As you probably know, in Sitefinity CMS it is easy to disable page indexing from external search crawlers (like Google bot, etc.) by unchecking the "Allow search engines to index this page" property. However, that page will still be indexed by the internal Sitefinity search engine and will appear in the list of search results on your web site.
Use the steps below to gain more control over what pages are indexed automatically by Sitefinity.
1. In Visual Studio create a class that inherits the PageInboundPipe class from the Telerik.Sitefinity.Publishing.Pipes namespace. Override its LoadPageNodes method:
This method is invoked every time Sitefinity needs to update its pages' search index (e.g. a new page is created or an old page is updated). It will check the value of the Crawlable property which corresponds to the status of the "Allow search engines to index this page" checkbox and will not add the item to the index if it is unchecked.
2. Replace the internal page pipe with our custom pipe from above - this is done in Global.asax.cs file as follows:
That's it, build the project and from now on if you uncheck the "Allow search engines to index this page" checkbox the page will be hidden from both the external and internal search crawlers.
To learn more about the Publishing system in Sitefinity CMS please check this blog post or the online documentation.
Use the steps below to gain more control over what pages are indexed automatically by Sitefinity.
1. In Visual Studio create a class that inherits the PageInboundPipe class from the Telerik.Sitefinity.Publishing.Pipes namespace. Override its LoadPageNodes method:
public
class
PagePipeNoIndex : PageInboundPipe
{
protected
override
IEnumerable<PageNode> LoadPageNodes()
{
return
base
.LoadPageNodes().Where(n =>
this
.CanProcessItem(n));
}
public
override
bool
CanProcessItem(
object
item)
{
if
(item ==
null
)
return
false
;
if
(item
is
PageData)
{
var pageData = item
as
PageData;
if
(pageData.NavigationNode.IsBackend)
{
return
false
;
}
if
(!pageData.Crawlable)
{
return
false
;
}
}
if
(item
is
PageNode)
{
var pageNode = (PageNode)item;
if
(pageNode.IsBackend)
return
false
;
if
((pageNode.NodeType != NodeType.Standard && pageNode.NodeType != NodeType.External) || !pageNode.Page.Crawlable)
{
return
false
;
}
}
return
base
.CanProcessItem(item);
}
}
This method is invoked every time Sitefinity needs to update its pages' search index (e.g. a new page is created or an old page is updated). It will check the value of the Crawlable property which corresponds to the status of the "Allow search engines to index this page" checkbox and will not add the item to the index if it is unchecked.
2. Replace the internal page pipe with our custom pipe from above - this is done in Global.asax.cs file as follows:
public
class
Global : System.Web.HttpApplication
{
protected
void
Application_Start(
object
sender, EventArgs e)
{
Bootstrapper.Initialized += Bootstrapper_Initialized;
}
void
Bootstrapper_Initialized(
object
sender, Telerik.Sitefinity.Data.ExecutedEventArgs e)
{
if
(e.CommandName ==
"Bootstrapped"
)
{
ReplacePagePipeWithCustomPagePipe();
}
}
private
void
ReplacePagePipeWithCustomPagePipe()
{
//Remove the default page pipe
PublishingSystemFactory.UnregisterPipe(PageInboundPipe.PipeName);
//This code will add the PagePipeNoIndex to the registered pipes with the original page pipe name
//so when the publishing system try's to use the page pipe will use the new one
PublishingSystemFactory.RegisterPipe(PageInboundPipe.PipeName,
typeof
(PagePipeNoIndex));
}
...
}
That's it, build the project and from now on if you uncheck the "Allow search engines to index this page" checkbox the page will be hidden from both the external and internal search crawlers.
To learn more about the Publishing system in Sitefinity CMS please check this blog post or the online documentation.