Search with accented characters

NEW TO SITEFINITY?

Sitefinity CMS uses the Lucene search provider by default. Lucene uses the so-called analyzer classes to examine indexed terms from text and generate a token stream. To implement an accent-insensitive search in Sitefinity CMS, you replace the default analyzer used by Lucene with one that replaces accented characters with the corresponding unaccented ones.

Lucene provides several filter classes, for example, the ASCIIFoldingFilter class, which you can use to customize the search functionality and convert special characters.

For more information, see:

The following example demonstrates how to implement a custom analyzer class:

using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Util;
using System.Collections.Generic;
namespace SitefinityWebApp
{
public class AccentInsensitiveAnalyzer : StandardAnalyzer
{
public AccentInsensitiveAnalyzer(ISet<string> stopWords)
: base(AccentInsensitiveAnalyzer.LuceneVersion, stopWords)
{
}
public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader)
{
TokenStream stream = new StandardTokenizer(AccentInsensitiveAnalyzer.LuceneVersion, reader);
stream = new StandardFilter(stream);
stream = new ASCIIFoldingFilter(stream);
return stream;
}
public static Version LuceneVersion = Version.LUCENE_24;
}
}

In the code above, you use the ASCIIFoldingFilter class to filter the result in the token stream of the custom analyzer.

using Lucene.Net.Analysis;
using System;
using System.Collections.Generic;
using Telerik.Microsoft.Practices.Unity;
using Telerik.Sitefinity.Abstractions;
namespace SitefinityWebApp
{
public class Global : System.Web.HttpApplication
{
private void ApplicationStartHandler(object sender, EventArgs e)
{
ObjectFactory.Container.RegisterType<Analyzer, AccentInsensitiveAnalyzer>(
new ContainerControlledLifetimeManager(),
new InjectionConstructor(new InjectionParameter<ISet<string>>(null)));
}
}
}

To enable Lucene to use your custom analyzer in Sitefinity CMS, you need to register the custom analyzer in Sitefinity CMS using the ObjectFactory class. You do this in the Application_Start method of your Global.asax class:

RESULT: Your new analyzer class is used during indexing. This means that any accented characters are replaced with their unaccented equivalents only during indexing and not during searching.

Want to learn more?

Increase your Sitefinity skills by signing up for our free trainings. Get Sitefinity-certified at Progress Education Community to boost your credentials.

Get started with Integration Hub | Sitefinity Cloud | Sitefinity SaaS

This free lesson teaches administrators, marketers, and other business professionals how to use the Integration hub service to create automated workflows between Sitefinity and other business systems.

Web Security for Sitefinity Administrators

This free lesson teaches administrators the basics about protecting yor Sitefinity instance and its sites from external threats. Configure HTTPS, SSL, allow lists for trusted sites, and cookie security, among others.

Foundations of Sitefinity ASP.NET Core Development

The free on-demand video course teaches developers how to use Sitefinity .NET Core and leverage its decoupled architecture and new way of coding against the platform.

Was this article helpful?

Next article

Index external content