Wednesday, November 12, 2025

Sitecore Forms and the Content Security Policy

Sitecore Forms and the Content-Security-Policy (CSP)

The situation

A site is using Sitecore XM/XP and Sitecore Forms and has implemented a Content Security Policy (CSP) header - which is a best practice. However, Sitecore Forms apparently does not really like these policies as they do tend to block stuff. We had a (rather big) content site that had made edits to their CSP - which is manageable in Sitecore so they can adapt it per site - based on a recent penetration test. After this extension (tightening) of the CSP we noticed the forms from Sitecore Forms were not working anymore. To be more precise: the form appeared fine, but the submit action was not working anymore when it included a "Redirect to Page" action. 

The error

Luckily the error was pretty clear in a browser console:
CSP error

The CSP header was blocking the script. 
The error mentions "Executing inline script violates the following Content Security Policy directive 'script-src ... Either the 'unsafe-inline' keyword, a hash ('sha256-Dcwc6bB3ob8DnpIRKtqhRwu0Wl6bkf7uLnQFk3g6bPQ='), or a nonce ('nonce-...') is required to enable inline execution. The action has been blocked."

The solution

As the code which outputs the inline script is in the Sitecore assemblies, it does not seem an option to add a nonce value. Adding unsafe-inline everywhere is also not a good option as that would lower the quality of the CSP dramatically. So we went for another option and tried to add this unsafe-inline only when there is a form on the page.

Adding unsafe-inline conditionally 

First of all we will add an indicator in the HttpContext to tell us whether there is a form on the page. This can be done in Form.cshtml (located in \Views\FormBuilder), which is the main cshtml file of Sitecore Forms. But as we are using SXA it can also be done in the SXA Forms wrapper. This is Sitecore Form Wrapper.cshtml (located in \Views\Shared) and as we already had some customization in this file to add translations (see my previous post on this topic) we added a few lines here:
var context = HttpContext.Current;
context.Items["WeHaveAForm"] = "Y";
You can name the context item whatever you want of course.

Now we need to act on this context item. Again, there are options. We already had some code that placed a CSP in the header based on a value set in Sitecore on the Site item. But if you do not, a generic solution would be to place it in the global_asax Application_EndRequest function.
var context = HttpContext.Current?.Items["WeHaveAForm"];
if (context != null && context.Equals("Y"))
{
    var csp = Response.Headers["Content-Security-Policy"];
    if (string.IsNullOrEmpty(csp))
    {
        return;
    }

    csp = csp.Replace("script-src", "script-src 'unsafe-inline'");
    var pattern = @"'nonce-[^']+'";
    csp = Regex.Replace(csp, pattern, string.Empty);
    Response.Headers.Set("Content-Security-Policy", csp);
}
As you can see we do just a little bit more here:
  1. We check if the item is present in the context and get out if it is not
  2. We check if we have a csp value - if not we don't need to do anything so we get out
  3. We add the 'unsafe-inline' part to the script-src, if it is present in the csp
  4. We remove the complete nonce if that is present
  5. We set the new value in the Content-Security-Policy header

It is important to also remove the nonce. When a CSP header includes both a nonce and unsafe-inline, the browser ignores the unsafe-inline for scripts or styles and uses the nonce to allow specific inline elements. So if we keep the nonce, the unsafe-inline addition will not do anything.

Conclusion

We fixed the redirects on the forms without adding unsafe-inline on all pages. I would assume that is the best solution we could find here. 

Monday, October 27, 2025

Sitecore Powershell Reporting

Sitecore Powershell Reporting

The Sitecore PowerShell Extensions (SPE) module has a lot of nice features. One of them is creating reports. You get quite some reports out-of-the-box with the module, but you can also create your own. 

I recently noticed however that people are still creating custom pages to create reports for the admins of the customer. Although this gives you a lot of flexibility and perhaps options to include data that resides not in Sitecore this also has some drawbacks compared to creating reports in PowerShell. 

The SPE reports are completely integrated in the Sitecore editing environment. This makes it possible to (amongst others):
  • use context items: the output in the report can be based upon the item it was requested on
  • open items in the editor: directly open the editor from the report
  • more future-proof - especially if you would move to a saas solution
Next to that - the SPE module comes with some very handy features like the Show-ListView which gives your admins a nice toolbox without any effort. 

But enough about the benefits of SPE - let's dive into the example I wanted to show here.

Redirect module

I assume many Sitecore developers have had the request to add some sort of redirect module, enabling editors to create redirects without intervention of IT-people having to change the rewrite instructions.

There used to be several modules floating around in the Sitecore community - but to be honest none of them was ever really perfect. SXA also has it's own implementation of redirects, and that had a different approach than most of the modules. Instead of having a repository of redirect items and using pipeline code to handle them, it is also an option to place the redirects in the tree where you actually want them.

This brings me to the request that led to this post: creating redirect in a non-SXA XM project in a way that they are easily handled by the admins. 

We decided to create a page template with a layout that redirect the page based on some values that can be edited inside the page (eg url, permanent redirect, ... ). You can actually go pretty far in this if you want but as that is not our main focus here let's keep it to that. The focus is here is on answering the question from the Sitecore admins:

"Where are all my redirects?"

Let's answer that question with Sitecore PowerShell. Note that we are using redirects here as an example and this can be done for any kind of data in Sitecore. Also note that out-of-the-box there already is a report to fetch all items based on a template within a folder. But that report is very generic and we wanted a fancy one. Of course, we will copy from this one to give us a head start.

We start with some functions to assist the actual reports.

Sitecore PowerShell Functions

Functions are a way to reuse parts of the code. We will use two functions for our report. The first one is to check whether the item is published to the web database. As many pieces of software, this one is just a copy. But as we in the Sitecore community are honest, we mention our sources. Kudos to Gabriel Streza for this one.

Get-IsPublished
function Get-IsPublished {
    [CmdletBinding()]
    param( 
        [Parameter(Position = 0, Mandatory = $true, ValueFromPipeline = $true)]
        [ValidateNotNullOrEmpty()]
        [Sitecore.Data.Items.Item]$Item        
    )
   
    $WebDbItem = Get-Item web: -Id $Item.ID

    if ($null -ne $WebDbItem) {
        return $true
    }else{
        return $false
    }
}
I assume this one is fairly easy and doesn't need more explanation. 

The second function will fetch the items and show the report. This is actually the core of the whole thing. 

Show-RedirectReport
function Show-RedirectReport{ 
    [CmdletBinding()]
    param (
        [Parameter(Mandatory=$true, Position=0)]
        [string]$templateId,
        [Parameter(Mandatory=$true, Position=1)]
        [string]$path
    )
        
	$items = Find-Item -Index sitecore_master_index `
	   -Criteria @{Filter = "Equals"; Field = "_template"; Value = "$templateId"},
	   @{Filter = "Equals"; Field = "_language"; Value = "en"},
	   @{Filter = "StartsWith"; Field = "_fullpath"; Value = "$path" }  | Initialize-Item

        Import-Function Get-IsPublished

	if($items.Count -eq 0) {
		Show-Alert "There are no redirects here."
	} else {
	    $props = @{
		Title = "Redirect Report"
		PageSize = 25
	    }
		
	    $items |
		Show-ListView @props -Property @{Label="Name"; Expression={$_.DisplayName} },
	    	    @{Label="Updated"; Expression={$_.__Updated} },
		    @{Label="Updated by"; Expression={[Sitecore.Security.Accounts.User]::FromName($_."__Updated by", $false).Profile.FullName}},
		    @{Label="Path"; Expression={$_.ItemPath} },
		    @{Label="Target - Url"; Expression={$link = $_._.RedirectUrl
		        if ($link.IsInternal) { $link.TargetItem.Paths.Path } else { $link.Url }} },
		    @{Label="Permanent"; Expression={$_._.Permanent.Checked} },
		    @{Label="Published"; Expression={Get-IsPublished -Item $_ } }
	}
}
We will go a little deeper into this one. It has two parts.
First of all we are fetching items through the master index. The function expects a templateID and a path and we will use those to fetch all items of that template in that path (in English).

If we found some items, we are using Show-ListView to display the report. Most of the fields we are showing are quite common but a few deserve some attenion:
  • Updated by: we are using the security account here to fetch the full name of the profile instead of the Sitecore login name as that might be much more readable. Kudos to the one and only Adam Najmanowicz for this one.
  • Target: this can be an internal or external link so we use the "._" notation (short for .PSFields) to access the typed field and check what we need to show
  • Permanent: again using ._ to handle this as a checkbox
  • Published: using our Get-Published function here


The report

Once we have our two functions, the report itself is very easy. We need to collect the parameters to pass to the report function and we should be ok.
$templatePath = "master:\templates\xxx\Shared\Redirect"
$baseTemplate = Get-Item -Path $templatePath
$templateId = $baseTemplate.ID
$path = "/sitecore/content/Sites"

Import-Function Show-RedirectReport
Show-RedirectReport -templateId $templateId -path $path

Close-Window
As this report is intended to be used for one specific template we can hardcode that. We did the same for the base path for now - it's the root path for all content in this case.

Location
In our Sitecore PowerShell module we create (or go to) a folder Reports: "/sitecore/system/Modules/PowerShell/Script Library/xxx/Reports". Here we create a PowerShell Script Library that will be the folder in the reporting tools section. In that folder we place our PowerShell Script - "Redirects". That will result as shown here:




This is nice - we have our report together with the other reports. But we mentioned before that we can make it context aware. Of course, we could ask for a start item in a dialog box. But we can also use a very similar report in the context menu. 

The context menu

To add the report to the context menu and use the context item as the start path, we simply add a new script to our library.
$templatePath = "master:\templates\xxx\Shared\Redirect"
$baseTemplate = Get-Item -Path $templatePath
$templateId = $baseTemplate.ID
$contextItem = Get-Item .
$path = $contextItem.Paths.FullPath

Import-Function Show-RedirectReport
Show-RedirectReport -templateId $templateId -path $path

Close-Window

Not many differences compared to what we already had - just the path is now coming from the context item ".".

Location














We are placing the new script in the folder "Content Editor/Context Menu" in our SPE module. But note that in this case we don't only take care of the location as this location will add the script everywhere but we want to limit it a little bit. We can use the Show Rule to define where the script will be shown in the context menu. In our case, we limit it to items of 2 specific templates but you can use any rule here.

This will look like this:



The final result

I want to show a screenshot of the final result as well - especially for those people who are not so familiar with Sitecore PowerShell yet. For others it will look very familiar - but that doesn't make it less nice :) 


In conclusion I would remind you once more that if you need to bring reports to your editors or admins, think about Sitecore PowerShell. It is a really cool tool that will bring you the result you need. And on top of that you can find a lot of resources already from Sitecore community people that have shared scripts and/or snippets to give you ideas and help you write your own scripts.  And maybe this one extra post helps someone as well.  





One small final note: use this script in the desktop mode...

Thursday, September 18, 2025

EasyLingo 3.0

 EasyLingo 3.0 - update for Sitecore 10.4


Years ago I created (together with Kris - Kevin - Verheire) a module for XM/XP customers that had multiple languages in their site and wanted an easy overview of which language versions existed on every page. To be honest, I almost forgot about this module as my active customers had stopped using it but this year I was asked to consult on an upgrade project for a customer towards Sitecore 10.4 - and this customer was still using the module.

I wrote posts about the module when they got released, and the code is available on Github:

Sitecore 10.4

Apparently the latest version of the module was not compatible with Sitecore 10.4. So I had to make a few adjustments and released version 3.0 - which now again is compatible. 

Experience editor

One problem though - this release seemed to break the experience editor in a weird way. I think it is related to the way some custom javascript code was inserted , but as I am not a javascript expert I might be wrong. Anyway, I couldn't get it fixed so version 3.0.1 was released without XP editor support. 

For the customer that was no issue as they didn't use the functionality in that editor that much and time was running out so I'm afraid that for now this will be it. 

Due to circumstances this issue is still not fixed - but as the code is openly available anyone can jump  in and contribute if you want. 



I assumed (and still do) that the module was very close to the end of it's lifetime, but I was also glad to see that at least some people are actually still using it after all those years.  If you still have a customer on XM/XP with multiple languages, feel free to have a look. And fix my xp editor issue if you can 😉

Wednesday, September 17, 2025

Sitecore RTE broken - h is not a constructor

Sitecore - Telerik RTE broken

Recently we installed the Sitecore Security Bulletin SC2025-004 on a project that is deployed on Azure as a PAAS solution. The project includes SXA, so we also installed the Cumulative hotfix for SXA 10.2.0 and Sitecore XP 10.2 as was mentioned in the security bulletin (for sites running on 10.2 - there are different versions for the other Sitecore releases).

At first everything seemed fine, but then we noticed the Telerik Rich Text Editor (RTE) wasn't working anymore. Or at least, the display of it was completely wrong:



In the meantime we learned the this is not related to the patch install, that it was just a coincident although it is weird that it only happened on the environments where those patches got installed. Also worth mentioned we could not reproduce it locally (although we also installed the whole patch there).


Sitecore StackExchange

I started searching for a solution and while doing that bumped into someone on Sitecore StackExchange who seems to have faced the same issue: both the PAAS environment and the error. Anyways, after finding a fix I also posted it there so I hope he reads it someday and can fix his issue as well. You can check the post on https://sitecore.stackexchange.com/a/39965/237

The error - h is not a constructor



Our main starting point to solving the issue was the error above that we got in the browser. I found my way to a Telerik knowledge-base article that mentions the error: TypeError: h is not a constructor at Sys.Component.create.

As you probably know, the Rich Text Editor in Sitecore XM/XP is from Telerik, and although the error message stack trace was not completely the same as mine, the solution was worth a shot as it was close enough.

Solution ?

The solution they propose is adding a setting in the appSettings part of the web.config. Note that many solutions (including ours) rip this part from the actual web.config to place it in a separate file. The section should already contain some keys related to the Telerik editor. Add this: 

<configuration>
  <appSettings>
<add key="ValidationSettings:UnobtrusiveValidationMode" value="None" />

Once this was added, the Telerik editor was working fine again on our development environment. We deployed the same fix our test environment and there it didn't fix the issue...

User & role manager

So back to the research table. I found something very weird. Actually the trigger was that the user and role manager also started to act weird - displaying a white screen. This led me to the finding that the WebResource.axd calls were giving wrong results - in fact, they all gave the same result. Which of course broke several other javascript calls.

This smelled like caching so I checked with our networking guys and they told me the Front Door which was in front of the PAAS systems (and not local of course) was caching some routes. So my assumption is that is caches those axd calls without taking the full querystring into account. 

The real solution

After disabling that caching in Azure Front Door everything started working again. And hopefully it stays that way. And if not, you will read about it here...


As at least one other person encountered this I thought there might be more so it was worth sharing. If you have it as well, I hope this fixes it for you as well.



Thursday, July 3, 2025

XP editor navigation issues in Sitecore JSS

Experience Editor & Sitecore JSS

As you probably know, Sitecore's experience editor does work on headless sites created with Sitecore SXA and JSS enabling editors to work in headless sites just as they would in the non-headless ones.  We had such a headless site and bumped into a few issues in the experience editor though. 

An important note here is that this headless site is not the only one on this Sitecore instance (10.2) - there are others, non-headless and non-sxa mvc sites. 

Navigation bar

Our first issue was the navigation bar. The navigation bar allows you to navigate to a specific page through a menu structure, a bit like a breadcrumb.  To enable the navigation bar, in the ribbon, on the View tab, select Navigation bar.


Our issue was that the urls in that navigation were wrong - they were using the wrong sc_site parameter in the querystring. 

SiteResolving

First thing we had to check was the configuration of the sites. When adding sites with config patches they can be ordered. We had done this properly, but the headless site is a SXA site and that does not gets added through a config file. However, there also is a config section siteResolving in the ExperienceAccelerator section. This needs to include all your non-sxa sites so you need to patch them in.
<siteResolving patch:source="Sitecore.XA.SitesToResolveAfterSxa.config">
  <site name="exm" resolve="after"/>
  <site name="website" resolve="after"/>
  <site name="unicorn" resolve="after"/>
  <site name="x" resolve="after" patch:source="xxx.Sites.config"/>
  <site name="y" resolve="after" patch:source="xxx.Sites.config"/>
  <site name="z" resolve="after" patch:source="xxx.Sites.config"/>
</siteResolving>
This is a good and necessary step forward, but unfortunately it did not solve our problem yet. 

TargetHostName

The final step to get it working was adding a target hostname to the site definition of our headless sxa-jss site. Not an ideal solution, but it does work. 

Ok, issue number one solved - the navigation bar is working.




LinkFields

We also noticed we had a problem when using LinkFields. 

When we have a RT field in the XP editor, the links inside are transformed into something like https://.../~/link.aspx?_id=5F7338F6FCFC404EB1EF67FAE1F71F7D&_z=z. In components using integrated GraphQL to fetch data however, we notice that the LinkFields are not transformed in the same way (on the same page). We fetch the jsonValue of the fields, use the Link component from the jss sdk in our Next.js code. This work perfectly in normal mode, but in the XP editor we get links like https://.../en/Test%3Fsc_site=X. If editors click on those links they get an error. 

I asked this on Sitecore StackExchange but didn't get an answer there. So I opened a Sitecore support ticket and after a few rounds with questions and answers and logs, videos and data gathering we finally came to a solution. 

But before we come to the solution, let's take a step back to see how we got there. 

Step 1 - the ?

First thing we noticed was the %3F instead of a ? to start the querystring. Apparently there is a bug called "Internal link query string encoded in Next.js app" but we actually just fixed it in the Next app itself by replacing the character. Problem 1 solved.

Step 2 - sc_site

To get the correct site in the links I first wanted to check the data that came from the graphQL queries. To do this, I wanted to use the grapQL edge UI in edit mode and apparently that is possible. How to do that is documented on StackExchange - https://sitecore.stackexchange.com/a/39193/237 - so no need to repeat it all here. In short, if you want to fetch the data in edit mode:
  1. Fetch a token from the Identity Server using Postman (or similar tool) as explained here
  2. Add the authorization header in the http headers
  3. Add sc_mode=edit to your querystring



This way we learned that the url was definitely coming from Sitecore the wrong way. So back to Sitecore... 

Maybe now is a good time to mention that although this headless site is rather new, the other mvc sites on the same platform are not. Instead, it's an old spaghetti platform that we inherited - I guess if you have been around in this business for a while you know what I mean.

And so, eventually we (to be honest, Sitecore support) found that someone patched a setting Languages.AlwaysStripLanguage to false. 

The default value for this setting is true, and it specifies if the StripLanguage processor in the "preprocessRequest" pipeline will parse and remove languages from the url, even when the languageEmbedding attribute of the linkProvider is set to "never".  

Changing this setting back to the original value (true) fixed our issue. And didn't cause any new ones... 


Conclusion

Don't touch settings that you should not touch... 

Thursday, June 12, 2025

Solr query with n-gram

A Search story with Solr N-Gram part 2


Querying our index

In part 1 of this search story I described the setup we did to create a custom Solr index in Sitecore that had a few fields with the n-gram tokenizer. 

A small recap: we are trying to create a search on a bunch of similar Sitecore items that uses tagging but also free text search in the title and description. We want to make sure the users always get results if possible with the most relevant on top.

In this second part I will describe how we queried that index to get what we need. We are trying to use only solr - not retrieving any data from Sitecore - as we want to be ready to move this solution out of the Sitecore environment some day. This is the reason we are not using the Sitecore search layer, but instead the SolrNet library.


Query options and basics

Let's start easy with setting some query options.
var options = new QueryOptions
{
  Rows = parameters.Rows,
  StartOrCursor = new StartOrCursor.Start(parameters.Start)
};
We are just setting the parameters for paging here - number of rows and the start row.
var query = new List<ISolrQuery>()
  {
    new SolrQueryByField("_template", "bdd6ede443e889619bc01314c027b3da"),
    new SolrQueryByField("_language", language),
    new SolrQueryByField("_path", "5bbbd9fa6d764b01813f0cafd6f5de31")
  };
We start the query by setting the desired template, language and path.
We use SolrQueryInList with an IEnumerable to add the tagging parts to the query but as that is not the most relevant part here I will not go into more details. You can find all the information on querying with SolrNet in their docs on Github.


Search query

The next step and most interesting one is adding the search part to the query.
if (!string.IsNullOrEmpty(parameters.SearchTerm))
{
  var searchQuery = new List<ISolrQuery>()
  {
    new SolrQueryByField("titlestring_s", parameters.SearchTerm),
    new SolrQueryByField("descriptionstring_s", parameters.SearchTerm),
    new SolrQueryByField("titlesearch_txts", parameters.SearchTerm),
    new SolrQueryByField("descriptionsearch_txts", parameters.SearchTerm)
  };
  var search = new SolrMultipleCriteriaQuery(searchQuery, SolrMultipleCriteriaQuery.Operator.OR);
  query.Add(search);
  options.AddOrder(new SortOrder("score", Order.DESC));
  options.ExtraParams = new Dictionary<string, string>
  {
      { "defType", "edismax" },
      { "qf", "titlestring_s^9 descriptionstring_s^5 titlesearch_txts^2 descriptionsearch_txts" }
  };
}
else
{
  options.AddOrder(new SortOrder("__smallupdateddate_tdt", Order.DESC));
}

What are we doing here? First of all, we check if we actually have a search parameter. If we do not, we do not add any search query and keep the sorting as default - being the last update date in our case. 

But what if we do have a search string? We make a new solr query that combines 4 field queries. We search in the string and ngram version of the title and description. We combine the field queries with an OR operator and add the query to the global solr query. 

We then set the sorting on the score field - this is the score calculated by solr and indicating the relevancy of the result. 

Last we also add extra parameters to indicate the edismax boosting we want to use. We boost the full string matches most, and also title more than description. 

This delivers us the requirements we wanted:
  • search in title and description
  • get results as often as possible
  • show exact matches first
  • get the most relevant results on top


Wrap up

To wrap things up we combine everything and execute the query:
var q = new SolrMultipleCriteriaQuery(query, SolrMultipleCriteriaQuery.Operator.AND);
logger.LogDebug($"[Portal] Information center search: {solrQuerySerializer.Serialize(q)}");
var results = await solrDocuments.QueryAsync(q, options);
Next to gathering the results note that we can also use the provided serializer to log our queries for debugging.

As a final remark I do need to add that a search like this needs fine-tuning. That is tuning the size of the ngrams and also tuning the boost factors. Change the parameters (one at a time) and test until you get the results as you want them.

And that's it for this second and final part of this ngram search series. As mentioned in the first post, this information is not new and most of it can be found in several docs and posts but I though it would be a good idea to bring it all together. Enjoy your search ;)

Wednesday, June 4, 2025

Search with Solr n-gram in Sitecore

A Search story with Solr N-Gram 

For a customer on Sitecore XM 10.2 we have a headless site running JSS with NextJS and a very specific search request. 
One section of their content is an unstructured bunch of help related articles - like a frequently asked questions section. This content is heavily tagged and contains quite a bit of items (in a bucket). We already had an application showing this data with the option to use the tags to filter and get to the required content. But now we also had to add free text search. 

There is nothing more frustrating than finding no results, especially when looking for help - so we want to give as much relevant results as possible but of course the most relevant on top. 

Also note that we do not have a solution like Sitecore Search or Algolia at our disposal here. So we need to create something with basic Solr. 

As I gathered information from several resources and also found quite a bit of outdated information this post seemed like a good idea. I will split it in two - a first part here on how to do the solr setup and a second post on the search code itself.

Solr N-Gram

To be able to (almost) always get results, we decided to use the N-Gram tokenizer.  An n-gram tokenizer splits text into overlapping sequences of characters of a specified length. This tokenizer is useful when you want to perform partial word matching because it generates substrings (character n-grams) of the original input text.

Step 1 in the process is to create a field type in the Solr schema that will use this tokenizer. We will be using it on indexing and on querying, meaning the indexed value and the search string will be split into n-grams.

We could update the schema in Solr (manually) - but every time someone would populate the index schema our change would be gone. 

Customize index schema population 

An article on the Sitecore documentation helped us to customize the index schema population - which is exactly what we need. We took the code from https://doc.sitecore.com/xp/en/developers/latest/platform-administration-and-architecture/add-custom-fields-to-a-solr-schema.html and changed the relevant methods as such:
private IEnumerable<XElement> GetAddCustomFields()
{
  yield return CreateField("*_txts",
    "text_searchable",
    isDynamic: true,
    required: false,
    indexed: true,
    stored: true,
    multiValued: false,
    omitNorms: false,
    termOffsets: false,
    termPositions: false,
    termVectors: false);
}
So we are creating a new field "text_searchable" with an extension txts that will get indexed and stored.

private IEnumerable<XElement> GetAddCustomFieldTypes()
{
  var fieldType = CreateFieldType("text_searchable", "solr.TextField",
    new Dictionary<string, string>
    {
      { "positionIncrementGap", "100" },
      { "multiValued", "false" },
    });
  var indexAnalyzer = new XElement("indexAnalyzer");
  indexAnalyzer.Add(new XElement("tokenizer", new XElement("class", "solr.NGramTokenizerFactory"), new XElement("minGramSize", "3"), new XElement("maxGramSize", "5")));
  indexAnalyzer.Add(new XElement("filters", new XElement("class", "solr.StopFilterFactory"), new XElement("ignoreCase", "true"), new XElement("words", "stopwords.txt")));
  indexAnalyzer.Add(new XElement("filters", new XElement("class", "solr.LowerCaseFilterFactory")));
  fieldType.Add(indexAnalyzer);
  
  var queryAnalyzer = new XElement("queryAnalyzer");
  queryAnalyzer.Add(new XElement("tokenizer", new XElement("class", "solr.NGramTokenizerFactory"), new XElement("minGramSize", "3"), new XElement("maxGramSize", "5")));
  queryAnalyzer.Add(new XElement("filters", new XElement("class", "solr.StopFilterFactory"), new XElement("ignoreCase", "true"), new XElement("words", "stopwords.txt")));
  queryAnalyzer.Add(new XElement("filters", new XElement("class", "solr.SynonymFilterFactory"), new XElement("synonyms", "synonyms.txt"), new XElement("ignoreCase", "true"), new XElement("expand", "true")));
  queryAnalyzer.Add(new XElement("filters", new XElement("class", "solr.LowerCaseFilterFactory")));
  fieldType.Add(queryAnalyzer);
  yield return fieldType;
}
Here we are adding the type for text_searchable as a text field that uses the NGramTokenizerFactory. We are also setting the min and max gram size. This will determine the minimum and maximum number of characters that are used to create the fractions of your text (check the solr docs for more details). 

Don't forget to also add the factory class and the configuration patch and that's it. 

We created a custom index for this purpose in order to be able to have a custom configuration with computed fields and such specific on this index - with a limited number of items. If we now populate the schema for that index, our n-gram field type is added.

Sitecore index configuration

As mentioned earlier we have a custom index configured.  This was done for 2 reasons:
  • settings the crawlers: plural as we have two for both locations where we have items that should be included in the application
  • custom index configuration: we wanted our own index configuration to be completely free in customizing it just for this index without consequences in all the others. The default solr configuration is referenced so we don't need to copy all the basics though
    <ourcustomSolrIndexConfiguration ref="contentSearch/indexConfigurations/defaultSolrIndexConfiguration">
In order to get what we need in the index, we configure:
  • AddIncludedTemplate: list the templates to be added in the index
  • AddComputedIndexField: all computed fields to be added in the index

Computed Fields

Next to a number of computed fields for the extra tagging and such, we also used computed fields to add the title and the description field two more times in the index. Why? Well, it's an easy to way to copy a field (and apply some extra logic if needed). And we do need a copy. Well, copies actually. 

The first copy will be set as a text_searchable field as we just created, the second copy will be a string field. Again, why?

As you will see in the next part of this blog where we talk about querying the data, we will use all data from the index and not go to Sitecore to fetch anything. This means we need everything we want to return in the index and that is why we are creating a string field copy of our text fields. It's all about tokenizers☺.  The text_searchable copy is to have a n-gram version as well.   

I am not going to share code for a computed field here - that has been documented enough already and a simple copy of a field is really very basic. 

Configuration

I will share the configuration parts to add the computed fields.
<fields hint="raw:AddComputedIndexField">
  <field fieldName="customtagname" type="Sitecore.XA.Foundation.Search.ComputedFields.ResolvedLinks, Sitecore.XA.Foundation.Search" returnType="stringCollection" referenceField="contenttype" contentField="title"/>
 ...
  <field fieldName="titlesearch" type="X.Index.CopyField, X" returnType="string" referenceField="title" />
  <field fieldName="descriptionsearch" type="X.Index.CopyField, X" returnType="string" referenceField="description" />
  <field fieldName="titlestring" type="X.Index.CopyField, X" returnType="string" referenceField="title" />
</fields>
  <field fieldName="descriptionstring" type="X.Index.CopyField, X" returnType="string" referenceField="description" />
</fields>  
This config will create all the computed index fields. Note that we are also using the ResolvedLinks from SXA to handle reference fields.
Adding the fields with the correct type to the field map:
<fieldMap ref="contentSearch/indexConfigurations/defaultSolrIndexConfiguration/fieldMap">
  <typeMatches hint="raw:AddTypeMatch">
    <typeMatch type="System.String" typeName="text_searchable" fieldNameFormat="{0}_txts" settingType="Sitecore.ContentSearch.SolrProvider.SolrSearchFieldConfiguration, Sitecore.ContentSearch.SolrProvider" />
  </typeMatches>
  <fieldNames hint="raw:AddFieldByFieldName">
    <field fieldName="titlesearch" returnType="text_searchable"/>
    <field fieldName="descriptionsearch" returnType="text_searchable"/>
  </fieldNames>
</fieldMap>  


Our index is ready now. In part 2 we will query this index to get the required results.