Friday, February 26, 2016

Custom indexes in a Sitecore Helix architecture

Sitecore Helix/Habitat

Most Sitecore developers probably know what Sitecore Habitat and Sitecore Helix is about right now, especially since the 2016 Hackathon. Several interesting blog posts have already been written about the architecture (e.g. this one by Anders Laub) and video's have been posted on Youtube by Thomas Eldblom.

Custom indexes

I use custom indexes very frequently. So I started thinking about how I could use custom indexes in a "Helix" architecture. When creating a feature that uses such a custom index, the configuration for that index has to be in the feature. That is perfectly possible as we can create a separate index config file. But what are the things we define in that config file?

  • index
    • a name
    • the strategy
    • crawler(s)
    • ...
  • index configuration
    • field map
    • document options
      • computed index fields
      • include templates
      • ...
    • ...

Some of these settings can be defined in our feature without issue: the name -obviously- and the stategy (e.g. onPublishEndAsync) can be defined. A crawler might be the first difficulty but this can be set to a very high level (e.g. <Root>/sitecore/content</Root>)

In the index configuration we can also define the fieldMap and such. In the documentOptions section we can (must) define our computed index fields. But then we should define our included templates. And that was were I got stuck..  in a feature I don't know my template, just base templates..


A first thought was to use the patching mechanism from Sitecore. We could define our index and it's configuration in the feature and patch the included templates and/or crawlers in the project.
Sounds like a plan, but especially for the included templates it didn't feel quite right.

For the index itself patching will be necessary in some cases.. e.g. to enable or disable Item/Field language fallback. If needed it is also possible to patch the content root in the project level.

Included templates? Included base templates!

For the included templates in the document options I was searching for another solution so I decided to throw my question on the Sitecore Slack Helix/Habitat channel and ended up in a discussion with Thomas Eldblom and Sitecore junkie Mike Reynolds. Thomas came up with the idea to hook into the index process to enable it to include base templates and Mike kept pushing me to do it and so.. I wrote an extension to configure your index based on base templates.

The code is a proof of concept.. it can -probably- be made better still but let this be a start.

Custom document options

I started by taking a look at one of my custom indexes to see what Sitecore was doing with the documentOptions sections and took a look at their code in Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilderOptions. As you can guess, the poc is done with Lucene..


The idea was to create a custom document options class by inheriting from the LuceneDocumentBuilderOptions. I could add a new method to allow adding templates in a new section with included base templates. This will not break any other configuration sections.

An example config looks like:
<documentOptions type="YourNamespace.TestOptions, YourAssembly">
    <include hint="list:AddIncludedBaseTemplate">
This looks very familiar - as intended. We create a new include section with the hint "list:AddIncludedBaseTemplate". The name 'AddIncludedBaseTemplate' will come back later in our code.


Related templates

The first function we created was to get all templates that relate to our base template:

private IEnumerable<Item> GetLinkedTemplates(Item item)
  var links = Globals.LinkDatabase.GetReferrers(item, new ID("{12C33F3F-86C5-43A5-AEB4-5598CEC45116}"));
  if (links == null)
    return new List<Item>();

  var items = links.Select(i => i.GetSourceItem()).Where(i => i != null).ToList();
  var result = new List<Item>();
  foreach (var linkItem in items)

  return items;

We use the link database to get the referrers and use the Guid of the "Base template" field of a template to make sure that we get references in that field only - which also makes sure that all results are actual Template items.
The function is recursive because a template using your base template can again be a base template for another template (which will by design also include your original base template). The result is a list of items.

A second function will use our first one to generate a list of Guids from the ID of the original base template:
public IEnumerable<string> GetLinkedTemplates(ID id)
  var item = Factory.GetDatabase("web").GetItem(id);
  Assert.IsNotNull(item, "Configuration : templateId cannot be found");

  var linkedItems = GetLinkedTemplates(item);
  return linkedItems.Select(l => l.ID.Guid.ToString("B").ToUpperInvariant()).Distinct();

As you can see what we do here is try to fetch the item from the id and call our GetLinkedTemplates function. From the results we take the distinct list of guid-strings - in uppercase.

Context database

One big remark here is the fact that I don't know what the database is - if somebody knows how to do that, please let me (and everybody) know. The context database in my tests was 'core' - I tried to find the database defined in the crawler because that is the one you would need but no luck so far.

And finally...


public virtual void AddIncludedBaseTemplate(string templateId)
  Assert.ArgumentNotNull(templateId, "templateId");
  ID id;
  Assert.IsTrue(ID.TryParse(templateId, out id), "Configuration: AddIncludedBaseTemplate entry is not a valid GUID. Template ID Value: " + templateId);
  foreach (var linkedId in GetLinkedTemplates(id))
    AddTemplateFilter(linkedId, true);

Our main function is called "AddIncludedBaseTemplate" - this is consistent with the name used in the configuration. In the end we want to use the "AddTemplateFilter" function from the base DocumentBuilderOptions - the 'true' parameter is telling the function that the templates are included (false is excluded). So we convert the template guid coming in to an ID to validate it and use it in the functions we created to get all related templates.


Determining your included templates is apparently only done once at startup. So if you have a lot of base templates to include which have lots of templates using them, don't worry about this code being called on every index update. Which of course doesn't mean we shouldn't think about performance here ;)


So we are now able to configure our index to only use templates that inherit from our base templates. Cool. Does it end here? No.. you can re-use this logic to create other document options as well to tweak your index behavior. 

And once more: thanks to Thomas & Mike for the good chat that lead to this.. The community works :)

No comments:

Post a Comment