Miauw - a Sitecore blog: November 2015

Sunday, November 29, 2015

SOLID Sitecore templates

Every programmer in an object-oriented environment should know the S.O.L.I.D. principles:

Single Responsibility
Open/closed
Liskov substitution
Interface segregation
Dependency inversion

More information about this topic can be found all over the internet, on wikipedia.

What I want to talk about here is using these principles when working with Sitecore, and Sitecore templates in particular. It will come down to a basic conclusion.
I hope that for most of I will not tell anything new, but I've seen quite a lot of code already that does not apply to these principles and the reason is almost always: we had to do it "fast".

They are lots of blog posts already over solid principles in C# which of course also apply to a Sitecore based solution. I will not repeat that but focus on the architecture of the Sitecore templates - not quite the same as oo-coding but as you will see the solid principles even have their meaning here.

Single Responibility

"A template should have only a single responsibility"

Do not use a template called "News" for something else then a news article. Even if your other item has the same fields. You will create confusion and the maintenance of the templates will become harder. You might consider using base templates if your items have the same fields, but there also goes the single responsibility principle.

Base templates
Do not create base templates that have more than a single responsibility. If you want to re-use fields for SEO purposes, create a seo base template. Do not mix them with e.g. fields for your navigation in a single global "page" base template. If you really need such a 'base', create it from several other base templates.
A base template that has more than a single section should get you thinking...

Open/Closed

"An entity should be open for extension, but closed for modification"

For this one my example does not apply to the template itself, but to functions that switch on template(-id)'s. Quite often I get to see code that has a set of if-statements (or a switch) to identify the template of an item and do something somehow related but still different per template. After a while you might even get to see these if/switch statements more than once in the same class. If you create a new template you might break those function(s) because your new template will not be set in the selections.

How can we make this better maintainable? Create an interface (if you don't already have one, as you should) and use that interface to the outside world. Implement the interface for each (set of) template(s) that you need - use a base class with virtual functions if needed. In front of those classes you create a façade (class) that does the template detection and uses the matching implementation of the interface. The consumers of your code will always see the interface and the implementing classes do not need to check templates anymore.
If you create a new template you need to check your façades and maybe create a new implementation of an interface. But you don't need to scan your solution for template detecting statements anymore.

Liskov substitution

"Objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program"

Applied on templates this comes down to a few simple rules that everyone finds very obvious and still.. the "fast" way, you know.

Do not "overwrite" fields from a base template: if a base template already has a field "Title" do not create another "Title" field. Also make sure you do not have multiple base templates that include the same fieldname.
Do not re-use a base template field for a purpose it was not designed. You will not only confuse your fellow programmers, but also you content editors.

Interface segregation

"Specific interfaces are better than one general-purpose interface"

"Clients should not be forced to implement interfaces they don't use"

Applying this on Sitecore templates comes down to using base templates. Use base templates, and use them wisely. Create a base template for every need and make sure that your templates do not end up with fields they do no use.

Unused fields will confuse your content editors, it might give them the impression that they lack functionality.

Dependency inversion

"One should depend upon Abstractions. Do not depend upon concretions. A high-level modules or class should not depend upon low-level modules or classes"

This principle is mostly implemented with Dependency Injection. As there are already lots of blog posts already on that subject (also in combination with Sitecore) I won't go into details here. And I couldn't find an example related to templates..

One thing worth mentioning however is not to think that you are done when using inversion of control. If you want you classes/function independent and testable you must also make sure they do not use Sitecore.Context. A simple (but often forgotten) one is using the "current language". Your business functions should ask for the language if they need it (a parameter) and not depend on the Sitecore.Context!

Conclusion

Wrapping this up for using templates one could say:

use base templates
use base templates wisely
use small base templates wisely

and try to keep your business logic code independent of Sitecore.Context and templateId's.

Friday, November 27, 2015

Preview.ResolveSite and the disabled Shared Layout button in Sitecore Experience Editor

After installing some Sitecore 8.1 sites, we noticed something weird in one of the new features in the experience editor: the final/shared layout button, which quite some editors were waiting for.

The symptoms:

The Final/Shared button is disabled.

Some more testing came up with the following results;

It happened on some Sitecore 8.1 installations, but not on all of them
It happened on all browsers, on different client machines
It happened only if we went from the LaunchPad to the Content Editor and then via the Publish ribbon to the Eperience Editor
It did not happen if we went from the LaunchPad immediately to the Experience Editor - and strangely enough once we had visited the Experience Editor this way the button was always enabled throughout the entire session

It took us quite a while and help from Sitecore Support (thanks Yuriy) to figure it out, but then they pointed us at a the Preview.ResolveSite setting. A small explanation is also found at Kirkegaard's blog.

The preview resolving settings

On a multisite solution, when one opens a page in the Experience Editor, by default, Sitecore resolves the corresponding site using the value of the Preview.DefaultSite setting from the \App_Config\Sitecore.config file. The default value there is "website".

However if one sets Preview.ResolveSite setting value to "true" (it is located in the same file and by default it is "false"), Sitecore tries to resolve the root item and the context site based on the current content language and the path to the item. If Sitecore cannot resolve the context site, it uses the site that is specified in the Preview.DefaultSite setting.

So, if Sitecore renders the page on the solution in context of the "website", the Final Layout button is disabled. Why that happens is still a mystery as the rest of the Experience Editor is working fine.

But anyway: setting the Preview.ResolveSite to true fixed the issue.

Friday, November 13, 2015

Sitecore indexes

There are already a lot of blog posts describing the use of Sitecore indexes, especially since Sitecore 7 and the introduction of the ContentSearchManager and the ease to use them.
And still.. I see lots of people writing queries going through lots of items. So: yet another post to promote the use of indexes.

Sitecore indexes, indexes, indexes!

Sitecore has some built-in indexes (since Sitecore 8 even more). The best know are probably the sitecore_master and sitecore_web indexes. Personally I never use those. I always create a custom index. Why?

I don't want to mess with the indexes that Sitecore uses
I want my indexes small and lean

faster (re)build
easier to check

When to use?

It's hard to say exactly when to use an index, but I'll try to give some common real-life examples of request where I almost always think "index":

fetch all news items from year x
fetch all products from category x
fetch all events happening in the future
fetch the latest news items
get the last 3 blog posts written by x
...

Too often developers write queries which are fast with the test data.. but after a while the real data is has outgrown the solution and it gets slow.. So it's better to think ahead and make more use of those indexes. In lots of cases the result will be faster than a (fast) query.

Create a custom index

As there already are a lot of examples out there, just a fast introduction on how you can create (and use) a custom index. For more information on all the possibilities, check the Sitecore docs (or the default index config files which include examples and comments).

Configuration

I usually create a separate config file where I put the index definition and configuration together.

Example index definition:

<configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
  <indexes hint="list:AddIndex">
    <index id="MyCustom_index" type="Sitecore.ContentSearch.LuceneProvider.LuceneIndex, Sitecore.ContentSearch.LuceneProvider">
      <param desc="name">$(id)</param>
      <param desc="folder">$(id)</param>
      <!-- This initializes index property store. Id has to be set to the index id -->
      <param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
      <configuration ref="contentSearch/indexConfigurations/myCustomIndexConfiguration" />
      <strategies hint="list:AddStrategy">
       <!-- NOTE: order of these is controls the execution order -->
       <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync" />
      </strategies>
      <commitPolicyExecutor type="Sitecore.ContentSearch.CommitPolicyExecutor, Sitecore.ContentSearch">
        <policies hint="list:AddCommitPolicy">
          <policy type="Sitecore.ContentSearch.TimeIntervalCommitPolicy, Sitecore.ContentSearch" />
        </policies>
      </commitPolicyExecutor>
      <locations hint="list:AddCrawler">
        <crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
         <Database>web</Database>
           <Root>/sitecore/content/Corporate</Root>
        </crawler>
      </locations>
    </index>
  </indexes>
</configuration>

In this example I used the LuceneProvider, the onPublishEndAsync update strategy (info on update strategies by John West here) and refer to my custom configuration.
Note that I added a crawler for the web database and gave it a root path (can also be an ID).

Example index configuration:

<indexConfigurations>
  <myCustomIndexConfiguration type="Sitecore.ContentSearch.LuceneProvider.LuceneIndexConfiguration, Sitecore.ContentSearch.LuceneProvider">
    <indexAllFields>true</indexAllFields>
    <initializeOnAdd>true</initializeOnAdd>
    <analyzer ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/analyzer" />
    <documentBuilderType>Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder, Sitecore.ContentSearch.LuceneProvider</documentBuilderType>
    <fieldMap type="Sitecore.ContentSearch.FieldMap, Sitecore.ContentSearch">
      <fieldNames hint="raw:AddFieldByFieldName">
        <field fieldName="_uniqueid" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
          <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
        </field>
        <field fieldName="__sortorder" storageType="YES" indexType="UNTOKENIZED" vectorType="NO" boost="1f" type="System.Integer" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider"/>
        <field fieldName="title" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
          <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
        </field>
        <field fieldName="date" storageType="YES" indexType="UNTOKENIZED" vectorType="NO" boost="1f" type="System.DateTime" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider"/>
        <field fieldName="sequence" storageType="YES" indexType="UNTOKENIZED" vectorType="NO" boost="1f" type="System.Integer" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider"/>
        <field fieldName="topics" storageType="YES" indexType="UNTOKENIZED" vectorType="NO" boost="1f" type="System.Guid" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider"/>
        <field fieldName="applications" storageType="YES" indexType="UNTOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider"/>
      </fieldNames>
    </fieldMap>
    <include hint="list:IncludeTemplate">
      <NewsTemplate>{5CD362B8-C129-437A-A0D4-4EE58E71FEB1}</NewsTemplate>
      <ProductTemplate>{18D5467C-79F9-405B-AA87-2BA4B7CDB443}</ProductTemplate>
      <EventTemplate>{6CA9AC2A-1A9D-429B-870C-FC9417D3A1C7}</EventTemplate>
    </include>
    <fieldReaders ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/fieldReaders"/>
    <indexFieldStorageValueFormatter ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/indexFieldStorageValueFormatter"/>
    <indexDocumentPropertyMapper ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/indexDocumentPropertyMapper"/>
  </myCustomIndexConfiguration>
</indexConfigurations>

Notice here:

the "indexAllFields" : if not true, you need to define the fields in an include (just like the IncludeTemplate, but then with includeField)
the "fieldMap": here we define the field options: storageType, type (can be string, int, guid, date, ...) and if needed an analyzer (all information on analyzers by Adam Conn here)
the IncludeTemplate section where we define the templates of the items to include in the index (the guid is important, the name is useful for understanding the config)

Standard Sitecore fields
Most of the standard Sitecore fields are included automatically.

Some are not, but you can include them (in the example above the SortOrder field is included).

After creating the configuration you should see your index in the Index Manager in Sitecore and you can rebuild it. Check your data in the index with a tool like Luke. This way you are sure your config is good before you start using it. Luke is also handy later on to check your queries.

Computed fields

Computed fields are fields that are added to the index through custom code (the value is computed instead of just fetched from a field). Off course, computed fields can also be added to a custom index.

Querying your index

Sitecore has a class SearchResultItem that can be used to fetch results from the index, but in most cases you will want to extend this class.

Example SearchResultItem:

public class EventItem : SearchResultItem

public class EventItem : SearchResultItem
{
  [IndexField("title")]
  public string Title { get; set; }
  
  [IndexField("startdate")]
  public DateTime StartDate { get; set; }
  
  [IndexField("profile")]
  public ID Profile { get; set; }
}

We use the SearchContext to do the actual query. Example code:

private IEnumerable<EventItem> GetEventItems()
{
  var templateRestrictions = new List<ID>
  {
    new ID(applicationSettings.EventsTemplateId)
  };

  using (var context = ContentSearchManager.GetIndex("MyCustom_index").CreateSearchContext())
  {
    var templatePredicate = PredicateBuilder.False<EventItem>();
    templatePredicate = templateRestrictions.Aggregate(templatePredicate, (current, template) => current.Or(p => p.TemplateId == template));
    var datePredicate = PredicateBuilder.True<EventItem>();
    datePredicate = datePredicate.And(p => p.StartDate >= DateTime.Today);
    var predicate = PredicateBuilder.True<EventItem>();
    predicate = predicate.And(templatePredicate);
    predicate = predicate.And(datePredicate);
    predicate = predicate.And(p => p.Language == Sitecore.Context.Language.Name);
    var query = context.GetQueryable<EventItem>(new CultureExecutionContext(Sitecore.Context.Language.CultureInfo)).Where(predicate).OrderBy(p => p.StartDate);
    var queryResults = query.GetResults();
    foreach (var hit in queryResults.Hits)
    {
      if (string.IsNullOrEmpty(hit.Document.Title))
      {
        continue;
      }

      yield return hit.Document;
    }
  }
}

We use "predicates" to define our query. I find them useful to create reusable code (not shown here), especially combined with generics. Predicates are created with the PredicateBuilder (use true for "and" and false for "or" queries).

First we defined a predicate to check the templateID (from a list of possibilities). We also check a datefield and in the end we have predicate for the language.

In the example we sort (OrderBy), but the queryable has also options to use paging, facetting, ... The resultSet include a list of results, but also the facets, the total number of results (important when paging), ..

Make sure that if you sort you are using the correct types. Sorting numbers as string will give you unexpected results..

Fetching Sitecore items

It is also important to know that the results are not yet Sitecore items - we get the items we define (our SearchResultItem's). It is however quite easy to fetch the actual Sitecore items here, also using Glass if you want. Be aware though that the part after the index is sometimes the performance bottleneck: you wouldn't be the first to lose all performance benefits from the index by fetching too many Sitecore items or writing a slow Linq query after the search.

Before fetching the real Sitecore items (or Glass-mapped-classes), consider if you really need them. In lots of cases you will, but sometimes the information from the index can be sufficient and you can save even more time not retrieving actual items.

Logs

If your query returns unexpected results a good place to start looking in the search log file. All queries that are performed are logged there and if you are using Luke you can copy/paste the query in Luke and test it.

Issues

There are some known issues.. some unknown as well. I have a few open tickets with Sitecore support regarding indexes at the moment, so maybe more posts will follow...

Thursday, November 12, 2015

Delivering instant data to a high traffic Sitecore site

The challenge

We had to deliver data to 12 to 15.000 concurrent users on 2 Azure servers running on Sitecore (7).
Most of the data was coming from an external source (restful services) and could (should) not be cached for longer than 1 second because it was real time (sports) scoring information. That data was merged with extra information from Sitecore.

In this post I will describe what we did to achieve this, knowing that we could not "just add some servers". We had an architecture with 2 content delivery servers, 1 content management server and a database server. That could not be changed.

All things described here had some impact, some more than others or on different levels of the application but I hope it might give you some ideas when facing a similar challenge. The solution was build on Sitecore 7 with webforms and webapi.

Coding

Using context classes

Our most visited pages had up to 10 controls showing data (so not counting controls for creating grids and so). All of these controls needed a same set of data fetched from the url, page, ... The worst solution would be to have all that logic in every control. A better (and actually faster) way could have been to put that logic in a class and use that as base class or call it from all controls.

But in order to prevent all these controls going through that process and possible doing request to back-end services we took another approach and created a "context" class, injected per request (using Autofac in our case). The "context" class was called before any control was initialized and prepared all necessary context data for the page - without having to know what controls are on the page because that would break the whole idea of Sitecore renderings.

Example

We had a "team pages". Each team had several pages with different controls on them (player list, statistics, coach info, pictures ... ) and as it should in a Sitecore environment the editors decide what controls they want on each page. But: on each team page we need to know the team and we could already fetch the global team information as this was one object in the back-end systems. Depending on the control this could cover up to 50% of the data needed, but for each control it was one less request. If you have 10 controls on a page, this matters.. (even if the data would be cached).

Caching

Some of the data from Sitecore could be cached. At least, until it was changed in Sitecore. There are lots of post already out there on how to clear your cache when a Sitecore publish is done, but then you might clear your cache too often. So we created a more granular caching system. Maybe I should write a blog post on this one alone but in a nutshell it comes down to this: each entry in the cache has a delegate that determines whether the cache should be cleared (and maybe even immediately refilled) and after Sitecore publish, for each cache entry the delegate is called.

This way each cache entry is responsible for clearing itself. In the delegate we can check language, published item, .. If a lot of Sitecore publishes are happening this mechanism can prevent quite some unneeded cache clearances. Of course, one badly written delegate method could kill your publish performance, but if you keep them simple (try to get out of the method as soon as possible) it's worth it.

Threads

As mentioned before we could not cache the data coming from the external back-end system for any longer than 1 second. The API to the system was build with restful services. Calling the services when we needed the data was not an option if we wanted to serve that many users. Our solution was threading. We created a first long running thread that called the back-end every 5 minutes to see whether there was data to be fetched (this timing was fine as data was starting to show at least a day before the actual live games started). When we detected live data coming in we would start a new thread that fetched the actual data constantly and kept it in memory available for the whole application (until we detected the end of the data stream and let the thread stop). With the constant loop that fetched the data, mapped it to our own business objects (Automapper to the rescue) and sometimes even performed some business logic on it, we were able to keep the "freshness" of the data always under the required 1 second.

So the data was available on our site -in memory- at all time and we the threads for the web application were not harmed as the retrieval itself had it's own threads. A monitoring system was put in place to detect the status of the running threads and we included quite some logging to know what was going on, but in the end it all went well and was stunningly fast.

WebAPI / AngularJS

To deliver the live data on the pages we used AngularJS and WebApi to change the data on the pages without extra page request. For some of the controls this also enabled us to provide at least some content immediately to the users while fetching the rest.
Other tricks like progressive loading of images managed to get the amount of data that is initially loaded by the page down to a minimum.

Tuning

Pagespeed optimizations

This is something you should consider on every site no matter what the traffic will be like. I am talking about bundling and minifying css and javascript, optimizing images (for Sitecore images, use the parameters to define width and/or height), enabling gzip compressing and so on.. Mostly quite easy tasks that can give your site that extra boost.

IIS tuning

Probably a bit less know and for many sites not necessary, but also your IIS webservice can be tuned.

There is a good article by Stuart Brierly on the topic here.

What we did in particular is change the configs to adapt to the number of processors allowing more threads and connections. When doing this you need to make sure off course that your application can handle those extra simultaneous requests.

We also adapted the number of connections that were allowed to the external webservice (by allowing more parallel connections to the IP-address).

These changes made sure that IIS did not put connections on hold while we still had some resources left on the server.

This looks like a small change, but with quite some impact and you will need to perform some load tests to test your changes.

Infrastructure

Servers

As said in the beginning we could not add more servers, but we did upscale the servers as far as we could.

Caching

The last step to the solution was adding caching servers (Varnish) in front of the solution. This could free the webservers from a lot of request that could easily be cached. Here the use of WebApi to load some data also helped to get quite some requests cached: this can be resources like javascript files or images, but also complete pages. If you can serve these request without them going all the way to your webserver, your server has a smaller request queue and more resources to handle the remaining requests.

This last step does not come for free, but it had a huge impact once configured properly.