Friday, January 5, 2018

A Sitecore 9 upgrade story

Sitecore 9 upgrade

I was asked to do an upgrade of a Sitecore 8.2 site to the new Sitecore 9. The site includes

Sitecore9 - © @jammykam
Sitecore 9 - © @jammykam
  • xDB with custom facets, outcomes, ...
  • WFFM with custom save actions (e.g. write data to the custom facets)
  • custom indexes (on Lucene)
  • extra publishing target
  • Sitecore Powershell Extensions
  • ...
For the upgrade we decided to use the default upgrade tools as explained in the Sitecore Upgrade guide which can be found on the developer network site.

We did bump into a few issues - some of them our fault, others maybe not.. :)
But I though it might be a good idea to share the experience - the upgrade is not fully finished yet so I might be adding stuff later.
First tip of the day:  stay awake and follow the install guide thoroughly. Apparently during the process we sometimes forgot a (small) step which included in errors later on - easily fixed by doing that step but taking you much longer because you need to figure out what's wrong..

The upgrade package

The analysis of the upgrade package gave us a lot of warnings. Most of them could be ignored, some of them addressed code: our code and the Powershell Extensions. We decided to ignore this as it was a test and we wanted to know if the upgrade would succeed without removing that code. Of course, we had removed all configs that might break stuff (like the ones regarding custom WFFM stuff - as you do need to disable the standard WFFM configs as well). We had no complaints about configs - those were clean :)

The installation of the package went well.. almost. At the very end it gave an error:
"An error occured while copying files. Some files might not have been updated."

Inspection of the logfiles showed us that a file could not be deleted and we had to finish manually - luckily this is was in the post-installation steps so we can assume that the installation/upgrade was done:
ERROR:System.Exception: Could not find a part of the path '...\sitecore\shell\Applications\Social\Wizards'.
ERROR:An error occured while copying files. Some files might not have been updated.
You must manually run the following batch file to complete the installation ...\temp\__Upgrade\Upgrade_20171121T163720573\process.bat
Details: [s]Sitecore.Update.Installer.Exceptions.PostStepInstallerException: Error has occurred during file installation. 
at Sitecore.Update.Installer.Items.PostStepInstaller.Process(IProcessingContext entry, IProcessingContext context)

We examined the other logs and it appears that this path in the Social folder was already deleted during the installation so why delete it again? Anyway, we ran the batch file and proceeded.

WFFM

Our site includes WFFM so we have to update this as well. This is done after the installation of the upgrade package and before the installation of xConnect. We did make a small mistake though.. The upgrade guide does mention updating your Solr setup prior to updating any modules. But as we didn't have any solr to start with, an assumption was made (or we just weren"t thinking) that we could upgrade wffm and still keep our search provider to Lucene - this is still supported on a standalone instance without xDB so that should work...

But we got: 
System.AggregateException: One or more errors occurred. Sitecore.Update.Installer.Exceptions.CriticalInstallationException: Critical installation exception occurred. System.AggregateException: One or more exceptions occurred while processing the subscribers to the 'packageinstall:items:ended' event

We do have a custom index though.. and the syntax for Lucene indexes changed slightly (so we learned here). This caused our upgrade process to fail miserably. After changing the search provider to Solr (including our custom index configuration) the upgrade worked.
Tip: fix your (custom) indexes when the upgrade guide tells you to "update" Solr

Solr

When switching to Solr, do test your queries. This seems obvious of course. But we noticed that Solr does behave slightly different in some cases compared to Lucene. This is not really an upgrade issue but as more people will be switching to Solr when upgrading to Sitecore 9 this might be worth mentioning. One issue we had was a query that retrieves lots of entries.. worked fine with Lucene and terribly slow with Solr - getting them in batches resolved this.

Installing xConnect

xConnect is new, so it has to be installed rather than upgraded. We had done this before so that would be a piece of cake. The prerequisites on the server(s) were ok and all was set to run SIF (the Sitecore Install Framework). All went fine and the webdeploy was running and.. boom.  The database deployment went wrong because the SQL user we had defined in the json configuration already existed.

Hmz.. indeed. One of the steps during the upgrade is deploying new databases (ExperienceForms and Processing.tasks). And I assumed (yes, assumptions.. it was not in the document) I had to add the SQL user to those databases. Which probably is indeed the case. I first thought that would be the issue, but it's not. That still was a good idea. But apparently we also had to remove the user from the Marketingautomation, ReferenceData and Processing.pools databases.

Tip: make sure your SQL user is not attached to the 3 mentioned databases before installing xConnect

Yes, we are good to go! The webdeploy managed to finish this time.

Well, almost..  this time because I did something really silly (just before holidays people do silly things) trying to install xConnect in my "current" folder. This will cause the installer to fail as it can't access it's own log files anymore as they are "in use". So:
Tip: do not install xConnect in your "current" folder

Starting windows services

We also had an issue starting the windows services installed with xConnect. This means we had to re-run the installation with a custom json, just to perform the tasks after starting the services. The reason the services didn't start are probably related to our environment but I'll mention them in case it helps someone: the user "local service" which is running the services had no access rights to the folders where the jobs are located (..\App_data\jobs\continuous\..). Just add those rights and test by starting the services manually.

After care

We have a running site now. Time to take a look at the logs ;)

Issue 1 : Sql Exceptions

Exception: System.Data.SqlClient.SqlException
Message: Could not obtain information about Windows NT group/user 'DOMAIN\...', error code 0x5.
Source: .Net SqlClient Data Provider
We restored our databases on a SQL 2016 instance before the upgrade so we already matched the required version. After asking Sitecore, this is not necessary - you can upgrade and switch afterwards as well. The issue is not that we switched before, but we restored the databases on the sql server with our windows accounts and that was not ok. We had to change the db owner to a dedicated sql user to fix this.

Issue 2 : Path analyzer errors

Path analyzer errors. Quite a few of those in the logs...  We checked the /sitecore/admin/PathAnalyzer.aspx page and noticed indeed that we had issues - some maps did not get deployed. We checked those in Sitecore, and noticed that our marketer friends for some reason deleted some standard goals. One of them was "Login" and this was now required. Packaging the missing goals from another instance, installing them and deploying the maps made the issues disappear. 

Remember that the upgrade process starts with running scripts that restore deleted Marketing Taxonomies and Marketing Definitions but that does not include any missing goals..

Publishing target

For our extra publishing target we had to add some (new) configuration as well.

xDB Data Migration

This one is still under investigation... we succesfully ran the migration tool but are facing some aggregation errors in the logs now. And seem to be missing some data...  Will update if applicable ;)


Next steps

  • Rewriting our custom facet code, ...  might be another post.
  • Try this all over again when update-1 is released :)
  • Try this all over again on another project with all our lessons learned...

4 comments:

  1. How do we handle this step? i tried removing the affected dll but the wizard gave an error and if i leave them the upgrade will overwrite them...how do we handle this? Thank you

    The wizard only analyzes .dll files that are stored in the \bin folder.
    To resolve these warnings:  Remove all the affected .dll files from the \bin folder.
    Or  If you are sure that the affected code is not used during Sitecore startup or in important Sitecore operations such as item management, security management, file management, or in regular website requests that require Sitecore functionality, you can ignore the warnings.

    ReplyDelete
    Replies
    1. For solving specific issues like this, these comments are not really the best place. I can highly recommand Sitecore StackExchange: https://sitecore.stackexchange.com Ask you question there and you will have access to a large number of specialist to help you.

      Delete
  2. Hey Gert, thank you for the article, any updates with the xDB migration tool? would be awesome if you could update the article

    ReplyDelete
    Replies
    1. We ran the migration tool again on an update-1 environment and had errors during the migration. Our problem was that your our model inherits from Sitecore.XConnect.Collection.Model.CollectionModel.Model instead of Sitecore.DataExchange.Tools.XdbDataMigration.Models.DataMigrationCollectionModel. Apparently that last one is needed during the migration.

      Delete