Monday, January 4, 2021

Sitecore media cache headers on pdf extension

MediaCache headers on PDF extension

MediaResponse.MaxAge

Based upon a recommendation from Google we set the MediaResponse.MaxAge setting to 365.00:00:00. This sets the Cache-Control header for media from the Sitecore MediaLibrary to "public, max-age=31536000", which is very good for images within the site. The header tells browsers that cached content younger than max-age seconds can be used without consulting the server- it should be used for content that doesn't change. 

But.. we also have other media types within the MediaLibrary - and today I am focusing on one particular type here being PDF documents. We wanted other cache settings for the pdf documents as we did run into a situation where a pdf file did change and only the internet-gods know where the old document was cached - it was not that easy to get rid off. So, I would like to be able to alter the cache settings for certain media types.


SXA and the MediaRequestHandler 

Note that I am using SXA. The Sitecore eXperience Accelerator makes our developer lives so much easier on various parts of a Sitecore development process, and again this is one of those occasions. If you have SXA, you have a Sitecore.XA.Foundation.MediaRequestHandler.MediaRequestHandler which overrides the default Sitecore.Resources.Media.MediaRequestHandler and this handler introduces 2 new pipelines. We could try to override this code again but it seems like a better option to check the pipelines. The first one is the  <mediaRequestHandler> pipeline and has some processors from SXA within. The second one is much more interesting for me here and is called <mediaRequestHeaders>. This pipeline is called just after setting the default media headers and can be used to alter them. By default it is empty (it doesn't even exist) - let's add a processor to it.

 MediaRequestHeaders pipeline

Adding a processor to the pipeline is rather easy:

<mediaRequestHeaders>
    <processor type="Feature.Caching.Pipelines.MediaRequestHeaderProcessor, Feature.Caching" />
</mediaRequestHeaders>
Now I just need to write some code for that processor:
public class MediaRequestHeaderProcessor
{
    public void Process(MediaRequestHeadersArgs args)
    {
        Assert.ArgumentNotNull(args, nameof(args));
        var media = args.Media;
        var cache = args.Context.Response.Cache;

        if (media.MimeType.StartsWith("application", StringComparison.OrdinalIgnoreCase))
        {
            cache.SetMaxAge(TimeSpan.FromMinutes(15));
            cache.SetCacheability(HttpCacheability.ServerAndPrivate);
            cache.AppendCacheExtension("no-cache");
        }
    }
}
The code provided here is an example of what it could be. You can do anything you want in here actually...  Our processor gets the media item and the Cache context from the provided arguments - we don't have to fetch any extra data here which is nice. We check the MimeType of the media item and act accordingly. Note again that we can implement any logic for any media item here, based on any type of information we can get from the media item (type, extension, …).

Using the cache context we can set all the header information we want. As an example in the code I am setting the max-age, the cacheability and I'm appending a custom extension. This will result in a cache header like "private, no-cache, max-age=900". 

Such a header makes the browser cache the result for 900 seconds, makes sure no intermediate (proxy) servers can cache it, and tells the browser not to trust a local version without a server agreement. The no-cache addition doesn't mean it can't be cached, it means a browser must check ("revalidate") with the server before using the cached resource. For our pdf documents, this sounds reasonable.


Conclusion

SXA made my life easier once again 😃

The extra mediaRequestHeaders pipeline made it very easy to include any business logic I want to alter cache headers that are being send with media requests.


Thanks to Sitecore StackExchange - as this answer from Richard helped me into the right direction.


4 comments:

  1. We use append revision id setting to solve the pdf caching issue, will that help in your case

    ReplyDelete
    Replies
    1. That is also an option - sounds like a good idea to combine that.

      Delete
  2. Replies
    1. My test team and customer confirm that the headers are set, so yes - that works.

      Delete