XING Devblog

Browser caching – Why is it not a good standalone solution?

| Posted by

Also available is the ‘memory cache’, which is located in the system’s volatile main memory (RAM) and merely used to improve the surfing experience. Pages that appear in the browser history are then stored in the cache in a volatile manner (power off means loss of data). The same applies to assets from SSL connections which are not stored in the disk cache as they may pose a security risk. With Mozilla browsers the disk cache takes up 50MB by default, whereas Microsoft browsers are non-transparent in this regard, and, as expected, leave it up to the operating system to handle the memory cache.

Browsers that support HTML5 and session storage also have an ‘offline disk cache’, which is comparable to a SQLite database and can be operated via JavaScript. In general, this cache is about 500MB in size and GoogleMail uses this feature in order to make the application available when offline.

In terms of caching, XING focuses on optimizing ‘disk cache’ utilization as the ‘memory cache’ cannot be used in a targeted manner and does not represent a sound method for cache optimizing in the long term. We therefore do not consider the specific use of offline disk caching to be effective due to a lack of general support.

The ideal client caching state can be seen in the following figure:

Ideal state of a page request

As all assets are already cached by the client, no requests are sent to the server with only HTML content and tracking pixels being reloaded. At most, individual assets should be checked to ensure they are up to date by means of a 304 (not modified) request. This in turn equates to the minimum possible number of page views, i.e. minimal traffic, only the bare minimum of requests, and minimal processing time in terms of downloading time.

But can this state actually be achieved in a real-life scenario?

Currently, at least 50% of our PCs run Internet Explorer 8, Opera and FF2/3.x browsers, and this number is expected to rise (see following figure).

Percentage distribution of the top 5 browsers in use at XING
Percentage distribution of the top 5 browsers in use at XING

In their standard delivered state, these browsers have a disk cache of 50MB (Opera 20MB). At first glance this seems to be enough as an entire page request hardly ever exceeds 500KB.

However, a number of key aspects need to be taken into consideration:

  1. Most users never clear their cache.
  2. A rising number of (2.0) websites are constantly increasing in size coupled with well-documented instructions availabe for everyone in the web (e.g. Steve Souders) mean that operators are constantly having to work on squeezing as much efficiency out of user caches as possible.
  3. Youtube, last.fm and other streaming stations are blowing up the cache with sometimes just one video/stream.
  4. When looked at simplistically, the cache is a kind of “least recently used” (LRU) list, i.e. the files that come in and are not used for week may be dropped out, if meanwhile other files come in.
  5. With some browsers, only the newer versions cache SSL content like that of xing.com (Firefox version 3.0 and above), and only if the cache header is ‘public’.

The continual development of the web means that the ratio of kilobytes to page view is likely to rise, as confirmed by our in-house statistics based on the xing.com website.

An increasing amount of content and optimized caching headers (see 2. + 3.) will lead to user disk caches filling up faster than ever. When a user’s cache is full (see 1.), data from the browser are deleted that actually need to be retained.

A combination of the ‘total time of the file in the cache’ and ‘last use’ then determines which files are deleted first.

So what is the upshot of this? Most users don’t have the pleasure of fully retrieving assets from the browser cache. Instead, the assets have to be requested again from the server, which increases loading time by at least 40%. When measured precisely, the number of visitors to the site with an empty or partially empty cache is surprisingly high.

Our own JavaScript-based measurements showed that 45% to 62% of users had an ‘empty cache experience’ while visiting xing.com, which is a surprisingly poor value.

See the green line in following figure for more information.

Image caching rate at xing.com
Image caching rate at xing.com

Our measurements correlate with those of a study conducted by Yahoo which used a backend-based method and returned an ‘empty cache experience’ of around 40% to 60%.

Conclusion: Well-designed caching has been proven to have a positive effect on user experience and is a genuine must both now and in the future. However, this isn’t the only area we need to focus on as there are too many other factors out of our control which may rapidly lead to the entire client caching being cleared out so that the user has to download all of the assets again. To this end, strategies such as minimizing the code and modularization need to be implemented together with lazy loading. Efficient caching and minimal requests are the only combination that can guarantee positive user experience in the long term.

p5rn7vb

About the author

Bjoern KaiserBjoern Kaiser works as a Frontend Engineer at XING.
He is responsible for Performance & Architecture.

XING Profile »


5 thoughts on “Browser caching – Why is it not a good standalone solution?

  1. Pingback: 10 steps to make your site cacheable | Frontend Force Blog

  2. “Our own JavaScript-based measurements showed that 45% to 62% of users had an ‘empty cache experience’ while visiting xing.com”

    Could you make this publicly available? I’d love to be able to put those on some sites to get an idea of end-users cache. I’m sure others would as well.

  3. Hi,

    Good point!
    we are planning to release this very piece of code on github as soon as the documentation is done.
    When this has happened we will write a blog post here.

    Thx
    Björn

  4. There are so many reasons why you should avoid browser caching with your site, and I believe you list all of them. As browsers continue to change and new ones are introduced you are asking for new problems down the road. As each browser re-defines their cache options and requirements you might find yourself constantly fixing your utilization strategy just to catch up.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>