Attention! Helicon Tech Blog has moved to www.helicontech.com/articles/

Monday, January 12, 2009

How mod_cache works?

Helicon Ape release (coming very-very soon) will contain mod_cache module. And as we promised in our previous article we are now giving you more thorough description of mod_cache operation.

mod_cache starts working

After authentication/authorization events but prior to request handler execution mod_cache comes out on the scene. At this stage the module performs the following:
  • checks whether it's possible to use cached response for the current request
  • if yes, generates a key and searches cached response using this key
  • if the response is found in cache, the module gives it back to the client and request processing is over — request handler is not invoked.

Cacheable or not cacheable: request check


Response may be cached if request meets the following requirements:
  • request method is GET
  • request does not contain Authorization header
  • Cache-Control request header must not be no-cache. This condition is ignored if CacheIgnoreCacheControl On is used
  • Pragma request header must not be no-cache. This condition is ignored if CacheIgnoreCacheControl On is used

mod_cache attempts to save response

When request handler has completed its job and all defined filters have been applied to response, mod_cache starts to operate. At this stage the module performs the following:
  • estimates the capability of response caching
  • checks if CacheEnable is set for this request
  • generates cache key
  • defines the period of time to store response in cache (absolute expiration time)
  • saves response in cache according to the key

Cacheable or not cacheable: response check


The following conditions are considered when deciding whether response is cacheable (all must be met at a time):
  • request method is GET
  • response status is 200 (200, 203, 300, 301 or 410 in Apache)
  • Expires response header contains valid "future" date
  • responses containing expiration time (i.e. Expires or Cache-Control: max-age=XX headers), Etag header or Last-Modified header. This condition is ignored if CacheIgnoreNoLastMod is used

    • if request has a QueryString, only those responses containing expiration time are cached (i.e. Expires or Cache-Control: max-age=XX headers). This condition is ignored if CacheIgnoreQueryString On is used
  • Cache-Control request header must not be no-cache. This condition is ignored if CacheStoreNoStore On is used
  • Cache-Control request header must not be private. This condition is ignored if CacheStorePrivate On is used
  • request does not contain Authorization header (for Apache: if Cache-Control contains s-maxage, must-revalidate or public)
  • Vary response header does not contain "*".

Cache key generation

Response is saved in cache according to the key. This key includes:
  • normalized (canonical) request URI without QueryString or, in case of proxy request, normalized proxy request URL;
  • all QueryString parameters and their values in alphabetical order (default behavior)

    • CacheIgnoreQueryString On directive cancels addition of request parameters to the cache key
    • CacheVaryByParams param1 param2 ... directive defines parameters to be included into cache key
  • all request headers specified in CacheVaryByHeaders header1 header2 ... directive. Headers are not included to the cache key by default.
  • If response contains Vary header, all request headers specified in it are included into cache key.

When cached response dies

HTTP response is stored in cache for a specific period of time that is computed in the following way:
  • If response contains Expires header and its value is valid and does not refer to the past, cached response will be stored till the time specified in it.
  • If response contains Cache-Control header with either max-age=X or s-maxage=X, cached response will be stored in cache for X seconds.
  • If response contains Last-Modified header, cached response will be stored in cache until: expiry date = date + min((date - lastmod) * factor, maxexpire), where date - current date, lastmod - value of Last-Modified header, factor - float value set via CacheLastModifiedFactor directive (default value = 0,1), maxexpire - value set via CacheMaxExpire directive (default value = 86400 seconds = 1 day).
  • If mod_cache was unable to calculate expiration date using one of aforementioned methods (this is possible if response doesn't have Expires, Cache-Control, Last-Modified headers BUT has Etag header), it (date) is equated to default value of 1 hour that may be reset using CacheDefaultExpire directive.
This load of text might look a little unclear for you at a glance, but in reality this is a well-composed and highly efficient scheme. And our upcoming article will convince you in this.

No comments:

Post a Comment