How Caching Can Really Improve Performance

We're always looking for ways to improve performance without expanding horizontally by adding new hardware. Recently there's been a lot of buzz about caching, specifically memcached and Microsoft's Velocity. These are distributed caching technologies which are ideally suited to server clusters. This post isn't about them. Instead its about plain ol' regular caching you would implement as you move up from small web-based project to medium project and before a distributed cache is something you'd use. The techniques here are incredibly simple to implement but the results are astounding.

Our system handles tens of thousands of connections per day, with each client creating a transaction that involves querying a database, retrieving data, and then packaging that data up to transmit to the client. The whole thing is fairly processor intensive during the peak operating hours. Caching was "low hanging fruit" to quote John, and we knew that implementing it would drastically improve performance. Here's a shot of what our database processor usage was at just before implementing caching, as generated by PAL:

Yikes! You may not be able to tell from the picture but those are the four cores sampled in 10 minute intervals peaking at 100%. There were some additional optimizations to be made at the OS and software level so this is an extreme case, however none of those optimizations account for what we got out of caching.

The problem: You have a data set that is retrieved by numerous clients regularly, or you have large chunks of data that are retrieved by numerous clients. The data is not volatile - though you can certainly cache volatile data but that involves more extensive work in synchronizing the data store and cache. This even works with smaller chunks of data - especially those read only lookup tables you use for UI labels like state, gender, part number descriptions, whatever, but the gain will be bigger based on the larger your data set is. Our particular problem involves large chunks of data.

The solution: When data is retrieved from the database, cache that data in the application server's memory so the database doesn't have to be queried for the data again. Store the data in a structure that allows it to be easily retrieved.

For this example, let's assume you are caching a statistical report which contains oodles of data, images, etc (like a PAL report). Your clients connect to the server periodically and retrieve one or more of these reports, at least once, if not several times during the day. The shape of the query may change - for example one user may request 3 reports, then another may request 2 of those reports and 2 other reports not yet cached. In any case we are still executing queries against the database to figure out which reports to return, but instead of returning all that data we are simply returning IDs, and we cache our reports in a hashtable on the application server by ID.

public class CacheService  
{
   private readonly Cache _cache;
   private static CacheService _instance;

   private CacheService(Cache httpCache)
   {
      _cache = httpCache;
      HitCount = 0;
   }

   public T Get(int id, ICacheFulfiller fulfiller)
      where T : new()
   {
      var ttl =  new TimeSpan(1, 0, 0);
      var ret = default(T);
      var key = CreateKey(typeof(T), id);
      var obj = _cache.Get(key);
      if (obj == null)
      {
         ret = fulfiller.Get(id);
         _cache.Insert(key, ret, null,
           DateTime.UtcNow.Add(ttl),
           Cache.NoSlidingExpiration);
      }
      else
      {
         ret = (T)obj;
         HitCount++;
      }
      Total++;

      return ret;
   }

   public int HitCount { get; private set; }
   public int Total { get; private set; }

   public static CacheService GetInstance(Cache httpCache)
   {
      if (_instance == null)
         _instance = new CacheService(httpCache);
      return _instance;
   }

   private static string CreateKey(Type t, int id)
   {
      return t + "_" + id;
   }
}

The sample CacheService above is implemented as a singleton, stores cached items in a generic Dictionary with the key values being the hash code of the type string of the item to be cached plus the unique integer identifier (for the sake of this example let's assume we're using autoincrementing primary integer IDs in the database and the queries I talked about above respond with zero or more IDs when executed). The HitCount and Total are used simply for gathering performance statistics; the places that benefit most from caching will end up with a 99% ratio of HitCount (number of actual cache item hits) to Total (number of total requests) over time. We also have a no sliding expiration of 1 hour; its a good idea to expire your cached content because it could change and you don't want to have to clear the app pool in a situation where the change is not critical to recache as soon as its made. So how are null keys - those items that have not yet made it into the cache - handled? We want to handle them internally as part of the cache so you don't have to check to see if an item is cached, and if not, retrieve it, then cache it. We also don't want to entangle retrieval logic for all cacheable items with our cache, so what we end up with is an ICacheFulfiller<T>:

public interface ICacheFulfiller where T : new()  
{
   T Get(int id);
}

And what, exactly, is an ICacheFulfiller? Essentially any service level class that you use to retrieve your data can become an ICacheFulfiller. Going back to our fictional report retrieval example (assume we have a report class which stores all of the data about a report including its images):

public class ReportService : ICacheFulfiller  
{
   public Report Get(int id)
   {
      Report r = null;
      // TODO get the report from the database
      // return a Report instance mapped to the
      // data
      return r;
   }
}

You can actually nix the entire ICacheFulfiller<T> idea completely and rely on Func<T> as seen in Steve Smith's more concise Cache Access Pattern Revised article. I like the contractfulness of interfaces, though the usefulness of Func<T> is not lost on me.

So by implementing this relatively easy pattern our tens of thousands of transactions were resulting in a near 99% cache hit ratio which ended up taking the CPU performance of the database server from nearly 100% to...

Those are once again ten minute intervals, same snapshot of time (peak operating hours), and that's right the utilization is staying at around 10%. The anomaly at the end is a backup job or report generation, whatever it is, it didn't have to compete with our primary operation to do its work.

So what about processor and memory utilization on the application (web) server? Aren't we just shifting some of the workload? Well the processor usage actually didn't go up (sat around 10%, or more informatively pre-cache and post-cache performance was the same) and memory utilization increased by about 100 MB for the cache, but memory utilization also decreased by about 100 MB (or more) due to less data mapping having to occur.

In our case we're still executing pretty complex queries against the database for every single transaction that's being made (a transaction being between a client and the server) but what the results say is that returning just a integer IDs instead of all the columns in joined tables as part of a single table result set is faster - not by a bit, but by magnitudes. My point isn't that a recordset of integers is less intensive, its that the actual data retrieval is extremely intensive especially so in bulk and caching, when viable, alleviates that.

In the code above you'll note that I'm using the HttpCache from the HttpContext. The reason for this is the already implemented expiration (among other things), but also as an additional treat there is the AspAlliance CacheManager which, while written some years ago, works perfectly fine today. Just keep in mind that if you are using IIS7 on Vista or Server 2008 you're not adding to the httpHandlers section, but to the webServer/handlers section instead: