One thing I’m finding, and I may have mentioned this before, is that lazy loading and distributed caching don’t play nice.
Over the past few months I’ve been identifying heavily accessed objects in our application that seldom change and introducing caching. Up until recently that caching has been accomplished using the built-in ASP.NET in-memory cache. In tandem with this caching effort we’ve grown our website from a one node webserver to a N node web farm. With that comes the need for a distributed cache to keep the various node’s caches in sync.
To accomplish distributed caching we’ve been using NCache, which is a great tool, very cool. However, one of the major differences between in-memory caches and distributed caches, is that in-memory caches are direct memory references to the object, while distributed caches work with serialized copies of the objects. If you’re not already familiar with the problem you’re probably starting to see why that’s an issue for lazy loading.
Many of the objects in our application use lazy loading. With an in-memory cache this isn’t an issue, the lazy loaded properties get loaded on-demand, and are directly available to the next caller. The properties are loaded on the first call, but available as part of the cached object on subsequent calls. However, in a distributed cache, the object is a copy. Therefore, if the object is placed into the cache before the lazy load properties are accessed, which is generally the case, every caller get’s a copy of the object without the lazy loaded data, and each copy then must load its properties on-demand. So while the top level object may be cached, all the lazy loaded child objects are not cached.
My solution to this, thus far, has been to force the eager loading of those child objects prior to placing the objects in the cache. It does mean, that for an object with a deep graph there’s a fair amount of analysis that has to take place to make sure all the significant child objects are eager loaded when cached, but lazy loaded otherwise. If this isn’t done properly, and I’ve been burned by it several times, switching from an in-memory cache to a distributed cache can result in significant performance degradation do to the reduced amount of caching taking place.
Some might argue that caching objects with large or deep graphs is a bad idea and the source of my woes, but it works so naturally with an in-memory cache its hard to pass up, I just wish it were just as natural with a distributed cache.