« WAN optimization so easy, your secretary or fundraiser can install it | Main | Riverbed's Apurva Dave interviewed at Interop »

May 30, 2008

Why Caching is so Old-School

"Please describe how your solution accelerates WAN transfers by caching files and emails." I review many of the RFP's that Riverbed receives from prospective customers, and this is an example of a question that I see from time-to-time. I struggle with how to respond. To me, this question is like asking a jet engine manufacturer to describe how their piston-driven propeller engines work. Or in the networking field, asking Cisco to describe how their token-ring switches work. How would you respond to a question like that?

Sure, caching approaches can work well in some environments, particularly when dealing with static data that doesn't change. Today, web caches are in widespread use, and they have been effective in optimizing delivery of static web pages. Blue Coat, one of our competitors, continues to make a good market in web proxies that cache static web pages found on Public Internet sites such as ebay.com or salesforce.com. In a similar way, there continues to be a good market in piston-driven propeller engines for small aircraft, many decades after the invention of the jet aircraft engine.

But the fact is that caching has encountered insurmountable problems when dealing with dynamic data, in collaborative enterprise environments where new data is being dynamically created and shared with others. That is the chief focus of WDS solutions--for internal enterprise environments, not for caching the contents of websites found in the Public Internet. Enterprise environments include applications that not only read the contents of files, but also create new files as well as change the contents of existing files. Data coherency and integrity problems are a common experience when using WDS solutions that cache files, particularly when the files are being accessed and shared by two or more remote sites.

However, there are other reasons why caching as a technical approach is self-defeating when applied to WDS solutions. A cache by definition only supports the application that it is designed for. A file cache can only cache files. A web cache can only cache web objects. An email cache can only cache emails. There are many other applications other than these, so I need a cache for each one of these applications. Of course, a vendor wouldn't expect their customers to deploy so many boxes each with an individual cache, so all of these caches must be combined into one platform. What we end up with is a patched-together Frankenstein of a device with a separate cache for each application, where each cache requires its own application-specific configuration. The result is often an enormously-complicated product with many configuration steps. Hence, that is why many have found Blue Coat's SGOS caching product to be so difficult to configure. On the other hand, some competitors have such difficulty engineering their caching products that their application-specific support is very limited. For example, it's been more than 3 years since Cisco introduced their WAFS solution, and they STILL can only cache one application protocol--CIFS.

The application-specific nature of caches spawns yet another problem--a cache stores application-specific data that is not deduplicated, what we might call "exploded" data. For example, a file cache stores and represents file data no differently than a typical Windows or Unix file server. And because there is no deduplication of the file data, the file cache consumes copious amounts of storage. If I save an existing file under a new filename, then the file cache must represent the same data an additional time under a different file name, even if the contents of the new re-named file are identical to the first file.

That is why a caching-based solution to facilitate a server-consolidation project is a self-defeating endeavor. I may have 300GB of data in my Windows file servers deployed at a given branch office; deploying a file caching-based WDS solution such as Cisco WAAS requires that I have at least 300GB or more of storage capacity in my branch office WAAS device. By removing my Windows file servers from the branch office and replacing them with Cisco's WAAS product, all I have accomplished is migrating my data from my Windows file servers to a proprietary Cisco device.  I'm puzzled as to what the value is in doing that.

The Riverbed WDS solution does not use caching. Steelhead appliances generally do not store application-specific data. Rather, the Steelhead solution only stores raw byte-level data in deduplicated format, and this data store is application-agnostic. This is new. This is revolutionary.  As revolutionary as the invention of the jet aircraft engine was in the field of aviation. Riverbed's approach is completely different from legacy old-school caching approaches that many of our competitors use.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e5508a3ca7883400e5523dd7ed8834

Listed below are links to weblogs that reference Why Caching is so Old-School:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Ok. So you're caching at the transport layer instead of the application layer? And your cache is compressed and deduplicated? Still seems like "caching" in the general sense is the applicable verb. Claiming otherwise is just marketing hype.

Well, that can be considered a valid point if you were to describe the Riverbed approach as a "transport cache" or a "byte cache." But I wouldn't necessarily call it marketing hype, because "caching" in the traditional sense of the term refers to an application-specific mechanism. For Riverbed, the core principal of how the product works is that the Steelhead data store holds application-neutral data. If we call it a "cache", then people are not going to understand the application-neutral aspect of the data store.

Fair enough.

Caching is catching..

its like calling a car a car.

Yes there are different types but at the level most people understand these products its a cache.. just bit level.

Yes, a steelhead does more than caching.. however in the industry acclerators are know for this basic feature above all others and its yours and our job to not take it personally and educate your customers without sounding condiscending like this post comes off as.

Don't it is personal its just tech.

Hi Mead,

I suppose someone else can accuse Riverbed of being just a compression device, right? After all, isn't what we're doing just transforming the raw byte-level data into a small amount of metadata, and then the device on the other side of the WAN transforms it back into the original raw data. Sounds like simple compression to me, so why not call Riverbed just another compression device?

You see, calling Riverbed just a caching device oversimplifies things. It's like calling a motorcycle the same thing as a bicycle. Yes, they both have the same objective in transporting someone from point A to point B, but the manner in which they work are completely different.

We have a right and obligation to explain to our customers how the Riverbed product is different from a caching device. There are profound differences, as outlined in my original blog, that include impacts on scalability and data integrity compared to caching-based products such as Cisco WAAS and Blue Coat's SGOS.

As far as the condescending tone you perceive in my post, that probably has more to do with the fact that you work for a competitor. Unfortunately, there's nothing I can do about how you feel. But I am certain not everyone shares your feelings, and that includes our other competitors who also don't use caching-based approaches.

Best regards,
Josh

I think you miss the point. I understand yours but apparently you take your work too seriously.

I am not going to argue with a blogger.

Just to share I am going to be a large customer (making my first big purchase this month) and I find this post and your responses disgusting.

Your welcome to email me and I'll send you my phone if you want to discuss specifics.

Mead,

My apologies for anything I said that was inappropriate. I do have competitors posting comments to my blogs (see the discussion on VoIP), and I sometimes lose track of who is who.

As I said to Jayerandom, I'm fine if Riverbed is described as a caching device as long as that description is qualified. But without that qualification, then one could easily mistake Riverbed to be a file cache or web cache, in the traditional meaning of the term "cache".

I hope you can understand why I take this topic seriously. Having been at Riverbed since we first introduced Steelhead in 2004, I have personally struggled to explain why our products are different to countless customers and technologists. Back then, we were unique, and there were no reference products from other competitors that could serve as a mental model of how we worked. "Oh, you're just like Tacit's file cache, right?" Well, not quite... Then when Cisco acquired Actona's file cache, the importance of explaining why we were different became even more important.

Best regards,
Josh Tseng

Post a comment

This weblog only allows comments from registered users. To comment, please Sign In.


WWW
blogs.riverbed.com

Please enter your email address to subscribe to the Riverbed Blog:

Please enter your email address to subscribe to the Riverbed Blog: