FeedBurner makes it easy to receive content updates in My Yahoo!, Newsgator, Bloglines, and other news readers.
They said "it" couldn't be done. They said nobody else's array could do "it" – that only their array architecture could handle "it." They said all kinds of things about how "it" was going to bring the demise of Symmetrix, because Symmetrix would never do "it." Even if we could do “it,” they said we wouldn’t – but they said we can’t.
But they were wrong. VERY wrong.
Today EMC announced "it" is now available on VMAX. And then EMC went one better than they ever imagined – EMC took "it" further than they have been able to, even after all the (8+) years they have been shipping "it."
And of course, they will try to undermine the fact that they now have DIRECT competition from another array vendor who has implemented "it" - highlighting the history of EMC bashing "it", as if that matters any more. As I have noted before, being "first" is only important until there is a second - then all that matters is which implementation is better. And so they will childishly act like first means best perpetually.
Have you guess what "it" is yet?
More importantly, do you know who “they” are?
Read on to see what they never expected…and should have feared...
Yup. THOSE guys tried to convince users that EMC would never implement what they call “array-based virtualization.” They made crazy videos featuring zombies and even Mr. T in their lame attempts to discredit EMC’s network-based virtualization approach instead of “it.” Humorous though they may have been, word is that these marketing shenanigans ultimately led to the replacement of at least one generation of their marketing organization.
But the news this week isn’t simply that EMC has introduced array-based VMAX Federated Tiered Storage (a descriptive name we prefer over simple “storage virtualization” for reasons I will explain in this post).
No, the news is that we’ve upped the ante on what it means to leverage external capacity “behind” an array, in several key ways:
And finally, while VMAX FTS has not yet been qualified with as many external arrays as they currently support, EMC innovation is already at work. EMC is encouraging users to self verify FTS interoperability with their external arrays and to submit the results to EMC’s eLab for ratification. The expectation is that FTS will “just work” with most storage – the product has been architected around common FC personality models. Working with our users, we expect to round out the support matrix quite rapidly over the coming months.
Of this list of differentiated features, I assert that item #2 above is the most important.
Long the Prime Objective of Symmetrix, Data Integrity should never be taken for granted. This is why EMC implemented a more robust variant of T10DIF in the VMAX – to verify that the data blocks we get back from a disk or flash drive have not gotten corrupted (the VMAX DIF also verifies that the block actually comes from the specified location (LBA)).
As I have noted before, silent data corruption is a reality of modern disk drives that most people aren’t aware of. It can happen on any drive, at any time, and unless the array implements checks and validations, these drive errors can be passed on to applications totally without notice.
So, for FTS, VMAX calculates CRC checksums for the data blocks that we write, and we keep them with the global metadata maintained within the VMAX. Although we do not write additional bytes to the external array, the checksum implementation can detect a huge percentage of what would be otherwise overlooked errors. If the data read is bad, FTS will attempt to reread the data multiple times before returning the standard SCSI error code for a failed read.
Now, they have been known to argue that the FC spec includes a per-frame CRC that does the same thing, but that’s misleading. The FC frame CRC does in fact ensure that the data received from the source is in fact unmodified in transit. But if the drive returns bad data to the array and it goes undetected by the array, the FC frame CRC will serve to ensure that the bad data is still bad…it cannot correct for silent data corruption if it happens before it is transmitted.
“Storage Virtualization” is a very general term. Taken literally, it simply describes what is the foundation of all storage arrays – take the various storage components controlled by the array(disks, DRAM, NAND, etc.), and present them to hosts as LUNs that behave as if they were in fact real, physical SCSI drives. This is pretty much what EMC invented over 23 years ago with its first external storage array.
So, rather than co-opt the term for yet another variation of “storage component” (i.e., external arrays), EMC chose to name the feature descriptively. Federated Tiered Storage is what you get when you put storage arrays behind a VMAX. “Federated” because the arrays now are working together to support the storage needs of the hosts and “Tiered” because the external storage has different operational, performance and reliability characteristics from the internal storage – which is pretty much how the Flash SSD, Enterprise HDD and Nearline HDD are differentiated within the array.
This notion of Federation – the cooperation and aggregation of disparate resources towards a common objective (from the BG dictionary ) – this is the foundation of how EMC is tackling the demands of the hybrid cloud.
FTS joins Federated Live Migration (FLM), a feature introduced last year to enable seamless and non-disruptive migrations of LUNs during Symmetrix-to-Symmetrix tech refreshes. Prior to the introduction of FLM, they would try to position storage virtualization as the only non-disruptive answer to tech refresh – when in fact, it still requires a disruption to insert the virtualization layer in between the hosts and the “old” storage…not to mention the fact that an outage is also required to tech refresh their array-based virtualization implementation.
Fact is, FLM is currently a Symmetrix-to-Symmetrix solution; FTS may thus be attractive to some for heterogeneous environments, and it does in fact support non-disruptively moving LUNs between any arrays within the host VMAX’s domain of capacity. VPLEX is another solution that is non-disruptive after the initial insertion into the I/O stream.
Like FTS, VPLEX delivers far more than simple migrations. As the only solution that delivers a true high-availability active/active LUN presentation that remains HA through a site outage or failure. But more on that in another EMC World-week related post .
The immediate focus of FTS is not migrations, but to support the utilization of external storage as another tier under the control of VMAX. FTS can be a powerful component for improving both operational and capital efficiency in customer environments. Being able to standardize on the Symmetrix robust suite of industry-defining storage services is very attractive, especially given the tremendous improvements that have been made in simplicity and ease of use in the latest release of Enginuity 5876.
Oh, and there will be more value-add build upon the foundation of FTS coming in the future.
I just can’t talk about that quite yet.
Admittedly, FTS represents an expansion of the approach EMC took to integrating storage resources before. I think of it as transformational – an evolution of strategy towards a much broader objective than simple migration challenges. In fact, FLM and FTS present two of the first steps towards the seamless federation of storage assets into a coherent, scalable, distributed and manageable pool of information storage resources. We at EMC see this notion of federation to be the requisite foundation the hybrid cloud.
In closing, I must admit that my persona (the storage anarchist) has long argued against the very notion of storage virtualization. In retrospect, I disagreed mostly with the limited focus (migrations) and the lack of data integrity in most everyone’s implementation. FTS addresses these concerns and expands the use cases, so I now have to admit…
Brian Gallagher, President of EMC’s Enterprise Storage Division, gave his keynote address today at EMC World 2012. (If you were unable to make it to Las Vegas, you can watch the video here.)
In his keynote, Brian spoke about how enterprises of all sizes are increasingly seeking to leverage cloud technology to meet their constantly expanding IT demands. But, he noted, the cloud demands of enterprise computing aren’t adequately addressed by most of the current public cloud service providers.
No, enterprise IT requires that their clouds to deliver the same continuous availability, predictable performance, assured data integrity, and security that they currently enjoy from their own internal data centers. And in fact, the lack of such “High QoS clouds” has slowed cloud adoption by enterprises globally.
That’s all changing – transforming, if you will – thanks in no small part to EMC’s relentless focus on cloud computing. In his keynote, Brian talked about how customers are building out their next-generation data centers around hybrid clouds, and (more importantly) how the new products announcements made by ESD on Monday are laser-focused on delivering enterprise-class service levels to the hybrid cloud.
From the incredible scalability of the new (Powerful. Trusted. Smart. and Efficient.) VMAX Family and radically improved simplicity and automation of VMAX administration, to the revolutionary high-availability active/active distributed data infrastructure uniquely delivered by VPLEX, to the glimpse into technologies that will help to dissolve distance to reduce the effect of latency on remote data centers, the biggest takeaway from Gallagher’s talk is that big enterprises no longer have any excuses. It is time to transform to the hybrid cloud.
And we just might have a few things to help accelerate your transformation…
The new VMAX Family, anchored by the VMAX 40K, is tailor-made for the hybrid cloud. The flagship 40K scales to more than 2X more capacity and up to 3X more performance than the nearest competitor. In fact, HALF a VMAX 40K (4 engines) is faster than the top-of-the-line Hitachi VSP or IBM DS8800. To compete with a full-blown VMAX 40K, competitors are going to have to quote multiple arrays, even though that will by definition increase the complexity (and reduce the flexibility) as compared to Family VMAX.
I won’t bore you with an exhaustive list of the VMAX 40K features and such; you can get the details over the new VMAX Family pages on EMC.com. That said, here’s a few highlights on the 40K:
Combined with Enginuity 5876, VMAX 40K is designed to reign in the complexity of the hybrid cloud. It helps you simplify and consolidate the management of your storage resources. You can use Federated Tiered Storage (FTS) to subordinate lesser storage systems behind the VMAX 20K or 40K, leveraging the full power of Symmetrix market-defining functionality and simplified management to extend the life and value of your pre-existing storage assets. You can even use external capacity as a tier with FAST VP, something no competitor currently supports. And so you can rest without fear of silent data corruption, FTS incorporates active data integrity verification to double check that the external arrays actually return the same data that we wrote to them…another innovation that others haven’t figured out yet.
With RecoverPoint for VMAX, you can leverage Continuous Data Protection (CDP) to scroll back to virtually any point in time in the event of a failure or data corruption, and you can replicate to/from VPLEX and/or VNX platforms without the complexity of host- or fabric-based splitters.
Go even further, and put VPLEX Metro in front of the VMAX or VNX families (or deploy it with your VBlock), and you can deploy the only distributed active/active virtual storage that is
The world’s largest Enterprise-class storage array, plus the only array-based heterogeneous replication solution, plus the ONLY HA active/active solution. Surround that with the power of the rest of EMC’s portfolio, from #1 dedupe backup to the #1 scale-out NAS to the #1 unified storage platform…
…everything you need to
Transform your IT, your Business and Yourself!
Just a quick post to update readers with some behind-the-scenes perspectives on today’s events here at EMC World 2012.
The day here started with the release of 9 press releases covering the announcement of 42 new products. These were followed with a series of press briefings, lead off by Pat Gelsinger and followed by the division presidents each covering their announcements.
Then there was the mad dash as more than 15,000 people proceeded to the main ballroom to hear Joe Tucci and Pat Gelsinger’s keynote presentations. While these were also simulcast and available for later viewing, I can assure you that nothing can hold a candle to actually being there– imagine a screen that is actually wider than an American football field, driven by ELEVEN widescreen projectors, providing a wrap-around view. Now, project onto this ultimate widescreen a star field from the perspective of a spaceship travelling through space and time (complete with a Store Trek theme), and you get perhaps a tiny fraction of the live experience. I was sitting in the back, and I watched people actually lean in their chairs as the starship banked into turns.
Maximum wow factor, to be sure.
The keynote presentations weren’t bad, either!!!
For me, the rest of the day was filled with 1-1 briefings with analysts, customers and press…and there will be more tomorrow.
I am purposefully NOT discussing the VMAX, VPLEX and RecoverPoint announcements just yet. Brian Gallagher will be covering these tomorrow in his SuperSession keynote. If you are here at EMC World, you won’t want to miss that, as Brian has really amped it up another notch this year with customer testimonials, videos and yet another episode of “Brian. Brian Gallagher.” Once his session is done, I’ll start rolling out some posts providing some of my perspectives of the launches.
Wow - EMC World 2012 is only a few days away! Are you ready? (I’m not.)
The slogan for this year’s EMC World- Transform: IT+Business+Yourself could not be more accurate for what can be expected at the show. Just about everything at EMC World this year is about how Cloud and Big Data are driving revolutionary change in information technology, business and the people behind the scenes who drive value out of information assets.
As you might imagine, things have been hectic around the Enterprise Storage Division (ESD) offices during the run-up to World. For the past 4 months our global team have been designing, scripting and rehearsing the more than 40-odd presentations and a dozen or so hands-on lab sessions that ESD engineering will be presenting at World.
On top of all that, the cross-functional teams of development, manufacturing, training, services and go-to market have been working feverishly to put the finishing touches on the more than 14 new VMAX, VPLEX and RecoverPoint products plus literally hundreds of enhancements that are being announced and discussed at EMC World next week. The scope of these announcements is even larger than our “megalaunch” back at the beginning of 2011, and that was the largest announcement in EMC’s history.
Among all the announcement are a few gems that are sure to cause some heartburn for the competition.
But then, that’s always a fun part of such announcements!
With all these new products coming to market at once, we have been busy collecting beta feedback from customer test partners around the, and in many cases we’ve been granted permission to use names and testimonials in support of the launches next week. We have created some very interesting and exciting content for the keynotes and super-sessions, including several custom videos. There is at least one video that I am sure will go viral after we see it during ESD President Brian Gallagher’s keynote (11:30-12:30pm PST) – it’s near the beginning of his session, so you don’t want to be late!
Brian’s keynote – Accelerating Transformation to the Hybrid Cloud – will focus on the opportunity that Hybrid Cloud presents to businesses and the ways in which our products can help you get there. This year EMC will make all of the keynotes available to anyone, anywhere in the world via live simulcast. You can even ask questions during the simulcast and get immediate answers from subject matter experts. Visit here to set up a calendar reminder or on the day of to tune in.
Right after his keynote Brian will host a BUZZ session with Chuck Hollis, EMC Global Marketing CTO, and Raj Rajkotia, National Manager and Chief Engineer - Head of Design and Engineering of Toyota Financial Services. Buzz are meet ups on the show floor where attendees and either EMC execs, engineers, or customers can engage. Brian’s buzz will take place directly following his keynote (1:00-2:00pm PST) at Buzz next to the bloggers lounge. He’ll be available to discuss and answer questions on all of the new products he just announced as well as the Hybrid Cloud. Buzz are also being simulcast live and can be viewed here.
Backup up all the announcement, ESD is delivering 46 individual deep-dive technical sessions. This is the real meat of every EMC World – engineers presenting directly to customers how the products work and how to use them most effectively. Be sure to scope out your planned sessions early and don’t dawdle: several of the sessions will be run more than once, and the first showings usually fill up early.
Also – this year the sessions titles and locations are being distributed electronically – watch closely as the various press releases hit the wires to see session titles change to reveal the new products!
Be sure also to check out all of the labs (HOLs) at EMC World for the opportunity to experience on-demand self-directed exercises on the full range of EMC products including: VMAX, VPLEX, RecoverPoint, VMware, RSA, Documentum, Avamar, Greenplum, VNX, Isilon, ProSphere,and Unisphere. For ESD, there are 9 labs and 18 individual exercises for VMAX, VPLEX and RecoverPoint – all hosted as virtual machines on EMC’s Demo Cloud (and yes, I believe they will actually be running vVMAX out in the cloud).
The HOLs are open during full conference hours (Monday 7:30am-8:30pm, Tuesday/Wednesday 7:30-5:15pm, Thursday 7:30-2:00pm).
There’s going to be a lot of cool stuff happening around the Enterprise Storage booths: VMAX, VPLEX, RecoverPoint, and Enterprise Storage Central can be found in booths 560, 160, 152 and 361 respectively.
We’re also going to have theatres featuring customer presentations, meet the experts stations, a photo op with the world renowned Transformer himself: Optimus Prime (get it? ) and some pretty cool giveaways.
Also you don’t want to miss our demo “Mission Critical Business Continuity.” Check out how EMC, Cisco, and VCE remove the physical barriers within, across, and between data centers to facilitate data center resizing, consolidation and relocation, technology refresh, and workload balancing. You can see it at booth 160 on Monday at 7:30pm and Tuesday and Wednesday at 4:00pm.
In addition to Brian’s Buzz session after his super-session, Fidelma Russo, SVP Symmetrix Integration Business Unit and Matthew Yeager, CTO, Enterprise Storage Division of Colt Technologies are scheduled at the Buzz on Wednesday May 23rd from 11:00-11:30 am (EMC World Village, next to the bloggers lounge or view it via simulcast here).
Fidelma and Matthey will discuss the exciting opportunities and investments EMC is making to bring enterprise quality to the cloud so that customers can do “clouds without compromises” as they undergo their transformation.
Last, but not least, Brian and his staff have hosted this event for a few years now and it’s a lot of fun. Whether you want to compete for some prizes and show off your EMC product knowledge or just come for the entertainment and the free food and libations, it’s an event not to be missed. Grab an invite at the VPLEX, VMAX, or RecoverPoint booth and join us Tuesday night 5-6:30 in the Venetian Bellini 2101A.
As always, the more the merrier. This year’s event should be particularly interesting, as we’ll introduce new categories related to VPLEX and RecoverPoint.
Whew! that’s a lot of stuff to cover! Add to all that the plethora of other EMC product announcements, partner showcases, cocktail receptions and the big Party Wednesday night with Maroon 5!
I’ll bet there are some “transformed” folks come Thursday morning (and they won’t all just be EMC employees).
Hmmm…I wonder if the Venetian has a line on that?
See you there!
Last week’s VSPEX announcement let the world know how serious EMC is about being a channel friendly partner. Compared to competitive announcements made by various companies last week, EMC demonstrated far more commitment to the channel community. In essence, the VSPEX announcement was about EMC’s partners, whereas the Netapp, IBM and HP announcements were about… well, Netapp, IBM and HP.
Last week's announcements are being vigorously debated in the blogosphere, so for my part I'll try to explore some ground that may not be covered elsewhere.
First, my observations on the VSPEX announcement and why EMC's event was different than what was announced by the other folks last week.
In all, far more comprehensive than simply another reference architecture.
Now, what I'd really like to talk about was what was not discussed as much over the past week…
In my job as Chief Strategy Officer for EMC's Enterprise Storage Division, I interact daily with global corporations that are looking beyond incremental improvements and asking "what do I really want IT to look like, and how can I best support the business by getting things into production fast, operating efficiently and lowering costs?".
At their core, all the announcements last week stressed flexibility as a way of communicating that customers can "have it their way", choosing the components that they want from the vendors that they want.
Now, this has been a fact of life in the IT industry for the last 4 decades or so, but it is also a trend that is increasingly out of step with large complex systems in other industries. Imagine an airline insisting that Boeing or Airbus include a specific engine component from a different vendor and what the implications would be on cost, performance, safety and ongoing support from the supplier. Ditto for the car in your garage or the central air conditioning system in your house. While you could theoretically hand-select all the components that go into the next HDTV or cell phone that you purchase, it simply isn’t practical for 99.99999999% of consumers – especially if you want a service and replacement warranty on the device to boot (and who doesn’t?).
While small businesses and a large portion of the commercial market may have budget, skills and political obstacles to doing things differently, the global customers I meet with are increasingly seeing that standardization is the only way to bring complexity under control. When they look to standardize, they are taking a strategic view and looking for long-term benefits. EMC's recent run of market share gains are at least partly due to customer recognition that EMC's technology is superior and consistent with their objectives to increase efficiency at scale – whether it be FAST VP, VFCache, VNX, VPLEX, Isilon or Data Domain that delivers the quantifiable value and allows IT organizations to be more effective and nimble.
My prediction is that reference architectures will dominate airwaves over the next year or two and they will make for some very entertaining posturing – especially in the commercial market segments. Further, I expect that the not-so-secret sauce behind VSPEX will emerge as the most viable approach to channel-delivered integrated solutions, thanks to its comprehensive nature and the immense value of the companies backing this initiative.
Meanwhile, I also expect that large enterprises will increasingly make the long term commitments to fully standardized and integrated product offerings like VCE's Vblock platforms. VCE is the antithesis of reference architectures, and they've made the tough decision to focus solely on industry leading technology – VMware virtualization, Cisco UCS, and EMC storage – that is designed as a fully unified system by their engineering team working closely with development teams at VMware, EMC and Cisco.
Through Vblock platforms, customers are able to
I also predict that as a byproduct benefit of VSPEX being introduced and marketed to the SMB and commercial customers, it will be easier for VCE to rise above the fray with their "un-reference architecture" message. VCE's strong growth is precisely because VCE understands their market and they provide exceptional differentiation to their target customers. While the big public debate goes on about whose reference architecture is more flexible (I think EMC made a huge statement with VSPEX), VCE is blazing the path of the future that is rapidly becoming mainstream across the Enterprise IT market.
Almost three years after Hitachi announced its High Availability Manager (HHAM), they have finally delivered introduced the long-promised nondisruptive migration service capability, heretofore to be referred to as The Bridge to Nowhere (BTN).
I mean, seriously, who in their right mind
would want to migrate from one to another Hitachi array… ;0)
Read the press release (and HDS CTO Hu Yoshida's blog post), and you'll be inclined to believe that Hitachi's engineers have one-upped the industry with their latest "capability."
But that would be incorrect, dear reader, for EMC's Federated Live Migration has been delivering zero-downtime migrations to VMAX arrays from prior-generation Symmetrix DMX arrays for over a year. In a race to remain relevant in the face of accelerating competition, Hitachi's engineers have seemingly abandoned the green eggs and ham clustered-array approach to tech refreshing its USP/VSP product line in favor of what is inarguably a direct copy of EMC's FLM.
Well, actually, it's not an exact copy – there are several rather significant deficiencies in Hitachi's nondisruptive migration service (aka the Bridge To Nowhere) as compared EMC's Federated Live Migration. We'll explore these after the break.
Hitachi has a lot of engineers, but they seem to have lost their knack for copying other vendor's innovations of late. Dynamic Tiering is a poor excuse for automated tiering, especially owing to its bloated relocation size (42MB) and slow reaction time to workload changes. They were slow on the uptake for flash drives, and they STILL haven't figure out how to deliver active/active distributed access to LUNs over distance like VPLEX (which is what HHAM was supposed to deliver, if only within a single data center).
So it's no surprise that behind all the marketing hyperbole, the Bridge To Nowhere is, well, both poorly made AND unfinished:
If that's not enough, the press releases (and corresponding coverage) would have you believe that Hitachi's new offering will concurrently migrate up to 8 arrays into 1 VSP, never requires a reboot and that it will also automagically transfer over local and remote replication relationships.
Finally, nobody has yet automated the migration of replication relationships. The challenge isn't so much the array-side – it's all the scripts that run on hosts that make this difficult. Still, customers are demanding this, and so we can expect that there will be solutions. But I doubt it will ever be truly seamless – unless you really want to live in "spoofed mode" forever…
There has been lots of discussion since EMC's announcement of VFCache, much of it about the implications of said announcement on the storage industry. I've seen all sorts of assertions made by analysts, competitors, wanna bees and prognosticators from all backgrounds – some thoughtful, some diversionary and some that are just down right silly.
There are those that say EMC's entry into the server-side Flash market validates the market for the early entrants. While that may be true in some regards, I will point out that when considered within the entire scope of the announcement, VFCache actually offers significant differentiation from would-be competitors. It is yet to be seen if or how the "established" players in server-side Flash market will respond to that differentiation. (More on this after the break).
There were some who turned this argument around – because VFCache was implemented as a "cache", it couldn't compete with the "established" players in this space – this even though VFCache offers the traditional "Flash-as-DAS" for those that want it. So then they said VFCache was too small to be competitive, especially since some of the other players were talking about 10TB devices and such. I found all this humorous – not surprising, just funny. I always get a chuckle when the success of something revolutionary is measured using the yardstick of the "old" way. Like when EMC introduced the first Flash drives for an enterprise storage array back in January 2008. There were a lot of people (and even a certain competitor's CTO) who asserted Flash was too expensive to have any real utility, and that "nobody was asking for it." Today, barely 4 years later it is hard to find any commercial mid-range or enterprise arrays that don't offer SSDs in ne capacity or another (pun intended).
Then there are those that assert this movement to server-side (Flash) storage represents a full circle return from the 20+ year external storage "diversion," portending the impending doom of the disk drive and/or the external storage array altogether. I assert that for either of these to be true requires an unforeseen discontinuity of pricing: solid state has to get a LOT cheaper than any reasonable projection, or hard disk drives have to get a LOT more expensive. Short of that, there remains a niche opportunity for flash-only solutions, but the sheer economics of $/GB will ensure that the vast majority of the storage market will be dominated by spinning rust for a VERY long time – though increasingly complimented by solid-state persistent storage to deliver the performance required by the typically small subset of any dataset that is "hot" at any given time.
And finally there are those that have made claims that server-side Flash is the precursor to entirely new ways of developing applications, fueled by the heretofore unattainable I/O performance levels delivered by affordable server-side large-scale solid state storage. Some of these pundits go on to assert that server-side solid state technology will drive such a revolutionary overhaul of application development that external storage itself will cease to exist. I personally believe these are fool's forecasts, proffered by those who ignore the reality of history. In the high-tech industry, new technologies rarely supplant the old – neither overnight, nor even over-decades. The IT landscape is littered with still-functioning dinosaurs that may well never be recoded or replaced: mainframes, tape, COBOL, SCSI, Ethernet, perl, , etc. Switching and conversion costs are formidable barriers to overcome. In a world where more than 2/3 of the average IT budget is spent just keeping things running, and the other 1/3 is being invested in storing the growing flood of new information in perhaps in a token few NEW applications to leverage it all, there is little opportunity to invest in rewriting anything. If it ain't broke, don't fix it. The more probable reality is that server-side Flash (like ever-cheaper DRAM) will lead to new ways of building file systems, databases and applications – BUT these will not represent an overnight revolution. Instead, this new “new” will follow the same evolutionary path as have the new technologies that have come before.
With that expression of my humble opinion, I'll spend the 2nd half of this post exploring how I see VFCache fitting into this information-centric world we live in…
It is no surprise that the first server-side Flash solutions where solid-state drives. Simple packaging that fits into existing form factors and delivers immediate benefit for everything from boot times to application startup and switching, to accelerated application I/O, Flash drives relatively quickly earned their spot as the preferred choice for almost every modern laptop/netbook/tablet, as well as for most desktops and servers.
So when the first PCI-based flash cards emerged, it was logical that the first use case be to emulate the proven drive interface model – at affords immediate utility with no programming (but perhaps a miniscule bit of scripting) to deliver even lower latencies than the physical hard disk controller I/O path. Applications that really needed something approaching DRAM speeds but at over a capacity of data too large to be affordable (or addressable) in the current processor generations took quickly to these PCI-based solid-state “disk” emulations, often employing a workload script along the lines of:
Starting back with I began meeting with customers in 2008 after EMC’s ground-breaking introduction of Enterprise Flash Drives, nearly every customer’s use case for server side flash had followed this basic model.
More importantly, nearly every one of them wanted to understand if they could get similar performance from array-based flash as they were getting from the (then SSD-based, and later PCI-based) embedded server flash approach. Surprised as I was at the time that there actually existed applications that could live with the occasional complete restart from scratch on an error, I was as equally relieved to know that indeed there are many more applications that customers want to accelerate that also require features that the Flash-as-DAS (direct access storage) solutions cannot deliver, namely:
While flash deployed within a Symmetrix can meet all of these requirements, especially with the introduction of sub-LUN Fully Automated Storage Tiered (FAST), array-based Flash performance is limited by the latencies of physical SCSI I/O over the SAN. We recognized that a hybrid approach was required to address these customer requirements, one where the majority of read I/O operations were serviced locally over the PCI bus, but where writes are delivered synchronously to the external array. Always seeing the writes as-they-happen, the array can then reliably protect the changes (through RAID plus local and/or remote replication), ensuring that the data has been reliably persisted outside of the server just as the application or database engine expects.
The decision to utilize server-side Flash as a write-through read cache in front of the external array capacity was huge. Instead of requiring datasets to be loaded into the Flash before applications could begin, applications can begin immediately and VFCache will start warming up with whatever data is being requested and reused most frequently. Not longer limited by the size of the local Flash, VFCache can be used against very large LUNs – in fact, VFCache can accelerate multiple different devices, for multiple different applications!
Depending upon the I/O demands of the application(s), and the size of their working set(s), a 300GB VFCache will generally take less than 45 minutes from cold boot to warm up and reach equilibrium. In order to provide the maximum benefits for the most challenging of workloads – small block random I/O – the current VFCache drivers will generally avoid trying to cache large-block sequential I/Os (64KB and greater than 64KB), under the assumption that these could well be a backup operation that would otherwise flush the cache unnecessarily. But many of us remember the ReadyBoost option introduced by Windows Vista and we can imagine that future VFCache drivers set aside a small amount of the cache to accelerate application load times (which also typically utilize large-block sequential I/O).
But the most important feature of VFCache is that it is a write-though cache. While read hits are serviced with no impact on the array, writes are always forwarded to the storage device for persistence and protection. While this operation will encounter the added latency of traversing the host bus adapter (HBA), the storage area network (SAN) and the array interface, this is a small price to pay when data updates and additions are important. With a copy of the data stored safely on the external RAID-protected device, the risk and impact of a server or flash failure is minimized.
And when the target is an intelligently cached disk array like Symmetrix VMAX with Fast Write capabilities (where writes are acknowledged back to the host as soon as they land in the array’s protected global memory), the total write latency can be less than a fraction of a millisecond in total. Slower than a read hit to the PCI flash, but potentially faster than a write to a local disk drive within the server.
Some of the early naysayers about VFCache made claims that this write overhead would severely limit the utility of the product in the real world; they seemed to be saying that Flash only had any value if 100% of the I/Os were to the flash device.
But customers had told us exactly the opposite of that – they said that they had lots more applications that required the same data protection they enjoyed from their array-based datasets and databases, and for which the Flash-as-DAS approach was a total non-starter.
What most people didn’t understand back in 2008 was what EMC’s engineers were learning about application working sets and workload skews – that for the nearly all applications, a small portion of the total data will be the target of the vast majority of I/O operations. Frequently referenced as the “80-20 rule”, where 80% of the I/O lands on only 20% of the data, the reality is that it’s more like “95-5” – 95% of the I/O targets only 5% of the data. This target 5% may change over time – sometimes gradually, or sometimes dramatically, dependent upon the nature of the application.
EMC’s FAST VP leverages this knowledge to automate internal tiering across Flash, fast HDD and slow Nearline devices. Now proven with more than a year in production applications, being able to put the right 5% on the Flash tier enables that 95% of I/O operations to be responded to in a fraction of a millisecond. And even if the remaining 5% of I/Os take 10 or even 20 times longer, the average response time for all I/Os is drastically reduced.
VFCache leverages this same knowledge, although a bit differently than does VMAX FAST VP. On the VMAX, FAST tries to predict what will be required and then tries to get that into the Flash tier before it is accessed. The current VFCache driver, on the other hand, is a more traditional LRU-type cache – data that is frequently re-read is kept in the Flash cache, and data that is touched once (or infrequently) is aged out to make room for new data.
This difference in caching strategy is the first integration feature between FAST VP and VFCache. In fact, it may surprise people to learn that FAST VP will in fact keep promoting data to Flash even when VFCache is active on the host(s). Both the Symmetrix caching and FAST VP algorithms are adaptive to the workloads they experience, seeking to reduce “Miss” operations using a variety of strategies. So, while VFCache changes the workload that the array sees to be predominantly Read “Miss” and writes, the FAST VP adjusts to best optimize this workload automatically.
Further integrations between VFCache and VMAX FAST VP are planned. Some will be focused on management and reporting – Symmetrix Management Console will recognize and highlight servers with VFCache this summer. There are also plans to collect and report on VFCache stats, like hit percentage and such.
But the real opportunity is for tighter integration between the two. As a host-resident driver, VFCache has the ability to inform the array about what's going on inside the server – what blocks are actually scoring read hits and at what rates, for example. VMAX caching algorithms and FAST VP can use this information to complement the metadata kept within the array about every block, track, extent or extent group. The cache algorithms might accelerate the fall-through rate for small block random read miss for data that VFCache is holding so that the array's global memory is quickly put to use servicing other applications, for example. FAST VP might choose to keep those heavily-reused block on a lower tier within the array, recognizing that any future "miss" request for those blocks from the server is very likely to be a one-off that will be cached by VFCache – the slightly longer "miss" time to read from SATA will be more than recovered by numerous subsequent VFCache "hits." And the integration can work in the other direction as well – the array might inform VFCache that the requested blocks of read data have historically not seen any significant VFCache "hits", and so the blocks should "fall through" on the server side as well…allowing VFCache to store only data that has the highest "hit" probability.
There are numerous additional possibilities to leverage the server side knowledge to help the array optimize better for VFCache, and for the array side to help increase VFCache hit rates while minimizing "miss" response times. EMC has some of its brightest working on optimizing the integration, across all of our storage platform, storage federation and solid-state development teams.
And we all share a common goal:
Ensure that VFCache works best with VMAX, and that VMAX works best with VFCache!
Just as when EMC first brought Enterprise Flash Drives to market, we continue to believe that slid state storage is going to dramatically change the way we store and utilize data.Thanks to innovations like EMC's Fully Automated Storage Tiering and now VFCache, virtually any existing application today can cost-effectively enjoy sub-millisecond I/O latencies. Coupled with VMAX, VFCache-equipped servers have practically unlimited capacity backed by the industry-leading data protection, business continuity and disaster recovery capabilities that make Symmetrix the most trusted storage platform in the world.