Boot Storms
Boot Storms a new challenge in enterprise VMware environments. Every Monday morning we boot up our VDI desktops and the Tsunami hits the storage farm!
Solving Boot Storms With High Performance NAS
George Crump, Senior Analyst, January 3, 2011
Desktop virtualization is growing in popularity. Its ability to dramatically reduce operational expenses, as well as increase IT flexibility, can have significant benefits to the data center. Like any virtualization technology it is not without its own challenges in the growing environment. Many of those cause problems for the storage infrastructure and storage system, with boot storms being at the top of the list.
Boot storms are caused when a large number of virtual desktops are turned on or activated simultaneously. They all attempt to log into the host that provides them with processing power and applications, typically in the morning when workers first arrive. For the most part, during the rest of the day virtual desktops have very modest storage I/O needs. There are exceptions though, like system software updates or virus scans, which, when executed simultaneously across many virtual desktops, can dramatically impact overall and individual performance. Even end-of-the-day log outs can cause problems. If multiple users tend to sign out within a close proximity of time to one another, their storage I/O demands can have an adverse impact, like delaying or slowing down overnight backup and replication jobs.
Attempts To Calm The Boot Storm
While approaches to solving the problem of boot storms exist, many introduce their own problems. One of the biggest is expanding storage to the point where the original savings realized by the virtual desktop infrastructure (VDI) project is eaten up in additional storage investments. For example, one of the most common remedies to boot storms is inserting a tier of solid state storage to accelerate the response time of the system and reduce the impact of virtualization. The problem is that the storage on a non-virtualized laptop is dramatically less expensive than data center storage, especially solid state based. Essentially, this takes the least expensive storage and replaces it with the most expensive, dramatically reducing the VDI project’s ROI.
Moving the VDI data to a faster tier of storage also brings management concerns. This strategy adds the complexity of simply managing a separate tier of storage, but compounds the problem when the storage administrator decides to use that high speed tier for something other than desktop virtualization data. For example, to take full advantage of the high performance tier and spread out the cost many storage managers would want to move data in and out of the high speed tier throughout the day. That way, they could leverage this performance tier to solve boot storms in the morning and then to improve database response time the rest of the day. While all this can be done, it certainly adds a layer of complexity and the requirement for monitoring that storage or virtual administrators may not have time to provide.
Desktop virtualization is also an environment where the headline grabbing “scale-out” storage solution may be inappropriate. Scale-out storage systems typically gain in performance as the capacity demands of the environment grow. Interestingly, the desktop virtualization project, while large enough that it needs to be accounted for, often does not have the requirements and storage footprint of server virtualization. In a scale-out infrastructure this can mean nodes are added for performance but the capacity that comes with those nodes goes unused, which is not an efficient use of resources.
The problems of cost, complexity and storage efficiency have lead some infrastructure vendors to go against conventional wisdom and suggest that local storage may be a better alternative. After all it’s inexpensive, easy to cache with low-cost SSD drives and most importantly, easy to isolate to a specific workload, like that of desktop virtualization. This recommendation runs counter to all the justifications of shared storage like: improved resource utilization, improved data protection and improved data accessibility between servers, which is especially important in virtual environments. The benefits of using local storage to support virtual desktop infrastructures are easily outweighed by these shortcomings and most enterprises will be quick to dismiss it as an option.
Solving Boot Storms Without Breaking VDI
How does one overcome the challenge of boot storms in the virtual desktop environment without breaking the cost model, introducing new layers of complexity or ‘going backwards in time’ by not using shared storage? The answer may be leveraging high performance NAS systems that are designed to meet high storage I/O demands. Companies like BlueArc for example, have been providing top-end IOPS performing NAS systems to high performance compute and high end database environments for years. These types of storage systems are now gaining traction in the server virtualization market. As dozens of servers become consolidated onto a single server the virtualization infrastructure has become its own ‘HPC environment in a rack’.
Virtual desktop infrastructures are similar to server virtualization. Instead of consolidating dozens of modestly performing servers these projects consolidate hundreds (or thousands) of low performing desktops onto just a few hosts, once again creating a mini-HPC environment that requires high end IOPS performance. Since systems like BlueArc’s are designed to provide scale-up performance, as discussed in the article “Storage: Scale Up Or Scale Out” they may be a better match for infrastructures where the performance is as important as capacity. This is the reality of desktop virtualization.
With the exception of a very large VDI project most data centers are not going to be able to justify the expense of isolating a single type of storage system to the VDI environment. In fact, this is one of the excuses for the use of local storage; I/O isolation. For VDI to cost justify shared storage it will also have to be used simultaneously for other projects in the environment. This means having the IOPS headroom to meet the needs of highly random I/O profiles as well as the ability to scale storage I/O performance independently of storage capacity.
Finally, these storage systems should be able to provide valuable data services to the virtual desktop environment, like cloning and capacity optimization, that help continue to drive out costs. For example, BlueArc’s systems has a feature called JetClone that can customize a single image to support 1,000s of copies of an original desktop master. While some desktop virtualization products have this capability, taking that load off of the software will improve performance and virtual desktop density. Additionally if SSD technology is needed to help with boot storm performance, systems like BlueArc can transparently move data in and out of the faster tier of storage automatically. This allows the tier to be used where it’s needed most throughout the course of the day, essentially allowing SSD integration without the complexity of multiple tier management.
The article “IOPS are more Important than Air” discussed how important performance is for server virtualization. When it comes to desktop virtualization, solving boot storms with high performance NAS allows those same IOPS to be managed, but does so while maintaining optimum capacity utilization.
Category: Boot Storms