Tuesday, May 3, 2011

Long-Distance vMotion: Part 1

This is the first part in a multi-part series on the topic of Long-Distance vMotion. I am currently architecting and building this out for a few of my customers.

Long-Distance vMotion (LDVM) is the holy grail of business continuity – the ability to migrate workloads across data centers or in and out of clouds with no disruption of service and zero downtime. When I started consulting in the 1990s, after my several years as a software developer, I was a high availability clustering consultant, among other things. Later I architected geographic clusters, but one thing was certain, it was very expensive, complex in architecture and difficult to manage.

Long-Distance vMotion attempts to tackle one issue, that of business continuity. Let’s face it, disasters are rare. I know there are earthquakes, tornados, hurricanes, floods and other bad things that happen. In my many years of consulting, these have rarely happened to my customers. I have two customers that have had their storage ruined, each by their own fire suppression system failing and pouring water onto their equipment. These disasters, although rare, do happen. They must be planned for. It’s risk mitigation; a business decision that doesn’t come for free.

What is much more common, what happens all the time is maintenance. More and more frequently, that maintenance needs to occur without downtime, at the very least a minimum of downtime. The most common disaster my customers have experienced in the last decade are brownouts and blackouts. These are the more common problems we face as data centers and power grids are stretched to capacity.

One of our challenges in building for disaster recovery is some kind of geographic protection. Equipment needs to be far enough away to avoid one disaster also affecting our recovery site. Flying in the face of this is a latency issue. To achieve business continuity, we need to have all transactions mirrored with guaranteed delivery. So in addition, we need low storage latency to achieve this, synchronous distances (less than 100 km or 62 mi).

As we start to strive for business continuity, or absolute zero downtime, we can proactively move workloads for impending disaster, also know as disaster avoidance. With that storm coming in we can have our workload moved before the power goes out or the pipes go down. We can also proactively move workloads for planned maintenance. When we get good and comfortable with the technology, we can start to migrate workloads between datacenters to balance workloads: for servers, for storage and for bandwidth.

We would like to have the best of both worlds: business continuity and disaster recovery. Business continuity asks for short distances today, like across the metro. Disaster recovery screams for longer geographic distances for protection. We can combine the two of them.

Before I tackle the topic of LDVM, I think it’s important to understand how we got here. So as I’m prone to do, first a little history ….

The Evolution of Recovery Point and Recovery Time Objectives

When I entered the industry in the 1980s, we protected our data with tape copies. These took the form of copy to tape or dump to tape. Commands varied by operating system but the result was the same. A tape copy or tape archive of our data sets. Our recovery points were daily for our critical data sets and weekly for less critical, less changing data, such is the application code and operating system itself. Our recovery time also went from days to weeks. It wasn’t a perfect world, but the concept of a service level agreement was more of a target. A computer that went down didn’t kill a business, but slowed it down for a time. We reverted to more manual processes.

In the 1990s backup applications entered the scene. These took the form of IBM’s TSM (ADSM in those days), Legato Networker and others. The main strength was in the automation. Now backups were automated scheduled jobs, either over a network or to a directly attached tape drive, tape library or silo. Because of the automation, our recovery points became daily and our recovery time objectives shrunk to mere days.

The last decade brought us to the era of snapshots and mirrors. BCVs, FlashCopies and Snapshots created very fast hot backups. Mirroring extended data protection to the recovery site without the need to lug tapes across country. We now could restore from a mere number of seconds to a handful of hours. We were protected to the recovery site and could be back up and running anywhere from 5 minutes or less to a few hours. Life was certainly improving.

This decade we enter the cloud and grid era. We want our recovery points to be continuous: we should be able to roll back or forward to any point in time like the DVR at home. We want recovery time to be zero, a non-disruptive recovery. While some of this is available today, it’s not quite there yet (non-disruptive recovery?). However, I’ll also point out, we’re just into this decade.

To address and achieve business continuity, we look at the same familiar components we address when we deal with virtualization: namely servers, networks and storage.


I’ve already covered virtualization previously, it’s an over-used word like a marketing brochure mentioning cloud computing today. But virtualization is nothing new.

Everyone’s heard of server virtualization. Blade technology made an initial splash at being the datacenter’s savior by offering a smaller footprint, centralized administration and some limited physical resource abstraction. Server virtualization really made a splash with hypervisor technology like IBM’s PowerVM (for UNIX), VMware’s vSphere ESX, Microsoft’s HyperV and Oracle’s VM for x86. It’s been proven over and over again to provide real physical server consolidation (hundreds of servers to a handful), as well as a serious enabling technology for disaster recovery and business continuity (mirroring and fault-tolerant VMs). Of course this is old hat on the mainframe, but that’s not the topic of this post.

Network virtualization is something we don’t think too much about, we take it for granted. But do you recall the period before VLANs, when networks were physically separate? How about VPNs? DNS is a way of virtualizing IP addresses, as is NAT. Interfaces are virtualized with link aggregation like 802.1ad and TRILL. Even the switches themselves are virtualized with a vSwitch, IBM’s IVE/HEA, Cisco’s Palo and Nexus 7000 Virtual Device Contexts (VDCs), where the switch is split up and virtualized just like a server running ESX is today.

Storage virtualization as well has evolved throughout the years. There was a time where a file could not be larger than a physical disk, which came in MBs and not many of them. Then RAID and LUNs allowed us to span the disks in the RAID set. Logical Volume Managers allowed us to create even greater amounts of storage aggregated together. SANs were virtualized with VSANs similar to VLANs, as well as NPIV (N-port ID Virtualization). Storage subsystems were virtualized with IBM’s SVC, NetApp’s V-series, EMC’s VPLEX and HDS’ USP-V. Secondary storage was virtualized between tape cartridges with VTS, or a whole tape library with VTLs.

Needless to say virtualization has been around a long time and continues to evolve and permeate throughout the data center.

As I said above, when we virtualize we look at servers, networks and storage. Servers contain the CPU and Memory components, the network connects it to the outside world and the storage is shared between the nodes in a grid (ESX cluster, PowerVM servers). When we virtualize a server, we take that physical server and create a virtual machines (or LPARs), which contain virtualized CPU, memory, network interfaces and storage interfaces. When we do vMotion, we copy the memory of the VM from one physical server to another, suspend it on the source and resume it on the target – all transparently and non-disruptively to that server’s clients. The same can be said of moving LPARs with Live Partition Mobility.

This is our basic building block we’ll use.

Disaster Recovery Today

So taking that basic building block, we build out the architecture of disaster recovery used today. We can connect two sites together with a wide-area network (WAN) or metro-area network (MAN). We can mirror the storage between those two sites once they’re connected. Add in Site Recovery Manager (SRM) and we are now ready for disaster recovery. Sure there may be some other pieces; places where NAT and DDNS and load balancers play, but these are the guts of it. This basic architecture is the playbook most of us use. This is our starting point, the architecture we’ll extend in the next part of this article.

Part 2 will build out the Long-Distance vMotion architecture with a few different approaches.

No comments:

Post a Comment