My long distance vmotion experience.


The company I work for is going thru a datacenter migration currently in the Americas, Asia, and European regions. In the Americas we were lucky enough to have the new datacenter that were moving into around 15 miles from the current one. This gave us an opportunity to try vmotion and storage vmotion over long distance for our VI3 environment of 100+ vm guests. Latency is one of the biggest factors for having a successful outcome with vmotion when putting long distances in between your esx hosts. So this is not something that can be done in every scenario.

First of course we had to do was extend the network and SAN from the current datacenter to the new datacenter. After some discussion we decide on 2x 4gig fiber connects in a ring for SAN and 2x 10gig for the network. We are a Cisco shop for network and SAN switches. Inter-VSAN Routing (IVR) was turned on so our fabrics could talk to each other. Keeping this high level the esx hosts in the primary was presented the luns of the new esx cluster in the new datacenter. You are only required to make sure the source has access to the destination but you may want to consider fail back if needed.

On the VMware side we turned on Enhanced Vmotion so wouldn’t have to worry about incompatible processors which also meant changing some setting in the BOIS for some systems. Now of course I can’t leave out testing, testing, and more testing was done on non-production vm guests. You can convert templates to vm’s and use them as your first round test subjects. All of our vm guests were updated with the latest vm tools and we made sure that ALL esx hosts new and old were at the same build to prevent and conflicts.

We could have created svmotion scripts but we choose to use the svmotion plugin that you can find at sourceforge, this is because we were on VI3. This is not a supported plugin by VMware but neither is doing vmotion and storage vmotion over long distances. The plugin made it more simpler for the IT team doing the migrations to follow a step by step document instead of written scripts. The plan was NOT to move everything at once anyway due to potential risk if anything went wrong.

So, the ground work was now laid so that vm migrations could be done at the click of a button. The first set of systems went better than expected which really instilled confidence with the application owners and management. The only real issue was with IVR. It seemed to be related to using the Auto settings but what happened was that not all esx hosts were able to login to the fabric with both hbas. It was a bit confusing because a couple esx hosts worked as they should have, some hba0 worked, then other hba1 worked. We ended up forcing the use of only one path since esx only used one path anyway but this meant no fail over.