vSphere Metro Stretched Clusters – Some Info/Links

A lot of questions lately about vSphere Clusters across distance. I really need to learn for myself so I collected some good links.

Make sure you understand what “Only Non-uniform host access configuration is supported” means. Someone correct me if I have this wrong but your device that enables the distributed virtual storage needs to be sure that hosts in site A are writing to their preferred volumes in site A and vice versa in Site B. Probably way over simplifying it.


LINKS

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007545

http://virtualgeek.typepad.com/virtual_geek/2011/10/new-vmware-hcl-category-vsphere-metro-stretched-cluster.html

http://www.yellow-bricks.com/2011/10/07/vsphere-metro-storage-cluster-solutions-what-is-supported-and-what-not/

http://www.yellow-bricks.com/2011/10/05/vsphere-5-0-ha-and-metro-stretched-cluster-solutions/

Big thanks to Scott Lowe for clearing the details on this topic.

All out of HA Slots

A few weeks a go I was moving a customer from an old set of ESX servers (not HA clustered) to a new infrastructure of Clustered ESX hosts. After building, testing and verifying the hosts we started moving the VM’s. It became apparent after a little while there were some resource issues. After just a few VM’s were moved an alert appeared that we could not start any new machines. I start looking at the cluster and there is plenty of extra Memory and CPU. Still nothing will start.
I say to myself, “Self, we have read about this before.” I thought back to this HA Deep Dive article by Duncan Epping.
Lets check the HA slots! (on a side note, if you use HA and have never read the Deep Dive, go do it now!)

media_1276972861425.png

As you can see here the slot size is rather giant. We have the largest CPU and Memory reservation plus some overhead (for simplicity) and that blows the size of the slot way up. I didn’t set the reservation, but surely they were there. 8GB of reserved memory. 4000MHz of CPU. Ouch. Where did that come from? It followed the VM from the old host to the new one. One of the reasons I was there was to setup a new cluster since the older ones were performing so slow on the local storage. It seems like someone tried to help some critical VM’s along the way by adding the reservations. I removed the reservations and had plenty of slots as you see below.

media_1276973677553.png

Yeah! I was able to power on another VM!

The new cluster blew away the old one. Went from older Xeon’s to 6 core Nehalem’s, from local disks to 48 disks of Equallogic Storage. The reservation was no longer needed.

Lessons:
1. Be careful with reservations, it can impact your failover capacity.
2. Reservations set on the machine will follow it to a new host.