All out of HA Slots

A few weeks a go I was moving a customer from an old set of ESX servers (not HA clustered) to a new infrastructure of Clustered ESX hosts. After building, testing and verifying the hosts we started moving the VM’s. It became apparent after a little while there were some resource issues. After just a few VM’s were moved an alert appeared that we could not start any new machines. I start looking at the cluster and there is plenty of extra Memory and CPU. Still nothing will start.
I say to myself, “Self, we have read about this before.” I thought back to this HA Deep Dive article by Duncan Epping.
Lets check the HA slots! (on a side note, if you use HA and have never read the Deep Dive, go do it now!)

media_1276972861425.png

As you can see here the slot size is rather giant. We have the largest CPU and Memory reservation plus some overhead (for simplicity) and that blows the size of the slot way up. I didn’t set the reservation, but surely they were there. 8GB of reserved memory. 4000MHz of CPU. Ouch. Where did that come from? It followed the VM from the old host to the new one. One of the reasons I was there was to setup a new cluster since the older ones were performing so slow on the local storage. It seems like someone tried to help some critical VM’s along the way by adding the reservations. I removed the reservations and had plenty of slots as you see below.

media_1276973677553.png

Yeah! I was able to power on another VM!

The new cluster blew away the old one. Went from older Xeon’s to 6 core Nehalem’s, from local disks to 48 disks of Equallogic Storage. The reservation was no longer needed.

Lessons:
1. Be careful with reservations, it can impact your failover capacity.
2. Reservations set on the machine will follow it to a new host.

8 thoughts on “All out of HA Slots”

    1. I’m glad though because that 8 gb reservation (on several servers) may have bit him in the backside after I was long gone. So it was a good chance for me to teach him.

    1. I’m glad though because that 8 gb reservation (on several servers) may have bit him in the backside after I was long gone. So it was a good chance for me to teach him.

  1. I’m with Duncan on this one. The funny thing is that this “slots” concept was around since 3.x.x – but VMware did absolute nothing to education people about it. I mean back in those days it wasn’t in any courseware… So you have my sympathies, because personally I think we dropped the ball on this one – by not education the end-user base correctly…..

    I find customers find the whole slot concept very alienating. Having explained it a number of times, I’ve yet to meet a customer say – gee, that’s so intuitive and logical. However, comparing a cluster to resource pool, and the % as “reservation” not for performance, but spare fail-over capacity it goes down a treat…

  2. I’m with Duncan on this one. The funny thing is that this “slots” concept was around since 3.x.x – but VMware did absolute nothing to education people about it. I mean back in those days it wasn’t in any courseware… So you have my sympathies, because personally I think we dropped the ball on this one – by not education the end-user base correctly…..

    I find customers find the whole slot concept very alienating. Having explained it a number of times, I’ve yet to meet a customer say – gee, that’s so intuitive and logical. However, comparing a cluster to resource pool, and the % as “reservation” not for performance, but spare fail-over capacity it goes down a treat…

Leave a Reply to Mike Laverick Cancel reply

Your email address will not be published. Required fields are marked *