Leadership in the Cloud (And everywhere else)

This is really a post about leadership in general, but I like to apply it to our industry. I am totally cool if you take these concepts and apply them elsewhere.

In any work environment there is constant posturing, politicing, conflicting, that has nothing to do with the actual cause of the workplace. I am going to offer a few leadership tips for everyone, not just for managers, vp’s and directors. Tips that we can all put to use.

1. It is not all about you. We all know that “guy” (or girl). Using every oppurtunity to push others down and himself up. Using others backs to climb on never lasts. Being the MVP of a losing team is never my goal, make everyone around you better. The skills involved in doing that will take you further than your daily task knowledge. No one ever says, “Wow, Jon sure can deploy a sweet VM.” If you are known for adding value, contributing and making everyone better that is how what you do will last. Valuing your team as something more than tools to make you look good is a good start.

2. Have a Purpose/Mission. I am here to change the world. Personally and Professionally. I have done jobs and have volunteered with people and organizations where no one knows why they do what they do. If you are making Pizza, make life changing pizza. If you are building next-gen datacenters, do it in a way that will alter life for someone.

3. Lead, Even if you aren’t supposed to. Don’t sit around and wait to be asked to do something leadershippy.

4. Have a Strategy. If you don’t know why you do what you do get that first. Then decide how the world will look when you are done. Impact (well good impact) on people will not happen on accident.

5. If you see a problem be part of the Solution. Stop complaining. There is only so much time in the day. Personally, it is natural for me to complain. I am very good at pointing out faults in everything. I have to consciously make the decision to work on the solutions for things I can change and shut up about the other stuff (for now). Some things just need the proper timing.

6. Community. Jump into the deep end of the pool of community. Make this a core tenant of everything you participate in. You can not do it all by yourself. Community substitues like Twitter and Facebook are a start but go meet in person with some real people. Just an idea.

The most cynical of my readers never started reading this. If you got this far, I hope in your mind you see how this applies to you. Of course any comments are welcome.

Some Reality for us Infrastructure Peeps or Apps are cool too

Don’t’ you just love double titles?

For many years I have been an infrastructure guy. I really liked how the cables, and processors and Memory and blinking lights worked. Applications were often the necessary evil tolerated so that I can play with cool technology. During my own journey toward learning about the cloud it becomes increasingly important to consider the function of the application. Six years ago me would totally punch me in the face right now. Traitor. J

1 – Don’t get your App messed up in my resource buckets of awesomeness

 

So the reality check to the Infrastructure geek in me is this: The application teams really think of what you do as the network. That is why when anything is ever wrong it is always “the network’s” fault. What we love to do is getting abstracted more and more. I will still contend that is very important and very hard to do. Whether you are building reference architectures or deploying a converged infrastructure appliance almost no one but us cares. They just want the data to do their jobs. So while we have really great discussions about speeds and feeds, the guy in the picture below just wants the app. From the hypervisor down we need to design with the application in mind or we will risk becoming like that goth dude locked in the server room on IT Crowd.

 

2 Honey badger don’t care about FCoE

My next post will get into what I have been researching regarding what is out there and hopefully help us (infra. peeps) understand our App/Dev brothers better.

You are probably an Infrastructure person if:

  1. You read this blog.
  2. You work mainly with Virtualization
  3. Storage Admin
  4. Network Admin
  5. You like to make fun of DBA’s

 

No clever title – ESXCLI

I have been missing in action for a few weeks. It is time to catch up for all the lost time. One topic I feel many people don’t know too much about is esxcli. I know how to do what I usually do with esxcli. There is a lot more there for us to explore.

First stop and take a look at the virtuallyGhetto article.

It can be run from the Service Console, the ESXi Tech Support Mode command line, or from the vMA. As William points out if you are running these command from the vMA you need to authenticate individually to each host. He goes on to list some articles that go over the most used case of esxcli, swiscsi.

A couple of quick examples I like to use:

esxcli nmp device setpolicy –device naa.6090a07800c2ea66b8c114050000c00d –psp VMW_PSP_RR

This command changes the policy for a storage device to another path selection policy. In this case it is Round Robin. This is great for when you are rebuilding ESX and the storage is already zoned. ESX will add the storage with the default PSP and changing a few dozen datastores on each host one at a time via the GUI can be VERY tedious.

Then how do I change the default PSP?

esxcli nmp satp setdefaultpsp –psp VMW_PSP_RR –satp VMW_SATP_DEFAULT_AA

This can be modified for different array types after the “—satp” tag or different path policies after the “–psp” tag.

For the VCAP-DCA4 exam I am studying for I wonder how much deeper than this they will go? I would feel most Data Center Administrators need to set up swisci settings and possibly change path policies. Anything I am missing? If you check out Duncan’s article here it will be great to know how to list what is available.

Equallogic, VAAI and the Fear of Queues

Previously I posted on how using bigger VMFS volumes helps Equallogic reduce their scalability issues when it comes to total iSCSI connections. There was a comment about does this mean we can have a new best practice for VMFS size. I quickly said, “Yeah, make em big or go home.” I didn’t really say that but something like it. Since the commenter responded with a long response from Equallogic saying VAAI only fixes SCSI locks all the other issues with bigger datastores still remain. ALL the other issues being “Queue Depth.”

Here is my order of potential IO problems on with VMware on Equallogic:

  1. Being spindle bound. You have an awesome virtualized array that will send IO to every disk in the pool or group. Unlike some others you can take advantage of a lot of spindles. Even then, depending on the types of disks some IO workloads are going to use up all your potential IO.
    Solution(s): More spindles is always a good solution if you have unlimited budget. Not always practical. Put some planning into your deployment. Don’t just buy 17TB of SATA. Get some faster disk and break your Group into pools and separate the workloads into something better suited to the IO needs.
  2. Connection Limits. The next problem you will run into if you are not having IO problems is the total iSCSI connections. In an attempt to get all of the IO you can from your array you have multiple vmk ports using MPIO. This multiplies the connections very quickly. When you reach the limit, connections drop and bad things happen.
    Solution: The new 5.02 firmware increases the total maximum connections. Additionally, bigger datastores means less connections. Do the math.
  3. Queue Depth. There are queues everywhere, the SAN ports have queues. Each LUN has a queue. The HBA has a queue. I would need to defer to a this article by Frank Denneman (a much smarter guy than myself.) That balanced storage design is best course of action.
    Solution(s): Refer to problem 1. Properly designed storage is going to give you the best solution for any potential (even though unlikely) queue problems. In your great storage design, make room for monitoring. Equallogic gives you SANHQ USE IT!!! See how your front end queues are doing on all your ports. Use ESXTOP or RESXTOP to see how the queues look on the ESX host. Most of us will find that queues are not a problem when problem one is properly taken care of. If you still have a queuing problem then go ahead and make a new datastore. I would also request Equallogic (and others) release a Path Selection Policy plugin that uses a Least Queue Depth algorithm (or something smarter). That would help a lot.

So I will repeat my earlier statement that VAAI allows you to make bigger datastores and house more VM’s per store. I will add a caveat, if you have a particular application that needs a high IO workload, give it a datastore.

Update Manager Problem after 4.1 Upgrade

A quick note to hopefully publicize a problem I had which I see is discussed in the VMware Community Forums already.

After building a new vCenter Server and Upgrading the vSphere 4.0 databases for vCenter and Update Manager. I noticed I could not scan hosts that were upgraded to 4.1. To be fair, by upgrading I mean rebuilt with a fresh install but with the exact same name and IP addresses. Seems that the process I took to upgrade has some kind of weird effect in the Update Manager Database. The scans fail almost immediately. I searched around the internet and found a couple of posts on the VMware Forums about the subject. One person was able to fix the problem by removing Update Manager and when reinstalling selecting the option to install a new database. I figured I didn’t have anything important in my UM database so I gave it a try and it worked like a champ.

Right now there is not any new patches for vSphere 4.1 but I have some Extension packages that need to be installed (Xsigo HCA Drivers). I wanted to note that I like the ability to upload extensions directly into Update Manager. This is a much cleaner process than loading the patches via the vMA for tracking and change control purposes.

Finding the Fusion OVFTool

The OVFtool is something I wished VMware Fusion had a while back and finally got a chance to use it the other day. I checked google and I found that it was located at:

/Library/Application Support/VMware Fusion/ovftool

As I looked for that path I was surprised it was not there. I upgraded from Fusion 2 to 3 to 3.1 and never recalled a chance or a place to add the OVFtool to my install. I could not find an independent download for the Mac OVFtool. I ended up re-installing the newest version of Fusion and I had to click “Advanced” during the install and turn on the OVFtool to install. Not sure if that is the best way, but that is how I got it to work. 🙂

Now that the path exists I was able to convert the OVF Appliance to be used on my Mac.

ovftool --help reveals a ton of options. To do a basic conversion though try this:


$mkdir /Users/username/Documents/Virtual Machines/ApplianceName
$/Library/Application Support/VMware Fusion/ovftool/ovftool ./Appliance.ovf /Users/username/Documents/Virtual Machines/ApplianceName

This will expand and convert the VM to be used with Fusion. Now just select open the VM in Fusion and play away.

Operational Readiness

One thing I am thinking about due to the VCDX application is operational readiness. What does it mean to pronounce this project or solution good-to-go? In my world it would be to test that each feature does exactly what it should be doing. Most commonly this will be failover testing, but could reach into any feature or be as big as DR plan that involves much more than the technical parts doing what they should. Some things I think need to be checked:

Resources

Are the CPU, Memory, Network and Storage doing what they should be? Some load generating programs like IOmeter can be fine to test network and storage performance. CPU busy programs can verify Resource Pools and DRS are behaving the way they should.

Failover

You have redundant links right? Start pulling cables. Do the links failover for Virtual Machines, Service Console, and iSCSI? How about the redundancy of the physical network, even more cable to pull! Also test that the storage controllers failover correctly. Also, I will make sure HA does what it is supposed to, instantly power off a host and make sure some test virtual machines start up somewhere else on the cluster.

Virtual Center Operations

Deploy new virtual machines, host and storage VMotion, deploy from a template, and clone a vm are all things we need to make sure are working. If this is a big enough deployment make sure the customer can use the deployment appliance if you are making use of one. Make sure the alarms send traps and emails too.

Storage Operations

Create new luns, test replication, test storage alarms and make sure the customer understands thin provisioning if it is in use. Make sure you are getting IO as designed from the Storage side. Making use of the SAN tools to be sure the storage is doing what it should.

Applications

You can verify that each application is working as intended within the virtual environment.

There must be something I am missing but the point is trying to test out everything so you can tell that this virtualization solution is ready to be used.

My Fun with the VMware Enterprise Administration and Design Exams

Sorry I have been missing for a few weeks. I know many were quite worried why I hadn’t blogged for a couple weeks (not really).

Back in February I sat for the Enterprise Administration Exam at PEX in Las Vegas. It was scheduled the day after the Super Bowl, what a bunch of distractions. Thankfully I passed and I want to give my experience so as to not violate any rules or anything I agreed to. This was a technical test. A lot of settings and configurations and information like that. Still multiple choice so at least you know the right answer is on the screen (hopefully, I did have one I thought none of these are right). The lab section was actually as fun as test taking could be. I wish there was more lab practical type things when it comes to these kinds of tests. Overall there is more intricate settings and config questions then you will find on the VCP exam.

At the end of April I took the Design Exam. This was a much different experience. I had a extremely hard time finding a study list of things that would help. Know the Exam Blueprint is all I would say. Also, this I think is where VMware can start finding out who does Architecture work and who may be an Administrator. I could say you could read every PDF on VMware.com and still not know how to pass this test unless you work with the solutions multiple times. The design drawing was a challenge, I wasted too much time reading the requirements document and ran of time, but I feel I was able to get a good portion of what I needed up on the page. Technically the interface was kind of quirky.

I felt both exams were challenging and but were fair to the Exam Blueprints. Nothing on there made me scream, “they didn’t say they would test on THAT!” The design exam needs some technical improvement (matching questions were buggy).

Now begins the harder and more involved process. The Design submission and hopefully an invitation to a defense.

VMware View – User Profile Options

All the technology and gadgets for managing desktops are worthless if your users complain about their experience with the desktop. Something I learned administering Citrix Presentation Server. Differing methods exist to keep the technical presentation of the desktop usable, for example the mouse being in sync and the right pixels show the right colors. What is also included in the user experience is a consistent environment where their personal data and settings are where they should be. Here are a few methods for managing those bits when using VMware View.

Mandatory Profiles
This profile is kept on the a central file share. The profile is copied to the machine on login, when the user logs out the changes are not kept. Great way to keep a consistent profile on kiosk type and data entry desktops. Where customization is not needed and most likely not wanted mandatory profiles are worth exploring. Main change is you set up the profile just like you want it then rename the NTUSER.dat to NTUSER.man. A lot exists on the internet about setting up man profiles.

Local Profiles
If you go through life never changing a thing in your Windows environment, you are using a Local Profile. Not to say you don’t change settings, save files or customize your background. You just have Windows running as the default. This is an option I will usually discourage because it is hard to backup data that is often kept in the local profile. VMware View will redirect user data to a User Data Disk (or whatever it is called today) on Persistent Desktop Pools. This is a good way to get the data on another VMDK. This introduces problems when looking at data recovery. There is solutions, but just something you will need to remember to look into.

Roaming Profiles
Roaming profiles is a great way to redirect current profiles to a central location. In theory this works great. In a View environment you can keep a local copy on a users desktop profile  and the changes are copied back and forth. I have often seen this work just great. Then from time to time, the profile will become corrupt, many times it does not unload correctly when users disconnect, or log out. Then you may have to pick through folders trying to find their “My Documents”. This is why I would suggest using this with Group Policy and Folder redirection which I will cover next.

Redirecting Folders
You may end up using a folder redirection group policy. This will move folders like the Desktop and My Documents for a user to a file server. This slims down the roaming profile as those locations are redirected to another location outside of the profile. This data is not copied from the machine to the server over and over. More information here.

Other Options
Immidio Flex Profiles
I really liked this option it was a way to combine mandatory profiles and a Roaming profile. This program would run some scripts on logon and log off to save files and settings. A really great paper on how to use it can be found here. Just like any great program that takes a new way to solve an annoying old problem, this is now not free.

RTO Virtual Profiles
I have never implemented this solution before. I have used it as part of a few training labs. I liked the feel. Now that VMware has purchased this software from RTO, the website redirects to a transition page. So I am looking for a way to test it in the lab, hoping the next set of bits of View includes RTO. Check this FAQ out for more information.

Maybe once it is built into View this will no longer be a serious issue. Profiles will be one of those things we tell stories to young padawan VM admins about, “We used to have to fight profiles, they were big and slow, and sometimes they would disappear!” Until that day…

VMware View and Xsigo

*Disclaimer – I work for a Xsigo and VMware partner.

I was in the VMware View Design and Best practices class a couple weeks ago. Much of the class is built on the VMware View Reference Architecture. The picture below is from that PDF.

It really struck me how many IO connections (Network or Storage) it would take to run this POD. Minimum (in my opinion) would be 6 cables per host with ten 8 host clusters that is 480 cables! Let’s say that 160 of those are 4 gb Fiberchannel and the other 320 are 1 gb ethernet. The is 640 gb for storage and 320 for network.

Xsigo currently uses 20 gb infiniband and best practice would be to use 2 cards per server. The same 80 servers in the above cluster would have 3200 gb of bandwidth available. Add in the flexibility and ease of management you get using virtual IO. The cost savings in the number director class fiber switches and datacenter switches you no longer need and the ROI I would think the pays for the Xsigo Directors. I don’t deal with pricing so this is pure contemplation. So I will stick with the technical benefits. Being in the datacenter I like any solution that makes provisioning servers easier, takes less cabling, and gives me unbelievable bandwidth.

So just in the way VMware changed the way we think about the datacenter. Virtual IO will once again change how we deal with our deployments.