TL;DR – Move Kubernetes volumes from legacy storage to Pure Storage.
So you have an amazing new Pure Storage array in the datacenter or in public cloud. The Container Storage Interface doesn’t provide a built in way to migrate data between backend devices. I previously blogged about a few ways to clone and migrate data between clusters but the data has to already be located on a Pure FlashArray.
Lately, Pure has been working with a new partner Kasten. While more is yet to come. Check out this demo (just 5:30) and see just how easy it is to move PVC’s while maintaining the config of the rest of the k8s application.
This demo used EKS in AWS for the Kubernetes cluster.
Application initially installed using a PVC for MySQL on EBS.
Kasten is used to backup the entire state of the app with the PVC to S3. This target could be a FlashBlade in your datacenter.
The application is restored to the same namespace but a Kasten Transform is used to convert the PVC to the “pure-block” StorageClass.
Application is live and using PSO for the storage on Cloud Block Store.
Why
Like the book says, “End with why”. Ok maybe it doesn’t actually say that. Let’s answer the “why should I do this?”
First: Why move EBS to CBS This PVC is 10GB on EBS. At this point in time it consumes about 30MB. How much does the AWS bill on the 10GB EBS volume? 10GB. On Cloud Block Store this data is reduced (compressed and deduped) and thin provisioned. How much is on the CBS? 3MB in this case. Does this make sense for 1 or 2 volumes? Nope. If your CIO has stated “move it all to the cloud!” This can be a significant savings on overall storage cost.
Second: Why move from (some other thing) to Pure? I am biased to PSO for Kubernetes so I will start there and then give a few bullets of why Pure, but this isn’t the sales pitch blog. Pure Service Orchestrator allows you a simple single line to install and begin getting storage on demand for your container clusters. One customer says, “It just works, we kind of forget it is there.” and another commented, “I want 100GB of storage for my app, and everything else is automated for me.”
Why Pure?
Efficiency – Get more out of the all-flash, higher dedupe with no performance penalty does matter.
Availability – 6×9’s uptime measured across our customer base, not an array in a validation lab. Actual customers love us.
TL;DR – EBS Volumes fail to mount when multipathd is installed on EKS worker nodes.
AWS Elastic Kubernetes Service is a great way to dive in with managed Kubernetes in the cloud. Pure Service Orchestrator integrates EKS worker nodes into the Cloud Block Store on AWS. I created this ansible playbook to make sure the right packages and services are started on my worker nodes.
In my previous testing with PSO and EKS I was basically focused on using PSO only. Recently the use case of migrating from EBS to CBS has shown to be pretty valuable to our customers in the cloud. To create the demo I used an app I often use for demoing PSO. It is 2 Web server containers attached to a mySQL container with a persistent volume. Very easy. I noticed though as I was using the built in gp2 Storage Class it started behaving super odd after I installed PSO. I installed the AWS EBS CSI driver. Same thing. It could not mount volumes or snapshot them in EBS. PSO volumes on CBS worked just fine. I figure most customers don’t want me to break EBS.
After digging around the internet and random old Github issues there was no one thing seemingly having the same issue. People were having problems that had like 1 of the 4 symptoms. I decided to test when in my process it broke after I enabled the package device-mapper-multipath. So it wasn’t PSO as much as a very important pre-requisite to PSO causing the issue. What it came down to is the EBS volumes were getting grabbed by multipathd and the Storage Class didn’t know how to handle the different device names. So I had to find how to use multipathd for just the Pure volumes. The right settings in multipath.conf solved this. This is what I used as an example:
I am telling multipathd to ignore everything BUT Pure. This solved my issue. So I saved this into the local directory and added the section in the ansible playbook to copy that file to each worker node in EKS. 1. Copy the ansible playbook above to a file prereqs.yaml 2. Copy the above multipath blacklist settings to multipath.conf and save to the same directory as prereqs.yaml 3. Run the ansible playbook as shown below. (make sure the inventory.ini has IP’s and you have the SSH key to login to each worker node.
# Make sure inventory.ini has the ssh IP's of each node.
# prereqs.yaml includes the content from above
ansible-playbook -i inventory.ini -b -v prereqs.yaml -u ec2-user
This will install the packages, copy multipath.conf to /etc and restart the services to make sure they pick up the new config.
In this post, I’m going to discuss how to load balance your storage provisioning across a fleet of Pure Storage FlashArrays.
As an integral part of Pure’s Kubernetes integration, Pure Service Orchestrator has the ability to load balance across a fleet of Pure storage devices. This is great for your containerized space, but I wondered how you could do something similar for arrays in a non-containerized environment. For example, a vSphere environment where there are multiple Pure Storage FlashArrays available to a vCenter and when creating a new datastore, you want to do this on the least full array.
What I wanted to do was orchestrate which storage array a volume was provisioned on automatically without the storage administrator having to keep checking which array to use. Now, when I think of automation and orchestration I immediately think of Ansible.
Pure Storage has an amazing suite of modules for Ansible that we can leverage to do this work, so I created an Ansible load-balancing role called, logically enough, lb.
The role takes a list of arrays, interrogate them and works out which has the least used capacity and then provide the information required for the rest of the playbook to provision against that array.
So where can you find this role and how do you use it?
The role can be found on the Pure Storage OpenConnect GitHub account in the ansible-playbook-examples repository, under the flasharray/roles directory.
To use it requires you to populate the variables file roles/lb/vars/main.yml with the management IP addresses of your fleet of arrays, together with an API token for a storage admin privilege user for each array. I guess there is no limit to the number of arrays you can load balance over, but the example below is for a fleet of six FlashArrays.
The populated file would look something like this (use your own array credentials):
To use the role just add it into your Ansible playbook.
If you are security-minded, then all of these entries in the URL and API token can be encrypted using Ansible Vault. I wrote another blog post that included details on how to implement Vault for these variable.
When the role has run two variables will have been defined: use_url and use_api. These identify the array with the lowest utilization level and therefore the one you should be provisioning to. There is also an additional variable you can use (use_name) that identifies the true name of the array selected.
A super simple playbook that uses the lb role and then provisions a single volume to the least full array is shown here:
- name: Pure Storage load balancing example
hosts: localhost
gather_facts: no
vars:
array_usage: [] # Do not remove - required by the role
roles:
- role: lb
tasks:
- name: Provisioning to {{ use_name }}
purefa_volume:
fa_url: "{{ use_url }}"
api_token: "{{ use_api }}"
name: lb_test
size: 50G
I hope this short post and this role prove useful, and if you have any of your own roles or playbooks for Pure Storage devices that you think would be useful to other users, please feel free to contribute them to the ansible-playbook-examples GitHub repository.
How to Upgrade your PSO FlexDriver deployment to the latest CSI-based driver
Over the past few months, the Kubernetes FlexDriver codebase has been deprecated and there is a solid shift towards using CSI-based drivers for providing Persistent Volumes to Kubernetes environments.
I’m not going to address the reasons behind that shift here, but suffice to say that all the major storage providers are now using the CSI specification for their persistent storage drivers in Kubernetes.
This is great, but what about those early adopters who installed FlexDriver based drivers?
It’s not the easiest thing to migrate the control of a persistent volume from one driver to another, in fact, it is practically impossible unless you are a Pure Storage customer and are using PSO.
With the latest release of PSO, ie 5.2.0, there is now a way to migrate your PSO FlexDriver created volumes under the control of the PSO CSI driver.
It’s still not simple and it’s a little time consuming, and you do need an outage for your application, but it is possible.
Simply (sic), these are the steps you need to undertake to perform your migration:
Scale down your applications so that no pods are using the FlexDriver managed PVCs and PVs.
Uninstall your FlexDriver – don’t worry all your PVs and PVCs will remain and the applications using them won’t notice.
Install the CSI based driver – now all new PVs will be managed by this new driver.
Identify your PVs that were created by the FlexDriver.
Patch the PV definition to ensure it doesn’t get automatically deleted by Kubernetes.
Delete the PVC and then the PV – sounds scary, but the previous patch command means that underlying volume on the backend storage is retained
Import the storage volume back into Kubernetes and under the CSI drivers control – this is where you need PSO v5.2.0 or higher…
Scale back up your applications.
Well that was easy, wasn’t it?
More details on exactly how to perform the steps above are detailed in the PSO GitHub repository documentation.
Now, you may feel a little paranoid about these deletion commands you are running against your precious data, so as a “belt and braces” type activity, you could always make a clone or a snapshot of your underlying storage volumes on your array before you do step 6. But remember to delete these clones when you have completed the migration.
This morning I needed to upgrade one of my dev clusters to 1.17.4. I decided to capture the experience. Don’t worry I speed up the ansible output flying by
I use Kubespray to deploy and upgrade my clusters. I didn’t do anything really to prepare. All of my clusters I can rebuild pretty easy from Terraform if anything breaks.
git clone git@github.com:kubernetes-sigs/kubespray.git
cd kubespray
## Make sure you copy your actual inventory. For more information see the kubespray github repo
ansible-playbook -i inventory/dev/inventory.ini -b -v upgrade-cluster.yaml
Watch it go for about 40 minutes in my case. Remember this is a dev cluster and the pods I have running can restart all they want. I don’ t care. Everything upgrades through the first part of the video. Now lets upgrade Pure Service Orchestrator.
Now if you watch the video you will notice I had to add the Pure Storage helm repo. This was a new jump box in the lab. So I had PSO installed just not from this actual host. It is easy to add. More details are in the Pure Helm Chart README.
Since the Mitaka release of OpenStack, the Pure Storage Cinder driver has supported Cinder replication, although this first iteration only supported asynchronous replication.
The Rocky release of OpenStack saw Pure’s Cinder driver support synchronous replication by integrating our ActiveCluster feature from the FlashArray.
This synchronous replication automatically created an ActiveCluster pod on the paired FlashArrays called cinder-pod. A pretty obvious name I would say.
While this provided a seamless integration for OpenStack users to create a synchronously replicated volume using a correctly configured volume type, there was one small limitation. ActiveCluster pods were limited to 3000 volumes.
Now you might think that is more than enough volumes for any single ActiveCluster environment. I certainly did until I received a request to be able to support 6000 volumes synchronously replicated.
After some scratching of my head, I remembered that from the OpenStack Stein release of the Pure Cinder driver there is an undocumented (well, not very well documented) parameter that allows the name of the ActiveCluster pod to be customizable and that gave me an idea….
Can you configure Cinder to use the same backend as separate stanzas in the Cinder config file with different parameters?
It turns out the answer is Yes.
So, here’s how to enable your Pure FlashArray Cinder driver to use a single ActiveCluster pair of FlashArrays to allow for 6000 synchronously replicated volumes.
First, we need to edit the cinder.conf file and create two different stanzas for the same array that is configured in an ActiveCluster pair and ensure we have enabled both of these backends:
If we look at the two stanzas, the only difference is that the pure_replication_pod_name is different. I have also set the volume_backend_name to be the same for both configurations. There is a reason for this I will cover later.
After altering the configuration file, make sure to restart your Cinder Volume service to implement the changes.
After restarting the cinder-volume service, you will see on the FlashArray that two ActiveCluster pods now exist with the names defined in the configuration file.
This is the first step.
Now we need to enable volume types to be able to use these pods and also to load-balance across the two pods – why load-balance? It just seems to make more sense to make volumes evenly utilize the pods, but there is no specific reason for doing this. If you wanted to use each pod separately, then you would need to set a different volume_backend_name in the Cinder configuration file for each array stanza.
When creating a volume type to use synchronous replication you need to set some specific extra_specs in the type definition. These are the commands to use:
openstack volume type create pure-repl
openstack volume type set --property replication_type=’<in> sync’ pure_repl
openstack volume type set --property replication_enabled=’<is> True’ pure_repl
openstack volume type set --property volume_backend_name=’pure’ pure_repl
The final configuration of the volume type would now look something like this:
openstack volume type show pure-repl
+--------------------+-------------------------------------------------------------------------------------------+
| Field | Value |
+--------------------+-------------------------------------------------------------------------------------------+
| access_project_ids | None |
| description | None |
| id | 2b6fe658-5bbf-405c-a0b6-c9ac23801617 |
| is_public | True |
| name | pure-repl |
| properties | replication_enabled='<is> True', replication_type='<in> sync', volume_backend_name='pure' |
| qos_specs_id | None |
+--------------------+-------------------------------------------------------------------------------------------+
Now, all we need to do is use the volume type when creating
our Cinder volumes.
Let’s create two volumes and see how they appear on the FlashArray:
Looking at the FlashArray, we can see the two volumes we just created (I am filtering the volume name on cinder just so you only see the OpenStack related volumes on this array)
The volume naming convention we use at Pure shows that these
volumes are in a pod due to the double colon (::)
in the name and the pod name for each volume is cinder-pod1
and cinder-pod2 respectively.
The view of each pod also shows only one volume in each.
If you didn’t want to load-balance across the pods and needed the flexibility to specify the pod a volume exists in, all I need do is set the volume_backend_name to be different in the configuration file array stanzas and then create two volume types. Each would point to the different volume_backend_name setting.
Please welcome Simon making a guest appearance to go through whatever it is this is about. 🙂 – Jon
Got to love those TLAs!!
To demystify the title of this blog, this will be about installing Pure Service Orchestrator (PSO) with Docker Kubernetes Service (DKS).
Specifically, I’ll be talking about PSO CSI driver v5.0.8, running with Docker EE 3.0 and the Universal Control Plane (UCP) 3.2.6, managing Kubernetes 1.14.8.
Let’s assume you have Docker Enterprise 3.0 installed on 3
Linux nodes, in my case they are running Ubuntu 18.04. You decide you want them to all run the Docker
Kubernetes Service (DKS) and have any persistent storage provided by your Pure
Storage FlashArray or FlashBlade – how do you go about installing all of these
and configuring them?
Pre-Requisites
As we are going to be using PSO with Pure Storage array for
the persistent storage, ensure that all nodes that will part of DKS have the
following software installed:
nfs-common
multipath-tools
Install UCP
The first step to getting your DKS environment up is to
install the Docker Universal Control Plane (UCP) from the node you will be
using as your master.
As PSO supports CSI snapshots, you will want to ensure that
when installing UCP, you tell it to open the Kubernetes feature gates, thereby
enabling persistent volumes snapshots through PSO.
If you don’t want to open the feature gates, don’t use the --storage-expt-enabled switch in the install command.
Answer the questions the install asks, wait a few minutes,
and voila you have Docker UCP installed and can access it through its GUI at http://<host IP>. Note that you
may be prompted to enter your Docker EE license key on the first login.
When complete you will have a basic, single node, environment consisting of docker EE 3.0, UCP 3.2.6 and Kubernetes 1.14.8.
Add Nodes to Cluster
Once you have your master node up and running, you can add
your two worker nodes to the cluster.
The first step is to ensure your default scheduler is
Kubernetes, not Swarm. If you don’t set this pods will not run on the worker
nodes due to taints that are applied.
Navigate to your username in the left pane and select Admin
Settings and then Scheduler. Set the default Orchestrator type to
Kubernetes and save your change
Now to add nodes navigate to Shared Resources and
select Nodes and then Add Nodes. You will see something like
this:
Use the command on each worker node to get them to join the
Kubernetes cluster. When complete, your nodes should be correctly joined and
look like this is your Nodes display.
You now have a fully functioning Kubernetes cluster managed
by Docker UCP.
Get your client ready
Before you can install PSO you need to install a Docker
Client Bundle onto your local node that will be used to communicate with your
cluster. I use a Windows 10 laptop, but run the Ubuntu shell provided by
Windows to do this.
To get the bundle, navigate to your user profile, select Client
Bundles and then Generate Client Bundle from the dropdown menu.
Unzip the tar file you get into your working directory.
Next, you need to get the correct kubectl version, which
with UCP 3.2.6 is 1.14.8, by running the following commands:
Check your installation by running the following commands:
# kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.8", GitCommit:"211047e9a1922595eaa3a1127ed365e9299a6c23", GitTreeState:"clean", BuildDate:"2019-10-15T12:11:03Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.8-docker-1", GitCommit:"8100f4dfe656d4a4e5573fe86375a5324771ec6b", GitTreeState:"clean", BuildDate:"2019-10-18T17:13:51Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
docker1 Ready master 24h v1.14.8-docker-1
docker2 Ready <none> 24h v1.14.8-docker-1
docker3 Ready <none> 24h v1.14.8-docker-1
Now we are nearly ready to install PSO, but PSO requires Helm, so now we install Helm3 (I’m using v3.1.2 here, but check for newer versions) and validate:
# wget https://get.helm.sh/helm-v3.1.2-linux-amd64.tar.gz
# tar -zxvf helm-v3.1.2-linux-amd64.tar.gz
# mv linux-amd64/helm /usr/bin/helm
# helm version
version.BuildInfo{Version:"v3.1.2", GitCommit:"d878d4d45863e42fd5cff6743294a11d28a9abce", GitTreeState:"clean", GoVersion:"go1.13.8"}
And finally…
We are ready to install PSO.. Here we are just going to
follow the instructions in the PSO GitHub repo, so check in their for updates
if you are reading this in my future…
# helm repo add pure https://purestorage.github.io/helm-charts
# helm repo update
So the latest version at this time is 5.0.8, so we should
get the values.yaml
configuration file that matches this version…
What does this look like in Docker UCP you ask, well this is
what you will see in various screens:
Now you can start using PSO to provide your persistent
storage to your containerized applications, and if you enabled the
feature-gates as suggested at the start of this blog, you could also take
snapshots of your PVs and restore these to new volumes. For details on exactly
how to do this read this: https://github.com/purestorage/helm-charts/blob/5.0.8/docs/csi-snapshot-clones.md,
but make sure you install the VolumeSnapshotClass first wit this command:
The version of Kubernetes provided in Docker UCP 3.2.6 does not support volume
cloning, but future releases may enable this functionality – check with Docker
UCP and Docker EE release notes.