Kubernetes on AWS with Cloud Block Store

Only a slight nudge at from @CodyHosterman to put this post together.

Kubernetes deployed into AWS is a method many organizations are using to get into using K8s. Whether you deploy K8s with Kubeadm, Kops, Kubespray, Rancher, WeaveWorks, OpenShift, etc the next big question is how do I do persistent volumes? While EBS has StorageClass integrations you may be interesting in getting better efficiency and reliability than traditional block in the cloud. That is one of the great uses of Cloud Block Store. Highly efficient and highly reliable storage built for AWS with the same experience as the on prem FlashArray. By utilizing Pure Service Orchestrator’s helm chart or operator you can now take advantage of Container Storage as a Service in the cloud. Are you using Kubernetes in AWS on EC2 and have questions about how to take advantage of Cloud Block Store? Please ask me here in the comments or @jon_2vcps on twitter.

  1. Persistent Volume Claims may will not always be 100% full. Cloud Block Store is Deduped, Compressed and Thin. Don’t pay for 100% of a TB if it is only 1% full. I do not want to be in the business of keeping developers from getting the resources they need, but I also do not want to be paying for when they over-estimate.
  2. Migrate data from on prem volumes such as K8s PVC, VMware vVols, Native physical volumes into the cloud and attach them to your Kubernetes environment. See the youtube demo below for an example. What we are seeing in the demo is creating an app in Kubernetes on prem, loading it with some data (photos), replicating that application to the AWS cloud and using Pure Service Orchestrator to attach the data to the K8s orchestrated application using Cloud Block Store. This is my re-working of Simon’s tech preview demo from the original launch of Cloud Block Store last November.

3. Simple. Make storage simple. One common tweet I see on twitter from the Kubernetes detractors is how complicated Kubernetes can be. Pure Service Orchestrator makes the storage layer amazingly simple. A single command line to install or upgrade. Pooling across multiple devices.

Get Started today:
Below I will include some links on the different installs of PSO. Now don’t let the choices scare you. Container Storage Interface or CSI is the newest API for common interaction with all storage providers. While flexvol was the original storage solution it makes sense to move forward with CSI. This is very true for newer versions of kubernetes that include CSI by default. So if you are starting to use K8s for the first time today or your cluster is K8s 1.11 we have you covered. Use the links below to see the install process and prerequisites for PSO.

FlexVol Driver:
Pure Service Orchestrator Helm Chart
Pure Service Orchestrator Operator

CSI Driver:
Pure Service Orchestrator CSI Helm
Pure Service Orchestrator CSI Operator

Talking Pure and K8s on the Virtually Speaking Podcast at #PureAccelerate

Migrate Persistent Data into PKS with Pure vVols

While I discussed in my VMworld session this week some of the architectural decisions to be made while deploying PKS on vSphere my demo revolved around once it is up and running how to move existing data into PKS.

First, using the Pure FlashArray and vVols we are able to automate that process and quickly move data from another k8s cluster into PKS. It is not limited to that but this is the use case I started with.

Part 1 of the demo shows taking the persistent data from a deployment on and cloning it over the vVol that is created by using the vSphere Cloud Provider with PKS. vVols are particularly important because they keep the data in a native format and make copy/replication and snapshotting much easier.

Part 2 is the same process just scripted using Python and Ansible.

Demo Part 1 – Manual process of migrating data into PKS

Demo Part 2 – Using Python and Ansible to migrate data into PKS

How to automate the Migration with some Python and Ansible

The code I used is available from code.purestorage.com. Which also links to the GitHub repo https://github.com/PureStorage-OpenConnect/k8s4vvols

They let me on a stage. Again. 🙂

Use PKS Enterprise on VMware SDDC and Pure Storage

Use PKS Enterprise on VMware SDDC and Pure Storage

Pivotal Container Services (PKS) provides a deeply integrated Kubernetes (k8s) architecture for the VMware SDDC. It is a joint engineering project from VMware and Pivotal. In my conversations with Pure Storage customers or potential customers around Kubernetes I often get asked about how Pure Storage can help a PKS Enterprise environment. The good news is there is a very easy path to utilizing k8s with Pure + VMware + PKS.

The Architecture

Using Pure with PKS is actually very straight forward. Since Pure FlashArray is already leading choice for all VMware environments it is not anything out of the ordinary to support PKS. 

Understanding the underlying technology that integrates PKS into VMware you may soon realize that highly reliable, stateless and shared storage is the best choice when deploying PKS. 

The choice between drivers (shown in the graphic above) to deliver the Storage is up to you. The vSphere Cloud Provider provides automated creation and management of the virtual disks presented to containers in PKS. This supports the use of vVols and enables great possibilities for your PKS environment.  Pure Service Orchestrator utilizes a direct connection to Pure Storage FlashArrays, FlashBlades and Cloud Block Stores. It is installed with a single Helm command or Kubernetes Operator. It includes Smart Provisioning in order to place volumes on the most optimal storage device in your fleet.

The choice of which tool will be dictated by your workload. It is not an exclusive choice either. It is easy to do both. After VMworld I hope to publish the details on how to install PSO on PKS. If you have really good github search foo you may be able to find the bosh deployment.

Highly Reliable

Pure Storage has measured 6×9’s of uptime across its customer base. Many storage solutions for container environments will require hours of planning and weeks of proper implementation to provide high availability. Do not spend time re-architecting your storage infrastructure for PKS. Spend your time delivering k8s to your customers so they can deliver innovation for your business.  Use the Pure Storage devices you already have. You may not even need a whole new dedicated array (don’t tell sales I said that). 

Stateless Arrays for Stateful Data

Migrating data should be eliminated from your daily tasks. As FlashArrays move further into the future where data always stays in place. The ability to keep the data in place for multiple hardware generations is a proven benefit of Pure. Migrating persistent storage in k8s even on VMware is a non-trivial task. Depending on your scale this could take weeks of planning and careful flawless execution to accomplish non-disruptively. The underlying hardware should not be a concern for delivering applications. Pure Storage has made this a reality since the FlashArray debut 7 years ago.

Shared Storage

Delivering highly reliable data across multiple PKS and vSphere clusters, allowing applications to failover if the compute in an availability zone becomes unavailable, is key to delivering a cloud experience for your k8s rollout. While the Pure sales teams would gladly help you acquire a FlashArray per vSphere cluster hosting PKS this is simply un-needed for nearly all situations. Especially as you start on your Kubernetes journey.

But Why PURE?

Simple; vVols on the FlashArray combined with the PKS integration with vSphere enables mobility of data and freedom unavailable on a legacy datastore. Have a group that rolled their own k8s? FlashArray can clone their persistent data instantly into PKS using vVols. Need to copy data from a bare metal (non-VM) k8s cluster to PKS? Pure vVols makes this possible. Have multiple k8s clusters within PKS today that require the same data for test/dev/prod Pure Storage enables this nearly instantly. Pure Storage FlashArray Snapshots and Clones move at the speed of an API call from any of our SDK’s from Python to Powershell to Ansible to Terraform and more to give you an easy way to fit Pure Storage into your Infrastructure as Code tools. 

You can probably spend the next 5 hours reading blogs and papers of all the other benefits of Pure Storage and they all apply to your PKS on vSphere environment but I wanted to provide a few examples directly related to operating PKS on Pure.

VMworld 2019 Session

In my session for VMworld in San Francisco I will demonstrate how Pure Storage is able to instantly migrate persistent volumes from “other” k8s clusters to PKS. Make sure you make it to this session if you considering PKS.

Storage Quotas in Kubernetes

One thing since we released Pure Service Orchestrator I get asked is, “How do we control how much developer/user can deploy?”

I played around with some of the settings from the K8s documentation for quotas and limits. I uploaded these into my gists on GitHub.

git clone git@gist.github.com:d0fba9495975c29896b98531b04badfd.git
#create the namespace as a cluster-admin
kubectl create -f dev-ns.yaml
#create the quota in that namespace
kubectl -n development create -f storage-quota.yaml
#or if you want to create CPU and Memory and other quotas too
kubectl -n development create -f quota.yaml

This allows users in that namespace to be limitted to a certain number of Persistent Volume Claims (PVC) and/or total requested storage. Both can be useful in scenarios where you don’t want someone to create 10,000 1Gi volumes on an array or create one giant 100Ti volume. 

Credit to dilbert.com When I searched for quotas on the internet this made me laugh. I work with salespeople a lot.

 

Creating a Helm Repo with Github

Next step in learning helm is being able to take an existing helm package and put it in your own repo.

There are ways to do this with github pages. I don’t really want mess withthat right now, how can I use a Github repo to host my changes to the deployment?

For installing helm and an additional demo please see part 1 of this series.

http://54.88.246.86/2018/03/27/getting-started-with-helm-for-k8s/

Continue reading “Creating a Helm Repo with Github”

Getting Started with Helm for K8s

Over the last few weeks I was setting up Kubernetes in the lab. One thing I quickly learned was managing and editing yaml files for deployments, services and persistent volume claims became confusing and hard. Even when I had things commited in github sometimes I would make edits then not push them then rebuild my K8s cluster.

The last straw was when 2 of our Pure developers said that editing yaml in vi wasn’t very cool and to start using helm.

Needless to say that was good advice. I still have to remember to push my repos to github. Now my demostration applications are more “cloud native”. I can create and edit them in one environment and use helm install in another and have it just work.

Continue reading “Getting Started with Helm for K8s”

Using Snapshots with the Pure Storage Plugin for Kubernetes

One request from customers is not only provision persistent storage for Kubernetes but also integrate into workflows that may need to snap and copy the data for different environments. Much like we do this with powershell or python for SQL and Oracle environments to accelerate development or QA. Pure has enabled snapshots using the Pure Provisioner as part of our Kubernetes Plugin.

In this demo I am showing how I can take a users data directory for JupyterHub and clone it for another user to take advantage of all the benefits of Pure’s snapshots and clones. You instantly get access to a copy of the dataset. The dataset doesn’t take up room on the backend storage. Only globally unique changes will grow the volume. In this use case the Data Science team will see increases in productivity as they are not waiting for data to download from the cloud or copy from another place on the array.

The command to run the snap using kubectl is below:

kubectl exec <pure provisioner pod name> -- snapshot create -n <namespace> <pvc-claim-name>

Kubernetes and the Pure Storage FlexVolume Plugin

First, if you are using Pure Storage and Kubernetes make life easier and take a look at our plugin. Now version 1.2.2 and GA.

https://hub.docker.com/r/purestorage/k8s/

Make sure the follow the directions on the page to pull and install the plugin. If you are using Openshift pay special attention to the Readme. I will post more on this in the near future.

Cockroach DB as our Persistent Database

I want to simulate a very easy database that I can easily use in a container. That is also not the same old. I built a Go app that will write to a database over and over to kind of demonstrate the inner workings of the plugin but not necessarily supply a performance test.

To learn more about the steps I use in the video to deploy and manage CRDB in K8s please check out this link. https://www.cockroachlabs.com/docs/stable/orchestrate-cockroachdb-with-kubernetes.html

With that said, please check out how to deploy and scale a database with a persistent data platform from a Pure FlashArray. Watch this in Full screen to make the CLI commands easier to see.

What you are seeing in the video:

  1. Deploy the initial 3 pods with volumes automatically created and connected on the Pure FA.
  2. Initialize the cluster.
  3. Fail a node and watch K8s redeploy a new container and re-attach the data volume.
  4. Run a load generation application as a K8s Job.
  5. Scale the DB cluster out to 8 nodes.

What is next?

This is a really easy and quick demo but it show the ease of using the Pure Plugin to manage the persistent data, making sure you do not lose data in the event of app crashes. Also easily scaling. This can all be done via policy and the deployment can be made even easier using Helm. In a future post we will see how we can take advantage of these methods and keep the same highly available, high performance and very easy to use persistent data platform for your application.

CockroachDB with Persistent Data

There IS an Official Whitepaper!

While I was writing this post the awesome Simon Dodsley was writing a great whitepaper on Persistent storage with Pure. As you can see there is some very different ways to deploy CockroachDB but the main goal is to keep your important data persistent no matter what happens to the containers as the scale, live and die.

I know most everyone loved seeing the demo of the most mission critical app in my house. I also want to show a few quick ways to leverage the Pure plugin to provide persistent data to a database. I am posting my files I used to create the demo here https://github.com/2vcps/crdb-demo-pure

First note
I started with the instructions provided here by Cockroach Labs.
This is an insecure installation for demo purposes. They do provide the instructions for a more Prod ready version. This is good enough for now.

Second note
The loadbalancer I used was created for my environment using the intructions to output the HAProxy file found here on the Cockroach Labs website:
https://www.cockroachlabs.com/docs/stable/generate-cockroachdb-resources.html

My yaml file refers to a docker image I built for the HAproxy loadbalancer. If it works for you cool! If not please follow the instructions above to create your own. If you really need to know more I can write another post showing how to take the Dockerfile and copy the CFG generated by CRDB into a new image just for you.

 

My nice little docker swarm

media_1501095950777.png

I have three VMware VM’s running Ubuntu 16.04. With Docker CE and the Pure plugin already installed. Read more here if you want to install the plugin.

media_1501096079095.png

Run the deploy

https://github.com/2vcps/crdb-demo-pure/blob/master/3node-cockroachdb-pure.yml

version: '3.1'
services:
    db1:
      image: cockroachdb/cockroach:v1.0.2
      deploy:
            mode: replicated
            replicas: 1
      ports:
            - 8888:8080
      command: start --advertise-host=cockroach_db1 --logtostderr --insecure
      networks:
            - cockroachdb
      volumes:
            - cockroachdb-1:/cockroach/cockroach-data
    db2:
      image: cockroachdb/cockroach:v1.0.2
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db2 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-2:/cockroach/cockroach-data
    db3:
      image: cockroachdb/cockroach:v1.0.2
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db3 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-3:/cockroach/cockroach-data
    crdb-proxy:
      image: jowings/crdb-proxy:v1
      deploy:
         mode: replicated
         replicas: 1
      ports:
         - 26257:26257
      networks: 
         - cockroachdb

networks:
    cockroachdb:
        external: true

volumes:
    cockroachdb-1:
      driver: pure
    cockroachdb-2:
      driver: pure
    cockroachdb-3:
      driver: pure

 

#docker stack deploy -c 3node-cockroachdb-pure.yml cockroach

Like it shows in the compose file This command deploys 4 services. 3 database nodes and 1 HAproxy. Each database node gets a brand new volume attached directly to the path by the Pure Docker Volume Plugin.

New Volumes

media_1501098437804.png

Each new volume created and attached to the host via iSCSI and mounted into the container.

Cool Dashboard

media_1501098544719.png

Other than being no data do you notice something else?
First lets generate some data.
I run this from a client machine but you can attach to one of the DB containers and run this command to generate some sample data.

cockroach gen example-data | cockroach sql --insecure --host [any host ip of your docker swam]

media_1501098910914.png

I am also going to create a “bank” database and use a few containers to start inserting data over and over.

cockroach sql --insecure --host 10.21.84.7
# Welcome to the cockroach SQL interface.
# All statements must be terminated by a semicolon.
# To exit: CTRL + D.
root@10.21.84.7:26257/> CREATE database bank;
CREATE DATABASE
root@10.21.84.7:26257/> set database = bank;
SET
root@10.21.84.7:26257/bank> create table accounts (
-> id INT PRIMARY KEY,
-> balance DECIMAL
-> );
CREATE TABLE
root@10.21.84.7:26257/bank> ^D

I created a program in golang to insert some data into the database just to make the charts interesting. This container starts, inserts a few thousand rows then exits. I run it as a service with 12 replicas so it is constantly going, I call it gogogo because I am funny.

media_1501108005294.png

gogogo

media_1501108062456.png
media_1501108412285.png

You can see the data slowly going into the volumes.

media_1501171172944.png

Each node remains balanced (roughly) as cockroachdb stores that data.

What happens if a container dies?

media_1501171487843.png

Lets make this one go away.

media_1501171632191.png

We kill it.
Swarm starts a new one. The Docker engine uses the Pure plugin and remounts the volume. The CRDB cluster keeps on going.
New container ID but the data is the same.

media_1501171737281.png

Alright what do I do now?

media_1501171851533.png

So you want to update the image to the latest version of Cockroach? Did you notice this in our first screenshot?

Also our database is getting a lot of hits, (not really but lets pretend), so we need to scale it out. What do we do now?

https://github.com/2vcps/crdb-demo-pure/blob/master/6node-cockroachdb-pure.yml

version: '3.1'
services:
    db1:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
            mode: replicated
            replicas: 1
      ports:
            - 8888:8080
      command: start --advertise-host=cockroach_db1 --logtostderr --insecure
      networks:
            - cockroachdb
      volumes:
            - cockroachdb-1:/cockroach/cockroach-data
    db2:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db2 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-2:/cockroach/cockroach-data
    db3:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db3 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-3:/cockroach/cockroach-data
    crdb-proxy:
      image: jowings/crdb-haproxy:v2
      deploy:
         mode: replicated
         replicas: 1
      ports:
         - 26257:26257
      networks: 
         - cockroachdb
    db4:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db4 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-4:/cockroach/cockroach-data
    db5:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db5 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-5:/cockroach/cockroach-data
    db6:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db6 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-6:/cockroach/cockroach-data
networks:
    cockroachdb:
        external: true

volumes:
    cockroachdb-1:
      driver: pure
    cockroachdb-2:
      driver: pure
    cockroachdb-3:
      driver: pure
    cockroachdb-4:
      driver: pure
    cockroachdb-5:
      driver: pure
    cockroachdb-6:
      driver: pure
$docker stack deploy -c 6node-cockroachdb-pure.yml cockroach

(important to provide the name of the stack you already used, or else errors)

media_1501172007803.png

We are going to update the services with the new images.

  1. This will replace the container with the new version — v1.0.3
  2. This will attach the existing volumes for nodes db1,db2,db3 to the already created FlashArray volumes.
  3. Also create new empty volumes for the new scaled out nodes db4,db5,db6
  4. CockroachDB will begin replicating the data to the new nodes.
  5. My gogogo client “barage” is still running

This is kind of the shotgun approach in this non-prod demo environment. If you want no downtime upgrades to containers I suggest reading more on blue-green deployments. I will show how to make the application upgrade with no downtime and use blue-green in another post.

Cockroach DB begins to reblance the data.

media_1501172638117.png

6 nodes

media_1501172712079.png

If you notice the gap in the queries it is becuase I updated every node all at once. A better way would be to do one at a time and make sure each node is back up while they “roll” through the upgrade to the new image. Not prod remember?

media_1501172781312.png
media_1501172828992.png

Application says you are using 771MiB of your 192GB. While the FlashArray is using just maybe 105MB across these volumes.

A little while later…

media_1501175811897.png

Now we are mostly balanced with replicas in each db node.

Conclusion
This is just scratching the surface and running highly scalable data applications in containers with persistent data on a FlashArray. Are you a Pure customer or potential Pure customer about to run stateful/persistent apps on Docker/Kubernetes/DCOS? I want to hear from you. Leave a comment or send me a message on Twitter @jon_2vcps.

If you are a developer and have no clue what your infrastructure team does or is doing I am here to help make everyone’s life better. No more weekend long deployments or upgrades. Get out of doing storage performance troubleshooting.

Go to more of your kids soccer games.

Come see @CodyHosterman at VMworld, and if he is too busy you can see me

Co89Fz2UAAAiAD9Look for a post about going to In-n-Out some time soon, it is my tradition.

Be sure to check out what we will be doing at VMworld at the end of the month. Click the banner below once you are done being mesmerized by Chappy. Sign up for a 1:1 demo or meeting, I’ll be there are would love to meed with you. See how focused a demo I give.

 

vmworld-sig chappyFB

Sessions to be sure to see featuring the Amazing Cody Hosterman

SDDC9456-SPO: Implementing Self-Service Storage Provisioning with vRealize Automation Xaas

VMware vCenter is no longer meant to be the end-user interface for requesting and managing virtual machines and related resources. Storage is no exception. Join Cody Hosterman as he discusses how vRealize Automation Anything-as-a-Service (Xaas) provides the ability to easily import vRealize Orchestrator workflows to control, manage and provision storage via the self-service catalog offering vRealize Automation.

Wednesday, Aug 31, 2:00 PM – 3:00 PM

NF9455-SPO: Best Practices for All-Flash Data Reduction Arrays with VMware vSphere

As All-Flash Data Reduction arrays are becoming common place in VMware environments due to their performance, flexibility and ease-of-use, it is important to understand how to best implement and manage them with EXXi. Data-reduction and flash changes how an administrator should think about various configuration options within VMware and those will be discussed in detail. VAAI, Space Reclamation, virtual disks, SIOC, SDRS Queue depths, Multipathing and other points will be highlighted.

Monday, Aug 29, 2:30 PM – 3:30 PM