Persistent Volumes in Kubernetes

Introduction

In this post, we will learn how persistence storage objects in kubernetes provide an easy way to meet storage requirements of statefull applications.

In previous post, we learned how to use volume plugins for external storage and we used an AWS EBS volume for this purpose. However, we used AWS EBS volume settings directly inside the pod YML file. This model works fine but however, storage information is hard-coded inside the Pod and that’s why its not flexible when it comes to application portability.

To decouple this hard-wiring and take one step closer making our application more portable, kubernetes have persistence objects such as Persistent Volume, Persistent Volume Claim etc. These are full on kuberetes objects and can be defined in a YAML file.

Kubernetes Persistent Volume Subsystem decouples data from application pods and containers and abstracts implementation details. The main components are as follows:

Persistent Volume (expose external storage to a PV)
Persistent Volume Claim (a request for persistent volume)
Storage Classes (for dynamic provisioning)

Now, we can use these objects in following two ways:

Static Provisioning
Dynamic Portioning

We will cover Persistent Volume (PV) and Persistent Volume Claim(PVC) using static provisioning workflow in today’s post. This will help us to understand the simple workflow and give us a base line for next steps. Storage Classes(SC) are used for dynamic provisioning and we will cover those in later post.

Setting The Scene

We have currently following setup for storage from last post, where we put volume and mount information directly inside the Pod:

This setup works and our data is not lost even if pod dies. However, as we know that this has some portability issues and some future changes in storage, will require changes in Pod and redeployment of pods. So thats the drawback of this model.

Lets see how the setup will look with Kubernetes persistence objects such as PV and PVC objects.

In this setup, we’ve taken out the volume details outside of the pod and separated it on these PV and PVC objects. This way, any changes to storage, wont affect pod definition and all of the storage details are abstracted from our apps.

Lets see the general workflow for this model.

Workflow

First, Someone creates a volume in external storage (e.g. 10GB).
We then create a Persistent Volume (PV) in kubernetes that links back to volume in external storage. At this point volume exists on its own in kubernetes.
after that, we can claim above created PV using a request called Persistent Volume Claim (PVC) from a Pod and our application can use it.

We will see all these items in our demo in coming sections.

You can download the source-code for the application and yml files from this git repository (branch: k8storage). This is the same repo we used in the previous posts as well, I will simply add new YML files to it.

Create an EBS Volume in AWS

We have seen this process in previous post. Following same process I’ve created a volume using AWS web console:

So first step is done i.e. someone (e.g. storage-admin) created a volume on external storage.

Create a Persistent Volume (PV)

Next, we create a Persistent Volume (PV). Here is the YAML file for PV.

kubectl apply -f .\accounting-pv.yml

we can apply this file using kubectl as shown above and at this point this persistent volume exists in kubernetes cluster on its own.

Now, what I am using here is AWS EBS, but you can check the details for other storage plugins from official kubernetes website from this link:

So now, all this storage specific information will be contained in this YML file/ kubernetes object and if we want to change the storage provider, it will be done here.

Persistent Volume Claim

So, we have a PV and we will create a Persistent Volume Claim. Now, a claim is just a request for some storage and depending on the parameters, it will be linked to a PV in our system.

Following is YML file for PVC and corresponding command output. As you can see that we mentioned 10Gig storage and also specified accessModes and some selector information. Now the link to PV will be created based on these parameters and this case, it will be linked to PV we created in previous section.

We can then use same kubectl command to apply this YML to kubernetes:

kubectl apply -f .\accounting-pvc.yml

Once PVC is created, we can check the PV again ad see that its now bound to the claim.

Up to this point, we have created a PVC which is linked to a PV which is pointing to an EBS storage on AWS. All that is left, is to use this PVC from our Pod.

Using PVC with a Pod

As we are using kubernetes deployment to manage pod, lets use PVC information as shown below:

As you can see that the change is very straight-forward and due to this abstraction, we now have decoupled storage in Pod definition. Pod is just using a storage-claim, it has no idea how and where this storage is managed, whether it is AWS or Google or Azure Files and our solution is more portable.

Now, we can apply this new YML file:

kubectl apply -f .\accounting-db.deploymet-pv.yml

You can, remove previous deployment using following command before applying new storage information:

at this point, we have completed the static provisioning using persistence objects.

Testing

Our application should work as before, we can visit the deployed app URL, enter some data:

and then we can delete the Pod, so the deployment behind that pod, can spin up a new pod and we can check if our data is still available after that update. We covered this part in detail in previous post.

Here I have done those steps again, Once Pod is up, visit the app again, and you can check that application data is not lost.

Now, its a good idea to delete you test cluster once you are done with testing. This way you will not be billed for the cluster.

Summary

In this post, we saw how kubernetes persistence objects can help to decouple storage details from our pod definitions and this model is more portable, as, now if persistence details change, we do not have to update the pod/redeploy pods.

We learned about PersistentVolume and PersistentVolumeClaim objects and how to use those in static provisioning.

Now static provisioning is good, but we can go one step further with dynamic provisioning. We will cover dynamic provisioning in next post.

Let me know if you have some comments or questions. Till next time, Happy coding.