Kubernetes Storage – Dynamic Provisioning using Storage Classes

Introduction

In previous post, we learned that Kubernetes Persistent Volume Subsystem decouples data from application pods and containers and abstracts implementation details. The main components are as follows:

  • PersistentVolume (expose external storage to a PV)
  • PersistentVolumeClaim (a request for persistent volume)
  • StorageClasses (for dynamic provisioning)

We then used PersistentVolume and PersistentVolumeClaim in a static provisioning manner along with AWS EBS service to provide storage for our statefull application part (Postgres).

In this post, we will learn discuss StorageClass and will see how this kubernetes object allow for dynamic provisioning which results in greater portability and eliminates the need to manually create the volumes on external storage and then create PV for those. We will see how to use this powerful kubernetes mechanism for our applications storage needs.

A StorageClass (SC) is a type of storage template that can be used to dynamically provision the storage. It is a full on kubernetes API object, you can define it in a YML file, it has a name and some properties. StorgeClass is an immutable object.

You can create and use multiple storage classes, pointing to different storage providers at the same time e.g. a fast storage class for things like cache and slow storage class for backup. When you have multiple storage classes, you can specify a default one (when we do not specify a StorageClass name in PVC, kubernetes uses the default one and this results in even more portable solution).

Typically we give abstract names to these storage classes e.g. slow, fast, backup, common etc. and all the PV details are hidden. This characteristic provide abstractness from implementation details.

Setting the Scene

We will be building on top of previous post. So, if you are new to these topics, first check the previous posts for background information. You can download the source-code for the application and yml files from this git repository (branch: k8storage). This is the same code base from earlier posts and I will simply add new files to this code base.

We have the following infrastructure setup from previous post:

and we also have corresponding YML files in the code repo. Please check the setup and create this infrastructure if needed.

Create a kubernetes cluster (if you do not have already one)

eksctl create cluster `
--name ab-cluster `
--nodegroup-name standard-workers `
--node-type t3.micro
--nodes 3 `
--nodes-min 2 `
--nodes-max 4 `
--node-ami auto

I am using AWS EKS for kubernetes cluster and EBS for storage purposes., but you can use any other provider e.g. GoogleCloud or Azure and principles are same.

Dynamic Provisioning (Storage Classes)

In Kubernets, Dynamic provisioning is mostly the preferred way to meet storage needs. In general, dynamic provisioning is mainly about the followings:

  • Dynamic Volume Creation
  • Create different class or tier of volumes
  • It helps reduce complexity.

We can use StorageClass object for this purpose. So the first benefit we get is that we no longer need to manually create the PersistentVolume (PV). We also get rid of losing storage space issue which we had with static provisioning because now the PV creation is dynamic, it will match the PVC request.

Following is the updated arrangement with StorageClass (SC) to manage dynamic PV creation.

Workflow

Here is the simple workflow involving storage classes:

  • Create a StorageClass (SC).
  • Create PVC that references SC.
  • Kubernetes uses SC provisioner to provision a PersistentVolume when the PVC is referenced in a POD volume and applied.
  • Storage is provisioned, PV is Created and get bound to PVC.

Creating a StorageClass (SC)

Before, we actually create a storage class, lets first check, if we already have any SC:

kubectl get sc

here is the command output:

As you can see that there is already a default storage class (gp2) is setup by AWS. We can simply use it. Note how easy this is to just to create a PVC and if it doesn’t reference any SC, this default is there, ready to be used. However, we will create one StorageClass ourselves.

Here is a very simple YML file for storage class:

As you can see the Kind is StorageClass. We gave it a name “standard” and also specified that this shall be set as a default storage class (using annotations).

annotations ‘is-default-class: “true”‘ : Specify default class for cluster. So any PVC, which doesn’t request a particular class of storage, gets cluster’s default class.

Provisioner: is the name of the library responsible for provisioning or deleting volumes.

parameters: are some information details about the storage used by the controller and the provisioner. These parameters differ from one storage to another and are specific to storage providers.

That way, the non-portable part of the storage class is contained in one place (parameters). This is good for portability. For example, if you have to port your application let’s say from Dev with NFS as storage to AWS with EBS or on google cloud, all you have to change is the storage class provisioner and its parameters. All the rest is the same. All kubernetes user sees is storage class name. For information and examples about other provisioners, please check the official documentation on this link.

At this point, storage class is created and its definition exist on cluster. But there are no PV for it yet. The whole point that the PVs get created on demand.

Now, In the background, there is a controller running on master and in this case, they are watching the API server and looking for new PVCs that references this standard class. Anytime, controller sees one, it magically creates the right type of volume on the external storage backend and it creates the PV.

So, we do not have to create PVs anymore but, we do need to create the PVCs that references them. But before, we do anything, we now have two default storage classes. Let’s edit the gp2 storage class and remove the annotation about it being default. This step is optional and no need to do if you do not have multiple default storage classes:

kubectl edit sc gp2

this command will open up definition in notepad and we can edit (by setting empty value for is-default-class) and save it as shown below:

now, if we check storage classes again, we can see the change has taken place:

So, at this point, we have a StorageClass (SC) name standard and next lets see how it is used to provide storage for our application.

Create a PVC

Following is PVC definition which uses the above created standard storage class (but because standard is also the default storage class for the cluster, so even if we do not exclusively specify it in this YML file, it will use this default storage class any way by default. But here we will mention it exclusively by name as shown below:

As, you can see that we mentioned the storageClassName and that’s it. You can also see that we can now apply the YML file and our PVC is created. However, there is not PV created yet (because we specified WaitForFirstConsumer in SC).

POD YML to use PVC

I copy the deployment file for db from previous post and just changed the PVC name as shown below:

This is exactly the same YML file from previous post. I just copied it to a new file, rename the file and only thing I changed is the claimName information to point to the new PVC we just created.

When we apply this file. The result is now the PersistentVolume PV is also automatically created because it was waiting for first consumer.

If I visit the AWS EBS web console, I can see that a new volume is created with exactly the same size we requested through PVC:

Testing

Our db pod is deployed, it is using StorageClass and for testing, lets deploy the service for db pod:

kubectl apply -f .\accounting-db.service.yml

I deployed the web part too to the cluster by using following command:

kubectl apply -f .\accounting-web.yml

We have done these parts before in earlier posts. So I am not going into those details.

Here is the load balancer address once web is deployed:

and when I visit the URL, I can see the application is working as expected.

I created few records using the Web UI. Then deleted the pod (because there is a deployment behind the pod, kubernetes start another new pod and we can check if our data is still there and in this case, it is there).

Deleting a PVC

So, what happens if we delete a PVC. Well, it depends on the ReclaimPolicy set in PVC. For dynamic provisioning, By default, if we do not put anything, it is Delete.

So if we delete a PVC, it will also delete the corresponding PV and the Volume. Let’s check this next.

I have deleted the application related Pods, Services etc. As you can see that I still have a PVC and corresponding PV.

We can delete the PVC using following command:

kubectl delete pvc accounting-pvc-sc

and once its deleted, if we check PV again, we will see that it is gone too and same for the volume on AWS EBS:

As mentioned above, this is the default behavior for StorageClass, however if you need, you can change it to Reclaim. But if possible, leave the default behavior (Delete) unchanged and instead introduce some backup strategy for data.

Summary

StorageClass is a kubernetres API object, which eliminates the work required to provision volumes statically. We can have multiple storage classes for different business needs and we can specify a default storage class for our cluster, which is automatically used when none is specified in PVC. Use of StorageClass is recommended as they enhance our application portability by abstracting concrete storage details.

In additional to StorageClasses, we can still use static provisioning same time (we do that by specifying an empty StorageClass in PVC to force use of static provisioning).

You can download the source-code for the application and yml files from this git repository (branch: k8storage).

Let me know if you have some questions or comments. Till next time, Happy Coding.