
- Kubernetes Tutorial
- Kubernetes - Home
- Kubernetes - Overview
- Kubernetes - Architecture
- Kubernetes - Setup
- Kubernetes - Setup on Ubuntu
- Kubernetes - Images
- Kubernetes - Jobs
- Kubernetes - Labels & Selectors
- Kubernetes - Namespace
- Kubernetes - Node
- Kubernetes - Service
- Kubernetes - POD
- Kubernetes - Replication Controller
- Kubernetes - Replica Sets
- Kubernetes - Deployments
- Kubernetes - Volumes
- Kubernetes - Secrets
- Kubernetes - Network Policy
- Advanced Kubernetes
- Kubernetes - API
- Kubernetes - Kubectl
- Kubernetes - Kubectl Commands
- Kubernetes - Creating an App
- Kubernetes - App Deployment
- Kubernetes - Autoscaling
- Kubernetes - Dashboard Setup
- Kubernetes - Helm Package Management
- Kubernetes - CI/CD Integration
- Kubernetes - Persistent Storage and PVCs
- Kubernetes - RBAC
- Kubernetes - Logging & Monitoring
- Kubernetes - Service Mesh with Istio
- Kubernetes - Backup and Disaster Recovery
- Managing ConfigMaps and Secrets
- Running Stateful Applications
- Kubernetes Useful Resources
- Kubernetes - Quick Guide
- Kubernetes - Useful Resources
- Kubernetes - Discussion
Persistent Storage and PVCs in Kubernetes
In Kubernetes, storage is a critical component of managing applications, especially stateful ones that require data persistence. By default, Kubernetes pods are ephemeral, meaning that any data stored within them is lost if the pod is restarted, rescheduled, or deleted. To solve this problem, Kubernetes provides Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), which allow applications to retain data beyond the lifecycle of a pod.
In this chapter, we will explore how Kubernetes handles persistent storage, the key components involved, and how to use Persistent Volumes and Persistent Volume Claims to ensure data is retained across pod restarts.
Understanding Kubernetes Storage
Kubernetes offers two primary types of storage −
- Ephemeral Storage − Temporary storage that is lost when the pod is deleted.
- Persistent Storage − Storage that remains available even if the pod using it is removed or rescheduled.
Why Persistent Storage Matters
Many applications, such as databases, need to store data that survives beyond the pod's lifecycle. Kubernetes dynamically schedules pods across different nodes, making local storage unreliable. Persistent storage provides a way to decouple storage from compute resources, enabling flexible scaling.
What is a Persistent Volume?
A Persistent Volume (PV) is a piece of storage in the Kubernetes cluster that has been provisioned by an administrator. It is independent of any single pod and provides an abstraction for storage systems like NFS, AWS EBS, Azure Disks, or Google Persistent Disks.
What is a Persistent Volume Claim?
A Persistent Volume Claim (PVC) is a request made by a pod to use a Persistent Volume. It allows users to dynamically request storage resources without needing to manage the storage infrastructure themselves.
Key Components of a PV
Following are the key components of a Persistent Volume -
- Capacity − Defines the storage size (e.g., 10Gi).
-
Access Modes −
- ReadWriteOnce (RWO) - Single-node read/write.
- ReadOnlyMany (ROX) - Multiple nodes read-only.
- ReadWriteMany (RWX) - Multiple nodes read/write.
-
Reclaim Policy −
- Retain - Keeps data after a PV is released.
- Delete - Deletes storage after a PV is released.
- Recycle - Performs a basic cleanup before reuse.
- Storage Class − Defines different types of storage with parameters.
How PVCs Work
When a PVC is created, Kubernetes tries to bind it to an existing PV with matching storage requirements.
If a suitable PV is found, it is bound to the PVC.
If no PV is available, the PVC remains pending until a suitable PV is created.
Setting Up Persistent Storage in Kubernetes
To understand how Kubernetes handles persistent storage, let's go through a hands-on example where we create a Persistent Volume (PV), a Persistent Volume Claim (PVC), and attach it to a pod.
Step 1. Creating a Persistent Volume
A PV is defined using a YAML manifest, specifying storage capacity, access modes, and the underlying storage provider. We will first create a Persistent Volume (PV) that provisions 1Gi of storage from the host path.
Using an editor, create the following YAML file -
$ sudo nano pv.yaml
Paste the following content −
apiVersion: v1 kind: PersistentVolume metadata: name: my-pv spec: capacity: storage: 1Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain hostPath: path: "/mnt/data"
Apply the configuration -
$ kubectl apply -f pv.yaml
Output
persistentvolume/my-pv created
This PV reserves 1Gi of storage on the host machine at /mnt/data. The Retain reclaim policy ensures that the storage is not deleted when the PVC releases it.
Step 2. Creating a Persistent Volume Claim
A PVC allows pods to request storage. Kubernetes will bind it to an available PV.
Using an editor, create the following YAML file -
$ sudo nano pvc.yaml
Paste the following content -
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: my-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi
Apply the configuration -
$ kubectl apply -f pvc.yaml
Output
It will produce the following output -
persistentvolumeclaim/my-pvc created
The PVC requests 1Gi of storage with ReadWriteOnce access mode. If a matching PV is available, Kubernetes binds the PVC to it.
Verify the PVC status -
$ kubectl get pvc
Output
It will produce the following output -
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE data-mymysql-0 Pending <unset> 6d1h my-pvc Bound my-pv 1Gi RWO <unset> 55s
The PVC is in a Bound state, meaning Kubernetes successfully allocated the requested storage.
Mounting a PVC to a Pod
To use the PVC, we need to attach it to a pod. Using an editor, create the following YAML file -
$ sudo nano pod.yaml
Paste the following content -
apiVersion: v1 kind: Pod metadata: name: my-pod spec: containers: - name: app-container image: nginx volumeMounts: - mountPath: "/usr/share/nginx/html" name: storage volumes: - name: storage persistentVolumeClaim: claimName: my-pvc
Explanation −
When the Pod is scheduled, Kubernetes attaches the Persistent Volume (PV) associated with the my-pvc claim. The container accesses the PVC storage at /usr/share/nginx/html. Any data placed in /usr/share/nginx/html inside the container remains available across Pod restarts.
Apply the configuration -
$ kubectl apply -f pod.yaml
Output
It will produce the following output -
pod/my-pod created
Now, the pod will use the PVC to persist its data.
Dynamic Provisioning with Storage Classes
Manual PV provisioning is inefficient for large-scale deployments. Storage Classes allow dynamic provisioning, enabling Kubernetes to create PVs automatically when a PVC is requested.
Creating a Storage ClassUsing an editor, create the following YAML file −
$ nano storage-class.yaml
Paste the following content −
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast-storage provisioner: kubernetes.io/aws-ebs parameters: type: gp2 reclaimPolicy: Delete
Apply the configuration −
$ kubectl apply -f storage-class.yaml
Output
It will produce the following output -
storageclass.storage.k8s.io/fast-storage created
This confirms that a new StorageClass named fast-storage has been created. This StorageClass enables dynamic provisioning using AWS EBS volumes with the gp2 type, and its reclaimPolicy is set to Delete, meaning the volume is automatically deleted when the PVC is deleted.
Using a Storage Class in a PVC
Using an editor, create the following YAML file -
$ sudo nano dynamic-pvc.yaml
Paste the following content -
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: dynamic-pvc spec: storageClassName: fast-storage accessModes: - ReadWriteOnce resources: requests: storage: 5Gi
Apply the configuration −
kubectl apply -f dynamic-pvc.yaml
Output
It will produce the following output -
persistentvolumeclaim/dynamic-pvc created
Since this PVC references the fast-storage StorageClass, Kubernetes automatically provisions a new AWS EBS volume with 5Gi of storage. Unlike manually created PVs, this storage is dynamically created and managed by Kubernetes.
Managing and Deleting PVCs and PVs
Checking Persistent Volumes −
To list existing PVs -
$ kubectl get pv
Output
It will produce the following output -
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE my-pv 1Gi RWO Retain Bound default/my-pvc <unset> 11m
Checking Persistent Volume Claims −
To list all PVCs -
$ kubectl get pvc
Output
It will produce the following output -
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE data-mymysql-0 Pending <unset> 6d1h dynamic-pvc Pending fast-storage <unset> 2m30s my-pvc Bound my-pv 1Gi RWO <unset> 11m
Deleting a PVC
To remove a PVC −
$ kubectl delete pvc my-pvc
Output
It will produce the following output -
persistentvolumeclaim "my-pvc" deleted
Note − Deleting a PVC does not immediately delete the associated PV unless the reclaim policy is Delete.
Real-World Use Cases for Persistent Storage
Databases − Persistent storage is critical for databases such as MySQL, PostgreSQL, or MongoDB in Kubernetes. Without it, database data would be lost every time a pod restarts.
Logging and Monitoring − Applications often need persistent storage for logging purposes. Tools like ELK Stack (Elasticsearch, Logstash, Kibana) or Prometheus rely on PVCs to store log and metric data.
File Storage for Web Applications − Web applications, such as a CMS (WordPress, Drupal), require persistent storage to save uploaded images, documents, and media files.
Troubleshooting Persistent Storage Issues
PVC Stuck in Pending State − Check if a matching PV exists using kubectl get pv. Ensure the PV has the correct storage class and size.
Pod Stuck in ContainerCreating State − Verify that the PVC is correctly bound using kubectl get pvc. Check logs using kubectl describe pod <pod-name>.
Best Practices for Managing Persistent Storage
- Use Storage Classes − Instead of manually defining PVs, use StorageClass to provision storage dynamically.
- Monitor Storage Usage − Keep an eye on storage utilization using kubectl get pvc and Prometheus metrics.
- Backup Strategies − Implement backup solutions like Velero or cloud-based snapshots.
- Use ReadWriteMany (RWX) when needed − Some workloads may require shared access to the same storage.
- Automate Cleanup − Implement policies to reclaim unused storage to optimize resource utilization.
Conclusion
Persistent storage in Kubernetes ensures that applications requiring data retention can function reliably. By using Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), Kubernetes enables decoupled and scalable storage solutions. Dynamic provisioning with Storage Classes further automates storage management, making deployments more efficient.
Understanding these concepts is essential for running stateful applications, such as databases and content management systems, in a Kubernetes environment. By properly configuring persistent storage, we can ensure that critical data remains intact, even when pods are rescheduled or restarted.