The Container Data Problem
Containers are Ephemeral and Stateless
- You usually don't store data in containers
- Non-persistent data is stored locally on a writable layer
- This is the default behavior - just write to the filesystem
- When containers are destroyed, so is the data inside them
Definition of 'Ephemeral'
ephemerous (ɪˈfɛmərəs)
ADJECTIVE
1. zoology: relating to an ephemeron
2. short-lived
Collins English Dictionary. Copyright © HarperCollins
Important: Without persistent storage, any data written to a container's filesystem will be lost when the container is terminated, updated, or rescheduled to a different node.
The Solution: Volumes
We need to store data outside the container in a Volume. Volumes let containers store data into external storage systems.
Data Persistence
Data survives container restarts, updates, and rescheduling
Data Sharing
Multiple containers can share the same volume
Storage Plugins
Vendors create plugins according to Container Storage Interface
Storage Provisioning Methods
Static Provisioning
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-static
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: manual
hostPath:
path: "/mnt/data"
- Admin manually creates PersistentVolumes
- Developer claims them with PersistentVolumeClaims
- Good for small clusters or specific storage requirements
- More administrative overhead
Dynamic Provisioning
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-dynamic
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast
resources:
requests:
storage: 5Gi
- Storage automatically provisioned on demand
- Uses StorageClasses to define provisioners
- Ideal for cloud environments and large clusters
- Less administrative overhead
StorageClass Example
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
fsType: ext4
reclaimPolicy: Delete
volumeBindingMode: Immediate
This StorageClass uses AWS EBS with gp2 volumes and ext4 filesystem
Common Volume Types
HostPath
Mounts a directory from the host node's filesystem
path: /data
Cloud Volumes
AWS EBS, GCE PD, Azure Disk
volumeID: vol-123456
NFS
Network File System for shared storage
server: nfs-server.local
path: /exports/data
ConfigMap/Secret
Mount configuration data as files
name: app-config
EmptyDir
Temporary directory that shares a pod's lifetime
CSI Volumes
Container Storage Interface for vendor plugins
driver: ebs.csi.aws.com
Essential Storage Commands
List Storage Resources
kubectl get pvc
kubectl get storageclass
Get PersistentVolumes, PersistentVolumeClaims, and StorageClasses
Describe Resources
kubectl describe pvc [pvc-name]
kubectl describe storageclass [sc-name]
Get detailed information about storage resources
Create from YAML
kubectl apply -f pvc.yaml
kubectl apply -f storageclass.yaml
Create storage resources from configuration files
Delete Resources
kubectl delete pvc [pvc-name]
kubectl delete -f storage-config.yaml
Delete storage resources
Complete Example: Database with Persistent Storage
# PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast
resources:
requests:
storage: 10Gi
---
# Deployment with Volume
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
spec:
selector:
matchLabels:
app: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: mysql
spec:
containers:
- image: mysql:8.0
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: password
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pvc
Key Points
- PVC requests 10GB of storage from the "fast" StorageClass
- Deployment mounts the PVC at /var/lib/mysql in the container
- MySQL data persists even if the pod is rescheduled
- Using Recreate strategy ensures only one pod accesses the volume
Access Modes
- ReadWriteOnce (RWO): Read-write by a single node
- ReadOnlyMany (ROX): Read-only by many nodes
- ReadWriteMany (RWX): Read-write by many nodes
- ReadWriteOncePod (RWOP): Read-write by a single pod
Storage Best Practices
Planning & Configuration
- Use dynamic provisioning for most use cases
- Choose appropriate access modes for your workload
- Set proper storage class based on performance needs
- Use StatefulSets for stateful applications
- Consider volume snapshots for backup strategies
Operations & Maintenance
- Monitor storage usage and set up alerts
- Understand reclaim policies (Delete, Retain, Recycle)
- Use volume expansion features when available
- Test backup and restore procedures regularly
- Document storage requirements for each application
StatefulSet Example
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx"
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "fast"
resources:
requests:
storage: 1Gi
StatefulSets automatically create PVCs for each pod (web-0, web-1, web-2)