What are Kubernetes Probes?
Probing the Container
- The kubelet checks containers periodically using probes
- Probes determine the health and readiness of containers
- Three types of probes: Startup, Readiness, and Liveness
- Each probe can use different action types to check health
- Essential for maintaining application reliability
Startup Probe
To know when a container has started
Readiness Probe
To know when a container is ready to accept traffic
Liveness Probe
Indicates whether the code is running or not
Probe Types Explained
Startup Probe
Used for slow-starting containers to determine when the application has successfully started.
- Disables liveness and readiness checks until it succeeds
- Useful for legacy applications with long startup times
- Prevents killing containers during initialization
Readiness Probe
Determines if a container is ready to serve requests.
- A failing readiness probe stops traffic to the pod
- Container remains running but not receiving traffic
- Essential for rolling updates and load balancing
Liveness Probe
Determines if the container is running properly.
- A failing liveness probe restarts the container
- Detects deadlocks and hung applications
- Ensures application remains responsive
Important: A failing readiness probe will stop the application from receiving traffic. A failing liveness probe will restart the container.
Probe Action Types
ExecAction
Execute a command inside the container
command:
- cat
- /app/healthy
TCPSocketAction
Check if a TCP socket port is open
port: 8080
HTTPGetAction
Performs an HTTP GET against a specific port and path
path: /healthz
port: 8080
HTTPGet Additional Options
httpGet:
path: /health
port: 8080
host: 127.0.0.1
scheme: HTTPS
httpHeaders:
- name: Custom-Header
value: Awesome
Probe Configuration Parameters
- initialDelaySeconds: Delay before first probe
- periodSeconds: How often to probe
- timeoutSeconds: Probe timeout
- successThreshold: Consecutive successes needed
- failureThreshold: Consecutive failures allowed
Complete Probes Example
apiVersion: v1
kind: Pod
metadata:
name: goproxy
labels:
app: goproxy
spec:
containers:
- name: goproxy
image: k8s.gcr.io/goproxy:0.1
ports:
- containerPort: 8080
# Startup Probe - for slow starting containers
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 3
periodSeconds: 10
# Readiness Probe - when container is ready for traffic
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
# Liveness Probe - if container is running properly
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
Startup Probe
- HTTP GET to /healthz on port 8080
- Checks every 10 seconds
- Allows 3 failures before giving up
- Disables other probes until successful
Readiness Probe
- TCP socket check on port 8080
- Starts after 5 seconds
- Checks every 10 seconds
- If fails, stops traffic to the pod
Liveness Probe
- TCP socket check on port 8080
- Starts after 15 seconds
- Checks every 20 seconds
- If fails, restarts the container
Advanced Probe Examples
Exec Action Example
apiVersion: v1
kind: Pod
metadata:
name: postgres-db
spec:
containers:
- name: postgres
image: postgres:13
env:
- name: POSTGRES_PASSWORD
value: "secret"
livenessProbe:
exec:
command:
- sh
- -c
- exec pg_isready -U postgres
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- sh
- -c
- exec pg_isready -U postgres
initialDelaySeconds: 5
periodSeconds: 5
HTTP Get with Headers Example
apiVersion: v1
kind: Pod
metadata:
name: web-application
spec:
containers:
- name: webapp
image: nginx:latest
livenessProbe:
httpGet:
path: /health
port: 80
httpHeaders:
- name: X-Custom-Auth
value: "Bearer token123"
- name: Accept
value: application/json
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 2
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 2
failureThreshold: 3
Database with All Three Probes
apiVersion: v1
kind: Pod
metadata:
name: mysql-database
spec:
containers:
- name: mysql
image: mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
value: "secret123"
startupProbe:
exec:
command:
- sh
- -c
- mysqladmin ping -h localhost -uroot -p${MYSQL_ROOT_PASSWORD}
failureThreshold: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- sh
- -c
- mysql -e 'SELECT 1' -uroot -p${MYSQL_ROOT_PASSWORD}
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe:
tcpSocket:
port: 3306
initialDelaySeconds: 30
periodSeconds: 10
Probes Best Practices
Configuration Guidelines
- Set appropriate initial delays: Allow applications to start properly
- Use conservative timeouts: Avoid false positives from slow responses
- Configure realistic periods: Balance between responsiveness and load
- Use startup probes for slow applications: Prevent unnecessary restarts
- Make readiness probes lightweight: They run frequently during traffic
- Test failure scenarios: Ensure probes work as expected
Application Design
- Implement proper health endpoints: /healthz, /readyz, /livez
- Make health checks independent: Don't depend on external services
- Include dependency checks in readiness: Database, cache, etc.
- Keep liveness checks simple: Basic "is the process running" check
- Use different endpoints: Separate health, readiness, and liveness
- Log probe failures: Help with debugging issues
Probe Configuration Recommendations
Startup Probe
- failureThreshold: 30
- periodSeconds: 10
- Use for apps taking >5min to start
Readiness Probe
- periodSeconds: 5-10
- timeoutSeconds: 1-3
- failureThreshold: 3
Liveness Probe
- periodSeconds: 10-30
- initialDelaySeconds: 15-30
- failureThreshold: 3
Warning: Avoid making liveness probes dependent on external services. If an external service fails, it could cause all your containers to restart in a cascade failure.