FastAPI applications, when deployed to Kubernetes, often need to expose health and readiness endpoints so the orchestrator can properly manage their lifecycle.
Imagine a FastAPI app running inside a Kubernetes pod. Kubernetes needs to know if your app is alive and ready to serve traffic. It does this by periodically hitting specific URLs (endpoints) on your application. If your app doesn’t respond or responds with an error, Kubernetes might restart the pod (liveness probe) or stop sending traffic to it (readiness probe).
Here’s a simple FastAPI app:
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
async def read_root():
return {"Hello": "World"}
@app.get("/status")
async def get_status():
# In a real app, this would check database connections, external services, etc.
return {"status": "ok"}
Now, let’s add proper health and readiness checks. These are typically just HTTP endpoints that return a success or failure status.
from fastapi import FastAPI, HTTPException
import logging
app = FastAPI()
# Simulate a dependency that might fail
DATABASE_CONNECTED = True
EXTERNAL_SERVICE_AVAILABLE = True
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@app.get("/")
async def read_root():
return {"Hello": "World"}
@app.get("/health")
async def health_check():
"""
Liveness probe endpoint.
Checks if the application process is running and responsive.
"""
logger.info("Health check requested.")
# A basic liveness check might just ensure the process is alive.
# For a more robust check, you could ping a local resource or
# check internal state that indicates the application is running
# but not necessarily ready for traffic.
return {"status": "up"}
@app.get("/ready")
async def readiness_check():
"""
Readiness probe endpoint.
Checks if the application is ready to serve traffic.
This means all its dependencies are met (e.g., database, cache, external services).
"""
logger.info("Readiness check requested.")
if not DATABASE_CONNECTED:
logger.error("Database connection failed.")
raise HTTPException(status_code=503, detail="Database not available")
if not EXTERNAL_SERVICE_AVAILABLE:
logger.error("External service unavailable.")
raise HTTPException(status_code=503, detail="External service not available")
# In a real application, you'd perform actual checks here:
# try:
# db_connection.execute("SELECT 1")
# except Exception as e:
# logger.error(f"Database check failed: {e}")
# raise HTTPException(status_code=503, detail="Database connection error")
#
# try:
# response = requests.get("http://external-service.com/health")
# response.raise_for_status()
# except Exception as e:
# logger.error(f"External service check failed: {e}")
# raise HTTPException(status_code=503, detail="External service error")
return {"status": "ready"}
# Simulate dependency failures for testing
@app.post("/simulate_db_failure")
async def simulate_db_failure():
global DATABASE_CONNECTED
DATABASE_CONNECTED = False
logger.warning("Simulating database failure.")
return {"message": "Database connection simulated as failed"}
@app.post("/simulate_db_recovery")
async def simulate_db_recovery():
global DATABASE_CONNECTED
DATABASE_CONNECTED = True
logger.warning("Simulating database recovery.")
return {"message": "Database connection simulated as recovered"}
@app.post("/simulate_external_service_failure")
async def simulate_external_service_failure():
global EXTERNAL_SERVICE_AVAILABLE
EXTERNAL_SERVICE_AVAILABLE = False
logger.warning("Simulating external service failure.")
return {"message": "External service simulated as unavailable"}
@app.post("/simulate_external_service_recovery")
async def simulate_external_service_recovery():
global EXTERNAL_SERVICE_AVAILABLE
EXTERNAL_SERVICE_AVAILABLE = True
logger.warning("Simulating external service recovery.")
return {"message": "External service simulated as available"}
In your Kubernetes deployment YAML, you would configure these endpoints as probes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-fastapi-app
spec:
replicas: 3
selector:
matchLabels:
app: fastapi
template:
metadata:
labels:
app: fastapi
spec:
containers:
- name: app
image: your-docker-image # Replace with your actual image
ports:
- containerPort: 8000
livenessProbe:
httpGet:
path: /health # This hits our /health endpoint
port: 8000
initialDelaySeconds: 15 # Wait 15 seconds before first probe
periodSeconds: 20 # Probe every 20 seconds
timeoutSeconds: 5 # Fail if no response within 5 seconds
failureThreshold: 3 # Consider the container unhealthy after 3 failures
readinessProbe:
httpGet:
path: /ready # This hits our /ready endpoint
port: 8000
initialDelaySeconds: 5 # Wait 5 seconds before first probe
periodSeconds: 10 # Probe every 10 seconds
timeoutSeconds: 5 # Fail if no response within 5 seconds
failureThreshold: 3 # Consider the container not ready after 3 failures
The core idea is that /health (liveness) is a quick check to see if the process is running and the application is capable of responding. It might just check if the web server thread is active. /ready (readiness), on the other hand, is a more thorough check. It verifies that the application has successfully initialized all its critical dependencies—like connecting to a database, loading configuration, or establishing connections to essential services. If /ready fails, Kubernetes will stop sending new traffic to that pod until the endpoint starts returning a success code (e.g., 200 OK).
The initialDelaySeconds is crucial. It gives your application time to start up and initialize before Kubernetes begins probing. Without it, Kubernetes might kill your pod before it even has a chance to become ready. periodSeconds defines how often the probe runs, timeoutSeconds is how long Kubernetes waits for a response, and failureThreshold is how many consecutive failures trigger the probe to fail.
A subtle but important point is how FastAPI’s HTTPException with a 503 Service Unavailable status code is interpreted by Kubernetes. When a probe endpoint returns a non-2xx status code (like 503), Kubernetes registers it as a probe failure. This is exactly what you want for a readiness probe when a dependency is down. The application itself is running, but it’s not ready to serve traffic.
The next thing you’ll likely encounter is managing the lifecycle of these dependencies more dynamically, perhaps using background tasks or startup events within FastAPI to signal readiness changes.