Kubernetes doesn’t actually run your Flask app; it orchestrates containers that do.
Let’s see it in action. Imagine we have a simple Flask app, app.py:
from flask import Flask
import os
app = Flask(__name__)
@app.route('/')
def hello_world():
return f"Hello from pod: {os.environ.get('HOSTNAME', 'unknown')}"
if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0', port=5000)
To deploy this on Kubernetes, we need a few things. First, a Dockerfile to build an image of our app:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]
And a requirements.txt:
Flask==2.2.2
We’d build this image and push it to a container registry (like Docker Hub, Google Container Registry, or AWS Elastic Container Registry). Let’s assume our image is my-flask-app:latest.
Now, the Kubernetes magic. We need two main YAML files: a Deployment and a Service.
The Deployment tells Kubernetes how to run our application. It specifies the container image, how many replicas (copies) of our app we want, and how to update them.
apiVersion: apps/v1
kind: Deployment
metadata:
name: flask-app-deployment
spec:
replicas: 3
selector:
matchLabels:
app: flask-app
template:
metadata:
labels:
app: flask-app
spec:
containers:
- name: flask-app-container
image: my-flask-app:latest # Replace with your actual image path
ports:
- containerPort: 5000
This Deployment will ensure we always have 3 pods running our Flask app. If a pod crashes, Kubernetes will automatically start a new one.
The Service is what exposes our application to the outside world or to other services within the cluster. It acts as a stable IP address and load balancer for our pods.
apiVersion: v1
kind: Service
metadata:
name: flask-app-service
spec:
selector:
app: flask-app # This must match the labels in the Deployment's template
ports:
- protocol: TCP
port: 80 # The port the service will be available on
targetPort: 5000 # The port your Flask app listens on inside the container
type: LoadBalancer # Exposes the service externally using a cloud provider's load balancer
With type: LoadBalancer, a cloud provider (like AWS, GCP, Azure) will provision an external IP address and load balancer for us. When you apply these YAMLs (kubectl apply -f deployment.yaml and kubectl apply -f service.yaml), Kubernetes does the heavy lifting. It pulls the my-flask-app:latest image, creates 3 pods, and then creates a Service that directs traffic to those pods on port 5000.
You can then get the external IP of the service:
kubectl get service flask-app-service
And access your app via that IP in your browser. Each time you refresh, you might see "Hello from pod:
The core problem this solves is managing the lifecycle and scaling of your application. Instead of manually starting, stopping, and load-balancing instances, Kubernetes handles it. It provides self-healing (restarting failed pods), scaling (adjusting replica count), and service discovery. The Deployment manages the desired state of your application’s pods, while the Service provides a consistent network endpoint.
A common point of confusion is the selector in the Service and the matchLabels in the Deployment. These are the glue. The Service uses the selector to find which pods it should send traffic to. The Deployment uses matchLabels to identify which pods it is responsible for managing. They must match, or the service won’t find your app’s pods.
The next step is often integrating your Flask app with a proper WSGI server like Gunicorn or uWSGI, rather than using Flask’s built-in development server, for production readiness.