Express.js applications, when deployed to Kubernetes, are often treated as ephemeral, stateless services, but the reality is that many real-world Express apps need to maintain some form of state or interact with external services in ways that require careful consideration within the Kubernetes paradigm.

Let’s see an Express.js app running in a Kubernetes pod.

// app.js
const express = require('express');
const app = express();
const port = 3000;

app.get('/', (req, res) => {
  res.send('Hello from Express on Kubernetes!');
});

app.listen(port, () => {
  console.log(`Express app listening on port ${port}`);
});

This simple app doesn’t do much, but it’s the foundation. To run this on Kubernetes, you’d typically have a Dockerfile:

# Dockerfile
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "app.js"]

And then a Kubernetes deployment manifest:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: express-app-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: express-app
  template:
    metadata:
      labels:
        app: express-app
    spec:
      containers:
      - name: express-app
        image: your-dockerhub-username/express-app:latest
        ports:
        - containerPort: 3000

And a service to expose it:

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: express-app-service
spec:
  selector:
    app: express-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
  type: LoadBalancer

This setup allows multiple instances of your Express app to run concurrently, managed by Kubernetes. The Service of type LoadBalancer provisions an external IP address (or uses an existing one if your cloud provider supports it) and distributes incoming traffic across the replicas defined in the Deployment.

The core problem this solves is scalability and resilience. Instead of managing a single server, Kubernetes handles starting new pods if one fails, and you can easily scale the number of replicas up or down based on traffic. The Service abstraction ensures that traffic always has a stable entry point, even if individual pods are created or destroyed.

Internally, when a request hits the LoadBalancer IP, it’s directed to one of the healthy pods running your Express app. Kubernetes’ kube-proxy (or a similar component depending on your CNI) is responsible for the service discovery and load balancing logic at the network level. Your Express app itself simply listens on 0.0.0.0:3000 and responds to requests.

The critical levers you control are:

  • replicas: How many instances of your app should run.
  • image: The Docker image containing your Express app.
  • containerPort: The port your Express app listens on inside the container.
  • port (in Service): The port the Kubernetes Service exposes.
  • targetPort (in Service): The containerPort to which the Service directs traffic.
  • selector: How the Service finds the pods belonging to your Deployment.

A common misconception is that LoadBalancer is the only way to expose an Express app. For internal cluster communication, or when using an Ingress controller, you’d use type: ClusterIP for the Service and manage external access via an Ingress resource. This is often more cost-effective and provides richer routing capabilities.

When you define a Service with type: LoadBalancer, Kubernetes doesn’t magically create a load balancer. It signals to your cloud provider (AWS, GCP, Azure, etc.) or an on-premises load balancer controller (like MetalLB) to provision an actual external load balancer and configure it to forward traffic to the NodePorts that kube-proxy opens on each node for your Service. The LoadBalancer type is essentially a higher-level abstraction that orchestrates these underlying resources.

The next concept you’ll likely grapple with is how to manage persistent data or shared configurations for your Express.js applications running in Kubernetes.

Want structured learning?

Take the full Express course →