FastAPI applications can serve production traffic directly from a container, bypassing the need for external WSGI/ASGI servers like Gunicorn or Uvicorn.

Let’s see this in action. Imagine a simple FastAPI app:

# main.py
from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def read_root():
    return {"Hello": "World"}

@app.get("/items/{item_id}")
async def read_item(item_id: int, q: str | None = None):
    return {"item_id": item_id, "q": q}

To containerize this, we’ll use a Dockerfile:

# Dockerfile
FROM python:3.10-slim

WORKDIR /app

COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# This is the key part for direct serving
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"]

And a requirements.txt file:

fastapi
uvicorn[standard]

Now, build the image:

docker build -t my-fastapi-app .

And run it:

docker run -d -p 8000:80 my-fastapi-app

You can then access your API at http://localhost:8000.

The magic here is that uvicorn is an ASGI server, designed specifically for asynchronous Python frameworks like FastAPI. Unlike traditional WSGI servers (like Gunicorn or Waitress), Uvicorn can directly run your FastAPI application without an extra layer of abstraction. This means fewer moving parts, reduced overhead, and a simpler deployment pipeline.

Uvicorn operates by launching worker processes that continuously listen for incoming HTTP requests. When a request arrives, it’s passed to the appropriate worker process. These workers then execute your FastAPI application’s code, process the request, and send the response back. The [standard] part in uvicorn[standard] pulls in httptools and websockets, which are essential for high-performance HTTP and WebSocket handling, respectively, giving you a robust production-ready server out of the box.

The CMD instruction in the Dockerfile is critical. It specifies the command to run when the container starts. Here, uvicorn main:app --host 0.0.0.0 --port 80 tells Uvicorn to find your FastAPI application instance named app within the main.py file, bind to all network interfaces (0.0.0.0) on port 80 (the default HTTP port), and start serving requests.

What most people don’t realize is that Uvicorn’s --reload flag, commonly used during development, is actually a form of application-level hot-reloading. When enabled, Uvicorn watches your application files for changes and automatically restarts the worker processes without interrupting existing connections, making it remarkably resilient. For production, however, you’d typically disable --reload and rely on container orchestration (like Kubernetes or Docker Swarm) for process management and scaling.

The next step is often to integrate this containerized application into a CI/CD pipeline for automated deployments.

Want structured learning?

Take the full Fastapi course →