The order in which FastAPI middleware runs is determined by how you add them, with the last one added executing first for incoming requests and last for outgoing responses.
Let’s see this in action. Imagine we have a simple FastAPI app:
from fastapi import FastAPI, Request
app = FastAPI()
@app.middleware("http")
async def add_process_time_header(request: Request, call_next):
import time
start_time = time.time()
response = await call_next(request)
process_time = time.time() - start_time
response.headers["X-Process-Time"] = str(process_time)
return response
@app.middleware("http")
async def add_server_header(request: Request, call_next):
response = await call_next(request)
response.headers["X-Server"] = "MyFastAPIServer"
return response
@app.get("/")
async def main():
return {"message": "Hello World"}
If you run this with uvicorn main:app --reload and hit http://127.0.0.1:8000/, you’ll see something like this in the response headers:
HTTP/1.1 200 OK
...
X-Process-Time: 0.00123456789
X-Server: MyFastAPIServer
...
Notice X-Process-Time appears before X-Server. This is because add_server_header was defined after add_process_time_header in the code. For an incoming request, FastAPI calls add_server_header first. Inside add_server_header, it calls await call_next(request), which then calls add_process_time_header. This continues down the chain until the actual endpoint handler (main in this case) is reached.
Once the endpoint handler returns a response, the middleware execution reverses. The response from main is first passed back to add_process_time_header, which adds its header, and then that response is passed back to add_server_header, which adds its header. Finally, the fully decorated response is sent back to the client.
This stacking behavior is fundamental to how middleware works. Each middleware wraps the subsequent layers (including other middleware and the endpoint). When call_next is invoked, it passes the request (or response) to the next middleware in the chain. If call_next is not invoked, the chain is broken, and the request processing stops at that middleware.
This allows for powerful patterns like authentication, logging, request modification, and response transformation. For instance, an authentication middleware added last (and thus executing first) can check credentials and either return an error response immediately or call next to allow the request to proceed to the rest of the application. A logging middleware added first (executing last on response) can record details about the request and the final response.
The key is understanding that call_next is the bridge. If you want middleware A to run before middleware B on the way in, you define B after A in your code. Conversely, if you want A to process the response after B, you define A before B.
The add_server_header middleware is actually added after add_process_time_header. When a request comes in, FastAPI’s ASGI application wrapper calls the middleware in the order they were registered, but it’s a bit of a "stacking" effect. The add_server_header middleware is the outermost layer for incoming requests. It receives the request, then calls await call_next(request), which passes the request to the next middleware in the registration order, add_process_time_header. This continues until the endpoint is reached. The response then travels back up the stack in the reverse order of call_next invocations. So, the response from the endpoint is processed by add_process_time_header first, and then its result is processed by add_server_header. This is why X-Process-Time is added before X-Server in the final response headers, even though add_server_header was defined later in the code.
When you define middleware using @app.middleware("http"), you are essentially pushing it onto a stack. The first middleware defined becomes the innermost handler for the request, and the last middleware defined becomes the outermost. For an incoming request, control flows from the outermost middleware inwards. For an outgoing response, control flows from the innermost middleware outwards. This means the middleware added last in your code will be the first one to receive the incoming request and the last one to see the outgoing response.
The next logical step is understanding how to conditionally skip middleware execution or how to chain custom middleware classes instead of using the decorator.