Serve gRPC Endpoints Alongside FastAPI REST Routes (2026)

FastAPI can serve gRPC endpoints alongside its RESTful routes, but it requires a bit of explicit setup to bridge the two worlds.

Here’s a live example of how you might structure this:

import uvicorn
from fastapi import FastAPI
from contextlib import asynccontextmanager
import grpc
from my_grpc_service_pb2_grpc import add_MyServiceServicer_to_server
from my_grpc_service_pb2 import MyResponse
from my_grpc_server import MyServiceServicer # Assuming this is your gRPC servicer implementation

# --- FastAPI App Setup ---
@asynccontextmanager
async def lifespan(app: FastAPI):
    # Setup gRPC server
    grpc_server = grpc.server(
        # You can configure thread pools here if needed
        # e.g., futures.ThreadPoolExecutor(max_workers=10)
    )
    add_MyServiceServicer_to_server(MyServiceServicer(), grpc_server)
    grpc_server.add_insecure_port('[::]:50051') # gRPC listens on a different port
    grpc_server.start()
    print("gRPC server started on port 50051")

    # Setup FastAPI app
    app.state.grpc_server = grpc_server

    yield

    # Shutdown gRPC server
    app.state.grpc_server.stop(0)
    print("gRPC server stopped")

app = FastAPI(lifespan=lifespan)

@app.get("/")
async def read_root():
    return {"message": "Hello from FastAPI!"}

@app.get("/items/{item_id}")
async def read_item(item_id: int, query_param: str | None = None):
    return {"item_id": item_id, "query_param": query_param}

# --- gRPC Service Definition (my_grpc_service.proto) ---
# syntax = "proto3";
#
# service MyService {
#   rpc SayHello (HelloRequest) returns (MyResponse);
# }
#
# message HelloRequest {
#   string name = 1;
# }
#
# message MyResponse {
#   string message = 1;
# }

# --- gRPC Servicer Implementation (my_grpc_server.py) ---
# from my_grpc_service_pb2 import MyResponse
#
# class MyServiceServicer(my_grpc_service_pb2_grpc.MyServiceServicer):
#     async def SayHello(self, request, context):
#         return MyResponse(message=f"Hello, {request.name}!")

# --- To run this: ---
# 1. Save the proto as my_grpc_service.proto
# 2. Generate Python code: python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. my_grpc_service.proto
# 3. Save the FastAPI code as main.py
# 4. Save the servicer as my_grpc_server.py
# 5. Install dependencies: pip install fastapi uvicorn grpcio grpcio-tools
# 6. Run: uvicorn main:app --reload

# --- To test gRPC: ---
# (You'll need a separate gRPC client)
# import grpc
# from my_grpc_service_pb2_grpc import MyServiceStub
# from my_grpc_service_pb2 import HelloRequest
#
# with grpc.insecure_channel('localhost:50051') as channel:
#     stub = MyServiceStub(channel)
#     response = stub.SayHello(HelloRequest(name='World'))
#     print(response.message) # Output: Hello, World!

The most surprising truth is that you don’t run a single server process that handles both HTTP and gRPC natively; you run two separate servers orchestrated by a single lifecycle manager.

Here’s the system in action. We have a FastAPI app that exposes a simple REST endpoint (/items/{item_id}). In parallel, it also starts a gRPC server that listens on a different port (50051 in this case) and exposes a SayHello RPC. The lifespan context manager in FastAPI is key here. It allows us to start and stop the gRPC server as part of the FastAPI application’s startup and shutdown sequence. When FastAPI starts, the __enter__ part of the lifespan context manager runs, initializing and starting the gRPC server. When FastAPI shuts down, the __exit__ part gracefully stops the gRPC server.

The core idea is that FastAPI is an ASGI framework, designed for asynchronous web servers. gRPC, on the other hand, typically uses its own server implementation, which might be synchronous or asynchronous depending on the library. In this setup, we’re using grpcio, the standard Python gRPC library, which provides its own server class. We instantiate this grpc.server and tell it to listen on a specific port, separate from the port FastAPI uses (usually 8000 for Uvicorn). The add_insecure_port method is crucial for specifying where the gRPC server will be accessible.

The add_MyServiceServicer_to_server function is generated by grpc_tools.protoc from your .proto file. It takes an instance of your servicer class (which implements the actual RPC logic) and registers it with the gRPC server. Your servicer class, like MyServiceServicer in the example, inherits from the generated base servicer and implements the asynchronous RPC methods defined in the .proto file.

The mental model here is a "dual-headed" application. One head is the familiar FastAPI/Uvicorn stack handling HTTP/S requests. The other head is a standalone gRPC server running concurrently. They share a common lifecycle managed by FastAPI’s lifespan hook, ensuring that both start and stop cleanly together. You don’t modify FastAPI’s core request handling to intercept gRPC calls; instead, you’re essentially running two distinct network services from a single deployment unit. The app.state.grpc_server is a common pattern to store application-level state, in this case, a reference to the running gRPC server instance so it can be stopped later.

What most people don’t realize is that the gRPC server itself is a blocking or asynchronous loop that runs independently of the ASGI event loop that Uvicorn manages for FastAPI. While the lifespan context manager starts and stops the gRPC server, the gRPC server’s execution is managed by its own internal threading or event loop mechanisms provided by grpcio. You can configure these, for instance, by passing a concurrent.futures.ThreadPoolExecutor to grpc.server to control how many requests it can handle concurrently, but this is entirely separate from FastAPI’s concurrency model.

The next hurdle is often managing authentication and authorization consistently across both HTTP and gRPC endpoints.