The most surprising thing about multi-tenant Flask apps with per-tenant schemas is that you can achieve true data isolation without spinning up entirely separate database instances for each tenant.
Let’s see this in action. Imagine we have a simple Flask app managing users and products.
from flask import Flask, request, jsonify
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import create_engine, text
from contextlib import contextmanager
app = Flask(__name__)
db = SQLAlchemy()
# --- Tenant-aware database connection ---
@contextmanager
def tenant_session(tenant_id):
"""Creates a temporary session for a specific tenant."""
# In a real app, you'd fetch this from a config or metadata store
DATABASE_URL = f"postgresql://user:password@host:port/mydatabase"
engine = create_engine(DATABASE_URL)
# Dynamically set the search_path for this connection
with engine.connect() as connection:
connection.execute(text(f"SET search_path TO {tenant_id}, public;"))
yield connection
# --- Models (common for all tenants) ---
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
username = db.Column(db.String(80), unique=True, nullable=False)
class Product(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(120), nullable=False)
price = db.Column(db.Float, nullable=False)
# --- Routes ---
@app.route('/<tenant_id>/users', methods=['POST'])
def create_user(tenant_id):
data = request.get_json()
username = data.get('username')
with tenant_session(tenant_id) as connection:
# We need to bind the SQLAlchemy session to this specific connection context
# This is a simplified example; a robust solution might involve custom session factories
session = db.session(bind=connection)
new_user = User(username=username)
session.add(new_user)
session.commit()
return jsonify({"message": "User created", "id": new_user.id}), 201
@app.route('/<tenant_id>/users', methods=['GET'])
def get_users(tenant_id):
with tenant_session(tenant_id) as connection:
session = db.session(bind=connection)
users = session.query(User).all()
return jsonify([{"id": u.id, "username": u.username} for u in users])
@app.route('/<tenant_id>/products', methods=['POST'])
def create_product(tenant_id):
data = request.get_json()
name = data.get('name')
price = data.get('price')
with tenant_session(tenant_id) as connection:
session = db.session(bind=connection)
new_product = Product(name=name, price=price)
session.add(new_product)
session.commit()
return jsonify({"message": "Product created", "id": new_product.id}), 201
@app.route('/<tenant_id>/products', methods=['GET'])
def get_products(tenant_id):
with tenant_session(tenant_id) as connection:
session = db.session(bind=connection)
products = session.query(Product).all()
return jsonify([{"id": p.id, "name": p.name, "price": p.price} for p in products])
if __name__ == '__main__':
# --- Setup (run once) ---
# In a real app, this would be part of your deployment/migration process
DATABASE_URL = "postgresql://user:password@host:port/mydatabase"
engine = create_engine(DATABASE_URL)
# Create schemas for tenants
tenants = ["tenant_a", "tenant_b"]
with engine.connect() as connection:
for tenant_id in tenants:
connection.execute(text(f"CREATE SCHEMA IF NOT EXISTS {tenant_id};"))
# Apply models to each tenant schema
db.metadata.create_all(bind=engine, tables=[User.__table__, Product.__table__], schema=tenant_id)
# Ensure public schema also has tables if needed for shared data
db.metadata.create_all(bind=engine, tables=[User.__table__, Product.__table__], schema="public")
app.run(debug=True)
When tenant_a makes a request to /tenant_a/users, the tenant_session context manager executes SET search_path TO tenant_a, public;. This tells PostgreSQL to look for tables first in the tenant_a schema, and if not found, then in the public schema. So, when session.query(User).all() is called, it queries tenant_a.users, not public.users or tenant_b.users.
The core problem this solves is data isolation for multiple customers on a single database instance, drastically reducing infrastructure costs and simplifying management compared to provisioning separate databases. Each tenant’s data resides in its own PostgreSQL schema. When a request comes in, we dynamically alter the search_path for the database connection. This is a PostgreSQL-specific feature that allows you to define a list of schemas to search in order when resolving unqualified object names (like table names). By setting the search_path to tenant_id, public, we ensure that queries for User or Product tables will first look within that specific tenant’s schema. If there were shared tables (e.g., a global_settings table), they could reside in the public schema, and the search_path ensures they are still accessible.
The exact levers you control are primarily the tenant_id that gets passed into the request handling and subsequently used to construct the search_path. In a real-world application, this tenant_id would typically be derived from the incoming request’s subdomain (e.g., tenant_a.myapp.com), a JWT token, or a lookup based on the authenticated user. The db.metadata.create_all call, when run with a schema argument, tells SQLAlchemy to create the tables within that specific schema during your initial setup or migration process.
What most people don’t realize is that the search_path mechanism isn’t just about isolating data; it’s also incredibly efficient. PostgreSQL’s query planner is aware of the search_path and will optimize queries accordingly. For instance, if tenant_a has a users table and you query SELECT * FROM users, PostgreSQL will attempt to resolve users in tenant_a first. If it exists there, it won’t even bother looking in public or tenant_b. This means you don’t pay a performance penalty for having multiple schemas; the lookups are direct and efficient.
The next concept you’ll likely run into is handling migrations across all tenant schemas consistently.