Soft delete isn’t just about marking a record as inactive; it’s fundamentally about preserving historical data while maintaining the illusion of a clean, current dataset.
Let’s see this in action. Imagine we have a simple User model with a deleted flag.
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import Column, Integer, String, Boolean
from sqlalchemy.orm import declarative_base
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///:memory:'
db = SQLAlchemy(app)
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
username = Column(String(50), unique=True, nullable=False)
email = Column(String(120), unique=True, nullable=False)
deleted = Column(Boolean, default=False, nullable=False)
def __repr__(self):
return f"<User {self.username}>"
with app.app_context():
db.create_all()
# Create some users
user1 = User(username='alice', email='alice@example.com')
user2 = User(username='bob', email='bob@example.com')
user3 = User(username='charlie', email='charlie@example.com')
db.session.add_all([user1, user2, user3])
db.session.commit()
print("All users:")
print(User.query.all())
# Soft delete bob
bob = User.query.filter_by(username='bob').first()
bob.deleted = True
db.session.commit()
print("\nUsers after soft deleting bob:")
print(User.query.all())
# Let's try to query bob directly
print("\nAttempting to query bob directly:")
print(User.query.get(bob.id)) # This will still show bob
# Now, the crucial part: querying without deleted users
print("\nQuerying only active users:")
print(User.query.filter_by(deleted=False).all())
# Restoring bob
bob_to_restore = User.query.filter_by(username='bob').first()
bob_to_restore.deleted = False
db.session.commit()
print("\nUsers after restoring bob:")
print(User.query.all())
The core problem soft delete addresses is the irreversible nature of DELETE statements in SQL. Once a row is gone, it’s gone. This can be catastrophic for auditing, compliance, or simply recovering from accidental data removal. Soft delete, by introducing a deleted (or is_active, status, etc.) flag, allows us to logically remove data without physically expunging it.
Internally, SQLAlchemy’s ORM maps Python objects to database rows. When you perform a session.delete(obj), SQLAlchemy generates a DELETE FROM ... WHERE id = ? SQL statement. For soft delete, we intercept this process. Instead of letting SQLAlchemy issue a DELETE, we modify the object’s attributes (setting deleted=True) and then call session.commit(). This results in an UPDATE statement: UPDATE users SET deleted = true WHERE id = ?.
The key to making soft delete feel like actual deletion is in how you query. You must always include User.deleted == False in your WHERE clauses for any query that should only return active records. This means overriding default query behaviors and ensuring all application logic adheres to this convention.
To simplify this, you can leverage SQLAlchemy’s default argument in Column definitions for initial states and implement custom query methods or even a Query subclass. For instance, defining a BaseQuery with a get_active method that automatically filters out soft-deleted records can drastically reduce boilerplate.
Consider the implications for relationships. If User has a posts relationship, you’ll need to decide if a user’s posts should also be hidden when the user is soft-deleted. This often involves cascade settings or custom logic within the related models’ query methods.
The most impactful, yet often overlooked, aspect of soft delete is how it affects unique constraints. If you have a unique constraint on email for your User model, and you soft-delete a user, you cannot create a new user with the same email until the old one is permanently purged. This is because the soft-deleted user still exists in the database, occupying that unique slot. Solutions involve either removing the unique constraint (and handling uniqueness in application logic), making the constraint conditional on the deleted flag (which requires advanced SQLAlchemy mapping or database-level triggers), or designing your system to allow duplicate emails for inactive users.
The next logical step after implementing soft delete is to define a clear strategy for purging data.