The most surprising thing about archiving completed workflows is that it’s often less about cleaning up and more about creating a high-fidelity, executable snapshot of past business logic, ready to be replayed or audited.
Imagine a customer order that flows through your system. It’s not just a record of what happened, but a series of discrete steps, each with its own inputs, outputs, and state. When that order is "completed," the workflow engine doesn’t just mark it done; it has a wealth of information about the journey. Archiving captures this entire journey, not just the final outcome.
Here’s a simplified view of a workflow execution in action. Let’s say we’re using a workflow engine like temporal.io. A workflow definition might look something like this (in Go):
package main
import (
"time"
"go.temporal.io/sdk/workflow"
)
func OrderProcessingWorkflow(ctx workflow.Context, orderDetails OrderDetails) error {
logger := workflow.GetLogger(ctx)
logger.Info("Order processing started", "OrderID", orderDetails.OrderID)
// Activity to validate the order
var validationResult ValidationResult
err := workflow.ExecuteActivity(ctx, ValidateOrderActivity, orderDetails).Get(ctx, &validationResult)
if err != nil {
logger.Error("Order validation failed", "Error", err)
return err
}
// Activity to process payment
var paymentResult PaymentResult
err = workflow.ExecuteActivity(ctx, ProcessPaymentActivity, validationResult.ValidatedOrder).Get(ctx, &paymentResult)
if err != nil {
logger.Error("Payment processing failed", "Error", err)
return err
}
// Activity to ship the order
err = workflow.ExecuteActivity(ctx, ShipOrderActivity, paymentResult.OrderWithPayment).Get(ctx, nil)
if err != nil {
logger.Error("Order shipping failed", "Error", err)
return err
}
logger.Info("Order processing completed successfully", "OrderID", orderDetails.OrderID)
return nil
}
When OrderProcessingWorkflow runs for a specific orderDetails, the Temporal server records every step: ValidateOrderActivity was called with orderDetails, it returned validationResult. Then ProcessPaymentActivity was called with validationResult.ValidatedOrder, it returned paymentResult, and so on. This entire sequence of events, including any retries, timeouts, or signals received, is what gets stored.
The problem archiving solves is the exponential growth of workflow history. As workflows execute, the Temporal server (or any similar system) stores their complete execution history. For long-running or high-volume workflows, this history can become massive, impacting storage costs and potentially read performance for historical queries. Archiving moves this historical data from hot, primary storage to cooler, cheaper storage, while still retaining its integrity and auditability.
Internally, workflow engines achieve this by serializing the execution history. This history is essentially a log of all events that occurred during the workflow’s lifecycle: workflow started, activity scheduled, activity completed, timer fired, signal received, workflow completed, etc. When archiving, this log is compressed and moved. The engine then stores only a pointer or a reference to the archived history, allowing it to be retrieved if needed for replay or inspection.
The exact levers you control in archiving typically involve:
- Retention Policies: Defining how long workflow histories are kept in hot storage before being eligible for archiving. This is often expressed in days, e.g.,
history_cleanup_delay = 7d. - Archival Destinations: Specifying where the history should be moved. This could be cloud storage like Amazon S3, Google Cloud Storage, or Azure Blob Storage, or even a local filesystem.
- Archival Triggers: Whether archiving is automatic based on retention policies or manually initiated.
- Rehydration/Replay Mechanisms: Ensuring that archived history can be loaded back into the engine for debugging or replaying specific past executions.
The most crucial aspect of archiving is that it preserves the determinism of the workflow. A workflow’s history is designed to be replayed to reconstruct its state. Archiving ensures that this re-playable history is preserved, meaning you can, in theory, take an archived history, load it back, and have the workflow engine "re-run" it exactly as it happened the first time, producing the identical outcome. This is invaluable for debugging obscure bugs that only appear under specific historical conditions or for regulatory compliance.
The next logical step after mastering workflow archiving is understanding how to effectively query and analyze the content of that archived data, rather than just its existence.