Cloud Run’s canary deployment feature lets you roll out new versions of your service gradually, minimizing risk.
Let’s see it in action. Imagine you have a service called my-app running on Cloud Run, and you want to deploy a new version, v2.
# Deploy the new version
gcloud run deploy my-app \
--image gcr.io/my-project/my-app:v2 \
--platform managed \
--region us-central1 \
--no-traffic \
--allow-unauthenticated
This command deploys v2 but doesn’t send any traffic to it yet. Now, you want to send 10% of traffic to v2 and keep 90% on the current stable version (let’s assume it’s v1).
# Update traffic percentages
gcloud run services update-traffic my-app \
--platform managed \
--region us-central1 \
--to-revisions v1=90,v2=10
Now, 10% of incoming requests to my-app will hit v2, and 90% will go to v1. You can monitor your metrics (error rates, latency, custom application metrics) for v2. If everything looks good, you can gradually increase the traffic to v2:
# Increase traffic to v2 to 50%
gcloud run services update-traffic my-app \
--platform managed \
--region us-central1 \
--to-revisions v1=50,v2=50
And eventually, send 100% of traffic to v2 once you’re confident:
# Send 100% traffic to v2
gcloud run services update-traffic my-app \
--platform managed \
--region us-central1 \
--to-revisions v2=100
This phased rollout allows you to catch regressions early. If v2 starts showing errors, you can immediately roll back by shifting traffic back to v1:
# Rollback to v1
gcloud run services update-traffic my-app \
--platform managed \
--region us-central1 \
--to-revisions v1=100,v2=0
Cloud Run handles the routing automatically based on these revision weights. It’s not just about percentages; you can also assign traffic to specific named revisions or even pin traffic to a particular revision if needed for testing. The key is that Cloud Run manages the underlying load balancing and routing logic, abstracting away the complexity of setting up separate ingress or load balancer configurations for each version. When you update traffic, Cloud Run adjusts the routing rules dynamically.
The most counterintuitive aspect of Cloud Run’s traffic splitting is that it’s tied directly to revisions, not to deployments or service versions in a traditional sense. Each time you deploy a new container image, you create a new revision. Traffic splitting then operates by assigning percentages to these immutable revisions. You can’t, for instance, split traffic between two running instances of the same revision; you split traffic between different revisions. This immutability of revisions is what makes rollbacks and gradual rollouts so robust, as you’re always routing to a known, tested state.
The next step is understanding how to automate these traffic shifts based on observed metrics.