Cloud Run Jobs and Services let you run containerized applications, but they’re designed for fundamentally different tasks.
Let’s see a Cloud Run Service in action. Imagine you have a simple web app that serves a static greeting.
# cloudrun-service-template.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: greeting-service
spec:
template:
spec:
containers:
- image: gcr.io/cloudrun/hello # A simple public image
ports:
- containerPort: 8080
When you deploy this with gcloud run deploy greeting-service --image gcr.io/cloudrun/hello --platform managed --region us-central1 --allow-unauthenticated, Cloud Run spins up instances that are always running, waiting for incoming HTTP requests. Each request is handled by an available instance. If traffic spikes, Cloud Run automatically scales up more instances. If traffic drops, it scales them back down, even to zero. The key here is continuous availability and request-driven scaling.
Now, consider a Cloud Run Job. This is for tasks that need to run to completion, not indefinitely. Think batch processing, data ingestion, or scheduled reports.
# cloudrun-job-template.yaml
apiVersion: run.googleapis.com/v1
kind: Job
metadata:
name: data-processor-job
spec:
template:
spec:
containers:
- image: us-docker.pkg.dev/cloudrun/container/job # A sample job image
command: ["python", "process.py", "--input", "gs://my-bucket/data.csv"]
restartPolicy: Never # Important for jobs: don't restart on completion
When you execute this with gcloud run jobs execute data-processor-job --region us-central1 --tasks=1, Cloud Run provisions an instance, runs the command, and when that command exits successfully (with exit code 0), the job instance is terminated. If the command fails (non-zero exit code), the job can be retried based on its restartPolicy. The emphasis is on task completion and finite execution.
The core problem Cloud Run solves is abstracting away the infrastructure for running containers. You don’t manage VMs, load balancers, or scaling logic. For Services, it provides a highly available, auto-scaling HTTP endpoint. For Jobs, it provides a way to run discrete, batch-oriented tasks without managing the underlying compute.
The primary lever you control for both is the container image. For Services, you also configure scaling parameters like --min-instances and --max-instances. For Jobs, you define the task execution, including --tasks (how many parallel instances to run a task), --task-timeout (how long each task can run), and --max-retries.
A common misconception is that Jobs are just Services that you manually trigger and then expect to stop. This isn’t quite right. A Job’s execution model is fundamentally different: it’s about a container running a specific command and exiting. A Service is about a container listening for requests. You can’t effectively run a long-running, request-handling web server as a Cloud Run Job because the job execution model expects the container to finish its work and exit. Conversely, running a batch processing script that’s meant to finish in a Service would mean Cloud Run keeps instances alive indefinitely, waiting for non-existent HTTP requests, and you’d have to manually scale them down.
The most surprising aspect, mechanically, is how Cloud Run Jobs handle parallelism. When you specify --tasks=5 for a job, Cloud Run doesn’t just launch five identical containers sequentially. It orchestrates their execution, ensuring that each of the five "task instances" runs independently and concurrently, as long as the underlying infrastructure allows. Each task instance gets its own execution environment and its own slice of the allocated resources, and Cloud Run tracks the success or failure of each individual task.
Understanding whether your application needs to be "always on" and responsive to requests (Service) or execute a defined piece of work and terminate (Job) is the deciding factor.