Cloud Trace doesn’t actually trace your Cloud Run requests; it traces the underlying gRPC calls that Cloud Run makes to your service.
Here’s a Cloud Run service configured to send traces to Cloud Trace. Notice the trace setting:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: my-traced-service
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/maxScale: "10"
autoscaling.knative.dev/minScale: "0"
# This is the key for tracing!
run.googleapis.com/launch-stage: BETA
run.googleapis.com/ingress: all
run.googleapis.com/trace: "true"
spec:
containerConcurrency: 80
timeoutSeconds: 300
containers:
- image: gcr.io/my-project/my-traced-app:latest
ports:
- containerPort: 8080
When a request hits your Cloud Run service, Cloud Run injects a traceparent header into the incoming request before it reaches your container. Your application then needs to propagate this header to any downstream services it calls, and ideally, use it to create new spans within your application’s execution.
Let’s say you have a service service-a that calls service-b.
service-a (Go example):
package main
import (
"fmt"
"net/http"
"os"
)
func main() {
http.HandleFunc("/", handler)
port := os.Getenv("PORT")
if port == "" {
port = "8080"
}
fmt.Printf("Listening on port %s\n", port)
http.ListenAndServe(":"+port, nil)
}
func handler(w http.ResponseWriter, r *http.Request) {
// Extract traceparent header from incoming request
traceHeader := r.Header.Get("traceparent")
// Create a new HTTP client
client := &http.Client{}
// Create a new request to service-b
req, err := http.NewRequest("GET", "http://service-b.example.com", nil)
if err != nil {
http.Error(w, "Failed to create request", http.StatusInternalServerError)
return
}
// Propagate the traceparent header
if traceHeader != "" {
req.Header.Set("traceparent", traceHeader)
}
// Make the call to service-b
resp, err := client.Do(req)
if err != nil {
http.Error(w, "Failed to call service-b", http.StatusInternalServerError)
return
}
defer resp.Body.Close()
w.WriteHeader(http.StatusOK)
w.Write([]byte("Called service-b successfully!"))
}
service-b (Python example):
from flask import Flask, request, Response
import os
app = Flask(__name__)
@app.route('/')
def hello_world():
trace_header = request.headers.get('traceparent')
if trace_header:
print(f"Received traceparent: {trace_header}")
return Response("Hello from service-b!", status=200)
if __name__ == "__main__":
port = int(os.environ.get("PORT", 8080))
app.run(debug=True, host='0.0.0.0', port=port)
In this setup, Cloud Run automatically handles the initial tracing initiation for requests hitting service-a. When service-a calls service-b, it must forward the traceparent header. If service-b also has tracing enabled (either through Cloud Run itself or by using a tracing library that understands the traceparent header), it can continue the trace.
The actual "tracing" happens at multiple levels:
- Cloud Run Ingress: Cloud Run’s load balancer generates a trace ID and span ID for the initial ingress request and injects the
traceparentheader. - Your Application: Your application receives the
traceparentheader. If you’re using a tracing SDK (like OpenTelemetry), you can create new spans within your application code to represent specific operations (e.g., "calling service-b", "processing database query"). These new spans are linked to the incoming trace ID. - Downstream Services: If your application calls other services (which might also be on Cloud Run or elsewhere), it must propagate the
traceparentheader. The downstream service then uses this header to continue the trace, creating its own spans.
The key to building a complete trace is consistent propagation of the traceparent header and using tracing libraries that correctly generate and link spans.
The most surprising thing about Cloud Run tracing is that the run.googleapis.com/trace: "true" annotation doesn’t magically make your application code aware of tracing; it only tells Cloud Run to start the trace and inject the header. You still need your application to participate by forwarding the header and, optionally, creating its own detailed spans.
When you view traces in Cloud Trace, you’ll see a waterfall diagram. Each box in the diagram represents a span. The top-level span is often the request handled by Cloud Run itself. If your application makes outgoing calls that also propagate the trace, you’ll see those as separate, indented spans, showing the latency of each hop.
The traceparent header follows the W3C Trace Context standard. It looks something like 00-0af76543210000000000000000000000-0000000000000000-01. The first part is the version, the second is the trace ID, the third is the parent span ID, and the last is the flags.
If your application is not emitting spans for its internal operations, you’ll primarily see the latency of the Cloud Run service itself and the network latency to downstream services, but not the breakdown of work within your application.
The next step after enabling basic tracing is to integrate a tracing SDK like OpenTelemetry into your application code.