OpenTelemetry traces can often be more valuable than logs for understanding complex distributed system behavior.

Let’s see it in action. Imagine a user request hitting your web application.

sequenceDiagram
    participant User
    participant WebApp
    participant AuthService
    participant Database

    User->>WebApp: GET /products
    WebApp->>AuthService: ValidateToken(token)
    AuthService-->>WebApp: TokenValid
    WebApp->>Database: SELECT * FROM products
    Database-->>WebApp: Product data
    WebApp-->>User: Rendered product page

This diagram shows a simplified flow, but in reality, each of these interactions can involve multiple services, network hops, and internal operations. OpenTelemetry allows us to capture this as a trace. A trace is a collection of spans, where each span represents a unit of work, like an HTTP request to another service or a database query.

When you add OpenTelemetry to your stack, you’re essentially instrumenting your applications and infrastructure to emit these traces, along with metrics (numerical measurements over time) and logs (timestamped events). The "observability" part comes from collecting and analyzing this data in a unified way. Instead of jumping between logs, metrics dashboards, and distributed tracing tools, OpenTelemetry provides a common standard and set of APIs to generate this data, which can then be sent to a backend of your choice (like Jaeger, Prometheus, or a commercial observability platform).

Here’s how you might instrument a simple Go application. First, you’ll need to import the OpenTelemetry SDK.

import (
	"context"
	"log"
	"net/http"

	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/exporters/stdout/stdouttrace" // Example exporter
	"go.opentelemetry.io/otel/propagation"
	"go.opentelemetry.io/otel/sdk/resource"
	"go.opentelemetry.io/otel/sdk/trace"
	semconv "go.opentelemetry.io/otel/semconv/v1.17.0" // Use a specific version
)

func initTracer() (func(), error) {
	// Create a stdout exporter to print traces to the console for demonstration.
	// In production, you'd use an exporter for Jaeger, OTLP, etc.
	exporter, err := stdouttrace.New(stdouttrace.WithPrettyPrint())
	if err != nil {
		return nil, err
	}

	// For the purpose of this example, we name our service "my-go-app".
	// In a real scenario, this would be dynamically set or configured.
	res, err := resource.Merge(
		resource.Default(),
		resource.NewWithAttributes(
			semconv.SchemaURL,
			semconv.ServiceName("my-go-app"),
			semconv.ServiceVersion("1.0.0"),
			attribute.String("environment", "development"),
		),
	)
	if err != nil {
		return nil, err
	}

	// Configure the tracer provider.
	tp := trace.NewTracerProvider(
		trace.WithBatcher(exporter),
		trace.WithResource(res),
	)
	otel.SetTracerProvider(tp)

	// Set the global text map propagator to propagate trace context across services.
	// W3C Trace Context and Baggage are standard formats.
	otel.SetTextMapPropagator(
		propagation.NewCompositeTextMapPropagator(
			propagation.TraceContext{},
			propagation.Baggage{},
		),
	)

	return tp.Shutdown, nil
}

func main() {
	shutdown, err := initTracer()
	if err != nil {
		log.Fatal(err)
	}
	defer shutdown()

	// Get a tracer for the current package.
	tracer := otel.Tracer("my-package")

	http.HandleFunc("/hello", func(w http.ResponseWriter, r *http.Request) {
		// Start a new span for the incoming HTTP request.
		// The context from the incoming request is crucial for propagating the trace.
		ctx, span := tracer.Start(r.Context(), "say-hello")
		defer span.End()

		span.SetAttributes(attribute.String("request.method", r.Method))

		// Simulate work
		name := r.URL.Query().Get("name")
		if name == "" {
			name = "World"
		}

		// Start a child span for an internal operation (e.g., calling another service)
		_, childSpan := tracer.Start(ctx, "greet")
		greeting := "Hello, " + name + "!"
		childSpan.SetAttributes(attribute.String("message", greeting))
		childSpan.End() // End the child span

		w.Write([]byte(greeting))
	})

	log.Println("Server started on :8080")
	http.ListenAndServe(":8080", nil)
}

The initTracer function sets up the global OpenTelemetry configuration. It defines a TracerProvider which is responsible for managing tracers and exporting telemetry data. The resource attribute provides metadata about the service itself, like its name and version. The propagation part is critical for distributed tracing; it ensures that trace context (like the trace ID and span ID) is passed between services, so they can be linked together in the final trace.

When tracer.Start(ctx, "say-hello") is called, a new span is created. If ctx already contains a trace context (because this request came from another instrumented service), the new span will be a child of the existing span, forming the trace. If ctx is new, a new trace is started. span.End() marks the completion of the operation and records its duration.

The stdouttrace.Exporter is just for demonstration; in a real-world scenario, you’d configure an exporter that sends data to a backend like Jaeger. You’d also typically use libraries that automatically instrument common operations, like HTTP clients, database drivers, and message queues.

The most surprising thing about distributed tracing is how often the root cause of an issue lies between services rather than within a single one. When a request fails or is slow, looking at the trace visually allows you to pinpoint which service in the chain is the bottleneck or the source of the error, even if that service itself reports no errors in its own logs. You see the latency added at each hop.

Understanding the resource semantic conventions is key to effective filtering and aggregation in your observability backend. For instance, consistently tagging all your services with deployment.environment: "production" or cloud.region: "us-east-1" allows you to easily isolate issues to specific environments or regions.

The next step is to explore how to collect and visualize these traces using a backend like Jaeger or Grafana Tempo.

Want structured learning?

Take the full DevOps & Platform Engineering course →