The Datadog Service Map doesn’t actually show you all your dependencies; it only visualizes the ones that are contributing to or are part of an active trace.
Let’s see this in action. Imagine you have a simple web application: a frontend service (webapp) that calls a backend API (api-service).
Here’s a snapshot of what the Service Map might look like before any tracing is enabled:
(No services visible in the Service Map)
Now, let’s enable distributed tracing. We’ll use OpenTelemetry for this.
Frontend (webapp) Configuration (simplified):
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.trace.sampling import ALWAYS_ON
resource = Resource(attributes={"service.name": "webapp"})
provider = TracerProvider(resource=resource, sampler=ALWAYS_ON)
span_processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://otel-collector:4318/v1/traces"))
provider.add_span_processor(span_processor)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)
# ... later in your request handler ...
with tracer.start_as_current_span("process_request"):
# Simulate an HTTP call to the backend API
response = requests.get("http://api-service:8000/data")
# ...
Backend (api-service) Configuration (simplified):
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.trace.sampling import ALWAYS_ON
resource = Resource(attributes={"service.name": "api-service"})
provider = TracerProvider(resource=resource, sampler=ALWAYS_ON)
span_processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://otel-collector:4318/v1/traces"))
provider.add_span_processor(span_processor)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)
# ... later in your API handler ...
with tracer.start_as_current_span("get_data"):
# ... fetch data ...
pass
OTLP Collector (Receiving spans and forwarding to Datadog):
Your OpenTelemetry Collector would be configured to receive OTLP HTTP or gRPC on port 4318 (or 4317 for gRPC) and then export to Datadog using the Datadog exporter.
receivers:
otlp:
protocols:
http:
grpc:
exporters:
datadog:
api:
site: datadoghq.com # or datadoghq.eu, etc.
api_key: <YOUR_DATADOG_API_KEY> # Alternatively, use DD_API_KEY env var
trace:
enabled: true
# Optional: specify agent_host and agent_port if not using the Datadog Agent directly
# agent_host: localhost
# agent_port: 8126
processors:
batch:
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [datadog]
Once requests start flowing through your instrumented services, and spans are being sent to your OTLP collector, which then forwards them to Datadog, the Service Map will dynamically populate.
You’ll start seeing something like this:
webapp (10 requests/min) ----> api-service (8 requests/min)
The arrows represent requests where a span from webapp directly triggered a span in api-service (e.g., via an HTTP call). The numbers indicate the rate of requests that are part of a trace.
The core problem this solves is understanding the runtime interactions between your services. Before tracing, you might know webapp can call api-service from your documentation or code review, but you don’t know if it’s happening, how often, or if it’s successful in the context of a user request. The Service Map shows you the operational reality.
The key levers you control are:
- Instrumentation: Ensuring your services are correctly emitting spans with appropriate parent-child relationships. This means propagating trace context (like
traceparentheaders) across network calls. - Sampling: Deciding which traces to keep.
ALWAYS_ONis for debugging; for production, you’d typically use a probabilistic or rate-limiting sampler to manage volume while still getting representative data. - Collector Configuration: Ensuring your OTLP Collector is correctly configured to receive spans and export them to Datadog.
- Datadog Agent/API Key: Making sure Datadog can ingest the data.
The most surprising thing is how the Service Map doesn’t show you static infrastructure dependencies or every possible API call. It’s a real-time, dynamic representation of active request flows. If a service call is happening but isn’t part of a distributed trace (e.g., tracing isn’t enabled, or trace context isn’t propagated), it won’t appear on the map as a dependency. This means the map is a powerful tool for identifying working and traced paths, but you need other tools (like network monitoring or static configuration analysis) to understand all potential paths or untraced traffic.
The next concept you’ll likely encounter is how to use the Service Map to drill down into specific performance issues, like high latency or error rates, by filtering and interacting with the map’s visualization.