The Elastic APM Service Map visualizes the flow of requests across your microservices, showing you exactly which services are calling which, and how many requests are passing through each link.
Let’s see it in action. Imagine a typical e-commerce flow: a user browses products, adds items to a cart, and then checks out. Here’s what that might look like in the APM Service Map:
graph LR
A[Browser] --> B(Frontend Web Server);
B --> C{Product Catalog Service};
B --> D{Cart Service};
C --> E[Database];
D --> F[Database];
B --> G{Checkout Service};
G --> H{Payment Gateway};
G --> I[Database];
In this diagram, Browser represents the user’s browser, Frontend Web Server is your main web application, and Product Catalog Service, Cart Service, and Checkout Service are individual microservices. The arrows indicate the direction of requests. The thickness and color of these arrows in the actual APM UI would represent the volume and success rate of traffic between services, respectively. You’d immediately see if, for example, the Checkout Service is experiencing a high error rate (indicated by red arrows) or if the Product Catalog Service is under heavy load (thick arrows).
The Service Map solves the "distributed tracing nightmare" problem. Before tools like APM, debugging a request that traverses multiple services meant sifting through logs from each individual service, trying to stitch together a coherent timeline. The Service Map provides this timeline visually, aggregating traces and showing the end-to-end path of a request. It automatically correlates requests based on distributed tracing headers (like traceparent in W3C Trace Context), so you don’t have to manually configure this correlation.
Internally, Elastic APM agents in each service automatically capture transaction and span data. A transaction represents a single request to a service (e.g., an HTTP request to /products). Spans are the individual operations within that transaction (e.g., a database query, an outgoing HTTP call to another service). When a service makes a call to another service, the APM agent injects a unique trace ID and parent span ID into the outgoing request headers. The receiving service’s APM agent reads these headers, creating a new span linked to the original trace. Elastic APM then collects this data and builds the Service Map by aggregating these linked spans and transactions.
You control the granularity of what you see by configuring the APM agents. You can enable or disable specific transaction types (e.g., only trace HTTP requests, not background jobs), set sampling rates (to reduce data volume by tracing only a percentage of requests), and configure which outgoing requests are automatically instrumented. For example, in your apm-agent-python configuration, you might set transaction_sample_rate: 0.1 to sample 10% of incoming requests, significantly reducing the load on your APM server and Elasticsearch cluster while still providing a representative view of your system’s behavior.
The most surprising thing about how service maps are constructed is that they don’t necessarily represent all possible communication paths. They represent the paths that have actually been taken by requests that have been traced and successfully reported to the APM server. A link might exist in your architecture but not appear on the map if no traced requests have traversed it recently, or if the agent configuration is too restrictive or the communication itself is failing at a fundamental network level before APM instrumentation can even report it.
Once you’ve mastered visualizing your current service dependencies, the next logical step is to start defining and visualizing SLOs (Service Level Objectives) on top of that map.