The most surprising thing about benchmarking etcd is that it’s not just about measuring how fast it is; it’s about understanding its consistency under load, because a single slow request can cascade into system-wide failures.
Let’s see etcdutl benchmark in action. Imagine you’ve got a cluster running, and you want to simulate some read and write traffic to see how it holds up.
ETCDCTL_API=3 etcdutl benchmark \
--endpoints=http://etcd-0.etcd.svc.cluster.local:2379,http://etcd-1.etcd.svc.cluster.local:2379,http://etcd-2.etcd.svc.cluster.local:2379 \
--keys=10000 \
--conns=50 \
--clients=10 \
--key-size=100 \
--value-size=1000 \
--total-ops=100000 \
--put-rate=1000 \
--get-rate=1000 \
--limit-rate=1000 \
--insecure-transport=false \
--cacert=/etc/etcd/ssl/etcd-ca.crt \
--cert=/etc/etcd/ssl/etcd-client.crt \
--key=/etc/etcd/ssl/etcd-client.key
This command is telling etcdutl benchmark to:
- Connect to three etcd endpoints.
- Use 10,000 unique keys.
- Establish 50 concurrent client connections.
- Simulate 10 clients performing operations.
- Make each key 100 bytes and its value 1000 bytes.
- Perform a total of 100,000 operations.
- Attempt to put 1,000 keys per second.
- Attempt to get 1,000 keys per second.
- Limit the overall rate to 1,000 operations per second.
- Use TLS for secure transport.
- Specify the CA certificate, client certificate, and client key for authentication.
The output will show you percentiles for put and get operations, latencies, and throughput. You’re looking for low average latencies and minimal p99 (99th percentile) latencies. High p99 latencies indicate that some requests are taking a very long time, which is a major red flag for etcd.
The core problem etcd solves is providing a distributed, strongly consistent key-value store. This means that when you write a value, every subsequent read will see that exact value, no matter which etcd node you talk to. It achieves this using the Raft consensus algorithm. Each write operation is proposed to the Raft group, replicated to a majority of nodes, and then committed. This consensus process is what guarantees consistency but also introduces latency.
The etcdutl benchmark tool simulates this by having multiple clients (controlled by --clients and --conns) issuing PUT and GET requests against your etcd cluster. The --put-rate, --get-rate, and --limit-rate flags are crucial for controlling the load. --limit-rate is particularly important because it caps the overall number of operations per second the benchmark tool will attempt to send, preventing it from overwhelming the cluster in a way that isn’t representative of real-world traffic patterns. It’s not a guarantee of actual throughput, but rather a target for the client.
The --keys, --key-size, and --value-size parameters define the data landscape. A larger number of keys means more data to manage and potentially more disk I/O. Larger keys and values increase network traffic and processing overhead. Benchmarking with realistic data sizes and a sufficient number of keys is vital for accurate results.
When you run this benchmark, you’re essentially probing the cluster’s ability to maintain Raft consensus under concurrent load. High latency, especially at the higher percentiles, points to issues like disk I/O bottlenecks, network saturation, or insufficient CPU resources on the etcd nodes. etcd is notoriously sensitive to disk performance; it writes every transaction to disk before acknowledging it. Slow disks mean slow commits, which means slow writes and potentially slow reads if those reads depend on recently committed data.
The etcdutl benchmark tool is designed to mimic the behavior of etcd clients, but it’s important to remember that real-world applications often have more complex access patterns and retry logic. The benchmark provides a baseline, but it’s not a perfect replica of production. For instance, the --write-batch flag (not used here but available) can simulate batched writes, which can significantly improve throughput by reducing the overhead of individual Raft proposals. Understanding how your application’s write patterns align with batching can be a key optimization.
After a successful benchmark, the next problem you’ll likely encounter is optimizing etcd’s disk I/O performance.