Scale Docker Compose Services with Multiple Replicas (2026)

Docker Compose, by default, spins up one instance of each service you define. To handle more load, you need more instances. The docker-compose up command, however, is a bit of a black box when it comes to scaling beyond one.

Let’s say you’ve got a simple docker-compose.yml for a web app and a database:

version: '3.8'
services:
  web:
    image: nginx:latest
    ports:
      - "8080:80"
    deploy:
      replicas: 3
  db:
    image: postgres:13
    environment:
      POSTGRES_PASSWORD: mysecretpassword

If you just run docker-compose up, you’ll get one nginx container and one postgres container. The deploy.replicas key is actually a Swarm-specific setting. For Compose, you use a different command to control the number of instances.

To scale your services, you use the docker-compose scale command. This command tells the Compose daemon how many instances of a particular service you want running.

For example, to scale the web service to 3 instances and the db service to 2 instances, you’d run:

docker-compose scale web=3 db=2

After running this, you can verify by checking your running containers:

docker ps

You’ll see output like this, showing multiple web containers:

CONTAINER ID   IMAGE          COMMAND                  CREATED         STATUS         PORTS                  NAMES
a1b2c3d4e5f6   nginx:latest   "/docker-entrypoint.…"   5 seconds ago   Up 4 seconds   0.0.0.0:8080->80/tcp   myproject_web_3
f7e6d5c4b3a2   nginx:latest   "/docker-entrypoint.…"   5 seconds ago   Up 4 seconds   0.0.0.0:8080->80/tcp   myproject_web_2
9876543210ab   nginx:latest   "/docker-entrypoint.…"   5 seconds ago   Up 4 seconds   0.0.0.0:8080->80/tcp   myproject_web_1
... (db containers would also appear if scaled)

Notice how the container names are automatically appended with _1, _2, _3, etc. This is how Docker Compose keeps track of the multiple instances of the same service.

The docker-compose scale command works by inspecting your docker-compose.yml file and then creating additional containers for the specified service until the desired count is reached. It leverages the docker run command under the hood, ensuring that each new container is configured exactly like the first one, including its image, environment variables, and volume mounts.

What’s a bit counterintuitive is that the deploy section in docker-compose.yml is primarily for Docker Swarm mode. If you’re just using plain Docker Compose on a single host, the replicas key there is effectively ignored. The scale command is your go-to for multi-instance setups in a non-Swarm Compose environment.

The scale command is idempotent. If you run docker-compose scale web=3 when you already have 3 web containers running, it won’t do anything. If you run docker-compose scale web=5 and you currently have 3, it will add 2 more. If you run docker-compose scale web=1 and you have 3, it will stop and remove 2 of them.

This mechanism is crucial for distributing incoming traffic. If you’re using a load balancer (like Nginx or HAProxy) in front of your scaled services, it can then send requests to any of the available instances of your web service. Docker Compose itself doesn’t provide a built-in load balancer for scaled services on a single host; you’d typically set one up as another service in your docker-compose.yml or manage it externally.

When you scale a service up, each new container gets its own isolated network namespace and IP address within the Compose network. They share the same image and configuration defined in the docker-compose.yml, but they are distinct processes. This isolation ensures that failures in one replica don’t cascade to others.

If you scale down a service using docker-compose scale service_name=N where N is less than the current number of running containers, Compose will stop and remove containers until the desired count is met. It typically removes the containers with the highest numerical suffix first.

The most surprising thing about scaling with docker-compose scale is that it doesn’t automatically handle service discovery or load balancing for you. You’ll still need a separate mechanism to direct traffic to the multiple instances of your service. For instance, if you’re scaling a web server, you’ll want to configure an Nginx or HAProxy container to act as a reverse proxy, distributing requests across the web service’s replicas. This often involves using Docker’s networking capabilities to expose the scaled services internally and then configuring the proxy to target those internal IPs or service names.

The next challenge you’ll likely encounter is managing stateful services, like databases, when scaled.