PM2’s clustering mode is surprisingly bad at actually hiding the zero-downtime deployment from your users.

Let’s watch it in action. Imagine you have an Express app that prints the current process ID and a timestamp to the console when it receives a request.

// app.js
const express = require('express');
const app = express();
const port = 3000;

app.get('/', (req, res) => {
  console.log(`Request received by PID ${process.pid} at ${new Date().toISOString()}`);
  res.send(`Hello from process ${process.pid}!`);
});

app.listen(port, () => {
  console.log(`App listening on port ${port} with PID ${process.pid}`);
});

Now, let’s start PM2 with clustering:

pm2 start app.js -i max --name "my-app"

pm2 list will show something like this:

┌─────┬──────────┬─────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┐
│ id  │ name     │ mode    │ status  │ restart │ uptime   │ cpu    │ mem  │ watching  │ pid      │ created_at │
├─────┼──────────┼─────────┼─────────┼─────────┼──────────┼────────┼──────┼───────────┼──────────┼──────────┤
│ 0   │ my-app   │ cluster │ online  │ 0       │ 0s       │ 0%     │ 0.0 Mo │ disabled  │ 12345    │ 2023-10-27 │
│ 1   │ my-app   │ cluster │ online  │ 0       │ 0s       │ 0%     │ 0.0 Mo │ disabled  │ 12346    │ 2023-10-27 │
│ 2   │ my-app   │ cluster │ online  │ 0       │ 0s       │ 0%     │ 0.0 Mo │ disabled  │ 12347    │ 2023-10-27 │
└─────┴──────────┴─────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┘

You have multiple Node.js processes (workers) all listening on the same port. PM2, as the master process, acts as a load balancer, distributing incoming requests across these workers. This is the foundation for zero-downtime reloads.

The magic happens during a reload. When you run pm2 reload my-app, PM2 doesn’t just restart everything at once. Instead, it gracefully restarts each worker process one by one. It sends a SIGINT signal to a worker, which tells it to stop accepting new connections and finish its current requests before exiting. While one worker is being restarted, the other workers are still online and handling traffic.

Here’s the critical part: PM2’s default behavior for pm2 reload is to send SIGINT to each worker sequentially. This means that for a brief moment, between the signal being sent and the process exiting, and then again while the new process is starting up and becoming ready, there’s a gap. If a request happens to land exactly during that micro-window, it might not be picked up by any worker.

To illustrate, let’s simulate a deployment. Start your app as shown above. Then, in one terminal, tail the logs:

pm2 logs my-app

In another terminal, run the reload:

pm2 reload my-app

Watch the logs. You’ll see messages like:

[TAIL] Tailing last 15 lines for [my-app] process (id 0)
/home/user/.pm2/logs/my-app-out.log last 15 lines
...
0|my-app   | App listening on port 3000 with PID 12345
0|my-app   | Request received by PID 12345 at 2023-10-27T10:00:00.000Z
0|my-app   | Request received by PID 12345 at 2023-10-27T10:00:01.000Z
...
[PM2] Applying action reload on app my-app
[PM2] [my-app] sending SIGINT to pid 12345
[PM2] [my-app] process with id 0 was stopped
[PM2] [my-app] starting in cluster_mode (1 worker)
[PM2] [my-app] process with id 3 started
0|my-app   | App listening on port 3000 with PID 12348
0|my-app   | Request received by PID 12348 at 2023-10-27T10:00:05.000Z
...
[PM2] [my-app] sending SIGINT to pid 12346
[PM2] [my-app] process with id 1 was stopped
[PM2] [my-app] starting in cluster_mode (1 worker)
[PM2] [my-app] process with id 4 started
0|my-app   | App listening on port 3000 with PID 12349
...

If you were to hit http://localhost:3000 rapidly with curl or a browser during the reload, you might occasionally get a connection refused or a timeout. This is because the load balancer (PM2) might try to send a request to a worker that has just received the SIGINT and is shutting down, or to a new worker that hasn’t fully initialized yet.

The default reload mechanism sends SIGINT to workers one by one. This is generally fine, but it doesn’t guarantee that every single request during the entire reload cycle will be handled. The "zero downtime" here is more about avoiding a full service interruption; a few dropped requests are sometimes an acceptable trade-off for faster deployments.

If you truly need zero dropped requests during a reload, you need to configure PM2 to wait for the new process to be ready before killing the old one. This is achieved using the wait_ready option with pm2 reload.

To enable this, you’d typically configure your app to emit a "ready" event when it’s finished initializing and ready to accept connections.

// app.js (modified for wait_ready)
const express = require('express');
const app = express();
const port = 3000;

app.get('/', (req, res) => {
  console.log(`Request received by PID ${process.pid} at ${new Date().toISOString()}`);
  res.send(`Hello from process ${process.pid}!`);
});

const server = app.listen(port, () => {
  console.log(`App listening on port ${port} with PID ${process.pid}`);
});

// This is the key for wait_ready
server.on('listening', () => {
  process.send('ready'); // Emits the 'ready' event PM2 listens for
});

Then, your reload command becomes:

pm2 reload my-app --wait-ready

With --wait-ready, PM2 will send SIGINT to a worker, but it won’t kill it immediately. Instead, it waits for the worker to send back the ready message (which our modified app.js does on its listening event). Only after the new worker has signaled it’s ready does PM2 terminate the old worker. This ensures there’s always at least one process fully capable of handling requests throughout the entire reload sequence.

The next thing you’ll likely wrestle with is how to manage environment variables and configuration across these multiple processes in a robust way, especially when dealing with secrets.

Want structured learning?

Take the full Express course →