Node.js Process Management with PM2

Node.js is incredibly powerful for building fast, scalable applications, but running it in a production environment comes with a unique set of challenges. By default, a Node.js application runs as a single process with a single JavaScript event loop thread. While libuv provides a background thread pool for certain operations (such as filesystem or DNS work), application code itself runs on one core unless you explicitly scale it across processes. If an unhandled exception occurs, the process crashes, taking your entire application offline. Furthermore, running a single thread on a modern multi-core server leaves valuable CPU resources sitting idle.

This is where PM2 (Process Manager 2) comes in. PM2 is a production-grade daemon process manager for Node.js that ensures your applications stay online 24/7. It provides automatic crash recovery, simplifies load balancing, enables zero-downtime deployments, and offers robust monitoring tools.

In this article we’ll look at the essential features of PM2, and explore how developers can leverage its advanced capabilities to stabilize and scale production workloads.

Getting Started: Moving Beyond `node app.js`

In development, you might rely on nodemon or standard node commands to run your applications. In production, however, you need a background process that outlives your SSH session.

To install PM2 globally, run:

npm install pm2@latest -g

Once installed, starting an application is as simple as passing the entry file:

pm2 start app.js

This command daemonizes the process, meaning it runs in the background and survives terminal closure. PM2 will automatically restart the app if it crashes. You can view all currently managed applications using the pm2 list (or pm2 ls) command, which displays a table containing memory usage, CPU utilization, restart counts, and uptime.

When you run pm2 start app.js two actions are performed:

The app is registered in the process list of pm2. The process list is where all running applications are registered (and a process is one instance of an app which has been started by PM2).
The app is started in the background.

You can also pass a name for easier identification:

pm2 start app.js --name "my-api"

The following lifecycle commands are commonly used when managing PM2 processes:

pm2 start my-api      # Starts and daemonizes an application
pm2 stop my-api       # Stop the process but keeps it in PM2's list
pm2 restart my-api    # Restarts a running process (kills and respawns)
pm2 reload my-api     # Performs a zero-downtime reload for networked apps
pm2 delete my-api     # Stops and removes a process from the PM2 list
pm2 start my-api -f   # Force start even if already running

While the CLI is great for quick tests, production environments demand a more declarative approach. This is where the Ecosystem File comes in.

Process Configuration with Ecosystem Files

Managing complex configurations via command-line flags (like --watch or --max-memory-restart 500M) quickly becomes unmanageable. PM2 solves this with the Ecosystem File, a centralized JavaScript (or JSON/YAML) configuration file describing how applications should be launched, scaled, and monitored.

Generate a boilerplate file by running:

pm2 ecosystem

This creates an ecosystem.config.js file. Here is an example of a production configuration:

module.exports = {
  apps: [
    {
      name: 'payment-service',
      script: './src/index.js',
      instances: 'max',
      exec_mode: 'cluster',
      max_memory_restart: '1G',
      env: {
        NODE_ENV: 'development',
      },
      env_production: {
        NODE_ENV: 'production',
        PORT: 8080,
      },
    },
    {
      name: 'background-worker',
      script: './src/worker.js',
      instances: 2,
      env_production: {
        NODE_ENV: 'production',
      },
    },
  ],
};

Using this file, you can start all defined services simultaneously and specify the environment:

pm2 start ecosystem.config.js --env production

This declarative configuration is particularly useful for CI/CD pipelines and infrastructure-as-code workflows, ensuring consistency across different deployment environments.

Scaling Up: Cluster Mode and Load Balancing

One of the most critical limitations of Node.js is its single-threaded nature. If you are running a Node application on an 8-core server, you are technically only utilizing one core.

PM2’s Cluster Mode leverages the native Node.js cluster module to spawn multiple child processes (workers) that share the same server port. PM2 acts as a built-in load balancer, seamlessly distributing incoming HTTP, TCP, or WebSocket traffic across these workers. Check this guide for a deep dive into using clustering to improve a Node.js app’s performance.

Note: For Cluster Mode to work reliably, applications should avoid keeping state in memory. Session data, caches, and WebSocket coordination should typically be stored in shared systems such as Redis, as consecutive requests from the same user might be handled by different worker processes.

In your ecosystem file, enabling cluster mode requires just two lines:

instances: 0, // or a specific number like 4
exec_mode: 'cluster',

Setting instances to 0 (or max, which still works, but is deprecated, according to the docs) tells PM2 to detect the number of available CPU cores and spawn a worker for each one. If a worker crashes due to an out-of-memory error or unhandled exception, PM2 instantly restarts it while the remaining workers continue to handle incoming traffic, drastically improving both fault tolerance and overall throughput.

Seamless Updates: Zero-Downtime Deployments

Deploying new code traditionally requires stopping the old server and starting the new one, resulting in dropped connections and brief periods of downtime. PM2 eliminates this issue through the reload command.

When operating in cluster mode, issuing a standard pm2 restart app kills all instances simultaneously and restarts them. In contrast, pm2 reload app performs a rolling restart.

Here is how the zero-downtime deployment lifecycle works:

PM2 spawns a new worker process running the updated code.
PM2 waits for the new process to report that it is ready to accept connections (using the wait_ready signal if configured).
PM2 sends a shutdown signal to one of the old worker processes, allowing it to gracefully finish its active requests and exit.
PM2 repeats this cycle until all workers have been replaced with the updated version.

For APIs or applications that require time to establish database connections before serving traffic, you can configure PM2 to wait for a specific signal from your app.

wait_ready: true,
listen_timeout: 10000 // Wait up to 10 seconds for the ready signal

In your Node.js code:

db.connect().then(() => {
  app.listen(3000, () => {
    // Tell PM2 the app is ready to receive traffic
    if (process.send) {
      process.send('ready');
    }
  });
});

Ensuring Persistence Across Server Reboots

A common pitfall for developers new to PM2 is discovering that their applications did not restart after the host server was rebooted (e.g., due to a system update or hardware failure). PM2 must be configured to hook into your server’s init system (like systemd or launchd).

To set up process persistence, execute two commands:

Generate the startup script:

pm2 startup

PM2 will analyze your OS and output a sudo command for you to copy and paste into your terminal. Run that generated command.

Save the current process list:

pm2 save

This freezes the current list of running applications into a dump file. On the next system boot, PM2 will read this file and resurrect your applications automatically. Remember to run pm2 save anytime you add or remove an application from PM2.

Process Resilience and Auto-Restart Strategies

PM2’s default behavior is to restart a process immediately if it crashes. Several options let you fine-tune this behavior.

Crash Loop Protection

If a process crashes repeatedly on startup (a missing dependency, a bad config file), PM2 can detect the loop and stop retrying:

{
  max_restarts: 10,
  min_uptime: "5s",
}

With these settings, PM2 considers a process “unstable” if it runs for fewer than 5 seconds before crashing. After 10 such unstable restarts, PM2 stops trying and marks the process as errored, preventing the server from burning CPU on a crash loop.

Memory Threshold Restarts

Node.js applications can develop slow memory leaks that are difficult to track down immediately. As a safety net, max_memory_restart tells PM2 to restart a process if its memory usage exceeds a given threshold:

{
  max_memory_restart: "500M",
}

This is not a fix for memory leaks — you should still profile and patch them — but it prevents a single leaked process from starving the server.

Watch Mode

During development, watch mode automatically restarts the process whenever a file changes:

{
  watch: true,
  ignore_watch: ["node_modules", "logs", ".git"],
  watch_delay: 1000,
}

This is a convenience for local development. In production, watch mode should always be disabled.

Scheduled Restarts

For processes that degrade over time (long-running workers accumulating handles, for example), a cron-based restart can be a pragmatic solution:

{
  cron_restart: "0 3 * * *",   // Restart every day at 3:00 AM
}

Logs, Monitoring, and Diagnostics

Visibility into your application’s health is non-negotiable in production. PM2 handles standard output (stdout) and error logging (stderr) automatically, saving them to ~/.pm2/logs/.

Each process gets two files: <name>-out.log for standard output and <name>-error.log for standard error. You can override these paths in the ecosystem file:

{
  out_file: "/var/log/myapp/out.log",
  error_file: "/var/log/myapp/error.log",
  merge_logs: true,        // Merge logs from all cluster instances into one file
  log_date_format: "YYYY-MM-DD HH:mm:ss Z",
}

Setting merge_logs: true is particularly useful in cluster mode, where separate log files per worker can be difficult to correlate.

Real-time Monitoring

To view a live stream of all your application logs, run:

pm2 logs                  # All processes
pm2 logs api-server       # Specific process
pm2 logs --lines 200      # Last 200 lines

For a more comprehensive view, PM2 includes a terminal-based dashboard. Running pm2 monit opens a rich interface that displays CPU usage, memory consumption, log streams, and custom metrics for every running process in real-time. For quick checks, pm2 list surfaces the same data in a compact table.

Preventing Disk Exhaustion with Log Rotation

By default, PM2 appends to log files indefinitely, which can eventually fill your server’s disk space. To prevent this, install the PM2 log rotation module. Because PM2 has its own module system, you install it using PM2 rather than NPM:

pm2 install pm2-logrotate

This module automatically rotates, compresses, and purges old log files based on configurable thresholds, ensuring your server remains healthy without manual log pruning.

Below we specify some configurations for pm2-logrotate:

pm2 set pm2-logrotate:max_size 50M
pm2 set pm2-logrotate:retain 10
pm2 set pm2-logrotate:compress true

This rotates each log file when it exceeds 50 MB, keeps the last 10 rotated files, and compresses old files with gzip. For many deployments, this is sufficient. For larger operations where logs feed into centralized systems (ELK, Datadog, Splunk), you can disable PM2’s file logging entirely with out_file: "/dev/null" and pipe structured JSON logs to your collector instead.

PM2 Plus

PM2 Plus (formerly Keymetrics) is PM2’s hosted monitoring service that collects metrics, exceptions, and deployment events from your PM2 processes and displays them in a web dashboard. It adds alerting (Slack, email, webhook), transaction tracing, remote actions (restart, pull, and reload from the browser), and memory/CPU profiling triggers. Connecting is a single command:

pm2 plus

For teams that already use Datadog, Prometheus, or Grafana, PM2’s custom metrics can be scraped or pushed through those pipelines instead.

Custom Application Metrics

The @pm2/io library lets you expose application-specific metrics that appear in the PM2 dashboard and are reported to PM2 Plus:

const io = require('@pm2/io');

const requestCounter = io.counter({ name: 'Requests served' });
const responseLatency = io.histogram({
  name: 'Response latency',
  measurement: 'mean',
});

app.use((req, res, next) => {
  const start = Date.now();
  requestCounter.inc();
  res.on('finish', () => {
    responseLatency.update(Date.now() - start);
  });
  next();
});

You can define counters, histograms, gauges, and meters. These are lightweight and designed for production use.

Common PM2 Commands

For rapid day-to-day operations, keep these essential commands handy:

Command	Action
`pm2 start app.js`	Starts a process in the background.
`pm2 list`	Lists all active PM2 processes and their metrics.
`pm2 reload all`	Executes a zero-downtime rolling restart (requires Cluster Mode).
`pm2 monit`	Opens the terminal-based monitoring dashboard.
`pm2 logs`	Streams standard output and error logs.
`pm2 save`	Saves the current process list for automatic respawn on reboot.

Conclusion

PM2 is one of the most widely used tools for managing Node.js applications in production. It provides process monitoring, automatic restarts, clustering, and centralized configuration.

Mastering PM2 elevates your Node.js applications from fragile, single-threaded scripts to resilient, multi-core services capable of handling production-level traffic without breaking a sweat. By utilizing ecosystem files, cluster mode, and proper logging strategies, you can significantly reduce maintenance overhead and deployment anxiety.

Whether you are running a small API or a large backend service, PM2 simplifies the operational side of Node.js deployments.

Getting Started: Moving Beyond node app.js