1. Node.js Process

  • Single Threaded: Node.js operates on a single thread using an event-driven, non-blocking I/O model.
  • Event Loop: Manages asynchronous operations in Node.js. It handles callbacks, events, and I/O operations in the background.
  • Call Stack: Executes functions in the order they are called. If an async operation is encountered, it’s passed to the event loop.
  • Memory: The process has its own memory heap and can execute JS code, load modules, and perform I/O operations.
  • Global Object: process object in Node.js provides information about the current process. It can handle signals, terminate the process, and manage memory.
  • Concurrency: Although Node.js is single-threaded, it achieves concurrency by offloading tasks (like I/O) to the event loop, which can handle multiple operations asynchronously.
  • Main Thread: This is responsible for executing the JavaScript code. It interacts with the event loop to manage async callbacks.
  • process API: Contains properties like process.env, process.argv, process.pid, and methods like process.exit(), process.nextTick() for interacting with the underlying operating system.

2. Threads in Node.js

  • Single Threaded: Node.js is single-threaded by default, meaning only one thread is used to handle the execution of JavaScript code.
  • Libuv: The library that provides the event loop and asynchronous I/O. It uses a thread pool (usually 4 threads) internally to handle expensive operations (file system operations, DNS resolution, etc.) but only the main thread is used for executing JS code.
  • Asynchronous Execution: Tasks like I/O operations, timers, and callbacks are handled in the background by the event loop and worker threads, allowing the main thread to remain non-blocking.

3. Worker Threads in Node.js

  • What Are Worker Threads?: Introduced in Node.js v10.5.0, worker threads allow the creation of additional threads for parallel execution of JavaScript code.
  • When to Use: Worker threads are useful for CPU-bound tasks, such as complex computations, which would block the event loop if executed on the main thread.
  • Execution Model: Each worker thread runs in its own isolated context, so no shared memory or state is available between threads. Communication happens via message passing (using the MessagePort API).
  • Thread Management: Worker threads can be managed using the worker_threads module. They can create new threads (new Worker()), pass messages between threads (worker.postMessage()), and listen for messages (worker.on('message')).
  • Example:
const { Worker } = require('worker_threads');
 
const worker = new Worker('./worker-script.js');
worker.postMessage('Hello from main thread!');
 
worker.on('message', (message) => {
  console.log(`Received from worker: ${message}`);
});
  • Performance: Worker threads avoid blocking the main event loop, improving the performance of Node.js applications when handling CPU-bound tasks.

4. Child Process in Node.js

  • What is Child Process?: Node.js allows creating new processes that run independently but can communicate with the parent process. The child_process module enables spawning new processes.
  • Types of Child Processes:
    • spawn(): Spawns a new process and allows communication through streams (stdout, stdin, stderr). Suitable for long-running processes.
    • exec() Runs a shell command and buffers the output. Useful for short-lived commands.
    • execFile(): Similar to exec(), but it directly executes a file (without involving the shell), which is more secure.
    • fork(): A specialized version of spawn() to create child Node.js processes specifically. It automatically sets up IPC (Inter-Process Communication) channel for communication.
  • Example (fork):
const { fork } = require('child_process');
 
const child = fork('child-script.js');
child.send('Hello from parent!');
 
child.on('message', (message) => {
  console.log(`Received from child: ${message}`);
});
  • Use Cases: Child processes are useful for:
    • Running shell commands or external programs.
    • Performing CPU-intensive tasks in parallel.
    • Offloading long-running tasks to a separate process.
  • Communication: Child processes communicate with the parent process using message passing.

5. Clustering in Node.js

  • What is Clustering?: Clustering allows Node.js to take advantage of multi-core systems by creating multiple instances (workers) of the Node.js application.
  • Why Clustering?: Since Node.js runs on a single thread by default, clustering helps distribute incoming connections across multiple workers, effectively utilizing multiple CPU cores.
  • Master-Worker Model:
    • Master Process: Manages worker processes and distributes incoming requests among them.
    • Worker Processes: Each worker is a separate Node.js process running on its own thread, handling requests independently.
  • Communication: The master process and workers communicate through IPC (Inter-Process Communication) to share information or pass messages.
  • Cluster Module: Provides APIs for setting up clusters.

Example:

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;
 
if (cluster.isMaster) {
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
 
  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died`);
    cluster.fork(); // Restart a worker if one dies
  });
} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end('Hello, world!\n');
  }).listen(8000);
}
 
  • Load Balancing: By default, clustering provides a round-robin approach for load balancing, where the master process distributes connections to the workers.
  • Scaling: Clustering is a simple way to scale Node.js applications horizontally to handle a larger number of concurrent users.

6. Comparing Worker Threads, Child Processes, and Clustering

  • Worker Threads:

    • Best for CPU-intensive tasks that require shared data between threads.
    • Runs within the same process but in different threads.
    • Communication via message passing (worker.postMessage).
    • Low overhead compared to creating a new process.
  • Child Processes:

    • Useful for running external programs or long-running tasks outside the main Node.js process.
    • Each child process runs independently with its own memory and event loop.
    • Communication via IPC or standard streams (stdin, stdout).
    • Higher overhead than threads due to separate memory space and OS-level process management.
  • Clustering:

    • Helps scale Node.js horizontally across CPU cores by creating multiple worker processes.
    • Best suited for I/O-bound tasks that can benefit from parallel processing of requests.
    • Each worker is an independent process, so no shared state between workers.
    • Communication via IPC channels.
    • Useful for load balancing and high concurrency scenarios.