Signals Handling and the PID 1 Problem

CONCEPT 4 min read

Why `Ctrl+C` Sometimes Does Nothing

You run a container interactively, press Ctrl+C to stop it, and the container ignores you. You wait. After 10 seconds, Docker forcefully kills it. This is not a Docker bug — it is the PID 1 problem, and understanding it will save you from data corruption and dropped connections in production.

How Linux Signal Handling Works

In Linux, every process can receive signals — small integer messages from the kernel or other processes. The two relevant ones here are:

SIGTERM (signal 15): "Please shut down gracefully." The process can catch this, finish in-flight requests, flush buffers, and exit cleanly.
SIGKILL (signal 9): "Die immediately." Cannot be caught or ignored. The kernel terminates the process instantly, with no cleanup.

When you run docker stop <container>, Docker sends SIGTERM to PID 1 inside the container. If PID 1 doesn't exit within the stop timeout (default: 10 seconds), Docker sends SIGKILL.

The Special Responsibility of PID 1

Process ID 1 has a unique role in Linux: it is the init process. It has two responsibilities relevant here:

Signal forwarding: It must receive signals and forward them to child processes.
Zombie reaping: When a child process exits, it becomes a "zombie" until its parent calls wait(). If the parent dies without calling wait(), PID 1 adopts and reaps the zombie. If PID 1 doesn't do this, zombies accumulate.

The problem: most application processes (Node.js, Python, Java, nginx) are not designed to be init systems. They often ignore SIGTERM entirely, or they don't forward signals to their child processes.

Shell Form Makes It Worse

If you use shell form in your Dockerfile:

# Shell form — your app is NOT PID 1
CMD python app.py

Docker actually runs /bin/sh -c "python app.py". The shell (/bin/sh) becomes PID 1. Your Python process is a child of the shell. When Docker sends SIGTERM to PID 1 (the shell), the shell exits — but it does not forward the signal to your Python process. Your Python app gets SIGKILL 10 seconds later with no chance to clean up.

The Solutions

Option 1: Always use exec form

# Exec form — your app IS PID 1
CMD ["python", "app.py"]

Your application becomes PID 1 directly. It receives SIGTERM. Whether it handles it gracefully depends on your application code — but at least the signal arrives.

Option 2: Use tini as a minimal init

tini is a tiny (~25 KB) init process designed specifically for containers. It handles signal forwarding and zombie reaping correctly:

FROM ubuntu:22.04

# Install tini
RUN apt-get update && apt-get install -y tini && rm -rf /var/lib/apt/lists/*

COPY app.py .

# tini becomes PID 1, your app is its child
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["python", "app.py"]

Option 3: Use --init flag at runtime

Docker can inject tini automatically without modifying your Dockerfile:

docker run --init my-image

This is useful when you don't control the Dockerfile (e.g., third-party images).

Option 4: Handle SIGTERM in application code

The most robust solution is to write your application to handle SIGTERM gracefully:

import signal
import sys

def handle_sigterm(signum, frame):
    print("Received SIGTERM, shutting down gracefully...")
    # flush buffers, close DB connections, finish in-flight requests
    sys.exit(0)

signal.signal(signal.SIGTERM, handle_sigterm)

Interview Tip

"What happens when you run docker stop on a container?" is a deceptively deep interview question. A surface answer is "it stops the container." A strong answer covers the full sequence: Docker sends SIGTERM to PID 1 → waits --time seconds (default 10) → sends SIGKILL if still running. Then explain the PID 1 problem, why shell form breaks signal delivery, and the tini solution. Mentioning that Kubernetes uses the same mechanism (SIGTERM → terminationGracePeriodSeconds → SIGKILL) shows you understand the production implications.

Key Point: In Kubernetes, the terminationGracePeriodSeconds field (default: 30 seconds) is the equivalent of Docker's stop timeout. If your application doesn't handle SIGTERM, every rolling deployment and pod eviction will result in forceful kills. At scale, this means dropped HTTP connections, incomplete database transactions, and corrupted message queue acknowledgments. Proper signal handling is not optional in production.

You're viewing a free demo lesson

Take the 4-min skill check to get your personalized roadmap, then sign up to save it.