Docker Architecture Deep Dive
All Posts All Posts

Docker Architecture Deep Dive

January 26, 2026·
Software Engineering
·3 min read
Tecker Yu
Tecker Yu
AI Native Cloud Engineer × Part-time Investor

Namespaces

Namespaces are a method provided by Linux to separate resources such as process trees, network interfaces, mount points, and inter-process communication. This is the core technology for achieving process isolation. By configuring different namespaces, we can set which resources should be isolated from the host machine when creating new processes.

docker run or docker start not only sets process-related namespaces but also configures user, network, IPC, and UTS related namespaces. Through namespaces, successful isolation from the host machine’s processes and network is achieved.

This solves process, network, and filesystem isolation.

Network

Every container started with docker run actually has its own separate network namespace. Docker provides us with four different network modes: Host, Container, None, and Bridge mode. Docker also assigns IP addresses to all containers. When the Docker server starts on the host, it creates a new virtual bridge docker0, and subsequently all services launched on that host connect to this bridge by default.

By default, every container creates a pair of virtual network cards upon creation. The two virtual network cards form a data channel, with one placed in the created container and the other added to the docker0 bridge.

When a Docker container needs to expose services to the host machine, an IP address is assigned to the container, and a new rule is appended to iptables. The iptables NAT PREROUTING redirects the IP address.

libnetwork

It provides an implementation for connecting different containers, and also offers a container network model that gives applications a consistent programming interface and network layer abstraction.

Mount Points

Solves file directory access problems. To create isolated mount point namespaces in new processes, CLONE_NEWNS must be passed into the clone function. This way, the child process gets a copy of the parent process’s mount points. Without this parameter, the child process’s read and write operations on the filesystem would synchronize back to the parent process and the entire host filesystem.

If a container needs to start, it must provide a root filesystem (rootfs). The container needs to use this filesystem to create a new process, and all binary execution must occur within this root filesystem.

To ensure that the current container process cannot access other directories on the host machine, we also need to change the root node of the file directories accessible to the process using the pivot_root or chroot functions provided by libcontainer.

CGroups

Used to solve physical resource isolation problems. Linux CGroups can allocate resources to a group of processes, which are the CPU, memory, network bandwidth, and other resources mentioned above.

Hierarchical structure, using VFS abstract interfaces replaced with specific implementations.

In CGroups, all tasks are system processes, and CGroups are groups of processes divided according to certain criteria. In this mechanism, all resource control is implemented at the CGroup unit level. Each process can join a CGroup or exit a CGroup at any time.

There are multiple methods to add processes to CGroup child nodes; you can directly write the pid to the task file under the child node.

UnionFS

Solves the image packaging problem. Docker images are essentially compressed packages, Docker images are files.

Every image in Docker consists of a series of read-only layers. Each command in a Dockerfile creates a new layer on top of existing read-only layers.

Difference between containers and images: All images are read-only, while each container equals an image plus a writable layer. That is, the same image can correspond to multiple containers.

Views