Docker Engine from Docker, Inc has been at forefront of technology, when it comes to containers. It was so synonymous with containers, that docker and containers used to mean the same thing for a long time. It used to be so till the time Kubernetes and OCI came and confusion has taken over since. However, Docker Engine contains so many UX enhancements, that it feels like magic till today and still a recommended option for anyone starting to get their hands dirty in container technology. What we call Docker Engine or simply docker, is composed of many small components tied together like a car engine. Note that Docker Engine is further a small component in set of tools included in the Docker Desktop, which in itself is a separate world.
At the time of writing of this blog post, the major components that make up the Docker engine are; the Docker daemon, containerd, runc, and various plugins such as networking and storage. Together, these tools are used to create and run containers.
It has been really hard to find the docker component architecture in-depth that explains how all these components are tied together, as they have not included in their official documents as well. However, below picture can be used to understand these components:
Again, notable here that this shim is entirely different than dockershim, which is used by Kubernetes to interact with dockerd and will be explained later. Dockershim itself stands newly deprecated as of Kubernetes v1.20, which is the latest version as of writing of this blog post.
Before Docker 1.11 and Before that…
The notation that we have come to associate with the containers that they are lightweight, easy to start, fast, easily scalable, etc were not achieved in a single day. Containers are fundamentally composed of several underlying kernel primitives: namespaces (who you are allowed to talk to), cgroups (the amount of resources you are allowed to use), and LSMs (Linux Security Modules—what you are allowed to do). Together, these kernel primitives allow us to set up secure, isolated, and metered execution environments for our processes. These concepts have been there from lot of time and incorporated in different Linux kernel versions since.
Also it may be important to mention at this point that Linux containers are different from Solaris Zones or BSD Jails, since they use discrete kernel features like cgroups, namespaces, SELinux, and more. An interesting worth watch is here and here.
In the beginning…there was LXC. Docker was a monolith then or more precisely, Docker daemon was a monolith! No secret today. In its defense, it was a very small project and very ambitious project then and no one would knew, how it would be adopted. When Docker was first released, the Docker engine had two major components:
- The Docker daemon (or just daemon)
The Docker daemon was a monolithic binary. It contained all of the code for the Docker client, the Docker API, the container runtime, image builds, and much more. LXC provided the daemon with access to the fundamental building-blocks of containers that existed in the Linux kernel.
But LXC was always an issue. It was hard to understand, implement and perhaps more importantly, Linux-specific. Its Linux-specific nature was eventually become a hinderance, to the later developed wider goals, such as being platform agnostic and that eventually lead to development to container runtimes such as runV, Kata, nabla, etc.) In the words of Docker Inc itself, the container start times were in order of minutes then and was a lot of pain:
There were further issues with the debugging, security, container management etc.
Eventually, Docker team developed their own tool called libcontainer as a replacement for LXC. The goal of libcontainer was to be a platform-agnostic tool that provided Docker with access to the fundamental container building-blocks that exist in the host kernel.
Libcontainer replaced LXC as the default execution driver in Docker 0.9. And it made life easier. It achieved this by dumping lot of additional functionalities they didn’t needed such as DHCP, upstart, dnsmacq, etc and replacing some such as init, etc
Community folks saw that and they said that Let there be Docker...
Getting Rid of the monolithic Docker Daemon
Over the period of time, the monolithic nature of the Docker daemon became more and more problematic:
- It was hard to innovate on
- It got slower
- It was not what the ecosystem wanted
Docker team was aware of these challenges. At the same time, containers became immensely popular. There was a lot of folks interested in doing all kind of things they can and were doing at that time. So, Docker, Google, CoreOS etc came together and founded open containers initiative (OCI). It launched two specifications:
- Runtime Specification (runtime-spec) – It outlines how to run a filesystem bundle that is unpacked on disk. At a high-level an OCI implementation would download an OCI Image, then unpack that image into an OCI Runtime filesystem bundle. At this point, the OCI Runtime Bundle would be run by an OCI Runtime.
- Image Specification (image-spec) – The OCI Image Format contains sufficient information to launch the application on the target platform (e.g. command, arguments, environment variables, etc). This specification defines how to create an OCI Image, which will generally be done by a build system, and output an image manifest, a filesystem (layer) serialization, and an image configuration.
Both specifications were released as version 1.0 in July 2017. Since then these specifications have been mostly stable. The latest image spec is v1.0.1, released in November 2017. The latest runtime spec is v1.0.2, released March 2020.
The team at Docker, Inc took a huge effort and re-factored and divided these into small individual components such as containerd, runc, shim etc. They also made it regression less at that time:
As of Docker 1.11 (early 2016), the Docker engine implements the OCI specifications as closely as possible. For example, the Docker daemon no longer contains any container runtime code — all container runtime code is implemented in a separate OCI-compliant layer. By default, Docker uses runc for this. runc is the reference implementation of the OCI container-runtime-spec. As well as this, the containerd component of the Docker Engine makes sure Docker images are presented to runc as valid OCI bundles.
Also, containerd was donated to CNCF in Mar 2017 and runc was donated to the OCI. They have been embraced and managed separately by the community.