Going down the Rabbit Hole of Docker Engine… – runc

As we have mentioned previously, runc was donated to OCI and have been managed separately since then. runc is located at GitHub at https://github.com/opencontainers/runc. It is the de-facto standard low-level container runtime. It is written in Golang based CLI tool for spawning, running and managing containers according to the OCI specification. And its very fast at that. To create a container, runc needs two things – a specification file named config.json and a root filesystem as per OCI specifications, both of these present inside a directory/bundle:

Image as courtesy of https://www.slideshare.net/PhilEstes/runc-the-little-engine-that-could-run-docker-containers

As we mentioned previously, the specification file contains metadata about the contents and dependencies of the image including the content-addressable identity of one or more filesystem serialization archives that will be unpacked to make up the final runnable filesystem. The image configuration includes information such as application arguments, environments, etc.:

Here’s one of the documents that should be able to explain this:

Install runc

If you have installed any container engine such as podman or docker, runc should already be installed. Otherwise, you can install it directly from the their releases. Once its installed, check if you are able to run runc:

Creating an OCI Bundle

Lets first create an directory called mycontainer. cd into that and create another directory called rootfs. Let’s create a docker container and export its filesystem into rootfs:

# create the top most bundle directory
mkdir /mycontainer
cd /mycontainer

# create the rootfs directory
mkdir rootfs

# export busybox via Docker into the rootfs directory
docker export $(docker create busybox) | tar -C rootfs -xvf -

After the filesystem is generated, we need to generate spec in the format of a config.json file inside your directory. runc provides a spec command to generate a base template spec that you are then able to edit. To generate this file, we can just use runc spec.

Running an OCI Bundle / Create Container with runc

One of ways to create container using OCI Bundle we created above, is to navigate into above directory using cd and run it with root privileges. Since we have used the unmodified runc spec template, this should give us a sh session inside the container. At the same time, sh being a first process, will have a pid as 1. Let’s go ahead and create a container:

# mycontainer01 is the id for the container created
sudo runc run mycontainer01

Another way to start a container is using the specs lifecycle operations. This gives us more power over how the container is created and managed while it is running. This will also launch the container in the background so we will have to edit the config.json to remove the terminal setting and replace "sh" with "sleep","5" in args. So our process {} in config.json should like below:

"process": {
                "terminal": false,
                "user": {
                        "uid": 0,
                        "gid": 0
                },
                "args": [
                        "sleep","5"
                ],
                "env": [
                        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                        "TERM=xterm"
                ],
                "cwd": "/",
                "capabilities": {
                        "bounding": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ],
                        "effective": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ],
                        "inheritable": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ],
                        "permitted": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ],
                        "ambient": [
                                "CAP_AUDIT_WRITE",
                                "CAP_KILL",
                                "CAP_NET_BIND_SERVICE"
                        ]
                },
                "rlimits": [
                        {
                                "type": "RLIMIT_NOFILE",
                                "hard": 1024,
                                "soft": 1024
                        }
                ],
                "noNewPrivileges": true
        },

Now we can run through various run lifecycle operations:

# run as root
cd /mycontainer
runc create mycontainerid

# view the container is created and in the "created" state
runc list

# start the process inside the container
runc start mycontainerid

# after 5 seconds view that the container has exited and is now in the stopped state
runc list

# now delete the container
runc delete mycontainerid

Running Rootless Containers

Rootless containers are a rave these days. TL;DR, running containers as non-root, removes some of the security surface available to the process within the container to exploit the host (on which container is running). Some of the container engines such as Podman, have had this capability since their beginning. While docker engine did not had this capability initially, as of today, it can also launch/run/manage rootless containers. However most of the container engines using runc as their runtime, should have this capability as this feature became a part of runc itself.

When we say rootless containers, we mean the ability to run containers without root privileges. This has nothing to do with the first process with PID 1 inside the container or user permissions, which for most of the OCI compatible images is root, with its uid and gid set to 0. That is a separate discussion altogether.

Note that in order to use this feature, User Namespaces must be enabled inside Host OS kernel. In order to create a rootless container using runc, we first need to generate comptable spec using runc spec --rootless. We can then instruct runc to run a rootless container with following command:

# The --root parameter tells runc where to store the container state. It must be writable by the user.
# note that we are not using sudo this time
runc --root /tmp/runc run mycontainer01

runc has further capabilities than discussed here with respect to running containers and we’ll not dwell into that for now. Point here is that runc is not intended for end users directly, so maybe you do not need to learn all gory details. However it is an important part of the container ecosystem. One of its direct challengers is crun.

crun is a Redhat led OCI implementation that is part of the broader containers project and a sibling to libpod. It is developed in C, is performant and lightweight, and was one of the first runtimes to support cgroups v2.

One thought on “Going down the Rabbit Hole of Docker Engine… – runc

  1. Excellent article Mohit. For rootless containers, I would also suggest that people look into the new Sysbox runtime. It’s an enhanced “runc” that is pushing the limits of rootless containers, in particular by enabling them to run not just microservices, but most workloads that run in VMs (e.g., such as systemd, Docker itself, and even Kubernetes). This way it voids the need for VMs or insecure privileged containers in many scenarios.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s