Skip to content

Commit 31a8ca3

Browse files
authored
Added draft of advanced instrumentation (#121)
* Added draft of advanced instrumentation * Additonal info * Added docker rootless infos * Added root mode security note to installation
1 parent 023979b commit 31a8ca3

File tree

2 files changed

+180
-1
lines changed

2 files changed

+180
-1
lines changed

content/en/docs/installation/installation-linux.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ However you need to add your current user to the `docker` group. We need this so
118118

119119
Please follow this explanation how to do it: [Official docker docs on docker group add](https://docs.docker.com/engine/install/linux-postinstall/)
120120

121-
=> If you want to use the rootless mode anyway you do not have to to that. Just read the next paragraph.
121+
=> Rootless mode is recommended if you run GMT in cluster mode. If you want to install that just read the next paragraph. In case you want to run root mode also in a cluster be sure to read the Security paragraph under [Advanced Instrumentation →]({{< relref "measuring/advanced-instrumentation" >}})
122122

123123
### Rootless mode (optional)
124124

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
---
2+
title: "Advanced instrumentation"
3+
description: ""
4+
date: 2025-05-25T01:49:15+00:00
5+
weight: 900
6+
toc: true
7+
---
8+
9+
GMT is designed to provide extremly low overhead as a measurement tool.
10+
11+
This is achieved by leveraging classic *docker* containers with native runtimes like *runc* (optionally with rootless, see down below).
12+
13+
To benchmark more complex applications however it might be necessary to use alternative
14+
runtimes to leverage functionalities like:
15+
16+
- Docker in Docker
17+
- Workloads that need systemd
18+
- K8s or K3s inside of GMT
19+
- Benchmarking alternative kernels and kernel modules with GMT
20+
21+
In the following we show runtimes supported by GMT and their pros, cons and caveats.
22+
23+
{{< callout context="caution" icon="outline/alert-triangle" >}}
24+
Using a different runtime for the docker orchestrator will almost always result in more overhead. This path should only be choosen if no other way of running the workload is possible as the base system might get damaged or the measurement itself might not be possible or distorted/disturbed without the isolation.
25+
{{< /callout >}}
26+
27+
## Kata Containers
28+
29+
[GitHub Repo](https://github.com/kata-containers/kata-containers/)
30+
31+
*Kata Containers* is a *containerd* compatible runtime that creates a *qemu VM* and launches a new *docker* container inside of it.
32+
33+
### Pros
34+
35+
- Highest degree of isolation
36+
- Enables *docker-in-docker* workloads
37+
- Enables *systemd* workloads
38+
- Enables to load alternative kernels and kernel modules
39+
40+
### Cons
41+
42+
- Big overhead ... although not as big as gVisor
43+
- Requires nested virtualization
44+
45+
### Caveats
46+
47+
- GMT orchestrated containers cannot be put on one network. This seems to be a bug ...
48+
- Unclear if GPU forwarding is supported
49+
- *systemd* workloads need patched image. So far not achieved to get running
50+
- *docker-in-docker* workloads so far not achieved running although they should work
51+
52+
### Activating
53+
54+
Install *Kata Containers* and the just supply `--runtime io.containerd.kata.v2` as a `docker-run-args` in the *service* definition of your `usage_scenario.yml`
55+
56+
## Sysbox
57+
58+
[GitHub Repo](https://github.com/nestybox/sysbox)
59+
60+
*sysbox* enables bare metal workloads and provides a bit more isolation than normal docker containers by providing stronger namespaces that are effectively rootless.
61+
62+
### Pros
63+
64+
- Slightly higher degree of isolation. Although still very close to native *docker*
65+
- Does not require nested virtualization
66+
- Enables *docker-in-docker* workloads
67+
- Enables *systemd* workloads
68+
69+
### Cons
70+
71+
- Biggest overhead of all runtimes
72+
- Unclear if GPU forwarding is supported
73+
- Cannot load other kernels or kernel modules
74+
75+
### Caveats
76+
77+
- Networking for *docker-in-docker* workloads seems to fail when containers are put on custom network. Network connection on the normal docker containers seems to work though. Suprisingly direct IP connects work, but DNS resolution fails for the *docker-in-docker* workloads
78+
- This can be mitigated at the moment by putting the containers on the *default bridge* network. This should have no further security implications
79+
- Alternatively one can also set a proxy for the docker container and forward the *HTTP_PROXY* variables to all applications that are started in the *docker-in-docker* containers.
80+
- Also alternatively all inner created docker containers in the container can be created with `--network=host` and will also retain connectivity.
81+
- Why the mitigations work is not exactly clear, but it might be related to this: https://github.com/nestybox/sysbox/issues/456. It seems clear however that it is a routing issue from the inner container to the internet but it suprising that either changing how the interface for the outer container is created can fix it as well as skipping creation the inner network adapter with `--network=host`.
82+
83+
### Activating
84+
85+
Install *sysbox* and the just supply `--runtime sysbox-runc` as a `docker-run-args` in the *service* definition of your `usage_scenario.yml`
86+
87+
## gVisor
88+
89+
*gVisor* emulates the whole kernel in user-space thus protecting the host kernel.
90+
91+
### Pros
92+
93+
- High degree of isolation. Probably on par with
94+
- Does not require nested virtualization
95+
96+
### Cons
97+
98+
- Biggest overhead of all runtimes
99+
- Unclear if GPU forwarding is supported
100+
101+
### Caveats
102+
103+
- Unclear if *systemd* workloads work
104+
- Unclear if GPU forwarding is supported
105+
- Unclear if *docker-in-docker* workloads work
106+
- Unclear if it can load other kernels or kernel modules
107+
108+
**Note**: Currently in alpha and not officially supported. Ping us if you want to help developing this feature to a stable version :)
109+
110+
## Firecracker
111+
112+
[GitHub Repo](https://github.com/firecracker-microvm/firecracker-containerd)
113+
114+
*Firecracker* launches a micro-VM that can also be made *containerd* compatible through a shim.
115+
116+
### Caveats
117+
118+
- Unclear if *systemd* workloads work
119+
- Unclear if *docker-in-docker* workloads work
120+
- Unclear if it can load other kernels or kernel modules
121+
122+
**Note**: Currently in alpha and not officially supported. Ping us if you want to help developing this feature to a stable version :)
123+
124+
## Docker Rootless
125+
126+
*Docker Rootless* is the endorsed default runtime configuration of the *runc* runtime that ships with *docker* and is officially endorsed by GMT.
127+
128+
Making containers rootless comes with some trade-offs:
129+
130+
### Pros
131+
132+
- Higher security. If containers are escaped no true root is possible
133+
- No *bridges* or *nftables* rules are created and might pollute host networking rules
134+
135+
### Cons
136+
137+
- Docker networking is completely done in user space via *slirp4netns* and thus very inefficient
138+
- Configuration of *slirp4netns* is another tool to learn to create custom networking rules for docker containers
139+
140+
## More runtimes?
141+
142+
Technically more runtimes can be supported as long as they are *containerd* compatible.
143+
144+
This requirements comes from the fact that many native *docker* functionalities are used inside of GMT:
145+
146+
- `docker exec`
147+
- `docker logs`
148+
- `docker network`
149+
- `docker run`
150+
- `docker images`
151+
- etc.
152+
153+
## Security
154+
155+
When setting your system up with alternative runtimes that need a *docker* root daemon you might want to lock out the default runtimes that ship with *docker*. Typically these are:
156+
157+
- *runc*
158+
- *io.containerd.runc.v2*
159+
160+
But double check with `docker info`
161+
162+
### Disable run
163+
164+
The easiest way to disable `runc` is to introduce an *AppArmor* or *SELinux* rule. Since GMT favors *Ubuntu/Debian* here is an example for *AppArmor*.
165+
166+
First check where your `runc` binary is with `$ realpath $(which runc)`. The typical location is `/usr/local/bin/runc`.
167+
168+
Then create a file at `/etc/apparmor.d/runc`:
169+
170+
```AppArmor
171+
# Block execution of runc
172+
profile runc-deny /usr/local/bin/runc {
173+
/usr/local/bin/runc ix,
174+
deny /** mrwklx,
175+
}
176+
```
177+
178+
Test with `runc` or `docker run --rm -it --runtime runc ubuntu bash`. It should fail.
179+
Also `docker run --rm -it --runtime io.containerd.runc.v2 ubuntu bash` should fail.

0 commit comments

Comments
 (0)