-
Notifications
You must be signed in to change notification settings - Fork 2.2k
runc exec --cgroup #3059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runc exec --cgroup #3059
Conversation
This comment has been minimized.
This comment has been minimized.
2532b02 to
f850fd5
Compare
This must be a unified path for hybrid cgroup v1+v2, which we do not handle well. Looks like I see what needs to be fixed (not a problem in this PR). |
This is presumably fixed by #2087. Let me check... |
6f831f2 to
0a0c868
Compare
Yes, but it requires criu 3.16+, so draft until it's released. |
0a0c868 to
902be21
Compare
|
Rebased; made the cgroup v1 test case work without cgroupns. |
902be21 to
ee2f532
Compare
|
Looks like criu 3.16 won't be in time for runc 1.1.0 release, so moving the milestone to 1.2.0. |
c9d45a7 to
229b86a
Compare
229b86a to
cc3905d
Compare
Or maybe it will -- if it is, I'll move the milestone back to 1.1.0, as this is quite important (not the feature per se, but the hybrid cgroup entering code from #2087). |
8f672ef to
a6eb8fe
Compare
|
Rebased; this is supposed to fail until criu PPA will have CRIU 3.16 (I hope soon). |
a6eb8fe to
5d711c1
Compare
No need to have an intermediate variable here. Signed-off-by: Kir Kolyshkin <[email protected]>
No need to add a file name to the error messages, as errors from OpenFile and (*os.File).Write both contain the file name already. Signed-off-by: Kir Kolyshkin <[email protected]>
The function used here, cgroups.EnterPid, silently skips non-existing paths, and it does not look like a good idea to do so for an existing container with already configured cgroups. Switch to cgroups.WriteCgroupProc which does not do that, so in case a cgroup does not exist, we'll get an error. Signed-off-by: Kir Kolyshkin <[email protected]>
Currently the parent process of the container is moved to the right cgroup v2 tree when systemd is using a hybrid model (last line with 0::): $ runc --systemd-cgroup run myid / # cat /proc/self/cgroup 12:cpuset:/system.slice/runc-myid.scope 11:blkio:/system.slice/runc-myid.scope 10:devices:/system.slice/runc-myid.scope 9:hugetlb:/system.slice/runc-myid.scope 8:memory:/system.slice/runc-myid.scope 7:rdma:/ 6:perf_event:/system.slice/runc-myid.scope 5:net_cls,net_prio:/system.slice/runc-myid.scope 4:freezer:/system.slice/runc-myid.scope 3:pids:/system.slice/runc-myid.scope 2:cpu,cpuacct:/system.slice/runc-myid.scope 1:name=systemd:/system.slice/runc-myid.scope 0::/system.slice/runc-myid.scope However, if a second process is executed in the same container, it is not moved to the right cgroup v2 tree: $ runc exec myid /bin/sh -c 'cat /proc/self/cgroup' 12:cpuset:/system.slice/runc-myid.scope 11:blkio:/system.slice/runc-myid.scope 10:devices:/system.slice/runc-myid.scope 9:hugetlb:/system.slice/runc-myid.scope 8:memory:/system.slice/runc-myid.scope 7:rdma:/ 6:perf_event:/system.slice/runc-myid.scope 5:net_cls,net_prio:/system.slice/runc-myid.scope 4:freezer:/system.slice/runc-myid.scope 3:pids:/system.slice/runc-myid.scope 2:cpu,cpuacct:/system.slice/runc-myid.scope 1:name=systemd:/system.slice/runc-myid.scope 0::/user.slice/user-1000.slice/session-8.scope This commit makes that processes executed with exec are placed into the right cgroup v2 tree. The implementation checks if systemd is using a hybrid mode (by checking if cgroups v2 is mounted in /sys/fs/cgroup/unified), if yes, the path of the cgroup v2 slice for this container is saved into the cgroup path list. The fs group driver has a similar issue, in this case none of the runc run or runc exec commands put the process in the right cgroups v2. This commit also fixes that. Having the processes of the container in its own cgroup v2 is useful for any BPF programs that rely on bpf_get_current_cgroup_id(), like https://github.com/kinvolk/inspektor-gadget/ for instance. [@kolyshkin: rebased] Signed-off-by: Mauricio Vásquez <[email protected]> Signed-off-by: Kir Kolyshkin <[email protected]>
5d711c1 to
1747cc0
Compare
|
Should this have a changelog entry? |
@h-vetinari added, PTAL |
Check that runc run and runc exec put the process on the same cgroups v2 when using hybrid mode. Signed-off-by: Mauricio Vásquez <[email protected]> Signed-off-by: Kir Kolyshkin <[email protected]>
In some setups, multiple cgroups are used inside a container, and sometime there is a need to execute a process in a particular sub-cgroup (in case of cgroup v1, for a particular controller). This is what this commit implements. Signed-off-by: Kir Kolyshkin <[email protected]>
1747cc0 to
0202c39
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. My only minor concern is that someone will think that --cgroup applies to the host cgroup paths, but since it's well documented hopefully that won't happen.
|
I guess we can merge this one. |
This does two things
Adds minimal hybrid hierarchy support to cgroup v1 fs driver. It seems it's needed when systemd is used, also makes it possible for cgroup v1 drivers to use the new
cgroup.killkernel API (in case hybrid hierarchy is available). This part is a carry of cgroups/systemd: add cgroup-v2 path to the list when using hybrid mode #2087.Implements
runc exec --cgroupto be able to run a process in a sub-cgroup of a container (as per [RFC] runc exec --cgroup #3040).Fixes: #3040.
Closes: #2087.
Usage (option for runc exec):
--cgroup path | controller[,controller...]:path
Execute a process in a sub-cgroup. If the specified cgroup does not exist, an
error is returned. Default is empty path, which means to use container's top
level cgroup.
For cgroup v1 only, a particular controller (or multiple comma-separated
controllers) can be specified, and the option can be used multiple times to set
different paths for different controllers.
Note for cgroup v2, in case the process can't join the top level cgroup,
runc exec fallback is to try joining the cgroup of container's init.
This fallback can be disabled by using --cgroup /.
Proposed changelog entry