Skip to content

Conversation

Sojamann
Copy link

@Sojamann Sojamann commented Oct 19, 2025

Content

This implements the cgroup awareness mentioned in #25491.

Things I'm unsure about:

  • using std.fs.openFileAbsolute instead of posix.open
  • code duplication when determining cgroup membership
  • rounding 2.9 cores to 2

other than that everything is implemented as stated in the issue.

Testing It

This outlines how one can test the changes without playing around with the cgroup fs manually.

Building The Binary

const std = @import("std");

pub fn main() !void {
    const cpus = try std.Thread.getCpuCount();
    const mem = try std.process.totalSystemMemory();

    const stdout = std.fs.File.stdout();
    var file_writer = stdout.writer(&.{});
    try file_writer.interface.print("cpus: {d}\nmem: {d}\n", .{cpus, mem});
    try file_writer.interface.flush();
}
zig build-exe -static -target x86_64-linux-musl -femit-bin=test main.zig

Building The OCI Image

FROM scratch
ARG BINARY_NAME
COPY ${BINARY_NAME} /exe
ENTRYPOINT ["/exe"]
docker build -t testimage --build-arg BINARY_NAME=test .

Testing It

# without limits
docker run --rm testimage
# with limits
docker run --memory 1G --cpu-quota 30000 --cpu-period 10000 --rm testimage

Copy link
Contributor

@rpkak rpkak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a zig team member, so feel free to ignore me if you want.

Some of your code is the same in totalSystemMemory and getCpuCount. Maybe it makes sense to introduce some cgroup-specific code to avoid this code duplication and enable access to other cgroup files from user code.

I compared this against gnulib nproc.c. Here are some differences:

  • gnulib lets you decide which method(s) to use to get the cpu count.

  • gnulib calls and checks the return value sched_getscheduler before using cgroup. I don't know enough about schedulers to know if this should also be done here.

  • gnulib also looks at parent cgroups.

  • gnulib looks at /proc/mounts to find a cgroup2 mount if it is not mounted at /sys/fs/cgroup.

@Sojamann
Copy link
Author

Sojamann commented Oct 20, 2025

Some of your code is the same in totalSystemMemory and getCpuCount. Maybe it makes sense to introduce some
cgroup-specific code to avoid this code duplication and enable access to other cgroup files from user code.

I would also like to get rid of the code duplication but I don't really know where a good place would be to place something like this, since these helper functions should probably not be available publicly. Can you think of something were some helper functions could be placed?
Since the code won't be changed as the cgroup2 interface is going to be stable, the code duplication could also be justified until more general cgroup related functions are added to std.linux

gnulib lets you decide which method(s) to use to get the cpu count.

Since the interface should be the same across different OSes I did not know how one could specify some custom options which are only valid on linux unless one introduces something like an option parameter leading to a breaking change.

gnulib also looks at parent cgroups

In most containerization scenarios this won't be possible due to namespacing but in those other scenarios it is possible so adding it here might not hurt either.

gnulib looks at /proc/mounts to find a cgroup2 mount if it is not mounted at /sys/fs/cgroup

My thought was that it might be too much of an overhead to open another file and proccess it for just getting the cpu count but it might be better than relying on the convetion to mount the cgroup2 fs to /sys/fs/cgroup

gnulib calls and checks the return value sched_getscheduler before using cgroup. I don't know enough about schedulers > to know if this should also be done here.

It would make sense to check if the process is scheduled 'fairly' otherwise the cpu.max file will be ignored. On the other hand I feel like that someone who is cares about the scheduling of their process in that detail would not use a generic getCpuCount() function. Also this adds another call which for most people won't really do much.
Still if you think it would make sense to check the scheduling policy than I don't mind adding since up till now my main focus was to add as little additional complexity as possible but if it is justified then of course.


@rpkak thanks for taking the time.

@Sojamann
Copy link
Author

The CPU controller can also be used in threaded mode so theoretically different threads of a process can have different cpu.limits but taking this into account seems overkill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants