New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cgroup-util: make sure cg_read_pid() can deal sanely with unmapped PIDs (as in foreign pidns) #32534
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS WSL1 still uses cgroups v1? Support for that is already obsolete in systemd v256 onwards, and WSL1 is not the development target of systemd. Sorry.
See also: #29512 (comment). I think the same argument applies to the cgroupfs implementation.
I'm not sure how WSL1 is involved in this, I haven't used that in ages. |
Just checked, latest WSL boots with cgroup_no_v1=all, so it's full cgroup v2. |
where do these zeros come from? What precisely is WSL doing there? Are these simply PIDs that live in a separate pidns that we cannot map? before we add any code around this I'd really prefer to understand what's going on here. |
I am not sure cg_read_pid() just skipping over these processes is really the right approach. For various purposes (i.e. "is this cgroup empty?") such a logic would simply be wrong |
I'm not sure where they come from. I can only speculate that they are processes from the WSL system distribution, that hosts the X server and friends. Looks something like this:
I'm not sure if there is any other sane way to deal with those zeros than to ignore them. Another approach that I originally implemented and been using for a while specifically modifies append_cgroup() in src/core/dbus-unit.c to continue on EIO from cg_read_pid(ref). |
Comments in microsoft/WSL#8879 imply that WSL is the problem:
|
That would have to be fixed from Microsofts side then, wouldn't it? Given its WSL itself that's launching systemd here. |
If the suspicions prove correct, yes |
I tried replacing /sbin/init with a small shellscript that runs |
so i am pretty sure systemd should be fixed to be fine with processes with unmappable pids (which is what those zero PIDs are). But skipping over them generically as in the proposed patch is problematic. As mentioned for code that checks if a cgroup is empty it is very much relevant to know that there is a process even if its pid cannot be mapped locally. I do have the suspicion that most of the time we want to skip those processes, but just not always. hence, I figure cg_read_pid() should be changed to take a flags param or so, with a flag CG_PID_SKIP_UNMAPPED or so, which must be explicitly specified to skip the unmapped processes as if they didn't exist, and then every single call site has to be looked at to decide if we should skip or not. messy, and involved, but I think that's the only right fix. |
There's not thaaat many callers from a quick glance, I'll have a look at it. |
I've turned the flag around, since for the majority of callers, ignoring the zero-pids is the desired mode of operation. Not sure what's up with the failing/stuck autopkgtests, I've trouble even identifying what the error is, and it looks unrelated. |
c6943a3
to
9d8f5e8
Compare
On 09.05.2024 12:38, Mike Yuan wrote:
In src/cgtop/cgtop.c
<#32534 (comment)>:
> @@ -207,7 +207,7 @@ static int process(
return r;
g->n_tasks = 0;
- while (cg_read_pid(f, &pid) > 0) {
+ while (cg_read_pid(f, &pid, /* flags = */ 0) > 0) {
Hmm, this is changed back? It should still set
|CGROUP_INCLUDE_UNMAPPABLE_PID|?
I don't think it was ever set in this version of the set, fixed now.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I do think this is the better approach, due to the aforementioned reasons. @keszybz please take a final look.
Is this a known issue @mrc0mmand ?
|
lgtm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
testing-farm:fedora-rawhide-x86_64 seems to have failed due to some network problems. Both look unrelated. |
Suggestion 1: #define PID_UNMAPPED 0 // use this macro in situation when checking for external pidns PIDs Usage of the macro in conditions will be self-documenting without need to comment each of the guards. Suggestion 2: - CGROUP_DONT_SKIP_UNMAPPED = 1 << 3,
+ CGROUP_INCLUDE_UNMAPPED = 1 << 3, The double negative is confusing. Suggestion 3: (Believe it or not, when I started typing this comment, the PR was still unmerged ;-) |
This is already discussed in #32534 (comment). |
Could I PR this patch to the 255 branch of the stable repo? So that Ubuntu 24.04, which ships on WSL with systemd enabled by default, could hopefully pick it up at some point? Or is that process automatic? |
I didn't notice that -- I believe that proves the strength of confusion it causes. The error is IMO fine since you can't make pidref to a not mapped PID. (Triple negative 🤯 ) BTW Can this situation be achieved (processes from inaccessible pidns in managed cgroups) anyhow but by the violation of the single writer rule? |
Feel free to open a PR. Please use |
In some environments, namely WSL, the cgroup.procs PID list for some reason contain a ton of zeros everywhere.
My suspicion is that those are from other instances under the same WSL Kernel, which at least always hosts the system instance with the X/Wayland/PA/Pipe server, so there is a bunch of zeros to be had.
Without this patch, whenever cg_read_pid encounters such a zero, it throws an error. This makes systemd near unusable inside of WSL.
Just skipping over any zeros in those lists makes systemd run without any issues for me.
On normal systems, where the list does not contain any zeros to begin with, this has no averse effects.
See also:
microsoft/WSL#8879