Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent fails when setting up namespace in TEST-29-PORTABLE #29690

Open
mrc0mmand opened this issue Oct 24, 2023 · 7 comments
Open

Intermittent fails when setting up namespace in TEST-29-PORTABLE #29690

mrc0mmand opened this issue Oct 24, 2023 · 7 comments
Labels
portable Anything to do with systemd-portable and portablectl and portables tests

Comments

@mrc0mmand
Copy link
Member

In some very recent CI runs (in both CentOS CI and Ubuntu CI) I noticed a couple of fails in TEST-29-PORTABLE which I haven't seen before. However, they seem to be very rare and I currently don't have any luck reproducing them locally.

The first one (from Arch) seem to be related to #28445 (or at least it touches the same paths):

$ journalctl -o short-monotonic --no-hostname --file system.journal SYSLOG_IDENTIFIER="(cat)" + UNIT=minimal-app0.service 
...
[   11.099164] (cat)[625]: Not unmounting /run/systemd/mount-rootfs/run/systemd/mount-rootfs/proc, outside of prefix: /run/systemd/mount-rootfs/run/credentials
[   11.099194] (cat)[625]: Not unmounting /run/systemd/mount-rootfs/run/systemd/mount-rootfs/proc/sys/fs/binfmt_misc, outside of prefix: /run/systemd/mount-rootfs/run/credentials
[   11.099218] (cat)[624]: Bind-mounting /run/systemd/inaccessible/dir on /run/systemd/mount-rootfs/run/credentials (MS_BIND|MS_REC "")...
[   11.099261] (cat)[624]: Successfully mounted /run/systemd/inaccessible/dir to /run/systemd/mount-rootfs/run/credentials
[   11.099299] (cat)[625]: Bind-mounting /run/systemd/inaccessible/dir on /run/systemd/mount-rootfs/run/credentials (MS_BIND|MS_REC "")...
[   11.099337] (cat)[624]: Applying namespace mount on /run/systemd/mount-rootfs/run/host/.os-release-stage/
[   11.099342] (cat)[625]: Successfully mounted /run/systemd/inaccessible/dir to /run/systemd/mount-rootfs/run/credentials
[   11.099414] (cat)[625]: Applying namespace mount on /run/systemd/mount-rootfs/run/host/.os-release-stage/
[   11.099414] (cat)[624]: Followed source symlinks /run/systemd/propagate/.os-release-stage → /run/systemd/propagate/.os-release-stage.
[   11.099444] (cat)[624]: Bind-mounting /run/systemd/propagate/.os-release-stage on /run/systemd/mount-rootfs/run/host/.os-release-stage/ (MS_BIND "")...
[   11.099482] (cat)[625]: Followed source symlinks /run/systemd/propagate/.os-release-stage → /run/systemd/propagate/.os-release-stage.
[   11.099487] (cat)[624]: Failed to mount /run/systemd/propagate/.os-release-stage (type n/a) on /run/systemd/mount-rootfs/run/host/.os-release-stage/ (MS_BIND ""): No such file or directory
[   11.099525] (cat)[625]: Bind-mounting /run/systemd/propagate/.os-release-stage on /run/systemd/mount-rootfs/run/host/.os-release-stage/ (MS_BIND "")...
[   11.099560] (cat)[625]: Failed to mount /run/systemd/propagate/.os-release-stage (type n/a) on /run/systemd/mount-rootfs/run/host/.os-release-stage/ (MS_BIND ""): No such file or directory
[   11.099568] (cat)[624]: Bind-mounting /run/systemd/propagate/.os-release-stage on /run/systemd/mount-rootfs/run/host/.os-release-stage/ (MS_BIND "")...
[   11.099598] (cat)[625]: Failed to mount /run/systemd/propagate/.os-release-stage to /run/systemd/mount-rootfs/run/host/.os-release-stage/: No such file or directory
[   11.099618] (cat)[624]: Successfully mounted /run/systemd/propagate/.os-release-stage to /run/systemd/mount-rootfs/run/host/.os-release-stage/
[   11.099676] (cat)[625]: Releasing crypt device /usr/share/minimal_0.verity context.
[   11.099696] (cat)[624]: Applying namespace mount on /run/systemd/mount-rootfs/run/systemd/incoming
[   11.099703] (cat)[625]: Releasing device-mapper backend.
[   11.099741] (cat)[625]: Closing read only fd for /usr/share/minimal_0.verity.
[   11.099761] (cat)[624]: Followed source symlinks /run/systemd/propagate/minimal-app0-foo.service → /run/systemd/propagate/minimal-app0-foo.service.
[   11.099770] (cat)[625]: Closed loop /dev/loop3 (/usr/share/minimal_0.verity).
[   11.099789] (cat)[624]: Bind-mounting /run/systemd/propagate/minimal-app0-foo.service on /run/systemd/mount-rootfs/run/systemd/incoming (MS_BIND "")...
[   11.099828] (cat)[624]: Successfully mounted /run/systemd/propagate/minimal-app0-foo.service to /run/systemd/mount-rootfs/run/systemd/incoming
[   11.099895] (cat)[624]: Too many messages being logged to kmsg, ignoring
[   11.104172] (cat)[625]: minimal-app0.service: Failed to set up mount namespacing: /run/host/.os-release-stage/: No such file or directory
[   11.028329] systemd[1]: minimal-app0.service: Child 625 belongs to minimal-app0.service.
[   11.028489] systemd[1]: minimal-app0.service: Control process exited, code=exited, status=226/NAMESPACE
[   11.028583] systemd[1]: minimal-app0.service: Got final SIGCHLD for state start-pre.
[   11.028929] systemd[1]: minimal-app0.service: Failed with result 'exit-code'.
[   11.029019] systemd[1]: minimal-app0.service: Service will not restart (restart setting)
[   11.029090] systemd[1]: minimal-app0.service: Changed start-pre -> failed
[   11.067245] systemd[1]: minimal-app0.service: Job 285 minimal-app0.service/start finished, result=failed
[   11.067455] systemd[1]: Failed to start minimal-app0.service.
[   11.072684] systemd[1]: minimal-app0.service: Unit entered failed state.
[   11.072766] systemd[1]: minimal-app0.service: Consumed 90ms CPU time.
[   11.080739] systemd[1]: minimal-app0.service: Releasing resources...

Journals:

The second one is from Ubuntu:

$ journalctl -o short-monotonic --no-hostname --file artifacts/TEST-29-PORTABLE.journal -u app0.service
...
[  172.683146] systemd[1]: Starting app0.service...
[  172.767690] (ript0.sh)[1357]: Failed to connect to nscd socket: No such file or directory
[  172.773125] (ript0.sh)[1357]: Failed to connect to nscd socket: No such file or directory
[  172.778645] systemd[1]: app0.service: User lookup succeeded: uid=63383 gid=63383
[  172.780791] (ript0.sh)[1357]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
[  172.790914] (ript0.sh)[1357]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
[  172.794779] (ript0.sh)[1357]: SELinux enabled state cached to: disabled
[  172.810378] (ript0.sh)[1357]: /tmp/rootdir/dev (bind-dev) is duplicate.
[  172.817001] (ript0.sh)[1357]: /tmp/rootdir/dev (read-write-implicit) is duplicate.
[  172.828328] (ript0.sh)[1357]: /tmp/rootdir/home (read-write-implicit) is duplicate.
[  172.830641] (ript0.sh)[1357]: /tmp/rootdir/proc (read-write-implicit) is duplicate.
[  172.831014] (ript0.sh)[1357]: /tmp/rootdir/root (read-write-implicit) is duplicate.
[  172.831351] (ript0.sh)[1357]: /tmp/rootdir/run/user (read-write-implicit) is duplicate.
[  172.831722] (ript0.sh)[1357]: /tmp/rootdir/sys (read-only) is duplicate.
[  172.832299] (ript0.sh)[1357]: /tmp/rootdir/sys (read-write-implicit) is duplicate.
[  172.832936] (ript0.sh)[1357]: /tmp/rootdir/sys/fs/cgroup (read-write-implicit) is duplicate.
[  172.833347] (ript0.sh)[1357]: /sys/fs/cgroup/system.slice/app0.service/memory.pressure is outside of root directory.
[  172.844630] (ript0.sh)[1357]: Applying namespace mount on /run/systemd/unit-extensions/0
[  172.845756] (ript0.sh)[1357]: Checking for /usr/lib/extension-release.d/extension-release.app0: Success
[  172.846370] (ript0.sh)[1357]: Checking for /usr/lib/extension-release.d/extension-release.app0: Success
[  172.863831] (ript0.sh)[1357]: Version info of extension 'app0' matches host.
[  172.867061] (ript0.sh)[1357]: Followed source symlinks /tmp/app0 → /tmp/app0.
[  172.868812] (ript0.sh)[1357]: Bind-mounting /tmp/app0 on /run/systemd/unit-extensions/0 (MS_BIND "")...
[  172.874850] (ript0.sh)[1357]: Successfully mounted /tmp/app0 to /run/systemd/unit-extensions/0
[  172.877428] (ript0.sh)[1357]: Applying namespace mount on /run/systemd/unit-extensions/1
[  172.881713] (ript0.sh)[1357]: Checking for /usr/lib/extension-release.d/extension-release.app1: No such file or directory
[  172.883426] (ript0.sh)[1357]: usr/lib/extension-release.d/extension-release.app2: 'user.extension-release.strict' attribute is false…
[  172.883804] (ript0.sh)[1357]: Checking for /usr/lib/extension-release.d/extension-release.app1: No such file or directory
[  172.884340] (ript0.sh)[1357]: usr/lib/extension-release.d/extension-release.app2: 'user.extension-release.strict' attribute is false…
[  172.884684] (ript0.sh)[1357]: Version info of extension 'app1' matches host.
[  172.885024] (ript0.sh)[1357]: Followed source symlinks /tmp/app1 → /tmp/app1.
[  172.885348] (ript0.sh)[1357]: Bind-mounting /tmp/app1 on /run/systemd/unit-extensions/1 (MS_BIND "")...
[  172.885666] (ript0.sh)[1357]: Failed to mount /tmp/app1 (type n/a) on /run/systemd/unit-extensions/1 (MS_BIND ""): No such file or directory
[  172.897880] (ript0.sh)[1357]: Failed to mount /tmp/app1 to /run/systemd/unit-extensions/1: No such file or directory
[  172.898658] (ript0.sh)[1357]: app0.service: Failed to set up mount namespacing: /run/systemd/unit-extensions/1: No such file or directory
[  172.960496] systemd[1]: app0.service: Child 1357 belongs to app0.service.
[  172.962349] systemd[1]: app0.service: Main process exited, code=exited, status=226/NAMESPACE
[  172.973274] systemd[1]: app0.service: Failed with result 'exit-code'.
[  172.973689] systemd[1]: app0.service: Service will not restart (restart setting)
[  172.974026] systemd[1]: app0.service: Changed start -> failed
[  172.975041] systemd[1]: app0.service: Job 1448 app0.service/start finished, result=failed
[  172.975378] systemd[1]: Failed to start app0.service.
[  172.998568] systemd[1]: app0.service: Unit entered failed state.
[  172.998911] systemd[1]: app0.service: Consumed 237ms CPU time.
[  173.007786] systemd[1]: app0.service: Releasing resources...

Journal: https://mrc0mmand.fedorapeople.org/TEST-29-PORTABLE_fails/ubuntu-unit-extensions.journal

FWIW, I know only about these three instances, the earliest being from 2023-10-20.

@mrc0mmand mrc0mmand added tests portable Anything to do with systemd-portable and portablectl and portables labels Oct 24, 2023
@bluca
Copy link
Member

bluca commented Nov 1, 2023

Looks like it happened in TEST-44-LOG_NAMESPACE:

Oct 31 20:45:36 H (echo)[51]: Bind-mounting /run/systemd/propagate/run-u0.service on /run/systemd/mount-rootfs/run/systemd/incoming (MS_BIND "")...
Oct 31 20:45:36 H (echo)[51]: Failed to mount /run/systemd/propagate/run-u0.service (type n/a) on /run/systemd/mount-rootfs/run/systemd/incoming (MS_BIND ""): No such file or directory

https://autopkgtest.ubuntu.com/results/autopkgtest-jammy-upstream-systemd-ci-systemd-ci/jammy/amd64/s/systemd-upstream/20231031_212620_c5cbd@/log.gz

The common theme being a namespaced bind mount being set up and failing. We really need better logs - what was not found, the source or the mountpoint?

@bluca bluca pinned this issue Nov 1, 2023
@yuwata
Copy link
Member

yuwata commented Nov 6, 2023

@bluca
Copy link
Member

bluca commented Apr 8, 2024

@mrc0mmand haven't seen this in a long time, is it still relevant?

@mrc0mmand
Copy link
Member Author

According to my archives the last occurrence is from 12th of March, so hard to say if it's really gone.

@bluca
Copy link
Member

bluca commented Apr 8, 2024

ok that's recent enough, must have missed it

@mrc0mmand
Copy link
Member Author

ok that's recent enough, must have missed it

I usually retrigger jobs that fail for PR-unrelated reasons soon after receiving an email from CentOS CI, so it might've never popped up anywhere for you to see :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
portable Anything to do with systemd-portable and portablectl and portables tests
Development

No branches or pull requests

3 participants