New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regression: some rkt-tests fail #4752
Comments
Thanks for the heads-up! According to our nightly builds, rkt:
In that timeframe we didn't merge any relevant PR and v231 is still green. (yes, rkt configure should probably spit out the systemd commit in use instead of "master" :) |
👍
Hm, I was wrong. #4259 (3d474ef) introduced the regression $ git bisect log
git bisect start
# bad: [a748a0169a11ed0803fd5b1df6f8eb1c4af61803] Merge pull request #4736 from dobyrch/calendar-cleanup
git bisect bad a748a0169a11ed0803fd5b1df6f8eb1c4af61803
# good: [1a1b13c9573b8cd30a4ab8dca2ec7961e460f083] seccomp: add @filesystem syscall group (#4537)
git bisect good 1a1b13c9573b8cd30a4ab8dca2ec7961e460f083
# bad: [fadc06bb8166b7ee494ed90b054f083ac4db4e11] Merge pull request #4259 from joukewitteveen/notify
git bisect bad fadc06bb8166b7ee494ed90b054f083ac4db4e11
# good: [ef8b0084552e05f28b9132d5dfc75edae164a991] sd-dhcp-client: use free_and_strdup
git bisect good ef8b0084552e05f28b9132d5dfc75edae164a991
# good: [c5c755e1bc76165a635e7036b0f7888395d82ae7] Merge pull request #4693 from poettering/nspawn-ephemeral
git bisect good c5c755e1bc76165a635e7036b0f7888395d82ae7
# good: [3d4cf7de48a74726694abbaa09f9804b845ff3ba] build-sys: check for lz4 in the old and new numbering scheme (#4717)
git bisect good 3d4cf7de48a74726694abbaa09f9804b845ff3ba
# bad: [3d474ef7a687e2052aa303e0f95893b2fc610475] service: fix main processes exit behavior for type notify services
git bisect bad 3d474ef7a687e2052aa303e0f95893b2fc610475
# good: [c35755fb878af58b80ac62a501a75f79c90a3763] service: introduce protocol error type
git bisect good c35755fb878af58b80ac62a501a75f79c90a3763
# first bad commit: [3d474ef7a687e2052aa303e0f95893b2fc610475] service: fix main processes exit behavior for type notify services I'm trying to understand how to implement the "minimal" reproducer. I need help :-) |
BTW I don't see
And the |
Hm, Some logs:
I removed - `/bin/sh -c "export FILE=/dir1/file CONTENT=1 ; ^RKT_BIN^ --debug --insecure-options=image run --mds-register=false --volume=dir1,kind=host,source=^TMPDIR^ --inherit-env=true ^VOL_RW_WRITE_FILE_ONLY^ ^VOL_RO_READ_FILE_ONLY^ --exec /inspect -- --pre-sleep=1 --read-file"`,
+ `/bin/sh -c "export FILE=/dir1/file CONTENT=1 ; ^RKT_BIN^ --debug --insecure-options=image run --mds-register=false --volume=dir1,kind=host,source=^TMPDIR^ --inherit-env=true ^VOL_RW_WRITE_FILE_ONLY^ ^VOL_RO_READ_FILE_ONLY^ --exec /inspect -- --read-file"`,
`<<<1>>>`,
0,
}, Hm. I'm not sure I understand how does |
@lucab, why does I've prepared a patch: diff --git a/tests/image/manifest b/tests/image/manifest
index 75fd5d3..bc52c47 100644
--- a/tests/image/manifest
+++ b/tests/image/manifest
@@ -73,7 +73,7 @@
},
{
"name": "appc.io/executor/supports-systemd-notify",
- "value": "true"
+ "value": "false"
}
]
} All tests passed (against 3d474ef) :-) Anyway, that |
Also,
What does it mean? :-) |
3d474ef~1
Related: Race condition causing sd_notify messages to get dropped |
@evverx you are right, it doesn't send a Regarding the sleep race, it looks like the whole test is broken with type=notify but should be back to normality with your oneliner. Do you still see some failures without type=notify and with the sleep in place? |
@lucab , thanks! So, the regression is -bash4.3# systemctl cat exit --no-pager
# /etc/systemd/system/exit.service
[Service]
User=1001
Type=notify
NotifyAccess=all
ExecStart=/bin/sh -x -c 'systemd-notify --ready --pid=$$$$'
-bash4.3# while systemctl start exit; do :; done
See "systemctl status exit.service" and "journalctl -xe" for details. |
BTW this is a regression too: -bash-4.3# systemctl cat exit --no-pager
# /etc/systemd/system/exit.service
[Service]
Type=notify
ExecStart=/bin/sh -x -c 'sleep 20'
-bash-4.3# journalctl -u exit --no-pager
...
exit.service: Failed with result 'protocol'. The docs says nothing about 'protocol'
) Another manpage:
But it doesn't say "the daemon must send a notification message" So, why should we break all such services? |
When a notify service fails after not having received a READY notification, that is not a regression but improved behavior. Earlier, such units would silently succeed and ignore any StartPost commands (see the commit message of 3d474ef. This was wrong.
I couldn't find any better wording that was not too long (#4724). Service configurations come with their own set of expectations. When this expected behavior is not showcased by the unit, the unit fails with this new protocol error.
As noted before, these services were already broken, we just didn't fail on them. As I read it, 'it is expected' should be read as 'must' in the manpage.
Agreed. This is added to #4745. |
@joukewitteveen , thanks! I'm closing this one. |
I run https://github.com/coreos/rkt/tree/master/tests sometimes
1a1b13c - all tests passed
a748a01 (master)
Seems like #4693 introduced the regression. I'll try to bisect tomorrow.
cc @lucab, @s-urbaniak
The text was updated successfully, but these errors were encountered: