New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log output can be lost from services using log namespaces that only produce output immediately before exiting #32604
Comments
richardmaw-codethink
added
the
bug 🐛
Programming errors, that need preferential fixing
label
May 1, 2024
I think this is subtly different from #32539 since it's not that the log message is currently in-flight, it's that systemd never gives the journal the chance to see it. |
yuwata
added a commit
to yuwata/systemd
that referenced
this issue
May 1, 2024
…nits Otherwise, if a service unit that requests LogNamespace= stopped before systemd-journald@.service is started, logs generated by the service will be lost, as systemd-journald@.socket is stopped and systemd-journald@.service will never started. To prevernt the issue, let's introduce another implicit dependency to a oneshot service that explicity synchronizes a namespaced journal file when the log namespace is not needed anymore. Fixes systemd#32604.
yuwata
added a commit
to yuwata/systemd
that referenced
this issue
May 1, 2024
…nits Otherwise, if a service unit that requests LogNamespace= stopped before systemd-journald@.service is started, logs generated by the service will be lost, as systemd-journald@.socket is stopped and systemd-journald@.service will never started. To prevent the issue, let's introduce another implicit dependency to a oneshot service that explicitly synchronizes a namespaced journal file when the log namespace is not needed anymore. Fixes systemd#32604.
@richardmaw-codethink Could you test #32619? |
bluca
pushed a commit
that referenced
this issue
May 2, 2024
…nits Otherwise, if a service unit that requests LogNamespace= stopped before systemd-journald@.service is started, logs generated by the service will be lost, as systemd-journald@.socket is stopped and systemd-journald@.service will never started. To prevent the issue, let's introduce another implicit dependency to a oneshot service that explicitly synchronizes a namespaced journal file when the log namespace is not needed anymore. Fixes #32604.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
systemd version the issue has been seen with
256~rc1-gf19076e^
Used distribution
Fedora 39
Linux kernel version used
kernel-6.8.7-200.fc39.x86_64
CPU architectures issue was seen on
x86_64
Component
systemd, systemd-journald
Expected behaviour you didn't see
Unexpected behaviour you saw
Steps to reproduce the problem
Run the TEST-44-LOG-NAMESPACE test with the log level set to debug on moderately slow hardware a few times, or get unlucky in CI.
Additional program output to the terminal or log subsystem illustrating the issue
See https://gist.github.com/richardmaw-codethink/05cce7763ad30a6009053961359282c0 for full logs.
I think the relevant logs are:
The
systemd-journald@foobaz.socket
"Unit is not needed anymore" is becausesystemd-journald@.service
hasStopWhenUnneeded=yes
and beforesystemd-journald@foobaz.service
is started the only service withRequires=systemd-journald@foobaz.socket
isrun-u9.service
There isn't an accompanying
Apr 30 18:32:40 H systemd[1]: Received SIGCHLD from PID 596 ((echo)).
,for this exit, the closest is
Apr 30 18:32:39 H systemd[1]: Received SIGCHLD from PID 581 (systemd-machine).
which is fromsystemd-machine-id-commit.service
, which is still running because there's no dependency on after firstboot.This may imply something subtle with handling multiple
SIGCHLD
events in calls tomanager_dispatch_sigchld
whenmanager_dispatch_signal_fd
returns aSIGCHLD
event,but I suspect it's just that systemd was kept too busy to handle the incoming IO on the journald stdio socket before
run-u9.service
exited, and since SIGCHLD has priority over incoming IO the socket is shut down instead of started.It's possible that explicitly starting journald for namespaces (rather than leaving it to socket activation) would fix this, but I'm not familiar enough with the code to say whether journald could still exit for being idle if there's a long running service that only produces output immediately before exiting.
The text was updated successfully, but these errors were encountered: