Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbus-broker-launch[9179]: ERROR listener_dispatch @ ../src/bus/listener.c +42: Bad file descriptor #330

Open
jstrausd opened this issue Dec 20, 2023 · 15 comments

Comments

@jstrausd
Copy link

jstrausd commented Dec 20, 2023

Hello, since a few days my server with AlmaLinux 9 randomly started making problems. After a reboot NetworkManager and systemd-login and d-bus can't start or logged some errors.

Dec 20 11:25:59 host systemd[1]: Starting D-Bus System Message Bus...
Dec 20 11:25:59 host systemd[1]: Started D-Bus System Message Bus.
Dec 20 11:25:59 host dbus-broker-launch[9179]: ERROR listener_dispatch @ ../src/bus/listener.c +42: Bad file descriptor
Dec 20 11:25:59 host dbus-broker-launch[9179]: dispatch_context_dispatch @ ../src/util/dispatch.c +344
Dec 20 11:25:59 host dbus-broker-launch[9179]: broker_run @ ../src/broker/broker.c +225
Dec 20 11:25:59 host dbus-broker-launch[9179]: run @ ../src/broker/main.c +261
Dec 20 11:25:59 host dbus-broker-launch[9179]: main @ ../src/broker/main.c +295
Dec 20 11:25:59 host dbus-broker-lau[9178]: Ready
Dec 20 11:25:59 host dbus-broker[9179]: Dispatched 0 messages @ 0(±0)μs / message.
Dec 20 11:25:59 host systemd[1]: NetworkManager.service: Unexpected error response from GetNameOwner(): Connection terminated
Dec 20 11:25:59 host systemd[1]: polkit.service: Unexpected error response from GetNameOwner(): Connection terminated
Dec 20 11:25:59 host systemd[1]: tuned.service: Unexpected error response from GetNameOwner(): Connection terminated
Dec 20 11:25:59 host dbus-broker-launch[9178]: Caught SIGCHLD of broker.

Is there any way to fix my problem with dbus/dbus-broker?
I tried several things I found online, but nothing seemed to help...

@dvdhrm
Copy link
Member

dvdhrm commented Dec 20, 2023

What version of dbus-broker are you running?

@jstrausd
Copy link
Author

jstrausd commented Dec 20, 2023

[root@host ~]# dbus-broker --v
dbus-broker 28

[root@host ~]# dbus-broker-launch --version
dbus-broker-launch 28

I saw the latest version is 35, how to properly update?

@jstrausd
Copy link
Author

image

When trying to build the newest version, 5 tests are failing with the same error as the installed version described above.

@dvdhrm
Copy link
Member

dvdhrm commented Dec 21, 2023

Apparently accept4(2) returns EBADF on the D-Bus listener socket. This sounds like either a wrong systemd-unit-setup or a bad kernel update. Does this happen with a clean AlmaLinux9 install?

@jstrausd
Copy link
Author

No the system is round about 2 month 24/7 in use. I also had the same problem 2 month ago and then had to reinstall. Now it's time to find the cause. It happend out of nowhere. So do you recommend update kernel?

@teg
Copy link
Contributor

teg commented Dec 21, 2023

No the system is round about 2 month 24/7 in use. I also had the same problem 2 month ago and then had to reinstall. Now it's time to find the cause. It happend out of nowhere. So do you recommend update kernel?

How long had the old system been running, also around two months?

Do you have any custom config of either systemd or dbus?

@jstrausd
Copy link
Author

Yes, it's about the same duration the server lasted. I just use DirectAdmin on it for Webhosting and Docker, with no custom configuration of the system itself.

@dvdhrm
Copy link
Member

dvdhrm commented Jan 10, 2024

This is really strange, especially given that you are the only one hitting this so far. I really suspect that some other component is at fault here, but that is always easy to say.

The thing is, the error indicates that the socket that is created by systemd and then passed in as D-Bus listener socket to dbus-broker, somehow is not valid. And this is the early startup code of dbus-broker, so not much can go wrong before this happens. And even then, it sounds highly unlikely that dbus-broker is at fault if a faulty socket is passed in. But cannot be sure.

I would expect this error to go away with a clean install of an up-to-date Fedora/Alma/..., but you are saying it does not, right?

@DimeOne
Copy link

DimeOne commented Jan 18, 2024

I'm having the same issue on Arch linux after upgrade, but only on my Cloud vserver.

$ dbus-broker --v
dbus-broker 35

$ systemctl status dbus
● dbus-broker.service - D-Bus System Message Bus
Loaded: loaded (/usr/lib/systemd/system/dbus-broker.service; static)
Active: active (running) since Thu 2024-01-18 20:59:33 CET; 50s ago
TriggeredBy: ● dbus.socket
Docs: man:dbus-broker-launch(1)
Main PID: 1513 (dbus-broker-lau)
Tasks: 1 (limit: 19176)
Memory: 996.0K (peak: 3.0M)
CPU: 30ms
CGroup: /system.slice/dbus-broker.service
└─1513 /usr/bin/dbus-broker-launch --scope system --audit

Jan 18 20:59:33 vs0101 systemd[1]: Starting D-Bus System Message Bus...
Jan 18 20:59:33 vs0101 systemd[1]: Started D-Bus System Message Bus.
Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: ERROR listener_dispatch @ ../dbus-broker-35/src/bus/listener.c +42: Bad file descriptor
Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: dispatch_context_dispatch @ ../dbus-broker-35/src/util/dispatch.c +344
Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: broker_run @ ../dbus-broker-35/src/broker/broker.c +229
Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: run @ ../dbus-broker-35/src/broker/main.c +261
Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: main @ ../dbus-broker-35/src/broker/main.c +295
Jan 18 20:59:33 vs0101 dbus-broker[1515]: Dispatched 0 messages @ 0(±0)μs / message.
Jan 18 20:59:33 vs0101 dbus-broker-launch[1513]: Ready
Jan 18 20:59:33 vs0101 dbus-broker-launch[1513]: Caught SIGCHLD of broker.

$ journalctl --since 18:00 | grep Unexpected

Jan 18 20:59:33 vs0101 systemd[1]: systemd-logind.service: Unexpected error response from GetNameOwner(): Connection terminated
Jan 18 20:59:33 vs0101 systemd[1]: systemd-networkd.service: Unexpected error response from GetNameOwner(): Connection terminated
Jan 18 20:59:33 vs0101 systemd[1]: firewalld.service: Unexpected error response from GetNameOwner(): Connection terminated
Jan 18 20:59:33 vs0101 systemd[1]: polkit.service: Unexpected error response from GetNameOwner(): Connection terminated
Jan 18 20:59:33 vs0101 systemd[1]: NetworkManager.service: Unexpected error response from GetNameOwner(): Connection terminated
Jan 18 20:59:33 vs0101 systemd[1]: systemd-resolved.service: Unexpected error response from GetNameOwner(): Connection terminated
Jan 18 20:59:33 vs0101 systemd[1]: systemd-logind.service: Unexpected error response from GetNameOwner(): Connection terminated
Jan 18 20:59:33 vs0101 systemd[1]: systemd-networkd.service: Unexpected error response from GetNameOwner(): Connection terminated
Jan 18 20:59:33 vs0101 systemd[1]: firewalld.service: Unexpected error response from GetNameOwner(): Connection terminated

$ pacman -Qikk dbus-broker dbus-broker-units

Name : dbus-broker
Version : 35-2
Description : Linux D-Bus Message Broker
Architecture : x86_64
URL : https://github.com/bus1/dbus-broker/wiki
Licenses : Apache-2.0
Groups : None
Provides : None
Depends On : audit libcap-ng expat systemd-libs libaudit.so=1-64 libcap-ng.so=0-64 libexpat.so=1-64 libsystemd.so=0-64
Optional Deps : None
Required By : dbus-broker-units
Optional For : None
Conflicts With : None
Replaces : None
Installed Size : 365.64 KiB
Packager : Jan Alexander Steffens (heftig) heftig@archlinux.org
Build Date : Fri 05 Jan 2024 03:45:05 PM CET
Install Date : Tue 16 Jan 2024 07:18:16 PM CET
Install Reason : Installed as a dependency for another package
Install Script : No
Validated By : Signature

dbus-broker: 18 total files, 0 altered files
Name : dbus-broker-units
Version : 35-2
Description : Linux D-Bus Message Broker - Service units
Architecture : x86_64
URL : https://github.com/bus1/dbus-broker/wiki
Licenses : Apache-2.0
Groups : None
Provides : dbus-units
Depends On : dbus-broker
Optional Deps : None
Required By : systemd
Optional For : None
Conflicts With : dbus-daemon-units
Replaces : None
Installed Size : 0.00 B
Packager : Jan Alexander Steffens (heftig) heftig@archlinux.org
Build Date : Fri 05 Jan 2024 03:45:05 PM CET
Install Date : Tue 16 Jan 2024 07:18:16 PM CET
Install Reason : Installed as a dependency for another package
Install Script : No
Validated By : Signature

dbus-broker-units: 7 total files, 0 altered files

@dvdhrm
Copy link
Member

dvdhrm commented Jan 23, 2024

@jstrausd @DimeOne I am trying to trace this down. On an affected machine, can you clone the dbus-broker repository and run the test-suite? And, if it shows the mentioned error, can you post the meson error logs? And lastly, if possible, can you run one of the affected tests via strace -f ./<build>/test/dbus/<some-test>?

Something like this:

git clone https://github.com/bus1/dbus-broker.git
cd dbus-broker
make meson-setup
make meson-build
make meson-test

cd ./build/meson
./test/dbus/test-matches
strace -f ./test/dbus/test-matches

(You might have to install strace on your machine first. This should be packaged on almost every distribution.)

@rafaelwastaken
Copy link

Also experiencing this on an Arch VM running on Hetzner Cloud. I ran the tests above on the affected VM, here's output from both the test and strace: https://gist.github.com/rafaelwastaken/2582a88888330243a5ea25207ae5fe48

@dvdhrm
Copy link
Member

dvdhrm commented Mar 5, 2024

Thanks a lot for submitting the information! Unfortunately, I picked the wrong directory to run the test from (it runs dbus-broker from a relative path). Can you try this instead (I also edited the previous message):

cd ./build/meson
./test/dbus/test-matches
strace -f ./test/dbus/test-matches

@rafaelwastaken
Copy link

Sorry for the delayed response. Re-ran the new commands, output here: https://gist.github.com/rafaelwastaken/7d1e097aa56e15a8632c0e05957378a9

@dvdhrm
Copy link
Member

dvdhrm commented Mar 20, 2024

Thanks a lot for the trace! I think the relevant piece is this:

[pid 475832] accept4(8, NULL, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK <unfinished ...>
[pid 475832] <... accept4 resumed>)     = -1 EAGAIN (Resource temporarily unavailable)
[pid 475832] getsockopt(-1, SOL_SOCKET, SO_TYPE,  <unfinished ...>
[pid 475832] <... getsockopt resumed>0x7ffda8d59b70, [4]) = -1 EBADF (Bad file descriptor)
[pid 475832] write(2, "ERROR listener_dispatch @ ../../"..., 76 <unfinished ...>
[pid 475832] <... write resumed>)       = 76

If we sanitize this, it becomes:

accept4(8, NULL, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK)     = -1 EAGAIN (Resource temporarily unavailable)
getsockopt(-1, SOL_SOCKET, SO_TYPE, 0x7ffda8d59b70, [4]) = -1 EBADF (Bad file descriptor)
write(2, "ERROR listener_dispatch @ ../../"..., 76)       = 76

I have no clue where this getsockopt() call comes from. I also don't know why it completely confuses the error-handler to pretend it is called from listener_dispatch().

A few followup questions:

  1. Is this reproducible when compiling without optimizations? That is, using --buildtype plain (or debug) instead of --buildtype debugoptimized? Is this reproducible when compiling with clang instead of gcc?

  2. Do you run any non-standard system? Things like non-glibc, or non-systemd, or non-upstream-kernel?

  3. Do you run any non-standard intrusive utilities that use LD_PRELOAD or similar to inject code? Or nss-modules?

@dvdhrm
Copy link
Member

dvdhrm commented Mar 20, 2024

There is a lot of suspicious activity in that strace. There is access on /tmp/.apid, there is an existing /etc/ld.so.preload, there is a load of /lib/libgcwrap.so.

Can you look into /etc/ld.so.preload and check whether it is correct? Can you check which software installed /lib/libgcwrap.so?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants