Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault on thread destruction with Wayland but not X11. #2536

Open
pascal-boeschoten-hapteon opened this issue Apr 12, 2024 · 7 comments
Open
Labels
cannot reproduce Bugs that have failed verification Wayland

Comments

@pascal-boeschoten-hapteon

When running this program, I get a segfault when running under Wayland, but X11 is fine.
The segfault occurs when the std::jthread is being destroyed.

#include <cassert>
#include <thread>
#include <GLFW/glfw3.h>

int main() {
  std::jthread gui_thread{[] {
    assert(glfwInit() == GLFW_TRUE);
    auto window = glfwCreateWindow(1280, 720, "title", nullptr, nullptr);
    assert(window);
    glfwDestroyWindow(window); // NOTE: Not necessary to reproduce.
    glfwTerminate();
    assert(glfwGetError(nullptr) == 0);
  }};
  assert(glfwGetError(nullptr) == 0);
  return 0;
}

OS and version: Ubuntu 22.04.4 LTS
Release or commit: 228e58262e18f2ee61799bd86d0be718b1e31f9f

Error messages:

Segmentation fault (core dumped)

Call stack:

Thread 1
#0  __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=74121, futex_word=0x7ffff4dc6910) at ./nptl/futex-internal.c:57
#1  __futex_abstimed_wait_common (cancel=true, private=128, abstime=0x0, clockid=0, expected=74121, futex_word=0x7ffff4dc6910) at ./nptl/futex-internal.c:87
#2  __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7ffff4dc6910, expected=74121, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=128)
    at ./nptl/futex-internal.c:139
#3  0x00007ffff5a96624 in __pthread_clockjoin_ex (threadid=140737301472832, thread_return=0x0, clockid=0, abstime=0x0, block=<optimised out>) at ./nptl/pthread_join_common.c:105
#4  0x00007ffff56e8567 in __gthread_join (__value_ptr=0x0, __threadid=<optimised out>)
    at [...]/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:669
#5  std::thread::join (this=0x7fffffffdd90) at [...]/libstdc++-v3/src/c++11/thread.cc:134
#6  0x00005555555578b9 in std::jthread::join (this=0x7fffffffdd88) at [...]/gcc/x86_64-pc-linux-gnu/13.2.0/../../../../include/c++/13.2.0/thread:191
#7  0x0000555555557818 in std::jthread::~jthread (this=0x7fffffffdd88) at [...]/gcc/x86_64-pc-linux-gnu/13.2.0/../../../../include/c++/13.2.0/thread:161
#8  0x0000555555557317 in main () at [...]/main.cpp:22

Thread 2
#0  0x00007ffff4392d30 in ?? ()
#1  0x00007ffff5a91691 in __GI___nptl_deallocate_tsd () at ./nptl/nptl_deallocate_tsd.c:73
#2  __GI___nptl_deallocate_tsd () at ./nptl/nptl_deallocate_tsd.c:22
#3  0x00007ffff5a9494a in start_thread (arg=<optimised out>) at ./nptl/pthread_create.c:453
#4  0x00007ffff5b26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
@elmindreda elmindreda added Wayland cannot reproduce Bugs that have failed verification labels Apr 12, 2024
@elmindreda
Copy link
Member

Unable to reproduce on Ubuntu 23.10. The program runs without error. I also don't see GLFW in the call stack of either thread above. This may be a bug elsewhere. Did you build the program with debug information?

@elmindreda elmindreda added the waiting for reply Issues blocked waiting for information label Apr 12, 2024
@pascal-boeschoten-hapteon
Copy link
Author

Yes, it's with debug symbols. I did some digging around, hopefully this will be useful.

It crashes in ./nptl/nptl_deallocate_tsd.c:73 which is this bit: https://github.com/bminor/glibc/blob/glibc-2.35/nptl/nptl_deallocate_tsd.c#L73.
I think it destroys thread-local variables by calling destructors in __pthread_keys.
Those destructors were registered in ./nptl/pthread_key_create.c:37: https://github.com/bminor/glibc/blob/glibc-2.35/nptl/pthread_key_create.c#L37.

If I run the program and break on ./nptl/pthread_key_create.c:37 it stops there 4 times, producing these backtraces:

1st:

#0  ___pthread_key_create (key=0x5555555e8320 <_glfw+133408>, destr=0x0) at ./nptl/pthread_key_create.c:37
#1  0x0000555555570193 in _glfwPlatformCreateTls (tls=0x5555555e831c <_glfw+133404>) at [...]/glfw/src/posix_thread.c:44
#2  0x0000555555563121 in glfwInit () at [...]/glfw/src/init.c:412
#3  0x0000555555560bbb in operator() (__closure=0x5555555fd2d8) at [...]/main.cpp:13
#4  0x00005555555611b8 in std::__invoke_impl<void, main()::<lambda()> >(std::__invoke_other, struct {...} &&) (__f=...) at /usr/include/c++/13/bits/invoke.h:61
#5  0x000055555556117b in std::__invoke<main()::<lambda()> >(struct {...} &&) (__fn=...) at /usr/include/c++/13/bits/invoke.h:96
#6  0x0000555555561128 in std::thread::_Invoker<std::tuple<main()::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5555555fd2d8) at /usr/include/c++/13/bits/std_thread.h:292
#7  0x00005555555610fc in std::thread::_Invoker<std::tuple<main()::<lambda()> > >::operator()(void) (this=0x5555555fd2d8) at /usr/include/c++/13/bits/std_thread.h:299
#8  0x00005555555610e0 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<main()::<lambda()> > > >::_M_run(void) (this=0x5555555fd2d0) at /usr/include/c++/13/bits/std_thread.h:244
#9  0x00007ffff7ce62b3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007ffff7894ac3 in start_thread (arg=<optimised out>) at ./nptl/pthread_create.c:442
#11 0x00007ffff7926850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

2nd:

#0  ___pthread_key_create (key=0x5555555e8328 <_glfw+133416>, destr=0x0) at ./nptl/pthread_key_create.c:37
#1  0x0000555555570193 in _glfwPlatformCreateTls (tls=0x5555555e8324 <_glfw+133412>) at [...]/glfw/src/posix_thread.c:44
#2  0x000055555556313b in glfwInit () at [...]/glfw/src/init.c:413
#3  0x0000555555560bbb in operator() (__closure=0x5555555fd2d8) at [...]/main.cpp:13
#4  0x00005555555611b8 in std::__invoke_impl<void, main()::<lambda()> >(std::__invoke_other, struct {...} &&) (__f=...) at /usr/include/c++/13/bits/invoke.h:61
#5  0x000055555556117b in std::__invoke<main()::<lambda()> >(struct {...} &&) (__fn=...) at /usr/include/c++/13/bits/invoke.h:96
#6  0x0000555555561128 in std::thread::_Invoker<std::tuple<main()::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5555555fd2d8) at /usr/include/c++/13/bits/std_thread.h:292
#7  0x00005555555610fc in std::thread::_Invoker<std::tuple<main()::<lambda()> > >::operator()(void) (this=0x5555555fd2d8) at /usr/include/c++/13/bits/std_thread.h:299
#8  0x00005555555610e0 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<main()::<lambda()> > > >::_M_run(void) (this=0x5555555fd2d0) at /usr/include/c++/13/bits/std_thread.h:244
#9  0x00007ffff7ce62b3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007ffff7894ac3 in start_thread (arg=<optimised out>) at ./nptl/pthread_create.c:442
#11 0x00007ffff7926850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

3rd:

#0  ___pthread_key_create (key=0x7ffff6eff170, destr=0x7ffff6e88d90) at ./nptl/pthread_key_create.c:37
#1  0x00007ffff6e89594 in __glDispatchInit () from /lib/x86_64-linux-gnu/libGLdispatch.so.0
#2  0x00007ffff7ad4da3 in ?? () from /lib/x86_64-linux-gnu/libEGL.so.1
#3  0x00007ffff7fc947e in call_init (l=<optimised out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdf08, env=env@entry=0x7fffffffdf18) at ./elf/dl-init.c:70
#4  0x00007ffff7fc9568 in call_init (env=0x7fffffffdf18, argv=0x7fffffffdf08, argc=1, l=<optimised out>) at ./elf/dl-init.c:33
#5  _dl_init (main_map=0x7ffff003dc00, argc=1, argv=0x7fffffffdf08, env=0x7fffffffdf18) at ./elf/dl-init.c:117
#6  0x00007ffff7974af5 in __GI__dl_catch_exception (exception=<optimised out>, operate=<optimised out>, args=<optimised out>) at ./elf/dl-error-skeleton.c:182
#7  0x00007ffff7fd0ff6 in dl_open_worker (a=0x7ffff77fe4d0) at ./elf/dl-open.c:808
#8  dl_open_worker (a=a@entry=0x7ffff77fe4d0) at ./elf/dl-open.c:771
#9  0x00007ffff7974a98 in __GI__dl_catch_exception (exception=<optimised out>, operate=<optimised out>, args=<optimised out>) at ./elf/dl-error-skeleton.c:208
#10 0x00007ffff7fd134e in _dl_open (file=<optimised out>, mode=-2147483647, caller_dlopen=0x55555556fff3 <_glfwPlatformLoadModule+33>, nsid=-2, argc=1, argv=<optimised out>, env=0x7fffffffdf18)
    at ./elf/dl-open.c:883
#11 0x00007ffff789063c in dlopen_doit (a=a@entry=0x7ffff77fe740) at ./dlfcn/dlopen.c:56
#12 0x00007ffff7974a98 in __GI__dl_catch_exception (exception=exception@entry=0x7ffff77fe6a0, operate=<optimised out>, args=<optimised out>) at ./elf/dl-error-skeleton.c:208
#13 0x00007ffff7974b63 in __GI__dl_catch_error (objname=0x7ffff77fe6f8, errstring=0x7ffff77fe700, mallocedp=0x7ffff77fe6f7, operate=<optimised out>, args=<optimised out>) at ./elf/dl-error-skeleton.c:227
#14 0x00007ffff789012e in _dlerror_run (operate=operate@entry=0x7ffff78905e0 <dlopen_doit>, args=args@entry=0x7ffff77fe740) at ./dlfcn/dlerror.c:138
#15 0x00007ffff78906c8 in dlopen_implementation (dl_caller=<optimised out>, mode=<optimised out>, file=<optimised out>) at ./dlfcn/dlopen.c:71
#16 ___dlopen (file=<optimised out>, mode=<optimised out>) at ./dlfcn/dlopen.c:81
#17 0x000055555556fff3 in _glfwPlatformLoadModule (path=0x5555555b867a "libEGL.so.1") at [...]/glfw/src/posix_module.c:39
#18 0x0000555555591ad0 in _glfwInitEGL () at [...]/glfw/src/egl_context.c:388
#19 0x000055555558c12d in _glfwCreateWindowWayland (window=0x7ffff0022fd0, wndconfig=0x7ffff77fe8f0, ctxconfig=0x7ffff77fe860, fbconfig=0x7ffff77fe8a0)
    at [...]/glfw/src/wl_window.c:2149
#20 0x000055555556b0fc in glfwCreateWindow (width=1280, height=720, title=0x55555559508d "title", monitor=0x0, share=0x0) at [...]/glfw/src/window.c:247
#21 0x0000555555560c0c in operator() (__closure=0x5555555fd2d8) at [...]/main.cpp:14
#22 0x00005555555611b8 in std::__invoke_impl<void, main()::<lambda()> >(std::__invoke_other, struct {...} &&) (__f=...) at /usr/include/c++/13/bits/invoke.h:61
#23 0x000055555556117b in std::__invoke<main()::<lambda()> >(struct {...} &&) (__fn=...) at /usr/include/c++/13/bits/invoke.h:96
#24 0x0000555555561128 in std::thread::_Invoker<std::tuple<main()::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5555555fd2d8) at /usr/include/c++/13/bits/std_thread.h:292
#25 0x00005555555610fc in std::thread::_Invoker<std::tuple<main()::<lambda()> > >::operator()(void) (this=0x5555555fd2d8) at /usr/include/c++/13/bits/std_thread.h:299
#26 0x00005555555610e0 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<main()::<lambda()> > > >::_M_run(void) (this=0x5555555fd2d0) at /usr/include/c++/13/bits/std_thread.h:244
#27 0x00007ffff7ce62b3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#28 0x00007ffff7894ac3 in start_thread (arg=<optimised out>) at ./nptl/pthread_create.c:442
#29 0x00007ffff7926850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

4th:

#0  ___pthread_key_create (key=0x7ffff7ae3568, destr=0x7ffff7ad5d30) at ./nptl/pthread_key_create.c:37
#1  0x00007ffff7ad538a in ?? () from /lib/x86_64-linux-gnu/libEGL.so.1
#2  0x00007ffff7fc947e in call_init (l=<optimised out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdf08, env=env@entry=0x7fffffffdf18) at ./elf/dl-init.c:70
#3  0x00007ffff7fc9568 in call_init (env=0x7fffffffdf18, argv=0x7fffffffdf08, argc=1, l=<optimised out>) at ./elf/dl-init.c:33
#4  _dl_init (main_map=0x7ffff003dc00, argc=1, argv=0x7fffffffdf08, env=0x7fffffffdf18) at ./elf/dl-init.c:117
#5  0x00007ffff7974af5 in __GI__dl_catch_exception (exception=<optimised out>, operate=<optimised out>, args=<optimised out>) at ./elf/dl-error-skeleton.c:182
#6  0x00007ffff7fd0ff6 in dl_open_worker (a=0x7ffff77fe4d0) at ./elf/dl-open.c:808
#7  dl_open_worker (a=a@entry=0x7ffff77fe4d0) at ./elf/dl-open.c:771
#8  0x00007ffff7974a98 in __GI__dl_catch_exception (exception=<optimised out>, operate=<optimised out>, args=<optimised out>) at ./elf/dl-error-skeleton.c:208
#9  0x00007ffff7fd134e in _dl_open (file=<optimised out>, mode=-2147483647, caller_dlopen=0x55555556fff3 <_glfwPlatformLoadModule+33>, nsid=-2, argc=1, argv=<optimised out>, env=0x7fffffffdf18)
    at ./elf/dl-open.c:883
#10 0x00007ffff789063c in dlopen_doit (a=a@entry=0x7ffff77fe740) at ./dlfcn/dlopen.c:56
#11 0x00007ffff7974a98 in __GI__dl_catch_exception (exception=exception@entry=0x7ffff77fe6a0, operate=<optimised out>, args=<optimised out>) at ./elf/dl-error-skeleton.c:208
#12 0x00007ffff7974b63 in __GI__dl_catch_error (objname=0x7ffff77fe6f8, errstring=0x7ffff77fe700, mallocedp=0x7ffff77fe6f7, operate=<optimised out>, args=<optimised out>) at ./elf/dl-error-skeleton.c:227
#13 0x00007ffff789012e in _dlerror_run (operate=operate@entry=0x7ffff78905e0 <dlopen_doit>, args=args@entry=0x7ffff77fe740) at ./dlfcn/dlerror.c:138
#14 0x00007ffff78906c8 in dlopen_implementation (dl_caller=<optimised out>, mode=<optimised out>, file=<optimised out>) at ./dlfcn/dlopen.c:71
#15 ___dlopen (file=<optimised out>, mode=<optimised out>) at ./dlfcn/dlopen.c:81
#16 0x000055555556fff3 in _glfwPlatformLoadModule (path=0x5555555b867a "libEGL.so.1") at [...]/glfw/src/posix_module.c:39
#17 0x0000555555591ad0 in _glfwInitEGL () at [...]/glfw/src/egl_context.c:388
#18 0x000055555558c12d in _glfwCreateWindowWayland (window=0x7ffff0022fd0, wndconfig=0x7ffff77fe8f0, ctxconfig=0x7ffff77fe860, fbconfig=0x7ffff77fe8a0)
    at [...]/glfw/src/wl_window.c:2149
#19 0x000055555556b0fc in glfwCreateWindow (width=1280, height=720, title=0x55555559508d "title", monitor=0x0, share=0x0) at [...]/glfw/src/window.c:247
#20 0x0000555555560c0c in operator() (__closure=0x5555555fd2d8) at [...]/main.cpp:14
#21 0x00005555555611b8 in std::__invoke_impl<void, main()::<lambda()> >(std::__invoke_other, struct {...} &&) (__f=...) at /usr/include/c++/13/bits/invoke.h:61
#22 0x000055555556117b in std::__invoke<main()::<lambda()> >(struct {...} &&) (__fn=...) at /usr/include/c++/13/bits/invoke.h:96
#23 0x0000555555561128 in std::thread::_Invoker<std::tuple<main()::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5555555fd2d8) at /usr/include/c++/13/bits/std_thread.h:292
#24 0x00005555555610fc in std::thread::_Invoker<std::tuple<main()::<lambda()> > >::operator()(void) (this=0x5555555fd2d8) at /usr/include/c++/13/bits/std_thread.h:299
#25 0x00005555555610e0 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<main()::<lambda()> > > >::_M_run(void) (this=0x5555555fd2d0) at /usr/include/c++/13/bits/std_thread.h:244
#26 0x00007ffff7ce62b3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#27 0x00007ffff7894ac3 in start_thread (arg=<optimised out>) at ./nptl/pthread_create.c:442
#28 0x00007ffff7926850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

All these stacks involve glfw.
After this gdb continues until it hits the segfault at ./nptl/nptl_deallocate_tsd.c:73 , after which printing __pthread_keys outputs:

(gdb) print __pthread_keys
$1 = {{seq = 2, destr = 0x0}, {seq = 2, destr = 0x0}, {seq = 2, destr = 0x7ffff6e88d90}, {seq = 1, destr = 0x7ffff7ad5d30}, {seq = 0, destr = 0x0} <repeats 1020 times>}

Which correspond to the 4 backtraces. So it looks like these were all created through glfw.

@elmindreda
Copy link
Member

elmindreda commented Apr 12, 2024

Two of them are created by GLFW, one by EGL and one by libGLdispatch. This is normal behavior on any modern Linux system and typically doesn't segfault. Edit: GLFW also doesn't register a destructor for any of its keys.

@elmindreda elmindreda removed the waiting for reply Issues blocked waiting for information label Apr 12, 2024
@elmindreda
Copy link
Member

Hmm, see if removing the unloading of the EGL and GL libraries from GLFW resolves this. Remove all the calls to _glfwPlatformFreeModule in egl_context.c.

@pascal-boeschoten-hapteon
Copy link
Author

After removing those (at L334, L551) I no longer get the segfault.

@pascal-boeschoten-hapteon
Copy link
Author

Does that indicate an issue with the graphics driver?

@mcgrew
Copy link

mcgrew commented Apr 21, 2024

I came across a similar issue in FamiStudio (a C# application) which uses GLFW. I tracked it down to a NULL pointer in XCreateFontCursor in my case.

Everything works fine in glfw3.3, but under glfw-3.4 when calling XCreateFontCursor, dpy is NULL and crashes the application: https://github.com/mirror/libX11/blob/ff8706a5eae25b8bafce300527079f68a201d27f/src/Cursor.c#L45

For clarity: This application currently uses xwayland.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cannot reproduce Bugs that have failed verification Wayland
Projects
None yet
Development

No branches or pull requests

3 participants