Avoid synchronising by sleep in a systhreads test #13142

OlivierNicole · 2024-05-02T10:05:00Z

The lib-threads/sockets.ml test exercises systhreads and socket communication, by creating a server thread, two client threads, and a thread for each connection. The reproducibility of the output is ensured by inserting a half-second delay before starting the second client.

As it happens, half a second doesn’t seem to suffice when the program is monitored by ThreadSanitizer—at least not on some CI workers: this test has been failing intermittently on the TSan CI for a long while, generating noise.

@fabbing and I propose to remove the sleep and synchronise using a shared counter instead. It’s more robust and has the secondary benefit of making the test faster.

gasche · 2024-05-02T11:26:40Z

testsuite/tests/lib-threads/sockets.ml

  let sock =
    Unix.socket (Unix.domain_of_sockaddr addr) Unix.SOCK_STREAM 0 in
  Unix.connect sock addr;
  let buf = Bytes.make 1024 ' ' in
  ignore(Unix.write_substring sock msg 0 (String.length msg));
  let n = Unix.read sock buf 0 (Bytes.length buf) in
+  while not (!client_turn = id) do Thread.yield () done;


Note to self: this looks like the busy-wait pattern that we are taught to avoid, but in fact it is okay because this function itself is the one that makes progress by incrementing client_turn below (unlocking another thread), so calling Thread.yield () is not busy-waiting but actually progressing towards termination.

Yes. I won’t claim this is a very efficient way of doing it as the thread might be woken up one or more times with the loop condition still not fulfilled; using a mutex and a condition variable would probably be more efficient, but for such a simple we went for the first simple solution that seemed correct.

Personally I would feel more confident that the test behaves correctly with a condition variable rather than a busy-wait loop, that looks tricky to think about and more likely to backfire.

Following your idea to not to rely on an explicit busy loop, Olivier and I reimplemented the synchronisation with an Event.channel which internally uses Mutex and Condition.

Achieved by using an Event.channel. Co-authored-by: Olivier Nicole <olivier@chnik.fr>

gasche

I'm okay with the new code if the test still passes in the CI. Thanks!

dustanddreams approved these changes May 2, 2024

View reviewed changes

gasche reviewed May 2, 2024

View reviewed changes

OlivierNicole added no-change-entry-needed run-thread-sanitizer This label makes the CI run the testsuite with TSAN enabled labels May 2, 2024

gasche added testsuite bug labels May 15, 2024

fabbing force-pushed the no_sync_by_sleep branch from 7d20192 to 056c7ff Compare May 15, 2024 15:20

Avoid synchronising by sleep in a systhreads test

c9b3918

Achieved by using an Event.channel. Co-authored-by: Olivier Nicole <olivier@chnik.fr>

fabbing force-pushed the no_sync_by_sleep branch from 7d20192 to c9b3918 Compare May 15, 2024 16:11

gasche approved these changes May 15, 2024

View reviewed changes

gasche merged commit fc809d3 into ocaml:trunk May 15, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid synchronising by sleep in a systhreads test #13142

Avoid synchronising by sleep in a systhreads test #13142

OlivierNicole commented May 2, 2024

gasche May 2, 2024

OlivierNicole May 14, 2024

gasche May 14, 2024

fabbing May 15, 2024

gasche left a comment

Avoid synchronising by sleep in a systhreads test #13142

Avoid synchronising by sleep in a systhreads test #13142

Conversation

OlivierNicole commented May 2, 2024

gasche May 2, 2024

Choose a reason for hiding this comment

OlivierNicole May 14, 2024

Choose a reason for hiding this comment

gasche May 14, 2024

Choose a reason for hiding this comment

fabbing May 15, 2024

Choose a reason for hiding this comment

gasche left a comment

Choose a reason for hiding this comment