Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid synchronising by sleep in a systhreads test #13142

Merged
merged 1 commit into from
May 15, 2024

Conversation

OlivierNicole
Copy link
Contributor

The lib-threads/sockets.ml test exercises systhreads and socket communication, by creating a server thread, two client threads, and a thread for each connection. The reproducibility of the output is ensured by inserting a half-second delay before starting the second client.

As it happens, half a second doesn’t seem to suffice when the program is monitored by ThreadSanitizer—at least not on some CI workers: this test has been failing intermittently on the TSan CI for a long while, generating noise.

@fabbing and I propose to remove the sleep and synchronise using a shared counter instead. It’s more robust and has the secondary benefit of making the test faster.

let sock =
Unix.socket (Unix.domain_of_sockaddr addr) Unix.SOCK_STREAM 0 in
Unix.connect sock addr;
let buf = Bytes.make 1024 ' ' in
ignore(Unix.write_substring sock msg 0 (String.length msg));
let n = Unix.read sock buf 0 (Bytes.length buf) in
while not (!client_turn = id) do Thread.yield () done;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: this looks like the busy-wait pattern that we are taught to avoid, but in fact it is okay because this function itself is the one that makes progress by incrementing client_turn below (unlocking another thread), so calling Thread.yield () is not busy-waiting but actually progressing towards termination.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I won’t claim this is a very efficient way of doing it as the thread might be woken up one or more times with the loop condition still not fulfilled; using a mutex and a condition variable would probably be more efficient, but for such a simple we went for the first simple solution that seemed correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would feel more confident that the test behaves correctly with a condition variable rather than a busy-wait loop, that looks tricky to think about and more likely to backfire.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following your idea to not to rely on an explicit busy loop, Olivier and I reimplemented the synchronisation with an Event.channel which internally uses Mutex and Condition.

@OlivierNicole OlivierNicole added no-change-entry-needed run-thread-sanitizer This label makes the CI run the testsuite with TSAN enabled labels May 2, 2024
Achieved by using an Event.channel.

Co-authored-by: Olivier Nicole <olivier@chnik.fr>
Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with the new code if the test still passes in the CI. Thanks!

@gasche gasche merged commit fc809d3 into ocaml:trunk May 15, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug no-change-entry-needed run-thread-sanitizer This label makes the CI run the testsuite with TSAN enabled testsuite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants