Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce fuzz testing to separate unit tests from repetition tests #219

Merged
merged 1 commit into from
May 17, 2024

Conversation

masih
Copy link
Member

@masih masih commented May 14, 2024

Prior to changes introduced here the majority of time during testing was spent on repeating the same test while changing the seed for generators or latency models.

Instead of repeating the same test over and over use the fuzz testing mechanism provided by Go SDK. This change allows the unit tests to run much faster while offering a configurable "fuzztime" that dictates for how long the test cases should be fuzzed.

The changes here introduce the following fuzz tests:

  • FuzzAbsentAdversary
  • FuzzImmediateDecideAdversary
  • FuzzHonest_AsyncRequireStrongQuorumToProgress
  • FuzzHonest_SyncMajorityCommonPrefix
  • FuzzHonest_AsyncMajorityCommonPrefix
  • FuzzHonestMultiInstance_AsyncDisagreement
  • FuzzHonestMultiInstance_SyncAgreement
  • FuzzHonestMultiInstance_AsyncAgreement
  • FuzzStoragePower_SyncIncreaseMidSimulation
  • FuzzStoragePower_AsyncIncreaseMidSimulation
  • FuzzStoragePower_SyncDecreaseRevertsToBase
  • FuzzStoragePower_AsyncDecreaseRevertsToBase
  • FuzzRepeatAdversary

The CI workflow is updated to run each of the fuzz tests for 30 seconds in a dedicated job.

Resolves #218

@masih masih force-pushed the masih/fuzz-all-da-things branch from 5dbb063 to 33ef804 Compare May 14, 2024 19:58
Copy link
Member Author

@masih masih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review

Makefile Show resolved Hide resolved
@Stebalien
Copy link
Member

Not now but... it would be really nice if we could have coverage guided fuzzing.

  1. Do a bunch of fuzzing.
  2. For each input, record the code coverage from that input.
  3. If the input exercises new code, add it to the corpus.

We'll still want random fuzzing, but this will help us make sure we at least cover as many error paths as possible.

@masih
Copy link
Member Author

masih commented May 14, 2024

Not now but... it would be really nice

Indeed it would 👍

Captured at #220

@Stebalien
Copy link
Member

I do have one concern: the tests will remain really fast even if other parts slow down. Basically, we'll just start losing test coverage. But that's probably not a huge deal (maybe run a nightly fuzzing job in CI that's longer)?

Makefile Outdated
test:
GOGC=$(GOGC) go test ./...
.PHONY: test

fuzz: FUZZTIME ?= 10s # The duration to run fuzz testing, default to 10s if unset.
fuzz: # List all fuzz tests across the repo, and run them one at a time with the configured fuzztime.
@go run ./scripts/list_fuzz_tests | while read -r line; do \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future readers, this is parallelized because go will internally parallelize fuzzing.

Makefile Outdated Show resolved Hide resolved
Copy link
Member

@anorth anorth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive me, but I don't think I'm aligned on goals here. This seems to be sacrificing test coverage (though I could well be misunderstanding something here) to make them faster and less deterministic. Faster is good, but I want more coverage and full determinism in most cases.

I would love to discuss this, but I don't see how changing the random seed without actually changing how the message queue works adds any coverage within the scope of an individual run (which must be repeatable to be useful). Yes, I could see that varying seeds in some long-running CI job could help, but the proposed change seems much deeper than that, with big effects on the standard dev flow.

test/constants_test.go Show resolved Hide resolved
test/absent_test.go Show resolved Hide resolved
test/decide_test.go Outdated Show resolved Hide resolved
test/honest_test.go Show resolved Hide resolved
test/multi_instance_test.go Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@Stebalien
Copy link
Member

Continuing from #219 (comment) and #219 (comment), I'd consider:

  1. Skipping this in make test (leaving a single fuzz run from go test ./....
  2. Running a bit of fuzzing in CI (30s, one job per fuzzer ideally).
  3. A longer fuzzing run in the merge queue.

That gives us immediate "did we break something" feedback while leaving the slow tests to CI. Although I'm starting to think that should happen in this PR so we don't lose coverage and/or make the tests take longer.

test/decide_test.go Outdated Show resolved Hide resolved
@masih masih force-pushed the masih/fuzz-all-da-things branch 2 times, most recently from 0ecae4d to 8a46e3b Compare May 15, 2024 17:19
@codecov-commenter
Copy link

codecov-commenter commented May 15, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 71.66%. Comparing base (837ffde) to head (5d0cad7).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #219      +/-   ##
==========================================
+ Coverage   71.61%   71.66%   +0.04%     
==========================================
  Files          31       31              
  Lines        2336     2333       -3     
==========================================
- Hits         1673     1672       -1     
+ Misses        533      532       -1     
+ Partials      130      129       -1     
Files Coverage Δ
sim/latency/log_normal.go 91.30% <50.00%> (+6.68%) ⬆️

@Kubuxu
Copy link
Collaborator

Kubuxu commented May 15, 2024

Let's merge #230, rebase and compare then.

Makefile Outdated Show resolved Hide resolved
@Stebalien
Copy link
Member

Note, to compare coverage, don't just look at the numbers.

  1. Generate a report with go test -cover -covermode=atomic -coverpkg github.com/filecoin-project/go-f3/gpbft -coverprofile coverage.out ./.... -coverpkg ensures we don't get a bunch of noise because we're changing the test framework.
  2. Trim the counts with cat coverage.out | grep -v ' 0$' | cut -d' ' -f-2 > coverage.trimmed.
  3. Do the same thing on the branch you're comparing against.
  4. Diff the results.

That'll let you explore the actual coverage diff.

(Maybe codecov.io will do this? But I've never had a good experience with it)

@Kubuxu
Copy link
Collaborator

Kubuxu commented May 15, 2024

Codecov will do it (apart from the fact that right now we are collecting coverage across all packages, we might want to change it).

@Stebalien
Copy link
Member

I'm looking into parallelizing fuzzing "seeds" and it's not clear if it actually works.

Copy link
Member

@anorth anorth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I am digesting this change more thoroughly, I see there are two different things going one. One is a consolidation of tests code, removing some duplication by moving to table-driven tests. This seems to be independent of the fuzzing. Next time it would be nice to review these separately.

Makefile Show resolved Hide resolved
test/multi_instance_test.go Outdated Show resolved Hide resolved
test/honest_test.go Show resolved Hide resolved
@anorth
Copy link
Member

anorth commented May 15, 2024

I had not updated to the latest PR code when writing the below. I'm still nervous, but my initial observations were of an earlier version of the code. Invoking some of the fuzzing from go test seems like a good arrangement.

I remain a bit nervous about the effects of these changes on development and testing workflows. The tests that have been moved out to fuzzing now are not captured by running "all tests". "All tests" is now much too fast, so I have to run some fuzzing to have any confidence. Fuzzing is difficult to run from the IDE*. And I'm even more nervous about how hard it will be to replicate a single test instance (participants, config, random seed) to read the logs, insert a debugger, etc.

[UPDATE] I see that when a fuzz test fails, go outputs a command to re-run that test, which shouldn't be too hard to invoke from IDE context
[MORE UPDATE] I see that output in https://github.com/filecoin-project/go-f3/actions/runs/9100009614/job/25014012083?pr=219, but I can't replicate it locally after introducing a bug

I have at least now discovered how to run fuzz tests one at a time in GoLand, but I wish it was simpler.

Would it be unreasonable to suggest adding some fuzz tests in addition to retaining the existing tests on main? Consider them the regression tests, which we can add to if a bug is discovered by fuzzing. Perhaps we could turn down the iteration count a bit in that case.

(*) My attempt to right-click and run all fuzz tests in tests failed with

/opt/homebrew/opt/go/libexec/bin/go test -json github.com/filecoin-project/go-f3/test -fuzz . -run ^$
{"Time":"2024-05-16T10:59:41.884345+12:00","Action":"start","Package":"github.com/filecoin-project/go-f3/test"}
testing: will not fuzz, -fuzz matches more than one fuzz test: [FuzzAbsentAdversary FuzzImmediateDecideAdversary FuzzHonest_RequireStrongQuorumToProgress FuzzMultiSyncAgreement FuzzMultiAsyncAgreement FuzzRepeatAdversary]

The shell script in the Makefile is too sophisticated for GoLand to naively replicate. Can we change something so that it's not necessary?

@anorth
Copy link
Member

anorth commented May 15, 2024

@masih in case you're reading this in email, I retracted much of the above comment after realising I didn't have the latest code locally.

@masih masih force-pushed the masih/fuzz-all-da-things branch 3 times, most recently from c871b5a to 095e47a Compare May 16, 2024 20:00
Copy link
Member

@anorth anorth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think we're on the same page now about replicating something similar to the existing coverage of inputs being tested (scenario x participant count x seed). I will stop objecting and leave it to other reviewers to 👍 the details.

test/constants_test.go Show resolved Hide resolved
test/honest_test.go Outdated Show resolved Hide resolved
test/multi_instance_test.go Outdated Show resolved Hide resolved
test/honest_test.go Outdated Show resolved Hide resolved
@anorth
Copy link
Member

anorth commented May 16, 2024

I found the current test failure, so can confirm that the workflow for isolating a failing test, tracing, IDE support etc are adequate here.

@masih
Copy link
Member Author

masih commented May 17, 2024

Folks this is ready for a final look; it:

  • offers slightly higher coverage compared to main.
  • covers the previous test cases as static fuzz corpus.
  • refactors tests to remove duplicate code while generally keeping the same parameter search space as the tests in main branch.
  • introduces fuzzing to CI at 30s per fuzz test.
    • I intend to reflect on suggestions to run longer nightly fuzzing / fuzz in parallel in a separate PR as this one is large enough already.

Please let me know if there is anything that I might have missed.

Happy fuzzing 🪲

@masih masih force-pushed the masih/fuzz-all-da-things branch 3 times, most recently from 147783a to 4bf5f54 Compare May 17, 2024 14:02
@Kubuxu
Copy link
Collaborator

Kubuxu commented May 17, 2024

Fuzzing failed with

--- FAIL: FuzzHonestMultiInstance_SyncAgreement (19.62s)
    fuzzing process hung or terminated unexpectedly: exit status 2
    Failing input written to testdata/fuzz/FuzzHonestMultiInstance_SyncAgreement/3612110feb3427c0
    To re-run:
    go test -run=FuzzHonestMultiInstance_SyncAgreement/3612110feb3427c0

but IMO let's get this merged as it is a pure improvement given the baseline coverage without continuous fuzzing.

@masih masih force-pushed the masih/fuzz-all-da-things branch 3 times, most recently from 0ab34b9 to 13060c5 Compare May 17, 2024 14:21
@masih
Copy link
Member Author

masih commented May 17, 2024

fuzzing process hung or terminated unexpectedly: exit status 2

see: golang/go#56238

Prior to changes introduced here the majority of time during testing was
spent on repeating the same test while changing the seed for generators
or latency models.

Instead of repeating the same test over and over use the fuzz testing
mechanism provided by Go SDK. This change allows the unit tests to run
much faster while offering a configurable "fuzztime" that dictates for
how long the test cases should be fuzzed.

The changes here introduce the following fuzz tests:
* `FuzzAbsentAdversary`
* `FuzzImmediateDecideAdversary`
* `FuzzHonest_AsyncRequireStrongQuorumToProgress`
* `FuzzHonest_SyncMajorityCommonPrefix`
* `FuzzHonest_AsyncMajorityCommonPrefix`
* `FuzzHonestMultiInstance_AsyncDisagreement`
* `FuzzHonestMultiInstance_SyncAgreement`
* `FuzzHonestMultiInstance_AsyncAgreement`
* `FuzzStoragePower_SyncIncreaseMidSimulation`
* `FuzzStoragePower_AsyncIncreaseMidSimulation`
* `FuzzStoragePower_SyncDecreaseRevertsToBase`
* `FuzzStoragePower_AsyncDecreaseRevertsToBase`
* `FuzzRepeatAdversary`

The CI workflow is updated to run each of the fuzz tests for 30 seconds
in a dedicated job.
@masih masih force-pushed the masih/fuzz-all-da-things branch from 13060c5 to 5d0cad7 Compare May 17, 2024 16:09
@masih masih added this pull request to the merge queue May 17, 2024
Merged via the queue into main with commit 7c2f72b May 17, 2024
4 checks passed
@masih masih deleted the masih/fuzz-all-da-things branch May 17, 2024 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fuzz RNG seed in simulation tests
5 participants