Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit test fails at clock.test.ts from time to time #6734

Open
ensi321 opened this issue May 4, 2024 · 9 comments · May be fixed by #6828
Open

Unit test fails at clock.test.ts from time to time #6734

ensi321 opened this issue May 4, 2024 · 9 comments · May be fixed by #6828
Assignees
Labels
meta-bug Issues that identify a bug and require a fix. scope-testing Issues for adding test coverage, fixing existing tests or testing strategies.

Comments

@ensi321
Copy link
Contributor

ensi321 commented May 4, 2024

Describe the bug

Sometimes clock.test.ts fails. Re-running it usually makes the error go away. Same behaviour is observed when running clock.test.ts on my local machine.

@Chainsafe validator % yarn vitest test/unit/utils/clock.test.ts 
yarn run v1.22.22
$ node_modules/.bin/vitest test/unit/utils/clock.test.ts

 DEV  v1.4.0 lodestar/packages/validator

 ❯ test/unit/utils/clock.test.ts (6)
   ❯ util / Clock (6)
     ✓ Should call on slot
     ✓ Should stop calling on slot after stop()
     ✓ Should call on epoch
     ❯ getCurrentSlot (3)
       × 'should return next slot after 11.5s'
       ✓ 'should return next slot after 12s'
       ✓ 'should return next slot after 12.5s'

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ Failed Tests 1 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯

 FAIL  test/unit/utils/clock.test.ts > util / Clock > getCurrentSlot > 'should return next slot after 11.5s'
AssertionError: expected 8780981 to be 8780982 // Object.is equality

- Expected
+ Received

- 8780982
+ 8780981

 ❯ test/unit/utils/clock.test.ts:96:61
     94|       const currentSlot = getCurrentSlotAround(testConfig, genesisTime);
     95|       vi.advanceTimersByTime(delta * 1000);
     96|       expect(getCurrentSlotAround(testConfig, genesisTime)).toBe(currentSlot + 1);
       |                                                             ^
     97|     });
     98|   });

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/1]⎯

 Test Files  1 failed (1)
      Tests  1 failed | 5 passed (6)
   Start at  21:56:05
   Duration  731ms (transform 435ms, setup 12ms, collect 591ms, tests 6ms, environment 0ms, prepare 33ms)


 FAIL  Tests failed. Watching for file changes...
       press h to show help, press q to quit
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
@Chainsafe validator % 
@Chainsafe validator % 
@Chainsafe validator % yarn vitest test/unit/utils/clock.test.ts
yarn run v1.22.22
$ lodestar/node_modules/.bin/vitest test/unit/utils/clock.test.ts

 DEV  v1.4.0 lodestar/packages/validator

 ✓ test/unit/utils/clock.test.ts (6)
   ✓ util / Clock (6)
     ✓ Should call on slot
     ✓ Should stop calling on slot after stop()
     ✓ Should call on epoch
     ✓ getCurrentSlot (3)
       ✓ 'should return next slot after 11.5s'
       ✓ 'should return next slot after 12s'
       ✓ 'should return next slot after 12.5s'

 Test Files  1 passed (1)
      Tests  6 passed (6)
   Start at  21:56:12
   Duration  560ms (transform 354ms, setup 12ms, collect 446ms, tests 4ms, environment 0ms, prepare 23ms)


 PASS  Waiting for file changes...
       press h to show help, press q to quit
✨  Done in 3.84s.

Workflows log here

Expected behavior

The test should pass every time

Steps to reproduce

Run yarn vitest test/unit/utils/clock.test.ts multiple times

Additional context

No response

Operating system

Linux

Lodestar version or commit hash

v.1.18.0

@ensi321 ensi321 added the meta-bug Issues that identify a bug and require a fix. label May 4, 2024
@philknows philknows added the scope-testing Issues for adding test coverage, fixing existing tests or testing strategies. label May 7, 2024
@divyesh87
Copy link

Hi, I would like to work on this. Can i get this assigned?

@ensi321
Copy link
Contributor Author

ensi321 commented May 23, 2024

Hi, I would like to work on this. Can i get this assigned?

Done! Thanks for your interest!

If you encounter any issue just drop a message in our discord and we can follow up from there

@divyesh87
Copy link

divyesh87 commented May 24, 2024

Hi there, I investigated this issue and i can think of some cases where these tests could fail (This is my first issue so please forgive me if i am not following any conventions :-p )

So the failing test is this :

   const testConfig = {SECONDS_PER_SLOT: 12} as BeaconConfig;
    const genesisTime = Math.floor(new Date("2021-01-01").getTime() / 1000);
    const testCase: {name: string; delta: number}[] = [
      {name: "should return next slot after 11.5s", delta: 11.5},
      {name: "should return next slot after 12s", delta: 12},
      {name: "should return next slot after 12.5s", delta: 12.5},
    ];

    it.each(testCase)("$name", async function ({delta}) {
      const currentSlot = getCurrentSlotAround(testConfig, genesisTime);
      vi.advanceTimersByTime(delta * 1000);
      expect(getCurrentSlotAround(testConfig, genesisTime)).toBe(currentSlot + 1);
    });

The function getCurrentSlotAround simply takes in the genesis time and your config and returns the current slot.
We have three test cases here, of which case 1 (delta : 11.5) and case 3 (delta : 12.5) are usually the ones that will fail.
Let my try and highlight the cases where these would fail:

1> Delta 11.5 (Case 1):

Assume genesis time = 0, +Date.now() / 1000 = 6 (For simplicity , Its 1970 and we are 6 secs past the Unix time lol)

You try to get the current slot on line 1 of the test case:
Return value : 1 (Slot 1)
Reason : The getCurrentSlotAround function rounds the current timestamp to its nearest slot using the below logic:

Math.round(6 / 12) // Round(0.5) => 1

You advance the timers by 11.5 and the run the assertion:

Now the current unix timestamp is : 6 + 11.5 => 17.5;
You try to get the current slot for assertion of the test case:

Return value : 1 (Slot 1 again!)
Reason :

Math.round(17.5 / 12) //  Round(1.458) => 1

We expected the slot to advance by 1 but it didnt!

2> Delta 12.5 (Case 3):

Assume genesis time = 0, +Date.now() / 1000 = 17.5 (We are 17.5 secs past the Unix time)

You try to get the current slot on line 1 of the test case:
Return value : 1 (Slot 1)
Reason :

Math.round(17.5 / 12) // Round(1.458) => 1

You advance the timers by 12.5 and the run the assertion:

Now the current unix timestamp is : 17.5 + 12.5 => 30;
You try to get the current slot for assertion of the test case:

Return value : 3 (Slot advanced by 2!)
Reason :

Math.round(30 / 12) //  Round(2.5) => 3

We expected the slot to advance by 1 but it advanced by 2.

So the test cases will only fail in a certain 0.5 sec window if you think about it, they dont always fail but in special cases described above. (Case number 2 would never fail)

The solution i can think of is to remove these two test cases here. Please let me know what you think of this and sorry for this colossal comment :-p

@ensi321
Copy link
Contributor Author

ensi321 commented May 24, 2024

Hi there, I investigated this issue and i can think of some cases where these tests could fail (This is my first issue so please forgive me if i am not following any conventions :-p )

Awesome explanation. Very easy to follow. Indeed it's the rounding that's causing the issue. There was a brief discussion on this issue: https://github.com/ChainSafe/lodestar/pull/6407/files#r1481758624.

The solution i can think of is to remove these two test cases here. Please let me know what you think of this and sorry for this colossal comment :-p

The delta + advance timer approach in the current unit test does not suit the behaviour of getCurrentSlotAround(). We will need to ditch it and come up with another approach and/or better test cases to truly reflect our expectation of the result from getCurrentSlotAround().

I think it's tricky to unit test getCurrentSlotAround() because it has dependency on Date.now(). I think if we override global.Date.now = () => desiredValue to return our desired value, it will be easier to test. What do you think?

cc. @nazarhussain

@nflaig
Copy link
Member

nflaig commented May 24, 2024

Just a note, this test is really old and never failed when we used mocha, it started to get flaky after vitest migration.

@divyesh87
Copy link

Agreed. Dependency on Date.now is going to cause these tests to fail under certain time slots.
The genesis time is already hardcoded, overriding the global Date.now seems to be a good way to fix this.

@divyesh87
Copy link

divyesh87 commented May 27, 2024

What do you think of this approach?

 describe("getCurrentSlot", function () {
    // eslint-disable-next-line @typescript-eslint/naming-convention
    const testConfig = {SECONDS_PER_SLOT: 12} as BeaconConfig;
    const genesisTime = 0;
    const originalDate = global.Date.now;

    beforeEach(() => {
      global.Date.now = () => genesisTime;
    });

    afterEach(() => {
      global.Date.now = originalDate;
    });

    const testCase: {name: string; delta: number}[] = [
      {name: "should return next slot after 11.5s", delta: 11.5},
      {name: "should return next slot after 12s", delta: 12},
      {name: "should return next slot after 12.5s", delta: 12.5},
    ];

    it.each(testCase)("$name", async function ({delta}) {
      const currentSlot = getCurrentSlotAround(testConfig, genesisTime);
      const advancement = delta * 1000;
      global.Date.now = () => genesisTime + advancement;
      vi.advanceTimersByTime(advancement);
      expect(getCurrentSlotAround(testConfig, genesisTime)).toBe(currentSlot + 1);
    });
  });

Changes made here:

genesisTime = 0
override the Date.now to return the same time as genesis time before a testcase
advance Date.now by delta inside the testcase

Changing the genesis time to special values will cause the test cases to fail, for eg 6.5 (as explained above),
with this override the test cases will either always fail or pass (depending on the value of genesis time)

@nflaig
Copy link
Member

nflaig commented May 28, 2024

@divyesh87 Thanks for looking into this, we can achieve the same without modifying the Date global

diff --git a/packages/validator/test/unit/utils/clock.test.ts b/packages/validator/test/unit/utils/clock.test.ts
index dfa9d386e6..b5ef2ab1fa 100644
--- a/packages/validator/test/unit/utils/clock.test.ts
+++ b/packages/validator/test/unit/utils/clock.test.ts
@@ -90,6 +90,10 @@ describe("util / Clock", function () {
       {name: "should return next slot after 12.5s", delta: 12.5},
     ];
 
+    beforeEach(() => {
+      vi.setSystemTime(genesisTime * 1000);
+    });
+
     it.each(testCase)("$name", async function ({delta}) {
       const currentSlot = getCurrentSlotAround(testConfig, genesisTime);
       vi.advanceTimersByTime(delta * 1000);

This seems to make the test runs deterministic, no failed tests after more than 100 runs on my machine

@divyesh87 divyesh87 linked a pull request May 29, 2024 that will close this issue
@divyesh87
Copy link

Oh agreed. Thats a really neat way to do this

Thanks, will create a pr soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta-bug Issues that identify a bug and require a fix. scope-testing Issues for adding test coverage, fixing existing tests or testing strategies.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants