Poll FIL events, populate daily_reward_transfers table, reformat endpoint handling #102

PatrickNercessian · 2024-05-16T20:24:01Z

Related to filecoin-station/spark-evaluate#219

Roadmap links:

bajtos · 2024-05-30T11:54:58Z

@PatrickNercessian I have the new monorepo layout with a new spark-observer service almost ready for you, see #120.

There is one more improvement I'd like to implement before I hand this over to you: connect spark-observer to spark-stats database and setup a framework for running DB schema migration scripts. I hope to get it done later today.

bajtos · 2024-05-30T11:56:13Z

bin/spark-stats.js

+  if (event.name === 'Transfer') {
+    await onTransfer(...event.args)
+  }
+}


Please revert changes in this file. The necessary infrastructure should be already prepared for you in https://github.com/filecoin-station/spark-stats/blob/1c069bac5a87de7b2ea58f395bb338f9abe67947/observer/bin/spark-observer.js. Feel free to make any adjustments in that file as needed.

bajtos · 2024-05-30T11:56:51Z

lib/config.js

+  RPC_URL,
+  DATABASE_URL,
+  rpcHeaders
+}


It's a good idea to have lib/config.js, please move this new code to observer/lib/config.js and clean up observer/bin/spark-observer.js accordingly.

bajtos · 2024-05-30T11:57:41Z

lib/platform-stats-generator.js

This file should go to observer/lib/

bajtos · 2024-05-30T11:58:02Z

test/platform-stats-generator.test.js

This file should go to observer/test/

…dling

PatrickNercessian · 2024-05-30T19:31:45Z

The functionality of actually checking the blockchain and getting the events is currently untested. I'd like to create a simple dry run script to test it before opening this PR

PatrickNercessian · 2024-05-31T16:38:06Z

observer/bin/spark-observer.js

+ieContract.queryFilter(ieContract.filters.Transfer(), lastCheckedBlock)
+  .then(events => {
+    for (const event of events) {
+      console.log('%s FIL to %s at block %s', event.args.amount, event.args.to, event.blockNumber)
+      updateDailyFilStats(
+        pgPool,
+        {
+          to_address: event.args.to,
+          amount: event.args.amount,
+          blockNumber: event.blockNumber
+        }
+      )
+    }
+  })


Also, because I made the switch to use queryFilter instead of "on", this is now a one-time call, instead of a continuously listening action, which is wrong as-is.

We need to either run this script on a schedule (e.g. once daily), or we need to run a schedule within this script's execution)

I think we can do cron jobs using Fly IO config. https://fly.io/docs/machines/flyctl/fly-machine-run/#start-a-machine-on-a-schedule

What do we think about that? @juliangruber @bajtos

I'd prefer to run something like

while (true) { await run() await sleep(ONE_HOUR) }

This way we use less platform features, and own the iteration logic

PatrickNercessian · 2024-06-03T21:55:49Z

observer/bin/spark-observer.js

+// Listen for Transfer events from the IE contract
+while (true) {
+  observeTransferEvents(pgPool, ieContract, provider)
+  await new Promise(resolve => setTimeout(resolve, OBSERVATION_INTERVAL_MS))
+}


Should we 'await' observerTransferEvents here?

I think as-is, this means we should run every hour, regardless of how long the observerTransferEvents method takes.

Awaiting is safer, I think this doesn't need to run exactly every hour. If we want it to be exact and safe, we can await, measure how long it took, and then wait INTERVAL-duration.

We should also handle the case where it errors. I believe a try/catch with console.error() & Sentry.captureException is the least we need to do

juliangruber · 2024-06-05T09:46:24Z

migrations/001.do.sql

@@ -1 +0,0 @@
-SELECT now();


this migration should be kept, as otherwise the migration library will complain that the hash of already performed migrations doesn't match with what's on disk

juliangruber · 2024-06-05T09:47:34Z

migrations/001.do.daily-reward-transfers.sql

+CREATE TABLE reward_transfer_last_block (
+    last_block INTEGER NOT NULL
+);
+INSERT INTO reward_transfer_last_block (last_block) VALUES (0);


what about instead we add block to daily_reward_transfers, and then query that to get the last block seen?

This way we don't run the risk of the two tables getting into an inconsistent state

I think we don't actually want the last block with an Event, we want the last "checked" block. So if it's Friday and spark-observer runs, we don't want it to have to check since Sunday (assuming no Transfers since Sunday) for two reasons:

We should have already checked all those blocks

We will fail because GLIF only allows last 16h40m. Not a big problem bc I have it set to retry over the last 16 hours, but still doesn't seem like a clean approach.

And I don't see a great way to incorporate what we actually want (last "checked" block) into daily_reward_transfers. But if you have some specifics in mind for that, I'm open to it!

I'm suggesting to put the last checked block in the above table

And I don't see a great way to incorporate what we actually want (last "checked" block) into daily_reward_transfers. But if you have some specifics in mind for that, I'm open to it!

Why not put the last checked block in last_checked_block in that table?

Oh I think I see, so you mean for every row, we have a new field called last_checked_block? Then do we query by MAX to get the highest block seen?

I wonder if it will end up being slow to do that MAX query after we have hundreds of thousands of rows? In the current implementation it's just a single value that is updated over time.

Also, it feels a bit unrelated to each row. Like the last_checked_block for any given row doesn't really tell you anything about that row, and we lose some semantic consistency of the table.

That said, I see the benefit of avoiding the daily_reward_transfers table not losing consistency with the last_checked_block value. Let me know which you prefer

Oh I think I see, so you mean for every row, we have a new field called last_checked_block? Then do we query by MAX to get the highest block seen?

Yeah that's what I'm thinking 👍

I wonder if it will end up being slow to do that MAX query after we have hundreds of thousands of rows? In the current implementation it's just a single value that is updated over time.

With an index on the column I don't see performance being an issue. I would rather start correct with this (if it's one table it's consistent by design) and optimize later if necessary.

Also, it feels a bit unrelated to each row. Like the last_checked_block for any given row doesn't really tell you anything about that row, and we lose some semantic consistency of the table.

It tells us at which state of the blockchain this event was received, which is related to the event.

That said, I see the benefit of avoiding the daily_reward_transfers table not losing consistency with the last_checked_block value. Let me know which you prefer

I would prefer to fold it into this table. Maybe @bajtos will disagree next week, and if we decide to go that way after all, I think it won't be too bad to follow up again. To me, here, correctness and simplicity outweigh the optimization.

Makes sense! Addressed in 3939e80, along with the below suggestions

However, we can't run the dry-run anymore on real events because of the 16h40m max lookback, so we'll have to wait til the next rewards are released (or just merge anyway since it shouldn't break anything else)

juliangruber · 2024-06-05T09:48:52Z

observer/bin/dry-run.js

+await pgPool.query('DELETE FROM reward_transfer_last_block')
+// Set the last block to -800 to simulate the observer starting from the beginning
+await pgPool.query('INSERT INTO reward_transfer_last_block (last_block) VALUES (-800)')


Suggested change

await pgPool.query('DELETE FROM reward_transfer_last_block')

// Set the last block to -800 to simulate the observer starting from the beginning

await pgPool.query('INSERT INTO reward_transfer_last_block (last_block) VALUES (-800)')

Can be removed with suggestion above

juliangruber · 2024-06-05T09:51:29Z

observer/lib/observer.js

+  // Get the last checked block. Even though there should be only one row, use MAX just to be safe
+  const lastCheckedBlock = await pgPool.query(
+    'SELECT MAX(last_block) AS last_block FROM reward_transfer_last_block'
+  ).then(res => res.rows[0].last_block)


with suggestion above, needs to be adjusted to select from the transfers table

juliangruber · 2024-06-05T09:52:12Z

observer/lib/observer.js

+  for (const event of events) {
+    const transferEvent = {
+      to_address: event.args.to,
+      amount: event.args.amount


with suggestion above, needs to include provider.getBlockNumber()

juliangruber · 2024-06-05T09:52:30Z

observer/lib/observer.js

+
+  // Get the current block number and update the last_block in reward_transfer_last_block table
+  // For safety, only update if the new block number is greater than the existing one
+  const blockNumber = await provider.getBlockNumber()
+  console.log('Current block number:', blockNumber)
+  await pgPool.query(`
+    UPDATE reward_transfer_last_block
+    SET last_block = $1
+    WHERE $1 > last_block
+  `, [blockNumber])


Suggested change

// Get the current block number and update the last_block in reward_transfer_last_block table

// For safety, only update if the new block number is greater than the existing one

const blockNumber = await provider.getBlockNumber()

console.log('Current block number:', blockNumber)

await pgPool.query(`

UPDATE reward_transfer_last_block

SET last_block = $1

WHERE $1 > last_block

`, [blockNumber])

juliangruber · 2024-06-05T09:53:55Z

observer/bin/spark-observer.js

+// Listen for Transfer events from the IE contract
+while (true) {
+  observeTransferEvents(pgPool, ieContract, provider)
+  await new Promise(resolve => setTimeout(resolve, OBSERVATION_INTERVAL_MS))
+}


Awaiting is safer, I think this doesn't need to run exactly every hour. If we want it to be exact and safe, we can await, measure how long it took, and then wait INTERVAL-duration.

juliangruber · 2024-06-05T09:54:27Z

observer/bin/spark-observer.js

+// Listen for Transfer events from the IE contract
+while (true) {
+  observeTransferEvents(pgPool, ieContract, provider)
+  await new Promise(resolve => setTimeout(resolve, OBSERVATION_INTERVAL_MS))
+}


We should also handle the case where it errors. I believe a try/catch with console.error() & Sentry.captureException is the least we need to do

bajtos mentioned this pull request May 30, 2024

refactor: monorepo layout #118

Merged

bajtos reviewed May 30, 2024

View reviewed changes

PatrickNercessian added 3 commits May 30, 2024 12:12

Poll Transfer events, populate daily_fil table, reformat endpoint han…

ee30d75

…dling

incomplete: use queryFilter for Event tracking history

f60d5b2

Merge with new repo structure, ensure total block history coverage

eb2a5c2

PatrickNercessian force-pushed the feat-fil-table-stats branch from 6de4ef6 to eb2a5c2 Compare May 30, 2024 19:14

PatrickNercessian mentioned this pull request May 30, 2024

Add daily_fil stats table filecoin-station/spark-evaluate#219

Closed

migrating db before observer tests

a819bcb

PatrickNercessian commented May 31, 2024

View reviewed changes

Run in loop, modularize code, add dry-run

01e78d6

PatrickNercessian marked this pull request as ready for review June 3, 2024 21:53

PatrickNercessian commented Jun 3, 2024

View reviewed changes

PatrickNercessian changed the title ~~Poll FIL events, populate daily_fil table, reformat endpoint handling~~ Poll FIL events, populate daily_reward_transfers table, reformat endpoint handling Jun 3, 2024

fix any missed renames for consistency

7be3dcc

juliangruber requested changes Jun 5, 2024

View reviewed changes

juliangruber mentioned this pull request Jun 6, 2024

Add daily scheduled rewards #131

Draft

3 tasks

Loop error handling and transition to single-table approach

3939e80

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poll FIL events, populate daily_reward_transfers table, reformat endpoint handling #102

Poll FIL events, populate daily_reward_transfers table, reformat endpoint handling #102

PatrickNercessian commented May 16, 2024

bajtos commented May 30, 2024

bajtos May 30, 2024

bajtos May 30, 2024

bajtos May 30, 2024

bajtos May 30, 2024

PatrickNercessian commented May 30, 2024

PatrickNercessian May 31, 2024

PatrickNercessian May 31, 2024

juliangruber Jun 3, 2024

PatrickNercessian Jun 3, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

PatrickNercessian Jun 5, 2024 •

edited

juliangruber Jun 5, 2024

PatrickNercessian Jun 5, 2024

juliangruber Jun 6, 2024

PatrickNercessian Jun 6, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

juliangruber Jun 5, 2024

	await pgPool.query('DELETE FROM reward_transfer_last_block')
	// Set the last block to -800 to simulate the observer starting from the beginning
	await pgPool.query('INSERT INTO reward_transfer_last_block (last_block) VALUES (-800)')

Poll FIL events, populate daily_reward_transfers table, reformat endpoint handling #102

Are you sure you want to change the base?

Poll FIL events, populate daily_reward_transfers table, reformat endpoint handling #102

Conversation

PatrickNercessian commented May 16, 2024

bajtos commented May 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PatrickNercessian commented May 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PatrickNercessian Jun 5, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PatrickNercessian Jun 5, 2024 •

edited