Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poll FIL events, populate daily_reward_transfers table, reformat endpoint handling #102

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

@bajtos bajtos mentioned this pull request May 30, 2024
@bajtos
Copy link
Member

bajtos commented May 30, 2024

@PatrickNercessian I have the new monorepo layout with a new spark-observer service almost ready for you, see #120.

There is one more improvement I'd like to implement before I hand this over to you: connect spark-observer to spark-stats database and setup a framework for running DB schema migration scripts. I hope to get it done later today.

if (event.name === 'Transfer') {
await onTransfer(...event.args)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert changes in this file. The necessary infrastructure should be already prepared for you in https://github.com/filecoin-station/spark-stats/blob/1c069bac5a87de7b2ea58f395bb338f9abe67947/observer/bin/spark-observer.js. Feel free to make any adjustments in that file as needed.

lib/config.js Outdated
RPC_URL,
DATABASE_URL,
rpcHeaders
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea to have lib/config.js, please move this new code to observer/lib/config.js and clean up observer/bin/spark-observer.js accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should go to observer/lib/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should go to observer/test/

@PatrickNercessian
Copy link
Contributor Author

The functionality of actually checking the blockchain and getting the events is currently untested. I'd like to create a simple dry run script to test it before opening this PR

Comment on lines 57 to 70
ieContract.queryFilter(ieContract.filters.Transfer(), lastCheckedBlock)
.then(events => {
for (const event of events) {
console.log('%s FIL to %s at block %s', event.args.amount, event.args.to, event.blockNumber)
updateDailyFilStats(
pgPool,
{
to_address: event.args.to,
amount: event.args.amount,
blockNumber: event.blockNumber
}
)
}
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, because I made the switch to use queryFilter instead of "on", this is now a one-time call, instead of a continuously listening action, which is wrong as-is.

We need to either run this script on a schedule (e.g. once daily), or we need to run a schedule within this script's execution)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do cron jobs using Fly IO config. https://fly.io/docs/machines/flyctl/fly-machine-run/#start-a-machine-on-a-schedule

What do we think about that? @juliangruber @bajtos

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to run something like

while (true) {
  await run()
  await sleep(ONE_HOUR)
}

This way we use less platform features, and own the iteration logic

@PatrickNercessian PatrickNercessian marked this pull request as ready for review June 3, 2024 21:53
Comment on lines 16 to 20
// Listen for Transfer events from the IE contract
while (true) {
observeTransferEvents(pgPool, ieContract, provider)
await new Promise(resolve => setTimeout(resolve, OBSERVATION_INTERVAL_MS))
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we 'await' observerTransferEvents here?

I think as-is, this means we should run every hour, regardless of how long the observerTransferEvents method takes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awaiting is safer, I think this doesn't need to run exactly every hour. If we want it to be exact and safe, we can await, measure how long it took, and then wait INTERVAL-duration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also handle the case where it errors. I believe a try/catch with console.error() & Sentry.captureException is the least we need to do

@PatrickNercessian PatrickNercessian changed the title Poll FIL events, populate daily_fil table, reformat endpoint handling Poll FIL events, populate daily_reward_transfers table, reformat endpoint handling Jun 3, 2024
@@ -1 +0,0 @@
SELECT now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this migration should be kept, as otherwise the migration library will complain that the hash of already performed migrations doesn't match with what's on disk

CREATE TABLE reward_transfer_last_block (
last_block INTEGER NOT NULL
);
INSERT INTO reward_transfer_last_block (last_block) VALUES (0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about instead we add block to daily_reward_transfers, and then query that to get the last block seen?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way we don't run the risk of the two tables getting into an inconsistent state

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't actually want the last block with an Event, we want the last "checked" block. So if it's Friday and spark-observer runs, we don't want it to have to check since Sunday (assuming no Transfers since Sunday) for two reasons:

  1. We should have already checked all those blocks
  2. We will fail because GLIF only allows last 16h40m. Not a big problem bc I have it set to retry over the last 16 hours, but still doesn't seem like a clean approach.

And I don't see a great way to incorporate what we actually want (last "checked" block) into daily_reward_transfers. But if you have some specifics in mind for that, I'm open to it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suggesting to put the last checked block in the above table

And I don't see a great way to incorporate what we actually want (last "checked" block) into daily_reward_transfers. But if you have some specifics in mind for that, I'm open to it!

Why not put the last checked block in last_checked_block in that table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think I see, so you mean for every row, we have a new field called last_checked_block? Then do we query by MAX to get the highest block seen?

I wonder if it will end up being slow to do that MAX query after we have hundreds of thousands of rows? In the current implementation it's just a single value that is updated over time.

Also, it feels a bit unrelated to each row. Like the last_checked_block for any given row doesn't really tell you anything about that row, and we lose some semantic consistency of the table.

That said, I see the benefit of avoiding the daily_reward_transfers table not losing consistency with the last_checked_block value. Let me know which you prefer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think I see, so you mean for every row, we have a new field called last_checked_block? Then do we query by MAX to get the highest block seen?

Yeah that's what I'm thinking 👍

I wonder if it will end up being slow to do that MAX query after we have hundreds of thousands of rows? In the current implementation it's just a single value that is updated over time.

With an index on the column I don't see performance being an issue. I would rather start correct with this (if it's one table it's consistent by design) and optimize later if necessary.

Also, it feels a bit unrelated to each row. Like the last_checked_block for any given row doesn't really tell you anything about that row, and we lose some semantic consistency of the table.

It tells us at which state of the blockchain this event was received, which is related to the event.

That said, I see the benefit of avoiding the daily_reward_transfers table not losing consistency with the last_checked_block value. Let me know which you prefer

I would prefer to fold it into this table. Maybe @bajtos will disagree next week, and if we decide to go that way after all, I think it won't be too bad to follow up again. To me, here, correctness and simplicity outweigh the optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Addressed in 3939e80, along with the below suggestions

However, we can't run the dry-run anymore on real events because of the 16h40m max lookback, so we'll have to wait til the next rewards are released (or just merge anyway since it shouldn't break anything else)

Comment on lines 18 to 20
await pgPool.query('DELETE FROM reward_transfer_last_block')
// Set the last block to -800 to simulate the observer starting from the beginning
await pgPool.query('INSERT INTO reward_transfer_last_block (last_block) VALUES (-800)')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
await pgPool.query('DELETE FROM reward_transfer_last_block')
// Set the last block to -800 to simulate the observer starting from the beginning
await pgPool.query('INSERT INTO reward_transfer_last_block (last_block) VALUES (-800)')

Can be removed with suggestion above

// Get the last checked block. Even though there should be only one row, use MAX just to be safe
const lastCheckedBlock = await pgPool.query(
'SELECT MAX(last_block) AS last_block FROM reward_transfer_last_block'
).then(res => res.rows[0].last_block)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with suggestion above, needs to be adjusted to select from the transfers table

for (const event of events) {
const transferEvent = {
to_address: event.args.to,
amount: event.args.amount
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with suggestion above, needs to include provider.getBlockNumber()

Comment on lines 37 to 46

// Get the current block number and update the last_block in reward_transfer_last_block table
// For safety, only update if the new block number is greater than the existing one
const blockNumber = await provider.getBlockNumber()
console.log('Current block number:', blockNumber)
await pgPool.query(`
UPDATE reward_transfer_last_block
SET last_block = $1
WHERE $1 > last_block
`, [blockNumber])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Get the current block number and update the last_block in reward_transfer_last_block table
// For safety, only update if the new block number is greater than the existing one
const blockNumber = await provider.getBlockNumber()
console.log('Current block number:', blockNumber)
await pgPool.query(`
UPDATE reward_transfer_last_block
SET last_block = $1
WHERE $1 > last_block
`, [blockNumber])

Comment on lines 16 to 20
// Listen for Transfer events from the IE contract
while (true) {
observeTransferEvents(pgPool, ieContract, provider)
await new Promise(resolve => setTimeout(resolve, OBSERVATION_INTERVAL_MS))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awaiting is safer, I think this doesn't need to run exactly every hour. If we want it to be exact and safe, we can await, measure how long it took, and then wait INTERVAL-duration.

Comment on lines 16 to 20
// Listen for Transfer events from the IE contract
while (true) {
observeTransferEvents(pgPool, ieContract, provider)
await new Promise(resolve => setTimeout(resolve, OBSERVATION_INTERVAL_MS))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also handle the case where it errors. I believe a try/catch with console.error() & Sentry.captureException is the least we need to do

@juliangruber juliangruber mentioned this pull request Jun 6, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants