-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bank: rewrite status cache #1790
base: main
Are you sure you want to change the base?
Conversation
The status cache has two main issues, (1) It's not particularly concurrent, and causes slowdowns when running with highly parallel bank execution. (2) It has unbounded memory consumption, and can cause out of memory conditions when it needs to store large numbers of transactions. It is rewritten in Firedancer for performance. The general design is there are two operations which need to be highly concurrent, insert and query, and everything else is rare and happens between slots and can largely just lock the whole thing. The base case for insert is optimized to two always-uncontended, and one very lightly contended compare and swap. Query is fully lockless. The transaction result storage is combined between the snapshot service cache, and the query lookup cache, which more than halves the memory usage and the memory use is fixed up front.
if( FD_UNLIKELY( max_rooted_slots<1UL || max_live_slots<1UL ) ) return 0UL; | ||
if( FD_UNLIKELY( max_live_slots<max_rooted_slots ) ) return 0UL; | ||
if( FD_UNLIKELY( max_txn_per_slot<1UL ) ) return 0UL; | ||
if( FD_UNLIKELY( !fd_ulong_is_pow2( max_live_slots || !fd_ulong_is_pow2( max_txn_per_slot ) ) ) ) return 0UL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if( FD_UNLIKELY( !fd_ulong_is_pow2( max_live_slots || !fd_ulong_is_pow2( max_txn_per_slot ) ) ) ) return 0UL; | |
if( FD_UNLIKELY( !fd_ulong_is_pow2( max_live_slots ) || !fd_ulong_is_pow2( max_txn_per_slot ) ) ) return 0UL; |
|
||
ulong txnhash_offset = blockcache->txnhash_offset; | ||
ulong head_hash = FD_LOAD( ulong, query->txnhash+txnhash_offset ) % FD_TXNCACHE_BLOCKCACHE_MAP_CNT; | ||
for( uint head=blockcache->heads[ head_hash ]; head!=UINT_MAX; head=tc->txnpages[ head/FD_TXNCACHE_TXNS_PER_PAGE ].txns[ head%FD_TXNCACHE_TXNS_PER_PAGE ]->blockcache_next ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could consider a macro for tc->txnpages[ head/FD_TXNCACHE_TXNS_PER_PAGE ].txns[ head%FD_TXNCACHE_TXNS_PER_PAGE ]
since it shows up several times in places it would be nice if it weren't so long
|
||
typedef struct fd_txncache_private_txn fd_txncache_private_txn_t; | ||
|
||
struct fd_txncache_private_txnpage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should mention explicitly here that all the txns refer to the same blockhash
typedef struct fd_txncache_private_txn fd_txncache_private_txn_t; | ||
|
||
struct fd_txncache_private_txnpage { | ||
ushort free; /* The number of free txn entries in this page. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alignment of txns
is 4, so you have two bytes of padding here FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't finished understanding it, so I'm not sure this is possible, but what about making a linked list of the pages for a blockhash using these bytes. Then you could get rid of the pages
pointer and freeing a list of pages would be O(1).
|
||
ulong idx; | ||
for( idx=0UL; idx<tc->root_slots_cnt; idx++ ) { | ||
if( FD_UNLIKELY( tc->root_slots[ idx ]==slot ) ) goto unlock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include that re-registering a root slot is a no-op in the function comment
ushort txnpages_per_blockhash_max; | ||
|
||
ulong root_slots_cnt; /* The number of root slots being tracked in the below array. */ | ||
ulong * root_slots; /* The highest N slots that have been rooted. These slots are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment about the way this is used? I was definitely expecting something more like a ring buffer with far fewer memmoves, and that initially confused me.
The status cache has two main issues,
(1) It's not particularly concurrent, and causes slowdowns when running
with highly parallel bank execution.
(2) It has unbounded memory consumption, and can cause out of memory
conditions when it needs to store large numbers of transactions.
It is rewritten in Firedancer for performance. The general design is there are two operations which need to be highly concurrent, insert and query, and everything else is rare and happens between slots and can largely just lock the whole thing.
The base case for insert is optimized to two always-uncontended, and one very lightly contended compare and swap. Query is fully lockless.
The transaction result storage is combined between the snapshot service cache, and the query lookup cache, which more than halves the memory usage and the memory use is fixed up front.
Related to #1770