Dry-run multi-block migrations #17

liamaharon · 2023-08-14T07:01:16Z

Original issue paritytech/substrate#14291

          @kianenigma am I right to assume we'll want to make it easy to test these with try-runtime-cli? Ideally should be as easy as `on-runtime-upgrade`, maybe make it auto-mine blocks until all multi-block migrations are complete & provide a hook that runs afterwards?

Originally posted by @liamaharon in paritytech/substrate#14275 (comment)

Kian response:

Good to think about it early, and yes I think it is possible with little effort. I believe follow-chain will already do everything needed to test an MBM?

The text was updated successfully, but these errors were encountered:

liamaharon · 2024-04-10T05:53:52Z

The inherents issue is fixed, so this is unblocked :)

liamaharon · 2024-04-16T04:50:10Z

Broadly, there seem to be two approaches we could take for adding try-runtime testing for MBMs, I'd like to allow for some discussion before choosing one to start work on.

1. Run MBMs in existing `try_on_runtime_upgrade` logic until complete

This approach would integrate MBMs into the existing try_on_runtime_upgrade logic, so they're tested as part of the existing migration testing logic flow.

Simply, after dry-running the migrations, step would be called until all MBMs are completed.

Pros:

Simple and fast to implement
Non-breaking change to try-runtime-cli

Cons:

Less realistic as MBMs would be progressing without inherents coming in between.

2. Mine new blocks after `try_on_runtime_upgrade` has executed

In this approach, we'd continue to mine empty blocks after the runtime upgrade, checking the MBM status (maybe needs a new runtime api?) each block to check the state of MBMs.

Pros:

More realistic, since inherents will be applied and the block number will progress between each MBM step as it would when run for real

Cons:

More complex implementation
Requires new runtime api to check MBM status (I think?)
Breaking change to try-runtime-cli, and try-runtime-cli will become more complex to use to dry-run migrations because the user will need to specify their chain inherents.

My thoughts

I'm on the fence, but slightly leaning towards option 1.

Although option 2 is more realistic, I can't foresee why MBMs would be built in such a way that inherents should impact their progress. That sounds like it would be an anti-pattern anyway, that would be fine to not allow?

If option 2 did not come with the drawbacks of being significantly more complex to implement, or make try-runtime-cli harder to use for the end-user, I would suggest it. But given the drawbacks, my gut feeling is it's not worth the additional time or complexity to implement and we should proceed with option 1.

ggwpez · 2024-04-17T09:54:09Z

Yea i think initially we can do option 1, to get something going quick since the first MBMs are being written.

Although option 2 is more realistic, I can't foresee why MBMs would be built in such a way that inherents should impact their progress. That sounds like it would be an anti-pattern anyway, that would be fine to not allow?

"should not" and "do not" is exactly why we need to test it. Streaming some data in from an inherent to a MBM would work i guess - but very non-standard to do.
Would it work to add a --mbm-test-strategy flag or something that can toggle between option 1 and (possibly future) option 2? Not sure how difficult it would be to write it in a general way to have this possibility.

kianenigma · 2024-04-17T16:48:04Z

My main reservation towards option 1 is that it might miss something like checking the weight of each step. Can we strictly ensure that a dry-run-ed MBM with approach 1 will detect if it is overweight?

If these two are the only options, I would also lean towards 1. I agree that getting something out the door to test MBMs is important and trumps everything else. But, even though I would support a more sophisticated approach, I don't like option 2 per-se, as it is not the right abstraction.

If try-runtime provides a good system to expresses "please dry-run 5 empty blocks for me", which was always part of the plan, testing MBMs is more or less trivial. I think this is what we should aim for, and then testing all types of migrations will be a specialized version of this.

In other words, if try-runtime can create empty blocks for a given, which I believe with the correct inherent extension it can, even the existing try-runtime-upgrade would be rendered deprecated, as it can be tested more realistically by just creating the next empty block.

Even now, I believe the main advantage of try_runtime_upgrade being a custom runtime-api is such that it can bypass block level details such as inherents and just execute the migration code path.

liamaharon · 2024-04-25T05:57:49Z

It seems there is consensus that we should pursue option 2, even if it will take longer.

My first question about implementing it is what is the best way to get the MBM migration status from try-runtime-cli while mining empty blocks?

Initially I was thinking to directly query the event storage, but I don't think I'd be able to decode it because the RuntimeEvent type varies between every runtime, and we don't know what it is.

We also I don't think can query the pallet storage directly, because the key it's under varies across runtimes.

We could introduce a new runtime api for querying MBM status, but that's far less than ideal because it creates non-trivial work for runtime devs to implement.

The approach that seems best would be to modify Executive's try_execute_block methods return value to include information about MBM status. Then, looks like try-runtime-cli can get MBM information from fn it's calling already.

@bkchr @ggwpez @kianenigma let me know if this sounds sensible.

kianenigma · 2024-04-26T09:33:59Z

My first question about implementing it is what is the best way to get the MBM migration status from try-runtime-cli while mining empty blocks?

I think this new return type of initailize_block is exactly what is meant for this:

https://paritytech.github.io/polkadot-sdk/master/sp_runtime/enum.ExtrinsicInclusionMode.html

I think if you update this code to respect this new type, you would already have MBM testing done:

try-runtime-cli/core/src/commands/fast_forward.rs

Lines 153 to 222 in 226514e

    
           async fn produce_next_block<Block: BlockT, HostFns: HostFunctions>( 
        
               externalities: &mut TestExternalities<HashingFor<Block>>, 
        
               executor: &WasmExecutor<HostFns>, 
        
               parent_height: NumberFor<Block>, 
        
               parent_hash: Block::Hash, 
        
               inherent_provider: &dyn InherentProvider<Err = String>, 
        
               previous_block_building_info: Option<(InherentData, Digest)>, 
        
           ) -> Result<(Block, Option<(InherentData, Digest)>)> { 
        
               let (inherent_data_provider, pre_digest) = 
        
                   inherent_provider.get_inherent_providers_and_pre_digest(previous_block_building_info)?; 
        
               let inherent_data = inherent_data_provider 
        
                   .create_inherent_data() 
        
                   .await 
        
                   .map_err(|s| sc_cli::Error::Input(s.to_string()))?; 
        
               let digest = Digest { logs: pre_digest }; 
        
               let header = Block::Header::new( 
        
                   parent_height + One::one(), 
        
                   Default::default(), 
        
                   Default::default(), 
        
                   parent_hash, 
        
                   digest.clone(), 
        
               ); 
        
               call::<Block, _>( 
        
                   externalities, 
        
                   executor, 
        
                   "Core_initialize_block", 
        
                   &header.encode(), 
        
               ) 
        
               .await?; 
        
               let extrinsics = dry_call::<Vec<Block::Extrinsic>, Block, _>( 
        
                   externalities, 
        
                   executor, 
        
                   "BlockBuilder_inherent_extrinsics", 
        
                   &inherent_data.encode(), 
        
               )?; 
        
               for xt in &extrinsics { 
        
                   call::<Block, _>( 
        
                       externalities, 
        
                       executor, 
        
                       "BlockBuilder_apply_extrinsic", 
        
                       &xt.encode(), 
        
                   ) 
        
                   .await?; 
        
               } 
        
               let header = dry_call::<Block::Header, Block, _>( 
        
                   externalities, 
        
                   executor, 
        
                   "BlockBuilder_finalize_block", 
        
                   &[0u8; 0], 
        
               )?; 
        
               call::<Block, _>( 
        
                   externalities, 
        
                   executor, 
        
                   "BlockBuilder_finalize_block", 
        
                   &[0u8; 0], 
        
               ) 
        
               .await?; 
        
               Ok(( 
        
                   Block::new(header, extrinsics), 
        
                   Some((inherent_data, digest)), 
        
               )) 
        
           }

ggwpez · 2024-04-26T11:36:59Z

I think just for PoC, we can also use chopsticks, since it exposes a block-mining RPC and the possibility to decode events.

I think this new return type of initailize_block is exactly what is meant for this:

Kind of, yes. Although there is a hail-mary last resort option, where a failed migration can decide what to do. In that case, one option is ForceUnstuck, and then the chain resumes. So just looking at that could be misinterpreted as success.
It is still better to either 1) parse events or 2) create a Runtime API.

liamaharon · 2024-04-26T15:00:07Z

I think if you update this code to respect this new type, you would already have MBM testing done:

Unfortunately as @ggwpez said just checking the ExtrinsicInclusionMode won't give us any solid MBM status.

I think just for PoC, we can also use chopsticks, since it exposes a block-mining RPC and the possibility to decode events.

I am worried devex will suck if we start requiring two completely different tools to test a runtime upgrade. Also, I'm not sure Chopsticks handles pre/post hooks.

For now proceed with modifying the try_execute_block ret value unless u have some other ideas.

bkchr · 2024-04-28T20:25:27Z

Unfortunately as @ggwpez said just checking the ExtrinsicInclusionMode won't give us any solid MBM status.

What solid MBM status? What do you need to know besides if they are still running?

liamaharon · 2024-04-29T05:28:03Z

Unfortunately as @ggwpez said just checking the ExtrinsicInclusionMode won't give us any solid MBM status.

What solid MBM status? What do you need to know besides if they are still running?

Need to also know whether or not it succeeded.

bkchr · 2024-04-29T20:55:05Z

Hmm? Run post_runtime_upgrade after all steps are finished.

liamaharon · 2024-04-30T05:05:26Z

Hmm? Run post_runtime_upgrade after all steps are finished.

Do you mean some try-runtime feature-gated code that will panic if the MBMs fail?

bkchr · 2024-04-30T08:01:47Z

No I mean as we have for OnRuntimeUpgrade the pre and post runtime upgrade functions. Here we could have something similar, aka a function that we call after the entire MBM is finished.

liamaharon added the enhancement New feature or request label Aug 14, 2023

liamaharon mentioned this issue Aug 14, 2023

try-runtime-cli: support testing new multi-block migrations paritytech/substrate#14291

Closed

liamaharon mentioned this issue Apr 24, 2024

MBM try-runtime support paritytech/polkadot-sdk#4251

Open

2 tasks

liamaharon self-assigned this Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dry-run multi-block migrations #17

Dry-run multi-block migrations #17

liamaharon commented Aug 14, 2023 •

edited

liamaharon commented Apr 10, 2024

liamaharon commented Apr 16, 2024 •

edited

ggwpez commented Apr 17, 2024 •

edited

kianenigma commented Apr 17, 2024 •

edited

liamaharon commented Apr 25, 2024

kianenigma commented Apr 26, 2024

ggwpez commented Apr 26, 2024

liamaharon commented Apr 26, 2024

bkchr commented Apr 28, 2024

liamaharon commented Apr 29, 2024

bkchr commented Apr 29, 2024

liamaharon commented Apr 30, 2024

bkchr commented Apr 30, 2024

Dry-run multi-block migrations #17

Dry-run multi-block migrations #17

Comments

liamaharon commented Aug 14, 2023 • edited

liamaharon commented Apr 10, 2024

liamaharon commented Apr 16, 2024 • edited

1. Run MBMs in existing try_on_runtime_upgrade logic until complete

2. Mine new blocks after try_on_runtime_upgrade has executed

My thoughts

ggwpez commented Apr 17, 2024 • edited

kianenigma commented Apr 17, 2024 • edited

liamaharon commented Apr 25, 2024

kianenigma commented Apr 26, 2024

ggwpez commented Apr 26, 2024

liamaharon commented Apr 26, 2024

bkchr commented Apr 28, 2024

liamaharon commented Apr 29, 2024

bkchr commented Apr 29, 2024

liamaharon commented Apr 30, 2024

bkchr commented Apr 30, 2024

liamaharon commented Aug 14, 2023 •

edited

liamaharon commented Apr 16, 2024 •

edited

1. Run MBMs in existing `try_on_runtime_upgrade` logic until complete

2. Mine new blocks after `try_on_runtime_upgrade` has executed

ggwpez commented Apr 17, 2024 •

edited

kianenigma commented Apr 17, 2024 •

edited