XDM with MMR proofs: 1 #2508

vedhavyas · 2024-02-05T14:37:12Z

This is the first of multiple PRs that updates XDM protocol to use MMR proofs for verification. This PR brings more if not all the type changes required for complete migration to MMR proofs. The goal was to introduce new types and add necessary infra to verify and generate the XDM proofs while the Domains are not yet enabled..

This PR also handles the verification part of the XDM but I did not update the tests yet since plan was to do focus on the infra.

Some notes:

Storage changes to pallet-domains
Host function updates and new additions for Consensus chain and Domains
Added a new crate for messenger host functions instead of sp-messenger due to cyclic depenedencies and changing that would be too invasive which I'm not quiet excited about yet. Once we have initial changes in, we can think of refactoring if need arises.

Overall, recommend going commit by commit for a better reviewing expereince

Removed an existing now redundant storage

Code contributor checklist:

I have read, understood and followed contributing guide

nazar-pc

I get the idea here, but I do not believe host functions are necessary. Left some other minor comments and didn't review last 3 commits at all.

crates/pallet-domains/src/block_tree.rs

crates/pallet-domains/src/lib.rs

nazar-pc · 2024-02-06T09:25:19Z

crates/pallet-domains/src/staking.rs

@@ -1231,9 +1233,10 @@ where

 #[cfg(test)]


Not about this PR, but tests should generally be in their own module. It is a better dev experience (library doesn't recompile when tests change) and cleaner overall.

domains/primitives/messenger/src/messages.rs

nazar-pc · 2024-02-06T09:49:00Z

crates/subspace-runtime/src/lib.rs

+        opaque_leaf: EncodableOpaqueLeaf,
+        proof: Proof<mmr::Hash>,
+    ) -> Option<Hash> {
+        let leaf: mmr::Leaf = opaque_leaf.into_opaque_leaf().try_decode()?;


The way it is written works and is correct, but I find placing types where they come from rather than making compiler infer them backwards easier to read, looks more linear:

Suggested change

let leaf: mmr::Leaf = opaque_leaf.into_opaque_leaf().try_decode()?;

let leaf = opaque_leaf.into_opaque_leaf().try_decode::<mmr::Leaf>()?;

There are also other places like this in the code.

This is similarly to let x = X::from(y) being more readable than let x: X = y.into(), just try to read it out loud, it is harder to do smoothly with .into(). In most cases : Type is not necessary.

hmm that is fair. For me this approach always made a lot of sense from reading the perspective since I know what is the final type that is being assigned to the variable instead of going through till the end to know what this type is.

This is similarly to let x = X::from(y) being more readable than let x: X = y.into(),

This I agree since the position of the final type is same in both the scenarios and readability wise, its much more cleaner to read.

Having said that, I dont have a strong opinion but rather a personal preference

nazar-pc · 2024-02-06T09:56:07Z

domains/runtime/evm/src/lib.rs

+        let leaf: MmrLeaf<ConsensusBlockNumber, ConsensusBlockHash> =
+            opaque_leaf.into_opaque_leaf().try_decode()?;
+        let state_root = leaf.state_root();
+        verify_mmr_proof(vec![EncodableOpaqueLeaf::from_leaf(&leaf)], proof.encode())


Why not using Mmr::verify_leaves here?

This is domain runtime and they do not have access to Mmr::verify_leaves unlike Consensus chain runtime. Hence, we have the host functions to verify such proofs for domains

It may not have right now, but what prevents us from including it as a dependency in domain runtime?

I'm not sure I completely follow you here. I have explained above in other comment why we I chose to go this route of using host function to verify the proof

crates/sp-domains/src/lib.rs

domains/pallets/messenger/src/benchmarking.rs

nazar-pc · 2024-02-06T10:14:10Z

domains/primitives/messenger-host-functions/src/host_functions.rs

+            StorageKeyRequest::ConfirmedDomainBlockStorageKey(domain_id) => runtime_api
+                .confirmed_domain_block_storage_key(best_hash, domain_id)
+                .map(Some),


I don't think I understand this host function either, comment is the same as for previous hash functions

This host function dates back some time when we had a discussion on assumed storage keys where we were defining the mocked storages and deriving their storage keys to verify the storage proof.

Here in messenger, when a message comes from Domain_a to Domain_b, previously Domain_b was deriving the storage key from the its own storage. This made to have the pallet messenger definition same across all the different runtimes.

What this host function does here is, Domain_b uses the the Domain_a runtime from the consensus chain and then use its Stateless API to get the storage key. This approach does not hold us back with such strict assumptions where pallets are defined exactly the same across runtimes

I see the benefits of decoupling, but I'm confused about what this has to do with domains, this seems to call consensus client runtime API 🤔

So the domain_proof is a storage proof derived from LatestConfirmedDomainBlock on consensus runtime. We are using consensus runtime api to get that storage key while verifying the domain_proof and the use src domain runtime to fetch the Outbox and InboxResponse storage keys to complete the XDM message verification

vedhavyas

I do not believe host functions are necessary.

I tried to explain why the host functions are required over the comments itself but we can do a Sync on this PR specifically if it helps you

crates/pallet-domains/src/lib.rs

crates/sp-domains/src/lib.rs

vedhavyas · 2024-02-06T10:31:42Z

crates/sp-subspace-mmr/src/host_functions.rs

@@ -56,4 +63,16 @@ where
            extrinsics_root: H256::from_slice(header.extrinsics_root().as_ref()),
        })
    }
+
+    fn verify_mmr_proof(&self, leaves: Vec<EncodableOpaqueLeaf>, encoded_proof: Vec<u8>) -> bool {


I understand why host function is needed for proof generation, but why is it needed for proof verification and why for proof verification you're calling runtime API anyway?

Okay I see the confusion here. I have decided to re-use this implementation of extension for Domain host function. See this definition DomainMmrRuntimeInterface. The proof verification for consensus chain is directly done on the consensus chain itself. But the proof verification for Domains on the other hand would require host function that uses consensus chain runtime_api to do such verification.

vedhavyas · 2024-02-06T10:38:45Z

crates/sp-subspace-mmr/src/host_functions.rs

@@ -56,4 +63,16 @@ where
            extrinsics_root: H256::from_slice(header.extrinsics_root().as_ref()),
        })
    }
+
+    fn verify_mmr_proof(&self, leaves: Vec<EncodableOpaqueLeaf>, encoded_proof: Vec<u8>) -> bool {


Moreover, you're requesting something about best block, which may not even belong to the same fork as the block that made host function call.

For MMR, the proof is strictly self contained in the sense we do not need to use the same consensus chain hash that generated the proof. The reason for such is the proof hold the total leaf count at the time of proof generation. So while verifying, the mmr root is calculated with that leaf count since it is strictly appending. Also, the leaves before a block is finalized will be stored in fork aware fashion and runtime picks what it assumes to be the canonical fork while verifying. So proof will either be valid or invalid.

vedhavyas · 2024-02-06T10:43:19Z

crates/subspace-runtime/src/lib.rs

+        opaque_leaf: EncodableOpaqueLeaf,
+        proof: Proof<mmr::Hash>,
+    ) -> Option<Hash> {
+        let leaf: mmr::Leaf = opaque_leaf.into_opaque_leaf().try_decode()?;


hmm that is fair. For me this approach always made a lot of sense from reading the perspective since I know what is the final type that is being assigned to the variable instead of going through till the end to know what this type is.

This is similarly to let x = X::from(y) being more readable than let x: X = y.into(),

This I agree since the position of the final type is same in both the scenarios and readability wise, its much more cleaner to read.

Having said that, I dont have a strong opinion but rather a personal preference

vedhavyas · 2024-02-06T10:48:41Z

domains/primitives/messenger-host-functions/src/host_functions.rs

+            StorageKeyRequest::ConfirmedDomainBlockStorageKey(domain_id) => runtime_api
+                .confirmed_domain_block_storage_key(best_hash, domain_id)
+                .map(Some),


This host function dates back some time when we had a discussion on assumed storage keys where we were defining the mocked storages and deriving their storage keys to verify the storage proof.

Here in messenger, when a message comes from Domain_a to Domain_b, previously Domain_b was deriving the storage key from the its own storage. This made to have the pallet messenger definition same across all the different runtimes.

What this host function does here is, Domain_b uses the the Domain_a runtime from the consensus chain and then use its Stateless API to get the storage key. This approach does not hold us back with such strict assumptions where pallets are defined exactly the same across runtimes

domains/primitives/messenger/src/messages.rs

vedhavyas · 2024-02-06T11:02:10Z

domains/runtime/evm/src/lib.rs

+        let leaf: MmrLeaf<ConsensusBlockNumber, ConsensusBlockHash> =
+            opaque_leaf.into_opaque_leaf().try_decode()?;
+        let state_root = leaf.state_root();
+        verify_mmr_proof(vec![EncodableOpaqueLeaf::from_leaf(&leaf)], proof.encode())


This is domain runtime and they do not have access to Mmr::verify_leaves unlike Consensus chain runtime. Hence, we have the host functions to verify such proofs for domains

vedhavyas · 2024-02-06T11:59:28Z

Seems like this PR might have broken the operator tests though I'm not sure if that is the case yet. For now cancelling these CI runs until I can figure out what is what

vedhavyas · 2024-02-06T15:19:02Z

@nazar-pc I was just verifying how the state_pruning and block_pruning works live with @jfrank-summit. This is currently jeremy's setup

state_pruning: archive-canonical
block_pruning: 256

We decided to pick block #10 on gemini-3h and as expected, the state for block #10 was available but block body was not available. Header was still available and this reminds me of this issue from you paritytech/substrate#14758 and since shamil is handling the pruning of headers as well, so reference implementation will have a growing size of state from each state transition. I'm assuming this was unintentional but I was under the assumption that it was indeed intentional.

Now coming to the crust of the PR and our discussion earlier, we can make the mmr verification stateless by storing the mmr roots on the consensus runtime and prune them on a rolling window, like you pointed out there are some edge cases when it comes to domains where they would need to access such mmr roots to verify these proofs which might not be available when they are required if for example domain halted for long enough period while consensus chain continued to progress.

Overall, looks like we would need some form of time limit for XDM by which they should included but if they are not, then that particular channel is completely halted. There will be way to unhalt it by submitting some extension proofs but have not discussed or spec'ed this out yet.

I'll try to put some more points again looking at other options but at the moment, I feel the usage of host functions to fetch the offchain data for verification seems more viable.

nazar-pc · 2024-02-06T15:26:25Z

We decided to pick block #10 on gemini-3h and as expected, the state for block #10 was available but block body was not available

Hm... this is very problematic then 😞 We'll need to fix it. And this also explains a lot, thanks!

I'm assuming this was unintentional but I was under the assumption that it was indeed intentional.

This was not intentional, but there is no API in Substrate to deliberately prune things we want to prune, rather than everything at a fixed distance from the tip of the chain.

Now coming to the crust of the PR and our discussion earlier, we can make the mmr verification stateless by storing the mmr roots on the consensus runtime and prune them on a rolling window, like you pointed out there are some edge cases when it comes to domains where they would need to access such mmr roots to verify these proofs which might not be available when they are required if for example domain halted for long enough period while consensus chain continued to progress.

Overall, looks like we would need some form of time limit for XDM by which they should included but if they are not, then that particular channel is completely halted. There will be way to unhalt it by submitting some extension proofs but have not discussed or spec'ed this out yet.

I'll try to put some more points again looking at other options but at the moment, I feel the usage of host functions to fetch the offchain data for verification seems more viable.

I think time limit is necessary either way due to the fact that we are pruning MMR data in offchain storage anyway and we will prune state once we can as well.

NingLin-P · 2024-02-06T21:36:08Z

I have spent some time to understand how MMR proof generation and verification work in more detail and catch up with the above discussion.

I think host function is inevitable for verifying XDM on the domain, because the dest domain has to verify the XDM comes from a confirmed block of the src domain, and only the consensus chain have knowledge about whether a domain block (or ER) is confirmed thus domain have to query from the consensus chain when verifying the XDM.

we can make the mmr verification stateless by storing the mmr roots on the consensus runtime and prune them on a rolling window

I'm not sure the whole idea of storing multiple MMR roots on the consensus runtime but it feels like we can archive the same by using an old consensus hash (instead of the best hash) to call the consensus runtime API and get the expected MMR root.

vedhavyas · 2024-02-07T04:02:03Z

I'm not sure the whole idea of storing multiple MMR roots on the consensus runtime but it feels like we can archive the same by using an old consensus hash (instead of the best hash) to call the consensus runtime API and get the expected MMR root.

You are making an assumption here that there exists the state for such old hash for every consensus node that is currently importing / doing state transition. This is not required to be true and current state_pruning being either archive/archive-canonical just happens to be unintentional change like we discussed above.

I think time limit is necessary either way due to the fact that we are pruning MMR data in offchain storage anyway and we will prune state once we can as well.

True but we can make this change specifically for MMR offchain rather than relying on state to be available since this would be self-contained unlike dependency on state to be available

These todos are cleared in the next coming commits

This will be used by the domains to verify MMR proof

…paller-messenger

… messenger xdm verification.

…code

…ead rely on on Confirmed domain blocks Also fixed some WASM sp-std and missing exclusion of default features

…aud proof verifier to use the extensions

vedhavyas · 2024-02-14T07:22:16Z

@nazar-pc while tracking MMR roots might be one possible solution, doing such introduces bunch of edge cases, some we already know and some we might not have known yet, and also defeats the purpose of introducing MMR since if we are tracking those roots, then we would rather just track state_roots instead and completely eliminate MMR for XDM. Personally, not super excited about this though.

As for the expiration time for XDM, it would still remain the same and we just need to prune the offchain leaves. I initially thought it would be invasive but it actually quite straightforward to prune a leaf and this is self contained would not need to relay on state or block body for any verification.

I have updated the PR with review feedback earlier. PTAL and let me know your thoughts ?

nazar-pc · 2024-02-16T12:08:34Z

@nazar-pc while tracking MMR roots might be one possible solution, doing such introduces bunch of edge cases, some we already know and some we might not have known yet, and also defeats the purpose of introducing MMR since if we are tracking those roots, then we would rather just track state_roots instead and completely eliminate MMR for XDM. Personally, not super excited about this though.

I believe MMR is useful beyond immediate need of XDM. While I also think storing roots in the runtime is preferred comparing to the client (might allow for some interesting proofs in the future), it is not a blocker.

Please don't let this PR be blocked by me, @NingLin-P can you take a look here, please?

vedhavyas · 2024-02-22T05:56:19Z

Conflicts are not a blocker for PR review. Will fix conflicts when in a moment. Thanks!

nazar-pc

Makes sense overall, but the way force pushes were done after initial review with zero explanation (especially the first one) makes it really hard to understand what has actually changed since last time I reviewed earlier version of this PR.

nazar-pc · 2024-02-22T12:24:07Z

domains/service/Cargo.toml

 sp-transaction-pool = { version = "4.0.0-dev", git = "https://github.com/subspace/polkadot-sdk", rev = "d6b500960579d73c43fc4ef550b703acfa61c4c8" }
 subspace-core-primitives = { version = "0.1.0", path = "../../crates/subspace-core-primitives" }
 subspace-runtime-primitives = { version = "0.1.0", path = "../../crates/subspace-runtime-primitives" }
+subspace-service = { version = "0.1.0", path = "../../crates/subspace-service" }


I really don't like this change. This adds a huge subspace-service to domains-service significantly reducing compilation parallelism and likely increasing compilation time.

that is fair. If not this, then I would have to introduce another crate just for Domain specific host functions and extension factory. Did not feel justified enough but if that is preferrable, I can do that instead 👍🏼

vedhavyas

the way force pushes were done after initial review with zero explanation (especially the first one) makes it really hard to understand what has actually changed since last time I reviewed earlier version of this PR

Apologies for no explanation. I picked out some commits, specifically first one, in a seperate PR that was already landed in main. Github's diff was all messed up when I initially went with a merge so I ended rebasing it for a cleaner diff while taking the opportunity to address the feedback provided earlier.

vedhavyas · 2024-02-23T04:37:45Z

domains/service/Cargo.toml

 sp-transaction-pool = { version = "4.0.0-dev", git = "https://github.com/subspace/polkadot-sdk", rev = "d6b500960579d73c43fc4ef550b703acfa61c4c8" }
 subspace-core-primitives = { version = "0.1.0", path = "../../crates/subspace-core-primitives" }
 subspace-runtime-primitives = { version = "0.1.0", path = "../../crates/subspace-runtime-primitives" }
+subspace-service = { version = "0.1.0", path = "../../crates/subspace-service" }


that is fair. If not this, then I would have to introduce another crate just for Domain specific host functions and extension factory. Did not feel justified enough but if that is preferrable, I can do that instead 👍🏼

nazar-pc · 2024-02-23T10:30:29Z

Apologies for no explanation. I picked out some commits, specifically first one, in a seperate PR that was already landed in main. Github's diff was all messed up when I initially went with a merge so I ended rebasing it for a cleaner diff while taking the opportunity to address the feedback provided earlier.

It was typically done in the service file, but I see that you included it in subspace-service because it is needed there as well, not just in domain's service. Separate crate works for me.

nazar-pc

The changes make sense, but I'd like to avoid re-exports both in subspace-service and in domains-service.

…n factory

vedhavyas · 2024-02-23T12:51:15Z

The changes make sense, but I'd like to avoid re-exports both in subspace-service and in domains-service.

I'm not sure why that is preferable here since this is mono-repo crate and would not have any issues with semvar bumps unlike the external crate but I don't have a strong stance on this. Should be updated now

nazar-pc

I'm not sure why that is preferable here since this is mono-repo crate and would not have any issues with semvar bumps unlike the external crate but I don't have a strong stance on this. Should be updated now

That way it is clearer what are explicit dependencies, also public re-exports sometimes result in dependencies that remain even when no longer used simply due to public re-export. IMO it is a good default, the only exception I'm aware of in our repo is the whole libp2p re-exported from subspace-networking, but arguably it makes a lot of sense there.

vanhauser-thc · 2024-04-05T11:03:58Z

domains/client/relayer/src/lib.rs

-                        BlockInfo {
-                            block_number: number,
-                            block_hash: hash,
+                // TODO: Derive correct domain proof


Is there tracking for this TODO? please put a link if so. if not - well there should be :)

This is todo is already resolved as part of the next PR
more here - https://github.com/subspace/subspace/pull/2562/files#diff-f6f9310595b3fedfb829530fe9971b90398313a38566a75f08479f8518db178fL384

vanhauser-thc · 2024-04-05T12:00:43Z

LGTM on the surface but we will review XDM once the implementation is stable.

vedhavyas requested review from NingLin-P, nazar-pc and rg3l3dr as code owners February 5, 2024 14:37

vedhavyas added execution Subspace execution need to audit This change needs to be audited labels Feb 5, 2024

vedhavyas force-pushed the xdm_mmr branch 6 times, most recently from 2f2ace6 to 5ee10ec Compare February 6, 2024 08:01

nazar-pc reviewed Feb 6, 2024

View reviewed changes

vedhavyas commented Feb 6, 2024

View reviewed changes

vedhavyas mentioned this pull request Feb 7, 2024

Update storages and types for Domains #2512

Merged

1 task

re-define XDM proof and mark todos where necessary.

1a6aea5

These todos are cleared in the next coming commits

vedhavyas force-pushed the xdm_mmr branch from 5ee10ec to ad4f880 Compare February 14, 2024 06:55

vedhavyas added 7 commits February 14, 2024 12:37

define domain mmr host functions.

21310d5

This will be used by the domains to verify MMR proof

define singular mmr proof verification and state root extraction for …

489e22e

…paller-messenger

define host function to get storage keys for a given chain and update…

3cc34b4

… messenger xdm verification.

update messenger runtime api definitions and cleanup unused leftover …

faf85fd

…code

update runtimes with actual xdm verification

9c6a750

remove state root api since relayer would not use it anymore but inst…

2acab7c

…ead rely on on Confirmed domain blocks Also fixed some WASM sp-std and missing exclusion of default features

refactor domain specific host functions and ext factory and update fr…

4ee91fd

…aud proof verifier to use the extensions

vedhavyas force-pushed the xdm_mmr branch from ad4f880 to 4ee91fd Compare February 14, 2024 07:07

vedhavyas requested a review from nazar-pc February 14, 2024 07:22

nazar-pc previously approved these changes Feb 22, 2024

View reviewed changes

NingLin-P previously approved these changes Feb 22, 2024

View reviewed changes

vedhavyas commented Feb 23, 2024

View reviewed changes

Merge branch 'main' into xdm_mmr

e4d9d28

vedhavyas dismissed stale reviews from NingLin-P and nazar-pc via e4d9d28 February 23, 2024 04:49

vedhavyas requested review from nazar-pc and NingLin-P February 23, 2024 04:49

nazar-pc reviewed Feb 23, 2024

View reviewed changes

create sc-domains to hold domain specific host functions and extensio…

5cdebfd

…n factory

vedhavyas force-pushed the xdm_mmr branch from 63b46d9 to 5cdebfd Compare February 23, 2024 12:49

vedhavyas requested a review from nazar-pc February 23, 2024 12:51

vedhavyas enabled auto-merge February 23, 2024 12:51

nazar-pc approved these changes Feb 23, 2024

View reviewed changes

vedhavyas added this pull request to the merge queue Feb 23, 2024

Merged via the queue into main with commit ab65eba Feb 23, 2024
11 checks passed

vedhavyas deleted the xdm_mmr branch February 23, 2024 14:03

vedhavyas mentioned this pull request Feb 29, 2024

Introduce new versioned execution_proof_check #2572

Merged

1 task

vanhauser-thc reviewed Apr 5, 2024

View reviewed changes

vanhauser-thc added audited This change was audited and removed need to audit This change needs to be audited labels Apr 5, 2024

	let leaf: mmr::Leaf = opaque_leaf.into_opaque_leaf().try_decode()?;
	let leaf = opaque_leaf.into_opaque_leaf().try_decode::<mmr::Leaf>()?;

XDM with MMR proofs: 1 #2508

XDM with MMR proofs: 1 #2508

Conversation

vedhavyas commented Feb 5, 2024 • edited by nazar-pc

Code contributor checklist:

nazar-pc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vedhavyas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vedhavyas commented Feb 6, 2024

vedhavyas commented Feb 6, 2024

nazar-pc commented Feb 6, 2024

NingLin-P commented Feb 6, 2024 • edited

vedhavyas commented Feb 7, 2024

vedhavyas commented Feb 14, 2024

nazar-pc commented Feb 16, 2024

vedhavyas commented Feb 22, 2024

nazar-pc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vedhavyas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nazar-pc commented Feb 23, 2024

nazar-pc left a comment

Choose a reason for hiding this comment

vedhavyas commented Feb 23, 2024

nazar-pc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vanhauser-thc commented Apr 5, 2024

vedhavyas commented Feb 5, 2024 •

edited by nazar-pc

NingLin-P commented Feb 6, 2024 •

edited