-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add local registry overlays #13926
base: master
Are you sure you want to change the base?
Add local registry overlays #13926
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ehuss (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
crates/cargo-test-support/src/lib.rs
Outdated
if let Some(ref mut p) = self.process_builder { | ||
let env_value = format!("{}={}", url, path); | ||
p.env( | ||
"__CARGO_TEST_PACKAGE_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The term of art for this kind of vulnerability is "Dependency Confusion". As with so many names in computer science, it's better to have consistency then trying to find a better name. This term was coined in the blog post https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610 which might be worth linking in comments about how users could easily miss use this.
Catching Eric up on some of the context, Arlo and I discussed the questions from #10948 (comment) at office hours on Thursday. Adding this API, exclusively for internal use, seemed like the least hacky solution. |
src/cargo/sources/overlay.rs
Outdated
/// expose this publicly. However, it is useful for some very specific private | ||
/// things, like locally verifying a bunch of packages at a time before any of | ||
/// them have been published. | ||
pub struct OverlayConfusionAttack<'gctx> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I think OverlaySource
is a better fitting name 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That name definitely makes more sense, yes. But in office hours someone suggested giving it a scary-looking name to make people less likely to misuse it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like OverlaySource
being the suffix. Maybe DependencyConfusionThreatOverlaySource
?
Thanks for the review! I think this is ready for another look -- we managed to successfully use this for #10948 as discussed. |
tests/testsuite/registry_overlay.rs
Outdated
// An older version in the local registry doesn't confuse it. | ||
Package::new("baz", "0.0.1") | ||
.alternative(true) | ||
.local(true) | ||
.publish(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is really testing anything because baz
is too low of a version for the version requirement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation here was to test that the overlay is correctly the union of the versions in the overlay and the overlaid registries. (If it were only returning the overlay versions, it would only return the too-old version and fail to resolve)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except this is relying on the user to look at the error and know the fine distinction of whether it says baz@0.0.1
exists
We also should probably have a test the other direction, that the overlay source properly shadows the regular source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'm now testing both directions, and checking the appropriate versions in the stderr.
tests/testsuite/registry_overlay.rs
Outdated
Package::new("baz", "0.1.1").dep("baz", "=0.0.1").publish(); | ||
Package::new("baz", "0.0.1") | ||
.alternative(true) | ||
.local(true) | ||
.publish(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have the local / selected version depend on the registry version? Thats more like what we'll be dealing with for cargo package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, but also happy to discuss it further. My understanding was that the behavior of cargo package
w.r.t. intra-workspace deps should be exactly the same as if the deps were published before packaging. Resolution should have the option to select the versions that are being packaged simultaneously, but it's also allowed not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should be the behavior but we aren't testing that situation to confirm it. cargo package
will involve a local package depending on an overlayed package which may depend on a registry package.
I guess there are also situations where you depend on a registry package which will depend on the overlaid package (if clap_complete
used criterion
it would both depend on clap directly and through criterion
). So maybe its best to cover both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense, I just misunderstood your original comment. I'm now testing both directions.
I guess there are also situations where you depend on a registry package which will depend on the overlaid package
I think the registry package will never require an overlaid package, right? Because the registry package can satisfy all its dependencies from the registry. I added a test where the registry package will pick up an overlaid dependency because it's newer (but compatible with) the registry version of that dependency.
/// affect any newly created `PackageRegistry`s. | ||
/// | ||
/// See [`crate::sources::overlay::OverlayConfusionAttack`] for why you shouldn't use this. | ||
pub fn local_overlays(&self) -> RefMut<'_, HashMap<SourceId, PathBuf>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd hope we can avoid putting this on GlobalContext but its hard to say without reviewing the follow up at the same time. Should we defer this to the follow up PR?
My motivation is that it gets messy adding and removing from the Global Context and if we can directly instantiate things, the easier it will be to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can give a little more context before we decide whether or not to defer. cargo package
needs the overlay to take effect during a call to ops::compile
. AFAICT there are three possibilities:
- put the overlay configuration in
GlobalContext
- put the overlay configuration in the ephemeral workspace that
cargo package
creates - have
cargo package
do something lower-level thanops::compile
The workspace option has the advantage that the overlay configuration doesn't need to be removed (because the workspace is discarded). It seems like it would just require a little work in resolve_ws
and resolve_ws_with_opts
to get the overlay configuration out of the workspace and insert it into the PackageRegistry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workspace seems the most beneficial. Whichever way we go, still seems like it could be split out into a follow up PR
src/cargo/sources/config.rs
Outdated
let src = id.load(self.gctx, yanked_whitelist)?; | ||
return self.overlay_source(id, src, yanked_whitelist); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only call overlay_source
on the result of a load
Should we name it fn load
and perform the load and apply the overlay? I worry that the current structure could make it easy to overlook applying the overlay while this pattern would make it the default to go through Self::load
and we'd be less likely to introduce bugs to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we're already in a fn load
. I renamed overlay_source
to load_and_overlay
, and modified it to do the load and the overlay. I think this is an improvement but not sure it it really addresses the concern.
for (key, path) in gctx.local_overlays().iter() { | ||
let overlay_id = SourceId::for_local_registry(path)?; | ||
base.overlays.insert(*key, overlay_id); | ||
} | ||
|
||
if let Ok(overlay) = | ||
gctx.get_env("__CARGO_TEST_DEPENDENCY_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS") | ||
{ | ||
let (url, path) = overlay.split_once('=').ok_or(anyhow::anyhow!( | ||
"invalid overlay format. I won't tell you why; you shouldn't be using it anyway" | ||
))?; | ||
let overlay_id = SourceId::for_local_registry(path.as_ref())?; | ||
base.overlays.insert(SourceId::from_url(url)?, overlay_id); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move this hack up into the caller? PackageRegistry
is the immediate caller but maybe a layer above that? That would help with the GlobalContext
. The main downside is it would make it more annoying to support __CARGO_TEST_DEPENDENCY_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we moved the overlay configuration into Workspace
as discussed below then this could be pushed up into resolve_ws
and resolve_ws_with_opts
tests/testsuite/registry_overlay.rs
Outdated
) | ||
} | ||
|
||
fn simple_project() -> Project { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid using this. This reduces "duplication" but that is false code duplication. They have the appearance of being the same but they do not share requirements. Instead it affects the code locality (the ability to read tightly coupled code). I'm trying to double check what these tests are doing and jumping back and forth is making this more difficult.
This PR adds (private to cargo internals) support for local registry overlays, in which you can locally pretend to add packages to remote registries; the local packages will have the same source ids as the remote registry that you're overlaying.
There are two ways to set up these overlays: programmatically using
GlobalContext::local_overlays
and through the__CARGO_TEST_PACKAGE_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS
environment variable. You can't set up these overlays with.cargo/config
.The motivation for this is packaging workspaces. When we're packing a workspace, we'd like to be able to pretend (for lockfile generation and verification) that some workspace packages are already published even though they aren't.