-
Notifications
You must be signed in to change notification settings - Fork 21
Install a curated registry of crates with libraries #67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #67 +/- ##
==========================================
- Coverage 72.46% 72.41% -0.05%
==========================================
Files 9 9
Lines 345 377 +32
Branches 60 68 +8
==========================================
+ Hits 250 273 +23
- Misses 61 64 +3
- Partials 34 40 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
maspe36
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change doesn't yet add the code necessary to instruct cargo to use our curated registry.
I am extremely interested in this part. Do you have an example of we expect users to have in their .cargo/config.toml or what they should have in their Cargo.toml to reference these deps?
As I understand it, you cannot mix local registry dependencies, with crates.io dependencies, without either
A. having an index for the local registry
B. Mirroring all of the crates used from crates.io locally and assuming all dependencies come from this local registry
Really, this is the biggest "opportunity" for I'll call out a few "features" of how colcon's install space is shaped:
In reference to Cargo behaviors, I draw these conclusions:
There appears to be no official Cargo mechanism for declaring that a package may be acquired from more than one source, so we can't say "use the current workspace, then fall back to my parent workspace(s), then fall back to crates.io". Something will need to make the resolution decision before Cargo is run, and instruct Cargo which registry it should use for the dependency. Developing that pre-resolution mechanism is where my attention lies, and the Cargo instrument through which it communicates to the build invocation would be to use selective patching. I'd love to have a more in-depth discussion about developing that mechanism further, but to plant the seed, I'm currently hung up on the fact that a single cargo package build can (and some do today) utilize more than one version of a particular package in the dependency tree, and replacing each occurrence of that package with one sourced from a workspace (current or parent) my not satisfy the build. This is in conflict with a colcon user's expectation that if an upstream package is in their workspace(s), it will be used in the build of a downstream package. |
14da959 to
66f76c7
Compare
|
This ended up taking longer to write than I expected, so I did my best to structure it clearly. Hopefully it’s not too much of a wall of text! 😁 Cargo Directory SourcesDetails
TThat’s true — though there’s an important caveat worth calling out, there is no way to mix curated crates and those from crates-io. Take a look at the Rust page for Debian. They even call this out
That is to say, your [source]
[source.debian-packages]
directory = "/usr/share/cargo/registry"
[source.crates-io]
replace-with = "debian-packages"Do note, that they are replacing All cargo sources require an index (associated with a registry) of some form. Debian could have a [source.debian-packages]
directory = "/usr/share/cargo/registry"But then you'd have no way from a [dependencies]
# This does _NOT_ work
some-crate = { version = "1.0", source = "debian-packages" }error: no matching package named `some-crate` found
location searched: crates.io indexInterestingly, you can have crates in these directory sources that are not actually uploaded on crates.io. # Assume a `.cargo/config.toml` similar to what Debian's Rust page lays out
[dependencies]
# This is on crates.io and vendored locally
rand_core = "0.9.3"
# This is _not_ on crates.io, but is vendored locally, and does work
my_autogen_msg_crate = "*"Directory sources are what I'm referring to in B from my initial comment
Unfortunately, that means we can’t use crates directly from crates-io with directory sources as our approach, we have to mirror them locally. Local Registry SourcesDetails
This option does allow you to mix crates-io crates with local crates. However, it comes with even more overhead than just creating the But, assuming we actually do all of that leg work, we could have user [dependencies]
rand_core = "0.9.3"
my_autogen_msg_crate = { version="*", registry="A" }
my_crate_from_overlay = { version="*", registry="B" }Naming these registries would also be a huge challenge... This approach is what I was referring to in A in my original comment
Cargo Feature GapIn my opinion, Cargo is indeed missing a feature here. I have asked on the rust internals forum and didn't get much traction. The cargo team claims that
But from my perspective, they already have mutable registries built-in (look no further than patches), just not in an ergonomic way. I have been on and off drafting an RFC to close this gap in Cargo. On the Topic of Patching
We are already using patching today. We don't need either directory sources or local registry sources if patching is the means to the end. The downside to patching is that you need to patch in each workspace, and you have no way to know if you'll use a package from a workspace you've sourced. That generates a flood of warnings which you cannot disable (as of this moment). I'm sure there is some way to make our patching logic smarter such that this is less of a pain, but then that's code we need to maintain, and behavior we need to explain to colcon-cargo users until the end of time. Speaking from experience with rclrs's current message generation pipeline, this is a huge pain to teach and debug issues. Single Source of Truth for a Crate
If I'm understanding you correctly here, this appears to be a fundamental difference in philosophy between colcon and cargo. Because Rust leans so heavily into full source builds and static linking, its completely fine to use multiple versions of a crate in a single build [1]. I suspect this is extremely common and even if a crate author is aware and trying not to, their dependencies probably exhibit this behavior. I'm not sure we'll ever be able to have complete alignment on this. ConclusionI don't think either directory sources or local registry sources are robust solutions to the problems we're facing, without adding significant maintenance cost. I think effort is best spent trying to improve Cargo and in the meantime, leveraging both Ultimately, I think the less development effort required of colcon itself, the better — it keeps us aligned with Cargo’s ecosystem and lets existing tools continue to “just work.” Slightly tangential question — wouldn’t having colcon manage either a directory source or a local registry source potentially go against some of the items that are explicitly out of scope for colcon itself? From the design doc (link)
It seems like trying to vendor or curate crates from
|
A very quick drive-by comment, I'm afraid you might be reading too much into this. Patching a source, and rewriting stuff in that source (e.g. git history rewrites, mutating files) will lead to weirdness if you don't religiously clean your build caches. Yes, you can use patches to sneakily build with a not-yet-upstreamed bugfix included (but buyer beware). |
This largely mirrors the approach taken by major Linux distributions like Ubuntu and Fedora. We can curate the crates in a way that should allow us to use them in downstream package builds within this colcon workspace or downstream workspaces. This change doesn't yet add the code necessary to instruct cargo to use our curated registry.
66f76c7 to
cffbe77
Compare
|
While testing this I noticed that the general approach we have been taking, both here and in It's allowed to specify a dependency on the root level And it the workspace members can just refer to it: What Cargo will then do when building the crate is look for the workspace Cargo.toml and use it to define the dependency. However, if we only copy the crate's Cargo.toml the workspace Cargo.toml won't be found, hence we will get errors such as: This is not a super simple fix, I suppose a way to do it could be to introduce one additional intermediate folder with a custom rust workspace |
|
@maspe36 Thank you for your detailed answer in #67 (comment), it really helped me understanding the big picture. For Cargo directory sources, I think there's a way to mix directory sources with crates.io, PTAL to the CI in: Blast545/test_cargo_pulling_deps#1 I had to use a "bundler_tool" to trick cargo into exporting the local files of the local dependency in a format that can be consumed by the main application (this part probably has to be combined somehow with pallet patcher). But otherwise, after that's process is completed there's only a small addition needed to the main user .cargo/config.toml file: |
|
Hey @Blast545, thanks for spending time on this and bringing some fresh air to what feels like a stale problem :) I believe the big discovery in your approach is that the registry index can essentially be completely empty, which is indeed interesting. Here are my thoughts after spending a few hours familiarizing myself with this approach. We do not need the
We could potentially skirt around this by manually specifying this env var while running our cargo commands (or having this be set when the colcon workspace is sourced). But this will also require repopulating this cache with not just our generated interface packages, but any dependency, meaning this folder can be large. My local
This isn't as simple as running These obstacles can be overcome, but I question what this additional complexity buys us. Your example can be further simplified to not depend on And my [registries]
my-curated-ros-registry = { index = "https://empty" }
[source.my-curated-ros-registry]
registry = "https://empty"
replace-with = "local-mirror"
[source.local-mirror]
directory = "build/"
[dependencies]
msg_pkg = { registry = "my-curated-ros-registry", version = "0.1.0"}
// build/msg_pkg/src/lib.rs
pub fn add(left: u64, right: u64) -> u64 {
left + right
}And finally, // src/foo_node/src/main.rs
use msg_pkg;
fn main() {
println!("2 + 2 = {}", msg_pkg::add(2, 2));
}If we build and run, you'll see everything works as expected But, here is where the trouble starts. What if // build/msg_pkg/src/lib.rs
pub fn add(left: u64, right: u64) -> u64 {
// LGTM
left - right
}And rebuild No changes were picked up even though we are pointed directly at the source code of The critical piece here is this pesky In my example, this is the contents. {"files":{},"package":null}The hashes of the files are used by cargo to determine if a rebuild is necessary. Even if we go to the trouble of trying to manage these hashes ourself, cargo will actually just fail in this case as this is not a supported workflow for directory sources. So the only way for us to actually see our changes is by clearing out the build folder of I believe this is what @juleskers was referring to with this comment
Again, I come to the conclusion that without changes to Cargo itself, we should be looking into "pure" I recently put up a PR to re-export all of our generated rust crate symbols in PTAL and let me know what you think :) |
|
Considering that a lot of this is figuring out the non-public implementation behaviour of cargo, have you seen the recent writeup by the cargo team? There is some big work going in in revamping the caching, detection of staleness thereof, and increased value for incremental compilation. "inside rust" Writeup: |
Yeah, totally. I mainly used
I mean, yes, I agree it's an inconvenience, but how often do developers have to change code related to a lib dependency from their code? I would be expecting that users packages from the "curated registry" will be making changes to their projects and not directly to the dependencies. If they have to change their dependencies they can push the changes upstream or patch it locally. We can add documentation appropriate for those scenarios.
If we don't need to maintain an additional index for the packages, users of the curated registry can install their "core ROS 2 Rust" with vcs via vcs import --input https://raw.githubusercontent.com/ros2/ros2/ros2_rust_rolling/ros2.repos srcAnd the only burden of internal maintenance would be thinking what kind of requirements we want out of the curated registry and adding CI to it. We can request maintainers of the packages in this list to keep their checksum.json files per commit, so we don't use If we had to create a formal registry index, host it somewhere and maintain it, that would have thing more complicated on our end. |
|
I'm trying to assemble a more complete workflow to highlight how I see this could work in the future. Scott also told me in a meeting that he thinks this might not solve the problems related to |
I would argue that this is actually the main case where |
This largely mirrors the approach taken by major Linux distributions like Ubuntu and Fedora. We can curate the crates in a way that should allow us to use them in downstream package builds within this colcon workspace or downstream workspaces.
This change doesn't yet add the code necessary to instruct cargo to use our curated registry.
Example of an Ubuntu package:
Details
Example of a Fedora package:
Details