Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure search paths inside OUT_DIR precede external paths #15221

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vlovich
Copy link

@vlovich vlovich commented Feb 22, 2025

If a library exists both in an added folder inside OUT_DIR and in the OS, prefer to use the one within OUT_DIR. Folders within OUT_DIR and folders outside OUT_DIR do not change their relative order between themselves.

This is accomplished by sorting by whether we think the path is inside the search path or outside.

What does this PR try to resolve?

Fixes #15220. If a Rust crates builds a dynamic library & that same dynamic library is installed in the host OS, the result of the build's success & consistent behavior of executed tools depends on whether or not the user has the conflicting dynamic library in the external search path. If they do, then the host OS library will always be used which is unexpected - updates to your Rust dependency will still have you linking & running against an old host OS library (i.e. someone who doesn't have that library has a different silent behavior).

How should we test and review this PR?

This is what I did to verify my issue got resolved but I'm sure there's a simpler example one could construct.

  • Make sure Alsa and libllama.so are installed (on Arch I installed alsa-lib and llama.cpp-cuda).
  • Clone llama-cpp-2 & init llama.cpp submodule & update the submodule to point to llama : expose llama_model_n_head_kv in the API ggml-org/llama.cpp#11997 instead.
  • Add plumbing to expose the new method within llama-cpp-2 as a public facing function on the LlamaModel struct (it's basically the same code as for n_head, just calling n_head_kv from llama.cpp).
  • Add cpal as a dependency in crate "foo"
  • Add llama-cpp-2 via path as a dependency in crate "foo" and enable the dynamic-link feature.
  • Add code using the newly expose n_head_kv method in crate "foo" in main.rs. NOTE: Code just needs to compile & be exported, doesn't have to be correct (fn main is probably easiest.
  • Add some basic code that tries to initialize cpal in crate "foo" in fn main.
  • Try to build / run crate "foo"

Before my change, it fails with a linker error saying it can't find llama_model_n_head_kv because /usr/lib appears in the search path before the directory that contains the libllama.so that was built internally by the crate. This is because cpal depends on alsa-sys which uses pkg-config which adds /usr/lib to the search path before the llama-cpp-sys-2 build.rs is run.

Additional information

I'm not sure how to add tests so open to some help on that. I wanted to make sure that this approach is even correct. I coded this to change Cargo minimally and defensively since I don't know the internals of Cargo very well (e.g. I don't know if I have to compare against both script_out_dir / script_out_dir_when_generated since I don't know the difference & there's not really any explanation on what they are).

It's possible this over-complicates the implementation so open to any feedback. Additionally, the sort that happens prior to each build up of the rustc environment is not where I'd ideally place it. I think it would be more efficient to have the list of search paths be free-floating and not tied to a BuildOutput so that they could be kept updated live & resorted only on insertion (since it's changed less frequently than rustc is invoked). Additionally, the generalized sort is correct but pessimistic - maintaining the list sorted could be done efficiently with some minor book keeping (i.e. you'd only need to sort the new paths & then could quickly inject into the middle of a VecDeque).

And of course in terms of correctness, I didn't do a thorough job testing across all possible platforms. From first principles this seems directionally correct but it's always possible this breaks someone else's workflow. I'm also uneasy that the relative position of -L / -l arguments changes in this PR & I'm not sure if that's observable behavior or not (i.e. it used to be -L for a crate followed by -l for a crate), but now it's -L for all crates, still grouped by crated internally, followed by -l by crate).

…the link order

If a library exists both in an added folder inside OUT_DIR and in the
OS, prefer to use the one within OUT_DIR. Folders within OUT_DIR and
folders outside OUT_DIR do not change their relative order between
themselves.
@rustbot
Copy link
Collaborator

rustbot commented Feb 22, 2025

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ehuss (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added A-build-execution Area: anything dealing with executing the compiler A-build-scripts Area: build.rs scripts S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 22, 2025
@vlovich
Copy link
Author

vlovich commented Feb 22, 2025

Not sure what the CI failure for "check-version-bump" is but I'm assuming it's some minor book keeping that's validating a version number somewhere needs to be bumped because I made a consequential change? But the output when running locally doesn't make sense to me:

--- failure function_requires_different_generic_type_params: function now requires a different number of generic type parameters ---

Description:
A function now requires a different number of generic type parameters than it used to. Uses of this function that supplied the previous number of generic types will be broken.
        ref: https://doc.rust-lang.org/reference/items/generics.html
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.39.0/src/lints/function_requires_different_generic_type_params.ron

Failed in:
  function strip_prefix_canonical (1 -> 0 generic types) in /home/vlovich/projects/cargo/crates/cargo-util/src/paths.rs:713

I haven't touched strip_prefix_canonical - running cargo semver-checks check-release --workspace on master seems to indicate this is broken there too so I think this step might just be broken for everyone right now.

@weihanglo
Copy link
Member

That CI job failure is completely unrelated to this PR. See #15222.

@weihanglo
Copy link
Member

Just note that it is generally recommended an issue to be discussed and labeled as S-accepted and than open a PR, though you are welcome to open this as a draft PR and hack with it :)

https://doc.crates.io/contrib/process/working-on-cargo.html#before-hacking-on-cargo

@vlovich
Copy link
Author

vlovich commented Feb 22, 2025

Good note but I also wanted to see how difficult such a change would be to help guide any questions that might come up in the issue.

@epage epage marked this pull request as draft February 24, 2025 17:33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it might be a bit fragile, it is totally possible to run Cargo with --verbose in test and assert the order of -L flags occurrences in rustc arguments.

See the doc and this example test:

p.cargo("build --verbose").with_stderr_data(str![[r#"
[LOCKING] 1 package to latest compatible version
[COMPILING] foo v0.5.0 ([ROOT]/foo/foo)
[RUNNING] `rustc --crate-name build_script_build --edition=2015 foo/build.rs [..]`
[RUNNING] `[ROOT]/foo/target/debug/build/foo-[HASH]/build-script-build`
[RUNNING] `rustc --crate-name foo --edition=2015 foo/src/lib.rs [..]-L dependency=[ROOT]/foo/target/debug/deps -L [ROOT]/foo/dummy-path1 -L [ROOT]/foo/dummy-path2 -l nonexistinglib`
[COMPILING] bar v0.5.0 ([ROOT]/foo)
[RUNNING] `rustc --crate-name bar --edition=2015 src/main.rs [..]-L dependency=[ROOT]/foo/target/debug/deps --extern foo=[ROOT]/foo/target/debug/deps/libfoo-[HASH].rlib -L [ROOT]/foo/dummy-path1 -L [ROOT]/foo/dummy-path2`
[FINISHED] `dev` profile [unoptimized + debuginfo] target(s) in [ELAPSED]s
"#]]).run();

Mind that in this repo we prefer writing the test first with a snapshot of the current behavior, and passes. Then we add a commit with the actual fix and the test update, so the test snapshot update can precisely show the behavior change.

Comment on lines +967 to +968
script_out_dir,
script_out_dir_when_generated,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we still need script_out_dir_when_generated here, isn't value already rewritten with script_out_dir at line 932?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was being overly defensive. It should just pass in script_out_dir.

@@ -74,11 +75,83 @@ pub enum Severity {

pub type LogMessage = (Severity, String);

#[derive(Clone, Debug, Hash, PartialEq, Eq)]
pub enum LibraryPath {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum and variants may be worth some documentations about what they are and why we're doing this. And I think we can derive Ord here.

https://doc.rust-lang.org/stable/std/cmp/trait.Ord.html#derivable

When derived on enums, variants are ordered primarily by their discriminants. Secondarily, they are ordered by their fields. By default, the discriminant is smallest for variants at the top, and largest for variants at the bottom. Here’s an example:

@@ -649,6 +667,13 @@ fn add_plugin_deps(
Ok(())
}

fn get_dynamic_search_path(path: &PathBuf) -> &Path {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind moving this function extraction to its own commit? That would help review and trace code in the future.

(Git is pretty bad at displaying this kind of diff 😞).

@@ -649,6 +667,13 @@ fn add_plugin_deps(
Ok(())
}

fn get_dynamic_search_path(path: &PathBuf) -> &Path {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn get_dynamic_search_path(path: &PathBuf) -> &Path {
fn get_dynamic_search_path(path: &Path) -> &Path {

If nothing terribly needs it, we usually favor &RefType over &OwnedType for function arguments.

library_paths.extend(output.library_paths.iter());
}

library_paths.sort_by_key(|p| match p {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've implemented (or derived) Ord, so sort should be good enough. No need to

Suggested change
library_paths.sort_by_key(|p| match p {
library_paths.sort()

Copy link
Author

@vlovich vlovich Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not. This does a grouped sort, keeping the relative ordering within a group the same (i.e. it moves all Cargo artifacts to the front of the search path and external paths to the end, but within cargo artifacts & within external paths the relative order of specified paths doesn't change. If you just did .sort, then it would sort by path within the group & screw up the ordering that build.rs requested. This needs a comment & the test should explicitly verify that this condition holds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct. I somehow thought it was already handled as I saw the manual Ord impl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-execution Area: anything dealing with executing the compiler A-build-scripts Area: build.rs scripts S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

link-search-path doesn't have a way to specify priority / doesn't auto deprioritize OS link paths
4 participants