Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always inline functions signatures containing f16 or f128 #133050

Merged
merged 2 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions compiler/rustc_mir_transform/src/cross_crate_inline.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,16 @@ fn cross_crate_inlinable(tcx: TyCtxt<'_>, def_id: LocalDefId) -> bool {
_ => {}
}

let sig = tcx.fn_sig(def_id).instantiate_identity();
for ty in sig.inputs().skip_binder().iter().chain(std::iter::once(&sig.output().skip_binder()))
{
// FIXME(f16_f128): in order to avoid crashes building `core`, always inline to skip
// codegen if the function is not used.
Comment on lines +53 to +57
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you move this down after let mir, you can just iterate over mir.args_iter() instead of having to call fn_sig

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would require moving let mir above the checks against opts.incremental, OptLevel::No, and threashold to avoid hitting a return false before we check for the types. Would this wind up with a perf impact since we don't generate mir here in these cases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I thought having this check so far down would be fine. But maybe not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a (temporary, right?) hack, I'd rather the diff be small and localized than pretty. What you've written here is good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(temporary, right?)

Yes please, I don't think we will be able to stabilize these types until LLVM at least doesn't crash on everything T2+.

if ty == &tcx.types.f16 || ty == &tcx.types.f128 {
return true;
}
}

// Don't do any inference when incremental compilation is enabled; the additional inlining that
// inference permits also creates more work for small edits.
if tcx.sess.opts.incremental.is_some() {
Expand Down
5 changes: 2 additions & 3 deletions library/core/src/num/f128.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1258,9 +1258,8 @@ impl f128 {
min <= max,
"min > max, or either was NaN",
"min > max, or either was NaN. min = {min:?}, max = {max:?}",
// FIXME(f16_f128): Passed by-ref to avoid codegen crashes
min: &f128 = &min,
max: &f128 = &max,
min: f128,
max: f128,
);

if self < min {
Expand Down
5 changes: 2 additions & 3 deletions library/core/src/num/f16.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1235,9 +1235,8 @@ impl f16 {
min <= max,
"min > max, or either was NaN",
"min > max, or either was NaN. min = {min:?}, max = {max:?}",
// FIXME(f16_f128): Passed by-ref to avoid codegen crashes
min: &f16 = &min,
max: &f16 = &max,
min: f16,
max: f16,
);

if self < min {
Expand Down
29 changes: 29 additions & 0 deletions tests/codegen/float/f16-f128-inline.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
//@ revisions: default nopt
//@[nopt] compile-flags: -Copt-level=0 -Zcross-crate-inline-threshold=never -Zmir-opt-level=0 -Cno-prepopulate-passes

// Ensure that functions using `f16` and `f128` are always inlined to avoid crashes
// when the backend does not support these types.

#![crate_type = "lib"]
#![feature(f128)]
#![feature(f16)]

pub fn f16_arg(_a: f16) {
// CHECK-NOT: f16_arg
todo!()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the automatic inlining heuristics wouldn't anyway already mark these functions as inlineable? Might be worth setting cross_crate_inline_threshold to 0 to be sure that doesn't mask the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a revision with -Zcross-crate-inline-threshold=never -Zmir-opt-level=0 -Cno-prepopulate-passes, is that reasonable?

Copy link
Member

@saethlin saethlin Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also add -Copt-level=0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah of course. Done, tests still pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-Cno-prepopulate-passes just disables LLVM optimizations, rustc changes its behavior in a few places based on the value of opt-level.

}

pub fn f16_ret() -> f16 {
// CHECK-NOT: f16_ret
todo!()
}

pub fn f128_arg(_a: f128) {
// CHECK-NOT: f128_arg
todo!()
}

pub fn f128_ret() -> f128 {
// CHECK-NOT: f128_ret
todo!()
}
Loading