Skip to content

Let codegen_transmute_operand just handle everything #143860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 27, 2025

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Jul 12, 2025

When combined with #143720, this means rvalue_creates_operand can just return true for every Rvalue. (A future PR could consider removing it, though just letting it optimize out is fine for now.)

It's nicer anyway, IMHO, because it avoids needing the layout checks to be consistent in the two places, and thus is an overall reduction in code. Plus it's a more helpful building block when used in other places this way.

(TBH, it probably would have been better to have it this way the whole time, but I clearly didn't understand rvalue_creates_operand when I originally wrote #109843.)

@rustbot
Copy link
Collaborator

rustbot commented Jul 12, 2025

r? @petrochenkov

rustbot has assigned @petrochenkov.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 12, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jul 12, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

@scottmcm
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 13, 2025
bors added a commit that referenced this pull request Jul 13, 2025
Let `codegen_transmute_operand` just handle everything

When combined with #143720, this means `rvalue_creates_operand` can just return `true` for *every* `Rvalue`.  (A future PR could consider removing it, though just letting it optimize out is fine for now.)

It's nicer anyway, IMHO, because it avoids needing the layout checks to be consistent in the two places, and thus is an overall reduction in code.  Plus it's a more helpful building block when used in other places this way.

(TBH, it probably would have been better to have it this way the whole time, but I clearly didn't understand `rvalue_creates_operand` when I originally wrote #109843.)
@bors
Copy link
Collaborator

bors commented Jul 13, 2025

⌛ Trying commit af1b680 with merge 1790c15...

@bors
Copy link
Collaborator

bors commented Jul 13, 2025

☀️ Try build successful - checks-actions
Build commit: 1790c15 (1790c15cb9074d67a9a5edd4ea709796b02df1ff)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (1790c15): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.4% [0.4%, 0.5%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary -4.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-4.0% [-4.0%, -4.0%] 1
All ❌✅ (primary) - - 0

Cycles

Results (primary -2.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.1% [-2.1%, -2.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.1% [-2.1%, -2.1%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 464.174s -> 465.333s (0.25%)
Artifact size: 374.70 MiB -> 374.75 MiB (0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 13, 2025
@petrochenkov
Copy link
Contributor

r? mir

@rustbot rustbot assigned saethlin and unassigned petrochenkov Jul 14, 2025
@scottmcm scottmcm force-pushed the transmute-always-rvalue branch from 5035094 to 70df8f6 Compare July 18, 2025 08:34
@rust-log-analyzer

This comment has been minimized.

@scottmcm scottmcm force-pushed the transmute-always-rvalue branch from 70df8f6 to 704af5e Compare July 18, 2025 16:29
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jul 18, 2025
…lcnr

Allow `Rvalue::Repeat` to return true in `rvalue_creates_operand` too

The conversation in rust-lang#143502 (comment) made be realize how easy this is to handle, since the only possibilty is ZSTs -- everything else ends up with the destination being `LocalKind::Memory` and thus doesn't call `codegen_rvalue_operand` at all.

This gets us perilously close to a world where `rvalue_creates_operand` only ever returns true.  (See rust-lang#143860 for more.)
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jul 20, 2025
…lcnr

Allow `Rvalue::Repeat` to return true in `rvalue_creates_operand` too

The conversation in rust-lang#143502 (comment) made be realize how easy this is to handle, since the only possibilty is ZSTs -- everything else ends up with the destination being `LocalKind::Memory` and thus doesn't call `codegen_rvalue_operand` at all.

This gets us perilously close to a world where `rvalue_creates_operand` only ever returns true.  (See rust-lang#143860 for more.)
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jul 20, 2025
…lcnr

Allow `Rvalue::Repeat` to return true in `rvalue_creates_operand` too

The conversation in rust-lang#143502 (comment) made be realize how easy this is to handle, since the only possibilty is ZSTs -- everything else ends up with the destination being `LocalKind::Memory` and thus doesn't call `codegen_rvalue_operand` at all.

This gets us perilously close to a world where `rvalue_creates_operand` only ever returns true.  (See rust-lang#143860 for more.)
rust-timer added a commit that referenced this pull request Jul 20, 2025
Rollup merge of #143720 - scottmcm:rvalue-always-operand, r=lcnr

Allow `Rvalue::Repeat` to return true in `rvalue_creates_operand` too

The conversation in #143502 (comment) made be realize how easy this is to handle, since the only possibilty is ZSTs -- everything else ends up with the destination being `LocalKind::Memory` and thus doesn't call `codegen_rvalue_operand` at all.

This gets us perilously close to a world where `rvalue_creates_operand` only ever returns true.  (See #143860 for more.)
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Jul 21, 2025
Allow `Rvalue::Repeat` to return true in `rvalue_creates_operand` too

The conversation in rust-lang/rust#143502 (comment) made be realize how easy this is to handle, since the only possibilty is ZSTs -- everything else ends up with the destination being `LocalKind::Memory` and thus doesn't call `codegen_rvalue_operand` at all.

This gets us perilously close to a world where `rvalue_creates_operand` only ever returns true.  (See rust-lang/rust#143860 for more.)
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Jul 21, 2025
Allow `Rvalue::Repeat` to return true in `rvalue_creates_operand` too

The conversation in rust-lang/rust#143502 (comment) made be realize how easy this is to handle, since the only possibilty is ZSTs -- everything else ends up with the destination being `LocalKind::Memory` and thus doesn't call `codegen_rvalue_operand` at all.

This gets us perilously close to a world where `rvalue_creates_operand` only ever returns true.  (See rust-lang/rust#143860 for more.)
@scottmcm
Copy link
Member Author

Re-running CI now that #143720 landed.

@scottmcm scottmcm closed this Jul 21, 2025
@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 21, 2025
// from fat pointers to vectors of u16. (See #143194 #110021 ...)
(abi::BackendRepr::SimdVector { .. }, _) | (_, abi::BackendRepr::SimdVector { .. }) => false,
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

annot: if you expand down from here, you'll see that everything else in this function is just true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to keep this function then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the bad reason is that this PR was opened while it couldn't be removed, because there were too parallel PRs neither of which could remove it.

I suppose I could now. I wasn't sure if we were willing to commit to saying that we'll definitely have all rvalues work like this in the future, and thus if we wanted to remove this fully or not.

*goes to look at some stuff*

Well, looks like it only has one caller that's not an assert!, so I guess it wouldn't be too bad to put back again later if it really became needed for some reason? @WaffleLapkin since you have yourself setup for pings on this, do you have any feelings about it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I had to rebase anyway for the codegen test move, so I added a commit that just removes the function entirely. It's pretty small, so I guess my inclination now is "sure, why not".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the function makes sense to me ✅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the (surprising, TBH) perf improvement from it is a pretty good argument 🙂

Thanks for reviewing!

// CHECK: store i64 %x, ptr %_0, align 4
// CHECK: %[[TEMP:.+]] = alloca [8 x i8], align 8
// CHECK: call void @llvm.lifetime.start.p0(i64 8, ptr %[[TEMP]])
// CHECK: store i64 %x, ptr %[[TEMP]], align 8
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

annot: the sparse previous checks here make it look like this changed more than it really did. On nightly today https://rust.godbolt.org/z/nrrnKo146 it does still have the store, two loads, and two insert values.

The material difference is that now the alloca is under the control of the transmute, not the LocalKind::Memory, so the store i64 gets align 8 instead of align 4. (And the lifetime annotations on said local exist.)

@bors
Copy link
Collaborator

bors commented Jul 22, 2025

☔ The latest upstream changes (presumably #144249) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this test use std::simd or the core::arch::x86_64 types?

Copy link
Member Author

@scottmcm scottmcm Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sure, I guess I can use simd::Simd for this.

(I don't want to use arch-specific ones so that both our normal contributor targets can run it.)

EDIT: Done.

scottmcm added 3 commits July 23, 2025 08:25
When combined with 143720, this means `rvalue_creates_operand` can just return `true` for *every* `Rvalue`.  (A future PR could consider removing it, though just letting it optimize out is fine for now.)

It's nicer anyway, IMHO, because it avoids needing the layout checks to be consistent in the two places, and thus is an overall reduction in code.  Plus it's a more helpful building block when used in other places this way.
Split to a separate commit to it could be reverted later if necessary, should we get new `Rvalue`s where we can't handle it this way.
@scottmcm scottmcm force-pushed the transmute-always-rvalue branch from 24a73dc to b7e025c Compare July 23, 2025 15:40
@scottmcm
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Jul 23, 2025
Let `codegen_transmute_operand` just handle everything
@rust-bors
Copy link

rust-bors bot commented Jul 23, 2025

⌛ Trying commit b7e025c with merge 1ee335b

To cancel the try build, run the command @bors try cancel.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 23, 2025
@rust-bors
Copy link

rust-bors bot commented Jul 23, 2025

☀️ Try build successful (CI)
Build commit: 1ee335b (1ee335ba59c92cc05eb39b94c4e542718519b762, parent: 4ff3fa01cbdd468851b1b859541ee1c648cde7de)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (1ee335b): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-0.7%, -0.1%] 14
Improvements ✅
(secondary)
-0.3% [-0.6%, -0.1%] 19
All ❌✅ (primary) -0.4% [-0.7%, -0.1%] 14

Max RSS (memory usage)

Results (primary 4.1%, secondary 2.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.1% [4.1%, 4.1%] 1
Regressions ❌
(secondary)
2.6% [2.6%, 2.6%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 4.1% [4.1%, 4.1%] 1

Cycles

Results (secondary -2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.7% [-2.8%, -2.5%] 2
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 468.179s -> 468.256s (0.02%)
Artifact size: 374.69 MiB -> 374.67 MiB (-0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 23, 2025
// from fat pointers to vectors of u16. (See #143194 #110021 ...)
(abi::BackendRepr::SimdVector { .. }, _) | (_, abi::BackendRepr::SimdVector { .. }) => false,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the function makes sense to me ✅

@WaffleLapkin
Copy link
Member

@bors r+ rollup=never

@bors
Copy link
Collaborator

bors commented Jul 25, 2025

📌 Commit b7e025c has been approved by WaffleLapkin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 25, 2025
@bors
Copy link
Collaborator

bors commented Jul 26, 2025

⌛ Testing commit b7e025c with merge 283a074...

@bors
Copy link
Collaborator

bors commented Jul 27, 2025

☀️ Test successful - checks-actions
Approved by: WaffleLapkin
Pushing 283a074 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 27, 2025
@bors bors merged commit 283a074 into rust-lang:master Jul 27, 2025
13 checks passed
@rustbot rustbot added this to the 1.90.0 milestone Jul 27, 2025
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing ce5fdd7 (parent) -> 283a074 (this PR)

Test differences

Show 8 test diffs

Stage 1

  • [codegen] tests/codegen-llvm/intrinsics/transmute-simd.rs#aarch64: [missing] -> ignore (only executed when the architecture is aarch64) (J0)
  • [codegen] tests/codegen-llvm/intrinsics/transmute-simd.rs#x86_64: [missing] -> pass (J0)

Stage 2

  • [codegen] tests/codegen-llvm/intrinsics/transmute-simd.rs#x86_64: [missing] -> pass (J1)
  • [codegen] tests/codegen-llvm/intrinsics/transmute-simd.rs#aarch64: [missing] -> ignore (only executed when the architecture is aarch64) (J2)
  • [codegen] tests/codegen-llvm/intrinsics/transmute-simd.rs#aarch64: [missing] -> pass (J3)
  • [codegen] tests/codegen-llvm/intrinsics/transmute-simd.rs#x86_64: [missing] -> ignore (only executed when the architecture is x86_64) (J4)

Additionally, 2 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 283a0746a244a88503fed61844f44df925ccdbb6 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-x86_64-apple: 7364.4s -> 10469.6s (42.2%)
  2. dist-aarch64-linux: 8075.6s -> 5916.6s (-26.7%)
  3. dist-aarch64-apple: 5704.7s -> 6644.1s (16.5%)
  4. pr-check-1: 1723.0s -> 1516.5s (-12.0%)
  5. pr-check-2: 2445.1s -> 2166.5s (-11.4%)
  6. aarch64-gnu-debug: 4147.5s -> 3743.8s (-9.7%)
  7. aarch64-gnu-llvm-19-2: 2484.9s -> 2251.0s (-9.4%)
  8. dist-apple-various: 5803.0s -> 6316.5s (8.8%)
  9. x86_64-gnu-llvm-19: 2737.3s -> 2506.1s (-8.4%)
  10. x86_64-rust-for-linux: 2837.7s -> 2627.5s (-7.4%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (283a074): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 4.5%, secondary 1.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.5% [3.2%, 5.8%] 2
Regressions ❌
(secondary)
4.0% [4.0%, 4.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.1% [-1.1%, -1.1%] 1
All ❌✅ (primary) 4.5% [3.2%, 5.8%] 2

Cycles

Results (secondary -8.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-8.2% [-8.2%, -8.2%] 1
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 466.456s -> 467.05s (0.13%)
Artifact size: 376.58 MiB -> 376.65 MiB (0.02%)

@scottmcm scottmcm deleted the transmute-always-rvalue branch July 27, 2025 05:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants