Skip to content

Conversation

@xiaotujinbnb
Copy link

DO NOT FILE A PULL REQUEST

This repository does not accept pull requests. Please follow http://llvm.org/docs/Contributing.html#how-to-submit-a-patch for contribution to LLVM.

DO NOT FILE A PULL REQUEST

dianqk and others added 30 commits March 11, 2025 14:02
Fixes #106846.

This is what I learned from GCC. I found that GCC does not duplicate the
BB that has indirect jumps with the jump table. I believe GCC has
provided a clear explanation here:

> Duplicate the blocks containing computed gotos. This basically
unfactors computed gotos that were factored early on in the compilation
process to speed up edge based data flow. We used to not unfactor them
again, which can seriously pessimize code with many computed jumps in
the source code, such as interpreters.

(cherry picked from commit dd21aac)
Fixes a crash with extract_subvectors in Hexagon backend seen when the
source vector is a vector-pair and result vector is not hvx vector size.

LLVM Issue: llvm/llvm-project#128775
Fixes #128775
---------

Co-authored-by: aankit-quic <[email protected]>
(cherry picked from commit 29d3fc3)
…`isGuaranteedNotToBeUndefOrPoison` (#130111)

Fixes (keep it open) #130110.

If the incoming value is PHI itself, we can skip this. If we can
guarantee that the other incoming values are neither undef nor poison,
then we can also guarantee that the value isn't either. If we cannot
guarantee that, it makes no sense in calculating it.

(cherry picked from commit 462eb7e)
…and 'w' (#129864)

- Allow 'u' and 'w' on LASX, LSX or floating point register operands.
- Also add missing description in LangRef.

Fixes #129863.

(cherry picked from commit bae6644)
The default triple of Amazon Linux on AArch64 is aarch64-amazon-linux,
see issue highlighded by PR #109263, somewhat serious linker issues are
encountered if any other triple is being used.

Unfortunately, this makes XFAIL lines like
`XFAIL: target=aarch64{{.*}}-linux-gnu` ineffective, making it
impossible to complete all of the check-cxx on Amazon Linux without
failing.

(cherry picked from commit 8f4ee42)
Summary:
These return values are actually signed, meaning that casting will
extend it and then all the bits will be one.

(cherry picked from commit 4ca8ea8)
…ny extract is waiting to be erased (#129375)

If any extract is waiting to be erased, then bail out as this will distort the cost calculation and possibly lead to infinite loops.

Fixes #129373

(cherry picked from commit 5ddf40f)
…l(). (#130693)

It has found to be quite a slowdown to traverse the users of a
function from each call site when it is called many (~70k)
times. This patch fixes this for now as long as this verification
is disabled by default, but there is still a need to eventually
cache the results to avoid recomputation.

Fixes #130541

(cherry picked from commit 378739f)
This effectively reverts #108535. The old AA code was looking for
the *first* clobber between the load and store and then trying to
move all the way up there. The new MSSA based code instead found
the *last* clobber. There might still be an earlier clobber that
has not been accounted for.

Fixes #130632.

(cherry picked from commit 5da9044)
…epth (#128704)

Backports b8d1f3d.

This fixes a potential integer overflow bug that has been around for
many versions and was exposed by my patch recently. So we think it
warrants a backport
… (#130517)

Bug introduced #120995. The LLD code calculates the "size" of the Mach-O
headers, and then uses that size to place the segments, but the `__TEXT`
section stays at `fileoff` zero. When I wrote the code into llvm-objcopy
I calculated the extra space into the initial offset, which moved all
the sections back 1 page.

Besides the modified test checking for the right `fileoff` values of the
sections and the segments, I also manually checked the generated
binaries after `llvm-objcopy` using `dyld_info`, as the bug report
suggested.

Fixes #130472

(cherry picked from commit 8413f4d)
…7) (#129686)

Fix's not rejecting global_load_lds_dwordx3 and x4 on other targets.
The encoded versions of instructions should not touch
SubtargetPredicate,
and only AssemblerPredicate.
    
(cherry picked from commit f319a65)
…f nodes have been seen (#129934)

This mitigates a regression introduced in #87824.

The mitigation here is to store pointers the deduplicated AST nodes, rather than copies of the nodes themselves. This allows a pointer-optimized set to be used and saves a lot of memory because `clang::DynTypedNode` is ~5 times larger than a pointer.

Fixes #129808.

(cherry picked from commit 8c7f0ea)
Alexei reported a bpf selftest failure with recent llvm for bpf prog
file progs/arena_spin_lock.c. The failure only happens when clang is
built with cmake option LLVM_ENABLE_ASSERTIONS=ON.

The error message looks like:
```
 clang: /home/yhs/work/yhs/llvm-project/llvm/lib/IR/Instructions.cpp:3460:
   llvm::BitCastInst::BitCastInst(Value *, Type *, const Twine &, InsertPosition):
   Assertion `castIsValid(getOpcode(), S, Ty) && "Illegal BitCast"' failed.
```
Further investigation shows that the problem is triggered in
  BPF/BPFAbstractMemberAccess.cpp
for code
```
  auto *BCInst =
      new BitCastInst(Base, PointerType::getUnqual(BB->getContext()));
```
For the above BitCastInst, Since 'Base' has non-zero AddrSapce, the
compiler expects the type also has the same AddrSpace. But the above
PointerType::getUnqual(...) does not have AddrSpace and hence causes the
assertion failure.

Providing the proper AddrSpace for the BitCast type fixed the issue.

Co-authored-by: Yonghong Song <[email protected]>
(cherry picked from commit 5686786)
This backports d911085 because it fixes a regression introduced in 19
and we don't want it to persist in 20
…ss. (#129087)

In some cases, it is possible for the same underlying object to be
accessed via pointers to different address spaces. This could lead to
pointers from different address spaces ending up in the same dependency
set, which isn't allowed (and triggers an assertion).

Update the mapping from underlying object -> last access to also include
the accessing address space.

Fixes llvm/llvm-project#124759.

PR: llvm/llvm-project#129087
…131438)

It seems that Apple Clang 17 starts to be used for CI, while it hasn't
supported `__builtin_is_virtual_base_of` yet. And thus we need to skip
the test for `is_virtual_base_of`.

Follows up #131302.
When inferring host device attr of virtual dtor of explicit
template class instantiation, clang should be conservative.
This guarantees dtors that may call host functions not to
have implicit device attr, therefore will not be emitted
on device side.

Backports: 0f0665d d37a392

Fixes: #108548
…mber access (#131450)

For an expression `nodiscard_function().static_member(), the nodiscard
warnings added by #120223, are not useful or actionable, and are
disruptive to some library implementations; we just remove them.

Fixes #131410

(cherry picked from commit 9a1e390)
The pull request [1] changed bpf default cpu from -mcpu=v1 to
-mcpu=v3 in clang20. Recently in [1], Yuval Deutscher suggested
to add an entry to clang20 ReleaseNotes so users can easily find
the change from documentation.

  [1] llvm/llvm-project#107008

Co-authored-by: Yonghong Song <[email protected]>
…rToZero` (#131522)

llvm/llvm-project#94525 assumes that the loop
will be infinite when the stride is zero. However, it doesn't hold when
the start value of addrec is also zero.

Closes llvm/llvm-project#131465.

(cherry picked from commit c5a491e)
This is to fix compile error with explicit Clang modules like
```
../../third_party/libc++/src/include/__vector/vector_bool.h:85:11: error: default argument of '__bit_iterator' must be imported from module 'std.bit_reference_fwd' before it is required
   85 |   typedef __bit_iterator<vector, false> pointer;
      |           ^
../../third_party/libc++/src/include/__fwd/bit_reference.h:23:68: note: default argument declared here is not reachable
   23 | template <class _Cp, bool _IsConst, typename _Cp::__storage_type = 0>
      |                                                                    ^
```

(cherry picked from commit 672e385)
…824)

Address missing changes:
- V[,U]COMXSD need to have XD (F3.0F –> F2.0F)
- V[,U]COMXS[S,H] need to have XS (F2.[0F,MAP5] -> F3.[0F,MAP5])
- VGETEXPBF16 needs to have T_MAP6 and NP (66.MAP5 -> NP.MAP6)

 Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

(cherry picked from commit 975c208)
"libmsvcrt-os" was added to the list of excluded libs in binutils in
9d9c67b06c1bf4c4550e3de0eb575c2bfbe96df9 in 2017.

"libucrt" was added in c4a8df19ba0a82aa8dea88d9f72ed9e63cb1fa84 in 2022.

"libucrtapp" isn't in the binutils exclusion list yet, but a patch for
adding it has been submitted. Since
0d403d5dd13ce22c07418058f3b640708992890c in mingw-w64 (in 2020), there's
such a third variant of the UCRT import library available.

Since 18df3e8323dcf9fdfec56b5f12c04a9c723a0931 in 2025, "libpthread" and
"libwinpthread" are also excluded.

(cherry picked from commit af93db9)
Without this we can try to generate invalid instructions or create
illegal types. This patch generates a SVE fcopysign instead and use its
lowering. BF16 is left out of the moment as it doesn't lower
successfully (but could use the same code as fp16).

(cherry picked from commit d4ab3df)
komimoe added 30 commits July 21, 2025 05:15
修复当entry节点终端无条件br目标是phi块时, phi指令降级而块不拆分却替换br指令导致跳转错误的问题, 为什么这么大一个bug这么久没人发现?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.