llvm 19 support #227

brandonros · 2025-06-08T22:27:27Z

potentially addresses all of:

Update rustc_llvm_wrapper to optionally support LLVM v19 #226 (Updates rustc_llvm_wrapper)
GitHub Codespaces/VSCode Devcontainer support #224 (Adds Codespaces, optional, we could remove)
sha2 crate = runtime error #207 (if LLVM v19 will compile and not have same issues as LLVM v7)
CUDA 12.8.1 and LLVM 18.1.8 #197 (we can put this behind a feature flag to optionally support Blackwell+)

brandonros · 2025-06-08T22:30:49Z

i'm on the fence about renaming llvm to llvm7 and llvm19

i think it might not actually be needed and i might put it back

brandonros · 2025-06-08T22:33:38Z

@LegNeato my measuring stick here is does vecadd example build with LLVM v19

could you tell me if I'm close or if I'm actually missing something huge like a mountain of work I'm not seeing?

brandonros · 2025-06-10T04:30:35Z

 DEBUG: About to call LLVMRunPasses - THIS IS THE CRITICAL POINT
  DEBUG: Parameters:
  DEBUG:   llmod: 0x770f3d33a580
  DEBUG:   pipeline: "default<O0>"
  DEBUG:   tm: 0x770f27dafb00
  DEBUG:   pass_options: 0x770f27e00060
  DEBUG: LLVMRunPasses returned: 0
  DEBUG: LLVMRunPasses completed successfully
  DEBUG: Cleaning up pass builder options
  DEBUG: Pass builder options disposed
  DEBUG: optimize function completed successfully
  DEBUG: About to verify module before prepare_thin
  DEBUG: Module verification result: 0
  DEBUG: LLVMRustThinLTOBufferCreate called with is_thin=1, emit_summary=1
  DEBUG: Taking ThinLTO path
  DEBUG: About to run ThinLTO pass
  error: rustc interrupted by SIGSEGV, printing backtrace

(gdb) bt
#0  0x0000772679b89e10 in llvm::ValueEnumerator::EnumerateType(llvm::Type*) ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#1  0x0000772679b8d240 in llvm::ValueEnumerator::incorporateFunction(llvm::Function const&) ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#2  0x0000772679b615b2 in (anonymous namespace)::ModuleBitcodeWriter::write() ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#3  0x0000772679b5b341 in llvm::BitcodeWriter::writeModule(llvm::Module const&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#4  0x0000772679b66802 in llvm::WriteBitcodeToFile(llvm::Module const&, llvm::raw_ostream&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#5  0x000077267944549c in llvm::ThinLTOBitcodeWriterPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#6  0x00007726786b34a7 in llvm::detail::PassModel<llvm::Module, llvm::ThinLTOBitcodeWriterPass, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#7  0x000077267a2e0dc8 in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
--Type <RET> for more, q to quit, c to continue without paging--
#8  0x000077267867e178 in LLVMRustThinLTOBufferCreate ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#9  0x000077267859d7be in rustc_codegen_nvvm::lto::ThinBuffer::new ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#10 0x000077267858512a in <rustc_codegen_nvvm::NvvmCodegenBackend as rustc_codegen_ssa::traits::write::WriteBackendMethods>::prepare_thin () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#11 0x00007726785ec605 in rustc_codegen_ssa::back::write::execute_optimize_work_item ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#12 0x00007726785e3cd5 in rustc_codegen_ssa::back::write::spawn_work::{{closure}} ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#13 0x0000772678404d97 in std::sys::backtrace::__rust_begin_short_backtrace ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#14 0x000077267847357f in std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}} ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#15 0x0000772678520144 in <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
    () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#16 0x0000772678466e24 in std::panicking::try::do_call ()
--Type <RET> for more, q to quit, c to continue without paging--
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#17 0x000077267847381b in __rust_try () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#18 0x00007726784730f1 in std::thread::Builder::spawn_unchecked_::{{closure}} ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#19 0x000077267848f987 in core::ops::function::FnOnce::call_once{{vtable.shim}} ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#20 0x000077268c55b8ab in std::sys::pal::unix::thread::Thread::new::thread_start ()
   from /root/.rustup/toolchains/nightly-2025-03-02-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-e3b06f91230294e6.so
#21 0x000077268668aaa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#22 0x0000772686717a34 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

pain

LegNeato · 2025-06-10T14:53:08Z

I don't know, llvm is an area of the project I have not touched.

LegNeato

This should stay the same, right? The newer support should be optional / only enabled when

LegNeato · 2025-06-10T14:57:35Z

crates/rustc_codegen_nvvm/build.rs

 static PREBUILT_LLVM_URL: &str =
    "https://github.com/rust-gpu/rustc_codegen_nvvm-llvm/releases/download/LLVM-7.1.0/";

-static REQUIRED_MAJOR_LLVM_VERSION: u8 = 7;
+static REQUIRED_MAJOR_LLVM_VERSION: u8 = 19;


The requirement doesn't bump unless targeting a higher arch? So the logic is:

7 if targeting arch supported by 7 and 19

19 if targeting arch not supported by 7

LegNeato · 2025-06-10T15:02:28Z

We should probably break this down, seeing as neither of us understand this space.

First, we should probably add various values for arch and stuff to enums on the rust side. Some of these might need to be gated if on 7 vs 19.

Then, we should get switching between 7 and a stubbed out / non working 19 via target arch.

Then, we should systematically fix each issue and refactor common code on the way.

brandonros · 2025-06-10T15:49:12Z

seeing as neither of us understand this space.

ok... kind of a strange remark...

i'm just going to work on my local branch and make kernels compile with 19.1

then i was going to work backwards and "make it upstream worthy"...

brandonros · 2025-06-14T17:33:02Z

@LegNeato

i rented some powerful big aws spot VM in the cloud and built llvm-19 debug with assertions enabled and am using it in devcontainer (it's a 15gb .tar.xz and 67gb uncompressed)

it's helping find issues more than a typical llvm release build fyi, just a little tip i'd share

i have the "shaved yaks" opinionated way to get that instance if you want, cost me about $3 total

https://github.com/brandonros/cloud-llvm-build

brandonros · 2025-06-15T01:41:31Z

@LegNeato i'm on the fence about having two rustc_codegen_nvvm_v{{version}} crates 85% copy and pasted but.... i got this to work. vecadd is working at least, going to see if ed25519_vanity_rs compiles later

proof:
vecadd_kernel.ptx.txt

brandonros · 2025-06-20T14:41:21Z

@LegNeato i've read multiple conflicting things from multiple "official nvidia sources/documentation" that the new cuda 12.9 toolkit is based on/adds support for either llvm 18, llvm 19, or llvm 20.

i can see in their cicc binary references of llvm 20, so....

i also question if we need this. this might sound dumb but... what if we used a simple official rust target like riscv64gc-unknown-none-elf, use official rust compiler (no custom nightly, no custom codegen llvm integration) to spit out llvm ir, and then patch it to work with cuda...

i totally agree with you/the project's view on "use nvvm to compile llvm ir to ptx"

https://github.com/brandonros/vanity-miner-rs/pull/8/files i haven't had a chance to test it yet (working on it) but in terms of thinking outside the box, i was for sure able to get nvvm to accept llvm ir and make what seems to be a "valid ptx"

brandonros · 2025-06-22T20:30:41Z

i got this working for blackwell a different way. this branch/pr/LLVM v19 integration might work just fine but it's kind of a lot to maintain if there's an "easier" (albeit hackier) way to solve this

https://github.com/brandonros/vanity-miner-rs/actions/runs/15809309968/job/44558442212

Build Pipeline

compile no_std Rust logic + kernels libraries (specifically 1.86.0 because it was built against LLVM 19) targeting riscv64gc-unknown-none-elf due to its simplicity in instruction set
make it emit LLVM IR instead of an actual binary
Adapt the RISC-V LLVM IR to NVPTX64 LLVM IR
assemble the NVPTX64 LLVM IR to NVPTX64 LLVM bitcode
feed the NVPTX64 LLVM bitcode to new CUDA toolkit 12.9 libNVVM which adds support for LLVM19 for Blackwell (previous architectures only support LLVM v7 which is very old) to get Nvidia's PTX (Parallel Thread Execution)
feed the PTX to ptxas to get CUBIN SaSS (Streaming ASSembler)
run the CUBIN on device with gpu_runner

let me know if you actually want this/to put time into it, otherwise blackwell+ might be able to avoid cuda_builder or i'd need to make a cuda_builder that makes this super opinionated rust -> cubin pipeline i made

LegNeato · 2025-08-01T19:38:12Z

I'm re-opening because I think we want to go this route. Totally understand if you go a different route for your project!

LegNeato · 2025-08-01T19:38:53Z

I plan to poke at it this week. Apologies for the previous response saying we don't know the space, I don't really know the LLVM side of the house and I misspoke.

brandonros · 2025-08-01T23:33:08Z

Totally understand if you go a different route for your project!

Nope! Here to help, let's land this!

#216

Let's land that and then I'll rebase?

LegNeato · 2025-08-03T09:52:15Z

I don't have time to jam on this with you until later in the week, but I think this is a great start! Ideally both are compiled in statically or as dylibs and runtime chooses based on arch selected. But distribution with dylibs is annoying, and compiling 2 llvm versions in the same process will be annoying. So I think the first step is what this is doing, manually switching, but we should be aware where we would like it to go.

tyler274 · 2025-08-16T03:17:01Z

Im willing to put some time in here if I can get some pointers.

devillove084 · 2025-08-22T05:52:14Z

@tyler274 I'm also spending time on this and have followed the same approach as @brandonros. Writing extensive conditional compilation is truly a pain. Perhaps I think we should focus our efforts on this branch? Maybe we could explore how to improve this together?

devillove084 · 2025-08-22T05:54:36Z

@tyler274 #229 (comment) Here are some previous hints from @LegNeato.

devillove084 · 2025-08-22T06:01:00Z

@LegNeato Additionally, based on my previous open-source contributions and discussions, I've learned that both the Graphite and Turso are exploring GPU acceleration. I believe this represents a significant opportunity for us. I'm highly motivated to drive this initiative forward and position our project as the leading GPU-accelerated solution within the Rust ecosystem.

LegNeato · 2025-08-22T06:04:33Z

@Firestar99 is working on Graphite's support via rust-gpu!

devillove084 · 2025-08-22T06:09:19Z

@Firestar99 is working on Graphite's support via rust-gpu!

@LegNeato That's awesome! The good news is, after a month of learning and hands-on practice, I've basically figured out the workflow of LLVM backend generation. The bad news is debugging conditional compilation remains quite painful. What do you think – would it be better to write conditional compilation, or maintain two separate branches?

LegNeato · 2025-08-22T06:10:34Z

I think conditional is better, as I think upgrading drops support for a bunch of devices? Or is that not the case?

brandonros · 2025-08-22T06:11:44Z

what would it take to land this?

devillove084 · 2025-08-22T06:15:44Z

@LegNeato Based on my research, here are the facts: Taking PassManager as an example, this component orchestrates LLVM optimizations and analyses during backend code generation. Typically, multiple passes need to collaborate (e.g., constant propagation followed by dead code elimination). The PassManager schedules passes in a predefined order or based on dependencies to ensure logical execution sequence.

Prior to LLVM 9/10, implementations required inheriting from specific PassManager virtual base classes. However, starting from LLVM 14, everything has been unified into a CRTP (Curiously Recurring Template Pattern) code structure. Additionally, header files may have been relocated or modified across different LLVM versions.

Therefore, if we choose conditional compilation, I would need to create CI workflows for every single version from LLVM 7 to LLVM 19 to ensure compatibility.

brandonros · 2025-08-22T06:17:01Z

I would need to create CI workflows for every single version from LLVM 7 to LLVM 19 to ensure compatibility.

I think this is a misunderstanding. NVIDIA cards either run LLVM 7 or LLVM 19, nothing in between. Please correct me if I am wrong.

devillove084 · 2025-08-22T06:19:52Z

I would need to create CI workflows for every single version from LLVM 7 to LLVM 19 to ensure compatibility.

I think this is a misunderstanding. NVIDIA cards either run LLVM 7 or LLVM 19, nothing in between. Please correct me if I am wrong.

@brandonros I'm referring specifically to modifications in the LLVM backend generation logic within wrapper files like rustc_llvm_wrapper. This is not targeting NVVM specifically.

devillove084 · 2025-08-22T06:25:34Z

@brandonros @LegNeato https://gist.github.com/ax3l/9489132 This source perfectly highlights that while NVIDIA officially certifies specific LLVM versions per CUDA release (as shown in the version matrix), the reality is more nuanced:

The crt/host_config.hhack (as noted) allows unofficial flexibility for newer LLVM versions
Production environments often use LLVM 11-15 (especially with CUDA 11.x/12.x)
NVIDIA’s Enhanced Compatibility(since CUDA 11.1) intentionally supports cross-version compatibility

devillove084 · 2025-08-22T06:32:54Z

@brandonros I think we could start by refining the branches for LLVM 7 and LLVM 19 based on your existing work, then progressively extend support to other components like NVVM (CUDA) and additional LLVM versions.

devillove084 · 2025-08-22T06:40:02Z

@LegNeato I strongly agree that prioritizing support for the newer Rust toolchain, CUDA, and LLVM versions is critical. Our project's future hinges on optimizing for cutting-edge hardware like the H100, A100, and even the GH200 (which I currently have access to). This strategic focus will enable major enterprise customers to integrate our solution into their infrastructure—the key to maximizing our long-term growth and impact.

brandonros · 2025-08-22T06:43:41Z

I think we could start by refining the branches for LLVM 7 and LLVM 19 based on your existing work,

I will have rebased this massive thing twice now and it continues to go stale at almost 2+ months old. Are we serious about upstreaming this? Otherwise I'm hesitant to keep doing this same song and dance of "get it ready for merge, put it on the shelf".

devillove084 · 2025-08-22T06:50:28Z

@LegNeato Based on the current modifications, what are the primary remaining challenges? Let's explore what additional efforts and adjustments we can make.

devillove084 · 2025-08-22T12:14:24Z

@LegNeato After reviewing his code changes, I can roughly understand that it seems you don't want to create code for different LLVM versions in multiple folders. Instead, you may prefer to achieve a balance between NVVM and LLVM through conditional compilation.

I sincerely request that you take the time to delve into this part.

LegNeato reviewed Jun 10, 2025

View reviewed changes

brandonros closed this Jun 10, 2025

brandonros reopened this Jun 15, 2025

brandonros force-pushed the llvm-19 branch 4 times, most recently from 3cac1eb to f50708b Compare June 15, 2025 01:14

brandonros closed this Jun 22, 2025

LegNeato reopened this Aug 1, 2025

brandonros force-pushed the llvm-19 branch from 52791ef to 80d972b Compare August 2, 2025 14:53

brandonros added 2 commits August 2, 2025 10:54

Fix read_volatile intrinsic

f4cfaaa

support sm_100 and llvm v19

c2a4471

brandonros force-pushed the llvm-19 branch from 80d972b to c2a4471 Compare August 2, 2025 14:58

llvm 19 support #227

Are you sure you want to change the base?

llvm 19 support #227

Uh oh!

Conversation

brandonros commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonros commented Jun 8, 2025

Uh oh!

brandonros commented Jun 8, 2025

Uh oh!

brandonros commented Jun 10, 2025

Uh oh!

LegNeato commented Jun 10, 2025

Uh oh!

LegNeato left a comment

Choose a reason for hiding this comment

Uh oh!

LegNeato Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

LegNeato commented Jun 10, 2025

Uh oh!

brandonros commented Jun 10, 2025

Uh oh!

brandonros commented Jun 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonros commented Jun 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonros commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonros commented Jun 22, 2025

Build Pipeline

Uh oh!

LegNeato commented Aug 1, 2025

Uh oh!

LegNeato commented Aug 1, 2025

Uh oh!

brandonros commented Aug 1, 2025

Uh oh!

LegNeato commented Aug 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tyler274 commented Aug 16, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

LegNeato commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

LegNeato commented Aug 22, 2025

Uh oh!

brandonros commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

brandonros commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

brandonros commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

brandonros commented Jun 8, 2025 •

edited

Loading

brandonros commented Jun 14, 2025 •

edited

Loading

brandonros commented Jun 15, 2025 •

edited

Loading

brandonros commented Jun 20, 2025 •

edited

Loading

LegNeato commented Aug 3, 2025 •

edited

Loading

devillove084 commented Aug 22, 2025 •

edited

Loading

devillove084 commented Aug 22, 2025 •

edited

Loading