-
Notifications
You must be signed in to change notification settings - Fork 2
Add symbols to wheel builds in the CI #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds debugging symbols to wheel builds in the CI to facilitate triage and debugging. The changes include:
- Adding Bazel build options to retain symbols (
--strip=never,-g3) - Updating Docker setup to use a more generic Python venv package
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| tools/docker_dev_setup.sh | Changes Python venv package from version-specific (python3.10-venv) to generic (python3-venv) |
| jax_rocm_plugin/build/rocm/tools/build_wheels.py | Adds Bazel options to disable stripping and enable debug symbols in wheel builds |
Comments suppressed due to low confidence (1)
jax_rocm_plugin/build/rocm/tools/build_wheels.py:180
- When xla_path is set, a second --bazel_options argument is appended at line 180. Since argparse with action='append' creates a list, this will result in args.bazel_options containing two elements: the string '--strip=never --copt=-g3 --cxxopt=-g3' and the string '--override_repository=xla='. While the second option will work correctly, the first one will fail as explained in Comment 1. The fix should ensure all Bazel options are passed as individual entries.
"--bazel_options=--strip=never --copt=-g3 --cxxopt=-g3",
"--wheels=jax-rocm-plugin,jax-rocm-pjrt",
"--rocm_path=%s" % rocm_path,
"--rocm_version=%s" % version_string,
"--use_clang=%s" % use_clang,
"--verbose",
"--output_path=%s" % output_dir,
]
# Add clang path if clang is used.
if use_clang:
clang_path = find_clang_path()
if clang_path:
cmd.append("--clang_path=%s" % clang_path)
else:
raise RuntimeError("Clang binary not found in /usr/lib/llvm-*")
if xla_path:
cmd.append("--bazel_options=--override_repository=xla=%s" % xla_path)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
| "python", | ||
| "build/build.py", | ||
| "build", | ||
| "--bazel_options=--strip=never", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a full debug build with assertions enabled and minimal optimizations, there's also a debug config: .bazelrc:522
python build/build.py build ....
--bazel_options=--config=debug
https://github.com/jax-ml/jax/blob/1d0d80c65ed4897e784ac898787cc0d7697ec134/.bazelrc#L521
and this is the option to turn it on or off @charleshofer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you meant to link to our .bazelrc? It has the same config https://github.com/ROCm/rocm-jax/blob/master/jax_rocm_plugin/.bazelrc#L121
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yes, we should use that instead, and modify it with whatever flags we want
charleshofer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to talk to Aleksei about how this will relate to the debug flags he put in the Makefile template. We it'd be nice if we could unify on one set of debug flags that can just sit in .bazelrc under their own config: https://github.com/ROCm/rocm-jax/blame/master/stack.py#L28
| make \ | ||
| patchelf \ | ||
| python3.10-venv \ | ||
| python3-venv \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
| "python", | ||
| "build/build.py", | ||
| "build", | ||
| "--bazel_options=--strip=never", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer that as many Bazel options as possible make their way down to the jax_rocm_plugin/.bazelrc file rather than getting threaded through one of our dozen-or-so wrapper scripts. Preferably, it'd be inside of something like a build:debug config, so you can enable it with ./bazel build ... --config=debug ... or build/buid.py ... --bazel_options="--config=debug" ....
| "python", | ||
| "build/build.py", | ||
| "build", | ||
| "--bazel_options=--config=debug_symbols", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we stick this in the build/build.py instead rather than storing bazel arguments directly in here?
Motivation
In many instances it is useful to have symbols in the wheel to facilitate triage etc.
Technical Details
When the
build.pyscript is triggered, we add the appropriate bazel flags to retain symbols in the different SO files in our wheels.Test Plan
The produced artifacts are manually tests using the
nm -Dcommand to ensure that symbols are present.Test Result
The output of the
nm -Dcommand shows that the symbols are indeed present in all the SOs in the wheelSubmission Checklist