Contributing to wild

If you'd like to help out, we'd love to hear from you. It's a good idea to reach out first to avoid duplication of effort. Also, it'll make it possible for us to provide hints that might make what you're trying to do easier.

LLM / AI use policy

It's OK to use an LLM or AI to help with your work on the Wild linker. However, there must be a human in the loop. We want to communicate with you, not a bot. Please do not use an AI agent to communicate with us and don't just copy and paste AI-written responses to review comments. If your English isn't good, it's fine to use automatic translation, however please let us know that you're doing so. Alternatively, perhaps even better, you can post in your native language and we'll use a tool to translate.

Ideally you should fully understand the change that you're sending us. If there are parts of your change that work, but which you don't understand, please let us know up front. This applies whether you were using an AI or not. e.g. if you happen to find a way to make something work but you're not sure why it works, you should be up front about it. If you send us code that you don't understand and don't tell us up front that you don't understand it, that will quickly erode trust.

If it's not just parts of your PR that you don't understand, but the majority of it, then please don't send it. Start with something smaller that you can understand.

Your PR / issue should be written by you, not by an LLM. LLMs tend to write excessively long PR descriptions. If you write something short and concise, it shouldn't take you long provided you understand the change that you're sending us.

When starting out contributing, try to start out small, both in terms of size of changes and number of PRs. We'd like to avoid the scenario where you send us lots of PRs at once and we have similar feedback on all of them. We'd rather get one PR, provide feedback and then you can take that into consideration when doing your next PR. That's a better use of both your time and ours.

Contribution Etiquette

Please refer to the Rust guidelines on this topic.

Options for communicating

Feel free to start a discussion, open an issue, or start a thread on our Zulip.

You're also welcome to reach out directly to the maintainers.

Meetings

We try to have about one video call per month. To add this to your calendar, join the wild-dev-meetings google group. If this doesn't work for some reason, come and talk to us on Zulip and we'll try to find another way to add you.

Ways you can contribute

Use wild and let us know your experiences, or file issues for problems found.
Open an issue or a discussion here on GitHub.
Sending a PR related to some issue

Running tests

To run tests (and have them pass) there are a number of pre-requisites to have installed on Linux:

clang 'C' compiler
lld linker
nightly-x86_64-unknown-linux-gnu toolchain (add with rustup install nightly-x86_64-unknown-linux-gnu)
x86_64-unknown-linux-musl target for the nightly toolchain (add with rustup target add --toolchain nightly x86_64-unknown-linux-musl)
cranelift backend (add with rustup component add rustc-codegen-cranelift-preview --toolchain nightly)
clang-format formatter

then use cargo test as usual.

Running tests for other architectures on x86_64

Wild supports testing on non-native architectures using QEMU.

Setup

Add required Rust targets:

# For aarch64
rustup target add --toolchain nightly aarch64-unknown-linux-gnu aarch64-unknown-linux-musl

# For riscv64
rustup target add --toolchain nightly riscv64gc-unknown-linux-gnu riscv64gc-unknown-linux-musl

# For loongarch64
rustup target add --toolchain nightly loongarch64-unknown-linux-gnu loongarch64-unknown-linux-musl

Install required packages (for apt-based systems):

# For aarch64
sudo apt install qemu-user gcc-aarch64-linux-gnu g++-aarch64-linux-gnu binutils-aarch64-linux-gnu

# For riscv64
sudo apt install qemu-user gcc-riscv64-linux-gnu g++-riscv64-linux-gnu binutils-riscv64-linux-gnu

# For loongarch64
sudo apt install qemu-user gcc-loongarch64-linux-gnu g++-loongarch64-linux-gnu binutils-loongarch64-linux-gnu

Running tests

To run tests for a specific architecture:

# For aarch64
WILD_TEST_CROSS=aarch64 cargo test

# For riscv64
WILD_TEST_CROSS=riscv64 cargo test

# For loongarch64
WILD_TEST_CROSS=loongarch64 cargo test

# All
WILD_TEST_CROSS=all cargo test

This runs both native tests and architecture-specific tests. QEMU is used for executing binaries for non-native, while linking and diffing are performed natively. Note that cross-compilation only works with GCC and rustc tests; clang-based tests currently disable cross-compilation.

Configuration file for tests

Currently, the behavior for the following test options can be configured using the TOML format:

rustc_channel: Specifies which Rust compiler channel to use when running tests that build Rust code. The default value is "default", which means no explicit toolchain is specified.
qemu_arch: List of additional architectures (different from the host) to run tests for. This setting is overridden by the $WILD_TEST_CROSS environment variable. The default value is [].
allow_rust_musl_target: Specifies whether to allow the musl target Rust. The default value is false, so you’ll need to set it to true if you want to run tests targeting musl.
diff_ignore: Adds extra rules to ignore certain diffs. This can be useful if you're developing on a system with an older version of GNU ld that doesn't perform certain optimisations.
run_all_diffs: Enables diffing the output of wild against that of the existing linkers. By default, diffs are skipped. Set to true to enable.

A sample configuration file is provided as test-config.toml.sample. By default, Wild uses test-config.toml as the configuration file. If you have written your configuration in a different file, specify its location using the WILD_TEST_CONFIG environment variable as follows:

WILD_TEST_CONFIG=path_to_config cargo test

Coding style for tests

C and C++ files follow Google style from clang-format that is enforced by an unit test. You can use the following command to format them:

clang-format -i wild/tests/sources/*.{c,cc,h}

Newer versions of clang-format might produce slightly different formatting. If your code is correctly formatted locally but the CI job still reports formatting errors, update your clang-format to either the latest version or the version used in the failing CI job (the CI output prints its version on failure). In general, formatting generated by newer versions should be accepted by older versions, but this isn’t guaranteed.

When working on tests, you can temporarily disable the formatting check by setting the WILD_TEST_IGNORE_FORMAT environment variable.

Running external tests

Wild can run some external test suites. Currently only the test suite of mold is supported.

You can run the mold tests as follows:

# Clone the mold repository (only needed once)
git submodule update --init --recursive

cargo test --features mold_tests

The output will be placed under fakes-debug/out/test/.

You can use this command instead of the second one to run all external tests together:

cargo test --features external_tests

Some tests are configured to be skipped by default. A list of these skipped tests can be found at:

wild/tests/external_tests/mold_skip_tests.toml: for the mold tests.

However, you can also run the tests without skipping any of them:

# Run mold tests without skipping any test
WILD_IGNORE_SKIP=mold cargo test --features mold_tests

# Run all external tests without skipping any test
WILD_IGNORE_SKIP=all cargo test --features external_tests

Running external tests with other linkers

When debugging a failing test, it can be useful to see how other linkers (such as GNU ld or lld) behave on the same test. You can use the WILD_EXTERNAL_LINKER environment variable to run test scripts with a different linker:

WILD_EXTERNAL_LINKER=ld cargo test --features mold_tests discard.sh

WILD_EXTERNAL_LINKER=lld cargo test --features mold_tests allow-multiple-definition.sh

The skip list is still applied, so expect_failure tests work as usual. This is useful for determining whether a test that fails with wild also fails with another linker, or whether the failure is specific to wild.

When using a third-party linker, the path to the temporary directory is printed to stderr, so you can inspect the output files (e.g. under /tmp/foo/out/test/).

Commit messages

The title of your commit message should ideally be written with a view to it being included in the release notes. i.e. write what would be useful for a user of the linker to read. Implementation details can go in the extended description.

When we write the release notes, we try to categorise commits into groups. You can help by prefixing the commit title with a word that gives a clue as to which section it should go in. If you think it shouldn't go in the release notes, you can prefix the title with "chore:" or "typo:", neither of which get included in the notes. If in doubt, conventional commits is probably a good source of information for how we aspire to format commit messages.

To see the current rules that we use for categorising commits, see cliff.toml.

GitHub workflow

TL;DR: We're pretty relaxed. Feel free to force push or not. Squash, rebase, merge, whatever you like, we'll find ways to work with it.

In order to make things like git bisect easier, this project maintains a linear sequence of commits.

It's fine for you to use whatever workflow you like when making a PR. For example, if you want to add fix-up commits as the PR progresses, that's fine. It's also fine to amend commits as you go.

All PRs get squashed when merging. We do this so that we have a reference to the PR in the commit message, which makes it easier to find the change later and write the release notes. If you have changes that you'd like to keep as separate commits, please send them as separate PRs.

When merging, by default we'll use the title and description of your PR as the commit message. So if you want to change the commit message after you've created the PR, edit the title and / or the description (first comment on the PR). We may sometimes edit the commit message ourselves when merging.

Feel free to mark your PR as a draft at any stage if you know there's more you'd like to do with it and want to avoid us merging it before it's ready.

Coding style

This is mostly handled by rustfmt. A couple of the format options that we use aren't yet stable, so you'll need to format with nightly. Before you upload your PR, you should run the following:

cargo +nightly fmt

YAML files are formatted with yamlfmt. If you change any workflow or other YAML files, run:

yamlfmt .

Naming

When in doubt, do what is done in the Rust standard library. e.g. when making a name CamelCase, only the first letter of each "word" should be uppercase even if the "word" is an acronym. You can see this in the standard library with types like TcpStream and IoSlice.

One import per line

One style thing that might be slightly different is that we use one import per line. i.e. we don't use {} in use statements. This has two benefits. Firstly, merge conflicts are significantly less likely. Secondly, if a merge conflict does happen, it's significantly easier to resolve. The downside is that it's more verbose, but since your IDE is probably adding these lines for you anyway, it shouldn't matter.

Panic policy

Panicking if there's a bug is OK. It's generally better to crash in a way that's easy to tell what happened rather than produce an invalid executable. That said, lots of the code, when it detects an inconsistent internal state (a bug), returns an error rather than panicking. The reason for this is not to avoid the panic per se, but rather because by returning an error, we can attach more contextual information to the error to help diagnose the problem. For example, we can add information about what symbol we were processing and which input file we were looking at. This is usually more useful for us than a stack trace showing where it was in the code. Also, since Wild is very multi-threaded, if there's a bug that causes all the threads to panic, the output can get pretty messed up.

So in summary, if you think something shouldn't happen, it's fine to panic. Calling unwrap is fine. But if you're less sure that it can't happen, or you've observed it happen and need to debug why it happened, then switching to returning an error is recommended.

Building wild with wild

You can add or modify a .cargo/config.toml file to change the linker used to build wild to be wild!

The below example has entries for musl and gnu ABI targets:

[target.x86_64-unknown-linux-musl]
linker = "/usr/bin/clang"
rustflags = ["-C", "relocation-model=static", "-C", "link-arg=--ld-path=wild"]

[target.x86_64-unknown-linux-gnu]
linker = "/usr/bin/clang"
rustflags = ["-C", "link-arg=--ld-path=wild"]

The .cargo/config.toml file can be added in the root folder of the project, or somewhere else according to the Hierarchical structure that cargo uses to determine config.

Reading

Linkers are complex bits of software. Here are some resources that are good for learning what linkers need to do.

Ian Lance Taylor's blog post series. Ian wrote the GNU Gold linker. This series is a bit old now, so doesn't have some more recent stuff, but is nonetheless a great introduction.
Maskray's blog. Maskray maintains the LLD linker and has many awesome blog posts about various linker-related topics. A few posts in particular:
- All about thread-local storage
- All about Global Offset Table
- Copy relocations, canonical PLT entries and protected visibility
- All about COMMON symbols. Despite their name, common symbols aren't commonly used. They are however used in libc, so are necessary if you want to be able to link pretty much anything.
- Everything else with the linker tag
For Wild specific content, there's David Lattimore's blog.
There are also various specification documents. These may not be the best to read start-to-finish, but can be good when you need some specific details on something.
A Deep dive into (implicit) Thread Local Storage

Finding an issue to work on

Whatever issue you work on, please comment on the issue to let us know you're working on it, otherwise two people might end up working on the same issue and that could be disappointing if someone then felt like they'd wasted their time. It's perfectly OK to say that you're going to work on something, then later realise that it's not for you.
If you'd like to work on something that someone said they're working on, but they haven't provided an update in a while, feel free to politely ask if they're still working on it and mention that if they're not, you'd like to have a go.
We may on occasion tag issues as good first issue. One person's good-first-issue might be too hard or too easy for another person, so this is a somewhat hard judgement to make.
You're welcome to help out with other unassigned issues too, even if they don't have tags. If you're interested in possibly working on such an issue, comment on it and we'll see what guidance we can provide. This will also allow us to assign the issue to you so that other's don't duplicate efforts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Contributing to wild

LLM / AI use policy

Contribution Etiquette

Options for communicating

Meetings

Ways you can contribute

Running tests

Running tests for other architectures on x86_64

Setup

Running tests

Configuration file for tests

Coding style for tests

Running external tests

Running external tests with other linkers

Commit messages

GitHub workflow

Coding style

Naming

One import per line

Panic policy

Building wild with wild

Reading

Finding an issue to work on

Uh oh!

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to wild

LLM / AI use policy

Contribution Etiquette

Options for communicating

Meetings

Ways you can contribute

Running tests

Running tests for other architectures on x86_64

Setup

Running tests

Configuration file for tests

Coding style for tests

Running external tests

Running external tests with other linkers

Commit messages

GitHub workflow

Coding style

Naming

One import per line

Panic policy

Building wild with wild

Reading

Finding an issue to work on