Skip to content

Comments

Validate WHEEL file tags#19411

Open
konstin wants to merge 7 commits intopypi:mainfrom
konstin:konsti/validate-wheel-file-tags
Open

Validate WHEEL file tags#19411
konstin wants to merge 7 commits intopypi:mainfrom
konstin:konsti/validate-wheel-file-tags

Conversation

@konstin
Copy link
Contributor

@konstin konstin commented Jan 28, 2026

It can happen that the filename and the dist-info WHEEL file mismatch, either because the author manually changed the wheel filename, or due to a bug in the build backend implementation.

This caused a number of bugs in uv and pip, which expected the WHEEL file to contain accurate tags in the basic Key: Value format:

Another notable example is ziglang/zig-pypi#40, where the author wasn't aware of the expanded tags requirement.

This PR adds validation that enforces the basic Key: Value format, that the tags in the WHEEL file are expanded and that the tags match.


Speaking of ziglang/zig-pypi#40, pypi/support#8180 would really valuable to have, zig is an important part of the toolchain of building manylinux-compliant python packages using rust.

It can happen that the filename and the dist-info `WHEEL` file mismatch, either because the author manually changed the wheel filename, or due to a bug in the build backend implementation.

This caused a number of bugs in uv, which expected the `WHEEL` file to contain accurate tags in the basic `Key: Value` format:

* astral-sh/uv#2149
* astral-sh/uv#6615
* astral-sh/uv#15832
* astral-sh/uv#16581
* astral-sh/uv#16823

Another notable example is ziglang/zig-pypi#40, where the author wasn't aware of the expanded tags requirement.

This PR adds validation that enforces the basic `Key: Value` format, that the tags in the `WHEEL` file are expanded and that the tags match.
@konstin konstin requested a review from a team as a code owner January 28, 2026 10:44
@di
Copy link
Member

di commented Feb 13, 2026

Just want to say, I think I would like this to eventually land in pypa/packaging#1009 instead, but since progress has (admittedly) been slow there, I have no problems adding support here first.

@konstin
Copy link
Contributor Author

konstin commented Feb 16, 2026

For some context on the impact here, uv is basically a WHEEL test harness for invalid tags, as they cause a reinstall loop. In the description is listed the cases that were reported over the months, including the notable recent one where torch accidentally published a broken wheel, which would have been caught be this PR (astral-sh/uv#17711). uv also validates the Version field (https://github.com/astral-sh/uv/blob/6b2d6f2c401aefb9d97cff180df0110c99e3493e/crates/uv-install-wheel/src/wheel.rs#L273-L304)

@konstin
Copy link
Contributor Author

konstin commented Feb 16, 2026

Here's the results for the first 50k projects from a hacky script that check the latest release for all projects: https://gist.github.com/konstin/755aead7facf690efb476b90ecee8fe7. I've split them into maturin and non-maturin, since maturin produces a lot of wheels and has since been fixed. I'll update it when i got a full run.

@miketheman
Copy link
Member

@konstin for impact analysis reporting - something that would help contextualize the impact of this change would be adding the file's date to the output, CSV-style - so that these could be inspected relatively easily.

@konstin
Copy link
Contributor Author

konstin commented Feb 16, 2026

Like this? https://gist.github.com/konstin/56afd979b79ed86afa944ef196d27d7d

@miketheman
Copy link
Member

Like this? gist.github.com/konstin/56afd979b79ed86afa944ef196d27d7d

Yes, thanks!

@konstin
Copy link
Contributor Author

konstin commented Feb 16, 2026

@miketheman
Copy link
Member

Thanks @konstin - I was looking to generate some idea of how much this may impact existing upload / builders.

Here's a chart where I only look at the past couple of years of uploads in the data you furnished:

COUNTA of project vs  upload_date (Since 2024-01-01)
source, data massaged a bit for parsing

Those numbers aren't as low as I'd hope/expect them to be.

Slicing another way, it looks like maturin is the biggest contributor to the issue, with over 87% of all problematic uploads in the past year attributed to this generator.

generator COUNTA of project COUNTA of project
maturin 31370 87.46%
setuptools 2217 6.18%
bdist_wheel 750 2.09%
hatchling 516 1.44%
poetry-core 409 1.14%
skbuild 134 0.37%
scikit-build-core 118 0.33%
xmake 53 0.15%
(blank)  31 0.09%

...

Looks like something was fixed, since the most recent error is up to maturin v1.9.6, but looks like a lot of folks are still using it, but there's a path forward for those users - upgrade maturin and keep going?

For setuptools, the most recent release of 82.0.0 is reporting similar behavior - one example: https://pypi.org/project/OpenGeode-core/15.31.4rc1/#opengeode_core-15.31.4rc1-cp310-cp310-manylinux_2_28_x86_64.whl

Do those users have a path forward with their tool of choice? Or will their upload be rejected without a path forward with their chosen tool?

Generally speaking the code changes are fine, I'm just worried that if we ship this without having a clear answer for users impacted, it will increase failures and end user confusion.

@konstin
Copy link
Contributor Author

konstin commented Feb 17, 2026

For maturin you can update, the releases should be highly compatible (at least I'm not aware of any blockers to upgrading from the issue tracker). It's kinda embarrassing that this went uncaught for so long.

For the other cases, I'm not clear on whether they are actually build system bugs, or if people are manually renaming wheels and aren't are that they also have to update the WHEEL file (or really, shouldn't edit this by hand at all).

@miketheman
Copy link
Member

... It's kinda embarrassing that this went uncaught for so long.

heh, we try our best 😉

For the other cases, I'm not clear on whether they are actually build system bugs, or if people are manually renaming wheels and aren't are that they also have to update the WHEEL file (or really, shouldn't edit this by hand at all).

Yeah, that would make sense - if the tool generates one thing and the publishing process renames files, that would probably lead to this mismatch. So what can we offer folks in the way of understanding the problem and resolving it? Do we need to have a Pinned Issue here for a while, improve the messaging they receive from an upload failure, or something else?

@ewdurbin
Copy link
Member

At least in the example provided for opengeode-core it appears to be renaming of files post build:

https://github.com/Geode-solutions/actions/blob/97d31ebb48d31dbcb706296cc313b5c56c181892/.github/workflows/cpp-deploy-linux-python.yml#L122-L125

Screenshot 2026-02-18 at 9 07 41 AM

I think its good to be cautious, but we cannot debug everyones builds for them preemptively.

Perhaps we could have a brief window where we notify people that their builds are bad?

@di
Copy link
Member

di commented Feb 18, 2026

Seems to me like this should have a deprecation period with notification via email before it becomes an outright block.

@miketheman
Copy link
Member

Perhaps we could have a brief window where we notify people that their builds are bad?

Seems to me like this should have a deprecation period with notification via email before it becomes an outright block.

These both seem reasonable, but I'm also curious about the infrequent publisher who hasn't published in the deprecation window hitting this would experience. Say they only publish a new version once a year.

In my mind, having a GH Issue with a "you hit this error during publishing, here's a few things you can explore to resolve, like upgrade maturin past x version, stop renaming files, etc" and link the issue in the error message provides a longer-term solution that folks can use when they hit this condition to self-resolve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants