Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build platform-specific wheels containing libmagic #294

Open
wants to merge 38 commits into
base: master
Choose a base branch
from

Conversation

ddelange
Copy link
Contributor

@ddelange ddelange commented Sep 6, 2023

Hi @ahupp 👋

This PR builds self-contained wheels as discussed in #233. For Windows users, this renders python-magic-bin from @julian-r obsolete.

pip install these wheels

pip can install from GitHub Release assets from my fork:

# comma separated list for --find-links
export PIP_FIND_LINKS=https://github.com/ddelange/python-magic/releases/expanded_assets/0.4.28.post7
pip install --force-reinstall python-magic
- python-magic-0.4.27.tar.gz
+ python-magic-0.4.28.tar.gz
- python_magic-0.4.27-py2.py3-none-any.whl
+ python_magic-0.4.28-py2.py3-none-macosx_10_9_x86_64.whl
+ python_magic-0.4.28-py2.py3-none-macosx_11_0_arm64.whl
+ python_magic-0.4.28-py2.py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
+ python_magic-0.4.28-py2.py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
+ python_magic-0.4.28-py2.py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl
+ python_magic-0.4.28-py2.py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+ python_magic-0.4.28-py2.py3-none-musllinux_1_1_aarch64.whl
+ python_magic-0.4.28-py2.py3-none-musllinux_1_1_ppc64le.whl
+ python_magic-0.4.28-py2.py3-none-musllinux_1_1_s390x.whl
+ python_magic-0.4.28-py2.py3-none-musllinux_1_1_x86_64.whl
+ python_magic-0.4.28-py2.py3-none-win32.whl
+ python_magic-0.4.28-py2.py3-none-win_amd64.whl

The wheels:

  • Come bundled with libmagic (.dylib on mac, .so on nix, .ddl on win) - no additional user action needed
    • latest libmagic (5.45), except on windows (5.44, used the latest GitHub Release at https://github.com/julian-r/file-windows)
    • building from source on linux and macos with maximum backwards compatibility
    • skipping i686 linux wheels as they don't pass pytest ref ca4def3
  • Are forward compatible, they will install on cpython 2.7 and all current and future cpython 3.5+ distributions
  • Are built for the major cpu architectures (and also for alpine linux)
    • Linux architectures limited by availability: https://pkgs.org/search/?q=file-libs now building from source on linux
    • For windows I couldn't find compiled libmagic for ARM64, skipped it
  • Are uploaded as GitHub Release assets, as well as upload to PyPI, whenever a GitHub Release is (pre)released.
    • 🚩 @ahupp for this, you still need to add wheels.yml as trusted publisher here

CI/CD

dists build with official cibuildwheel on GitHub Actions, and they build in parallel:
image

  • The PyPI publish step is failing here ^ because my fork is not a trusted publisher
  • For any system not covered by these wheels, pip will fall back to the source distribution, which will check for proper libmagic on the systen at install time.

fix #137, fix #288, fix #225, fix #276, fix #248, fix #87, fix #139, fix #233, fix #73, fix #60, fix #34, fix #293, fix #233, fix #278, fix #262, fix #248, fix #238, fix #145, fix #61, fix #12, fix #295, fix #311, fix #312, fix #313, fix #321, fix #332, fix #249, fix #333

@ddelange ddelange changed the title Build ABI3 wheels containing libmagic Build platform-specific wheels containing libmagic Sep 7, 2023
@ddelange ddelange force-pushed the abi3-wheels branch 2 times, most recently from a98f13b to dc9c393 Compare September 7, 2023 07:26
@ddelange ddelange force-pushed the abi3-wheels branch 12 times, most recently from d672b91 to 14f7dbb Compare September 7, 2023 10:03
@apirogov
Copy link

This is nice! Hope this will be merged soon!

Just ran into issues with my library being not usable by Mac and Windows users because I rely on python-magic. If there are wheels, I don't need to find a workaround or replace the library :)

python-magic-bin did not work for some of them, by the way.

with:
files: dist/*

- name: Upload to PyPI

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small improvement here might be to use the PyPa Action instead: https://github.com/pypa/gh-action-pypi-publish

The big advantage is trusted publishing, instead of storing a password or token as a secret

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's cool, thanks for sharing!

@ahupp shall I make that change and you set it up on PyPI side?

Copy link
Contributor Author

@ddelange ddelange Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like trusted publishing is the way to go. I recently got this email:

Hi ddelange!
Earlier this year, we announced that PyPI would require all users to enable a form of two-factor authentication on their accounts by the end of 2023.

Keeping your PyPI account secure is important to all of us. We encourage you to enable two-factor authentication on your PyPI account as soon as possible.

What forms of 2FA can I use?
We currently offer two main forms of 2FA for your account:

Security device including modern browsers (preferred) (e.g. Yubikey, Google Titan)
Authentication app (e.g. Google Authenticator)
Once one of these secure forms is enabled on your account, you will also need to use either Trusted Publishers (preferred) or API tokens to upload to PyPI.

What do I do if I lose my 2FA device?
As part of 2FA enrollment, you will receive one-time use recovery codes. One of them must be used to confirm receipt before 2FA is fully active. Keep these recovery codes safe - they are equivalent to your 2FA device. Should you lose access > to your 2FA device, use a recovery code to log in and swap your 2FA to a new device.

Read more aboutrecovery codes.

Why is PyPI requiring 2FA?
Keeping all users of PyPI is a shared responsibility we take seriously. Strong passwords combined with 2FA is a recognized secure practice for over a decade.

We are requiring 2FA to protect your account and the packages you upload, and to protect PyPI itself from malicious actors. The most damaging attacks are account takeover and malicious package upload.

To see this and other security events for your account, visit your account security history.

Read more on this blog post.

If you run into problems, read the FAQ page. If the solutions there are unable to resolve the issue, contact us via [email protected].

Thanks,
The PyPI Admins

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahupp so the last thing for you to do is adding this repo as trusted publisher to https://pypi.org/manage/project/python-magic/settings/publishing/

@ahupp
Copy link
Owner

ahupp commented Sep 28, 2023

This is huge, thank you! Apology for the delay I thought I'd commented earlier but guess not. I'll look this over soon; I didn't quite understand how bad the binary dep situation was expecially on windows.

.github/workflows/main.yml Outdated Show resolved Hide resolved
@jean-humann
Copy link

@ahupp @stumpylog could we have this one merged (and released) by the end of the year please ?

add_libmagic.sh Outdated Show resolved Hide resolved
add_libmagic.sh Outdated Show resolved Hide resolved
Comment on lines +20 to +26
magic_file = None
if not os.environ.get("MAGIC"):
# wheels package the mime database in this directory
# prefer it when no magic file is specified by the user
mime_db = os.path.join(os.path.dirname(__file__), 'magic.mgc')
if os.path.exists(mime_db):
magic_file = mime_db
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems weird duplicating this logic between here and __init__.py, but I guess it's unavoidable since the __init__.py version is hidden inside Magic.__init__().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one way to avoid code duplication (and cyclic import) would be to add from magic.compat import magic_file on top of __init__.py, and use it as default argument:

Magic.__init__(magic_file=magic_file, ...)

option 2 would be to put the snippet in a new file and import it in both places (avoids cyclic import as well).

any preference here @ahupp?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add from magic.compat import magic_file on top of __init__.py,

I don't think that's a good idea, as it means that it'd be impossible to import any part of python-magic without also importing magic.compat. A compat module, specifically, should only be imported when needed.

The code could be moved to the top level of __init__.py, and imported into compat.py with from . import magic_file. I think that would be a better direction for things to flow, if going that route.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind. __init__.py already unconditionally imports magic.compat in its _add_compat() function. Which is always run on import.

@ddelange ddelange force-pushed the abi3-wheels branch 5 times, most recently from 2822bca to 3d7698c Compare May 28, 2024 23:53
@ddelange
Copy link
Contributor Author

Hi 👋 I'll be AFK until end of June. @ahupp feel free to take over my branch, or merge as is!
https://http.cat/301

Copy link

@Privat33r-dev Privat33r-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM apart from some minor issues. Ideally, to have better control over dependencies, I would keep binaries inside of the repo, like pyexiv2 and some other libraries do.

.github/workflows/ci.yml Outdated Show resolved Hide resolved
CHANGELOG Show resolved Hide resolved
README.md Show resolved Hide resolved
Depending on your system and CPU architecture, there might be no compatible wheel uploaded.
However, precompiled libmagic might still be available for your system:

```sh

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be beneficial to add a library installation guide for SUSE as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you provide the relevant command?

fwiw, I think mostly all linux flavours will be covered by the wheels in the PR description, so those users won't be needing the install from source instructions provided here.

Copy link

@Privat33r-dev Privat33r-dev Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that it would be
zypper install file-devel
it might as well be required to do (in vivo test is required though)
zypper install file-magic

Currently I don't have OpenSUSE at my disposal for tests and it's likely that it would be a default package. I don't promise anything, but I might find time soon-ish to test it.

tar xvf "${tmpfile}" &&
cd "${version}" &&
./configure &&
make &&
Copy link

@Privat33r-dev Privat33r-dev Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dev dependencies to use "make" and similar commands are not shipped by default in some distros, especially in those that are usually used by Docker. I'd suggest to ensure that necessary dependencies are installed first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wished set -euxo pipefail also effects commands inside function definitions. then we could remove && from every line here (and the subshell parentheses here and the || on L21), and we could test which make on top of this function or similar.

now, it simply falls back to install_precompiled (L69) when curl or make or compilation fails (error will be in output).

if make is not available, in the current setup we will only have done an unnecessary curl. not the worst but could be avoided.

any ideas?

Copy link

@Privat33r-dev Privat33r-dev Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wished set -euxo pipefail also effects commands inside function definitions. then we could remove && from every line here (and the subshell parentheses here and the || on L21), and we could test which make on top of this function or similar.

any ideas?

It appears to be doing so already. Try this (and also the same script, but remove the first line or replace exit 1 to exit 0):

set -euxo pipefail
func(){
  echo 1
  (exit 1)
  echo 2
}

func

You made me think about other topic when you mentioned that curl can be executed. There is no cleanup on any step, it might be logical to make a ZIP (or gzip, doesn't matter) file cleanup from external function and do it regardless of the outcome. Same for windows. Pseudocode:

install_stuff(){
 real_install()
 rm potential_zip
}

README.md Outdated Show resolved Hide resolved
(
version="file-5.45" &&
tmpfile="$(mktemp)" &&
curl -sSLo "${tmpfile}" "https://astron.com/pub/file/${version}.tar.gz" &&
Copy link

@Privat33r-dev Privat33r-dev Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried it on ubuntu's Docker, fails here :)
bash: curl: command not found

install_precompiled would work though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the script is only intended for github actions default runner images and the cibuildwheel docker images (both have curl and make)

@cclauss
Copy link
Contributor

cclauss commented Jun 26, 2024

There is wonderful work in this pull request and it has four positive reviews. Unfortunately, it has been open for ~9 months without landing. Perhaps it would be useful to break it into three separate PRs that are easier to review and land.

One PR that deals only with macOS and another that deals only with Windows might be easier to land. Once that is done then this PR could be rebased to deal only with Linux and friends. I know it is extra work but I sense that new momentum is needed.

@estarfoo
Copy link

Hi! Great PR, is there any real contention about it beyond the scope of OS/distro support in add_libmagic.sh and README.md?

Most users will only ever want the wheels from this repo. In particular, this looks like it will solidly cover usage in Docker images. Anyone who wants to use the source version and provide libmagic themselves, probably knows best how to do the latter in their environment. given info on where this package will look for the library. (Those who package python-magic for their Linux distro of choice will already have a preferred way of ensuring libmagic presence. This will probably not exactly match anything suggested in python-magic docs.)

Even for those particularly invested in having a range of setup instructions, the PR in its current state should look like a clear improvement on master, and further improvements in that are will come more easily as separate PRs (because they won’t be tied to CI scripts).

So: how about merging this without completing libmagic setup instructions for every possible platform? Seems like it already does what the PR title says.

@Privat33r-dev
Copy link

So: how about merging this without completing libmagic setup instructions for every possible platform? Seems like it already does what the PR title says.

Totally for it. My suggestions just show the way to improve it, but I would merge it "as is" since it already provides a huge value. "Perfect is the enemy of good".

@ahupp hopefully you can find some time to review this most discussed PR in the python-magic's history :)

Comment on lines +11 to +13
tmpfile="$(mktemp)" &&
curl -sSLo "${tmpfile}" "https://astron.com/pub/file/${version}.tar.gz" &&
tar xvf "${tmpfile}" &&

This comment was marked as resolved.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@estarfoo

Not a big deal, but if going with the piped curl output, personally I'd pair that with an explicit tar xvf -. Again, for the paranoid.

import platform, sysconfig, io, zipfile, urllib.request
assert platform.system() == "Windows"
machine = "x86" if sysconfig.get_platform() == "win32" else "x64"
url = f"https://github.com/julian-r/file-windows/releases/download/v5.44/file_5.44-build104-vs2022-{machine}.zip"

This comment was marked as resolved.

@ddelange
Copy link
Contributor Author

@ahupp this also fixes failing CI on master (looks like the github actions linux runner image no longer ships libmagic by default)

@ashtonpaul
Copy link

@ahupp @ddelange Any update on this PR and the release? 👀 Is there anything that needs to be tested or are we waiting on anything other than a review?

@ddelange
Copy link
Contributor Author

ddelange commented Sep 2, 2024

@ahupp please review, merge, add trusted publisher (see PR description) and release 🙏

@wilhelmklopp
Copy link

Also would love to see this merged, it's causing some issues for windows users of our package (which depends on mocket which depends on python-magic). Thank you! 😄

@cclauss
Copy link
Contributor

cclauss commented Oct 18, 2024

@ashtonpaul @ddelange @wilhelmklopp @ferdnyc #294 (comment)

@ferdnyc
Copy link

ferdnyc commented Oct 21, 2024

@ahupp Speaking personally, this is a big, complex PR that I don't really feel qualified to approve given the scope and nature of all the changes here. I guess I'm in agreement with your #294 (comment)

Perhaps it would be useful to break it into three separate PRs that are easier to review and land.

One PR that deals only with macOS and another that deals only with Windows might be easier to land. Once that is done then this PR could be rebased to deal only with Linux and friends. I know it is extra work but I sense that new momentum is needed.

OTOH, the PR has received a number of approvals as-is, so there's ample indication that others are satisfied with it in its current form. I left some comments, which were addressed, so my input certainly shouldn't be viewed as disapproval by any means.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet