Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support publishing with static python dependency #720

Open
wants to merge 24 commits into
base: v0.10.x
Choose a base branch
from

Conversation

isc-shuliu
Copy link
Collaborator

@isc-shuliu isc-shuliu commented Jan 30, 2025

See #698

This PR adds the ability to include python wheels in a package. Developers should be responsible for creating their own wheel and adding a <PythonWheel /> entry in module.xml.

This PR also supports pure offline installation of the oras python package and its dependencies.

Even if pip install of the wheel fails, the installation of IPM still continues.

@isc-tleavitt isc-tleavitt linked an issue Feb 10, 2025 that may be closed by this pull request
@isc-tleavitt
Copy link
Contributor

isc-tleavitt commented Feb 10, 2025

Discussed with @isc-shuliu today. Plan is:

  • Have an optional flag during package phase (also used during IPM installer package creation - see make-release.sh) to create a wheel based on requirements.txt for the package.
  • On install of a package artifact (or source branch), see if a .whl exists. If it does, try to install it rather than loading from requirements.txt. If installation fails, fall back to requirements.txt. In the case of ORAS, as a pure python dependency, we don't need to worry about cross-OS/architecture concerns; in general we might need to and may revisit this as an additional set of dimensions (along with IRIS version for deployed code) for artifact publication and discovery.
  • Ensure that IPM installer package creation cleanly regenerates the wheel

@isc-tleavitt
Copy link
Contributor

Important caveat here for IPM itself: we should be able to say that installation will continue without error (and without installing the wheels) if pip isn't available.

Developers should be responsible for placing python dependencies manually in the wheel files (or requirements.txt)
@isc-shuliu
Copy link
Collaborator Author

So far, Publish works with wheel files correctly bundled in the exported item, but $System.OBJ.Load("/path/to/xml", "ck") is failing because some (not all) of the wheels cannot be installed via the $zf(-100) call.

After testing locally, the following packages have been successfully installed:

  • attrs
  • charset-normalizer
  • idna
  • jsonschema-specifications
  • referencing

While these fails:

  • certifi
  • jsonschema
  • oras
  • requests
  • typing_extensions
  • urllib3

Will take a deeper look

@isc-tleavitt
Copy link
Contributor

@isc-shuliu another thing to test here will be upgrading from a version of IPM without support - I think this should be OK but we need to double check.

@isc-shuliu
Copy link
Collaborator Author

isc-shuliu commented Feb 18, 2025

It appears the reconstructed wheel files from extracting the artifact xml are different from the original wheel files.

For example, the original /wheels/certifi-2025.1.31-py3-none-any.whl has an md5 hash of

c205c7808745af1661fb0fba4df49238

But after bundled into zpm.xml and extracted during $system.OBJ.Load(), the md5 of the same file becomes

8190678a878b9747bbbe34ad7388bd37

This causes pip's failure to read the wheel file. Specifically, in line 749 of the library module /usr/lib/python3.10/zipfile.py, it tries to call self._file.seek(self._pos) where self._pos is a negative number -4, which resulted in an error in the underlying syscall. On my container, it returns an OS error code of 22 -- Invalid Argument.

But strangely, although the md5 changed, the wheel file remained zip-extractable as before!

@isc-tleavitt Something must be going on when creating and/or extracting zpm.xml that caused the wheel file to change.

@isc-tleavitt
Copy link
Contributor

@isc-shuliu I'm taking a look.

@isc-shuliu
Copy link
Collaborator Author

isc-shuliu commented Feb 18, 2025

@isc-tleavitt
I converted the binary files to ascii (by printing 1 byte a line using hexadecimal 0-9, A-F), and it turns out that at the 4 following lines, the byte 0D or 00001101 were missing 4 times. This would explain the -4 in file.seek()

$ wc -l original
  166393 original

$ wc -l reconstructed
  166389 reconstructed

$ diff original reconstructed
87806d87805
< 0D
113708d113706
< 0D
149351d149348
< 0D
159900d159896
< 0D

I wonder if there's a (sort-of) alignment issue here

@isc-tleavitt
Copy link
Contributor

irisowner@85ad414f9f53:/tmp/dirdvZ2Sd/wheels$ cmp -l jsonschema-4.23.0-py3-none-any.whl /home/irisowner/zpm/wheels/jsonschema-4.23.0-py3-none-any.whl | gawk '{printf "%08X %02X %02X\n", $1, strtonum(0$2), strtonum(0$3)}' | more 
000005F9 0A 0D
000005FA BF 0A
000005FB 7B BF
000005FC 19 7B
000005FD 9C 19
000005FE 9E 9C
000005FF 5C 9E

Somewhere we're treating these as text files and putting Windows-style line endings on there!?

@isc-tleavitt
Copy link
Contributor

... ok, we were doing that in parallel. Similar finding regardless and your way is better. :)

@isc-shuliu
Copy link
Collaborator Author

Yes, it seems we are converting CR LF to only LF even in binary files. Therefore, only some of the wheel files are affected. I assume those unaffected wheels don't have consecutive CR LF bytes in them.

@isc-tleavitt
Copy link
Contributor

@isc-shuliu I think I got it... let's see how this CI does

@isc-tleavitt
Copy link
Contributor

Re: 7fa5d33 good catch... fixed that in my exported installer XML and forgot to change the source

@isc-shuliu isc-shuliu marked this pull request as ready for review February 18, 2025 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bootstrapping IPM from artifact ignores requirements.txt
2 participants