-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Avoid installing test dependencies in system python path. NFC #25401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
958ae17 to
0bc31ad
Compare
a73b170 to
5ac23bf
Compare
|
Previously I have been able to test end user setup by leaving out installing the dev dependencies. But because bootstrap is mandatory, and after this PR it will install python dev dependencies, it looks like I will need to develop some kind of delete step to remove the dev dependency packages from Python for testing and shipping. This would be a divergence between node.js vs python, where we don't install node.js dev dependency packages via boostrap either? Also, on Linux e.g. on Debian where |
| __rootpath__ = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) | ||
| sys.path.insert(0, __rootpath__) | ||
| # Add `out/python_deps` to ensure that we can import our dev dependencies. | ||
| sys.path.insert(0, os.path.join(__rootpath__, 'out', 'python_deps')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, isn't this causing a divergence between "what is tested" vs "what user has installed"?
I.e. before this PR, when running the test suite, it would test the user's Python installation, which could be for example the Python we ship via emsdk?
But after this change, the Emscripten test runner would stop testing the package setup from user's Python installation, and start testing a local sandboxed Python out/python_deps package path that is set up just for local testing purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only things out/python_deps are the packages listed in requirements-dev.txt.
The emscripten compiler itself does not depend on any of these. In fact it has zero dependencies on non-standard packages.
On advantage of this change is that we no longer install our test-only/dev-only packages in the system python (or in the emsdk python) which means that when we run out test there is no longer any risk that emscripten itself will accidentally depend on one of the dev dependencies.
I designed this change specifically with your workflow in mind. IIRC the reason you didn't want to install the dev (test) dependencies was that you were worried about emscripten accidenatlly depending on these. With this system that risk has been removed. The emscripten compiler code itself cannot see or use these dependencies.
The bootstrap script does install the node dev dependencies. The place were with using
Indeed, and we are not installing anything system-wide install. Everything is being put in That is another advantage of this PR is that now we don't depend on installing anything in the system python path. |
pip install as part of bootstrap.py|
I re-titled, and updated the description to be a little bit more explicit about the intent here. |
|
Anther potential upside here is that we add new test dependencies without effecting the distributed version of emscripten, or introducing new dependencies for our users. For example, we could starting using new packages in our python test code such that could improve the testing experience. Right now we are artificially limited in the packages we can use in our test framework because we want to avoid new product dependencies. This change makes a clear separation and allows us to move forward without risking new deps for our users. |
Also, install pip packages in local `out/python_deps` directory instead of installing system wide. This means we can consistently expect dev dependencies to be available in test code without needing to include an opt out mechanism. We were already doing this for `psutil`, but for `websockify` we made it optional. This means that only python scripts that explicitly add `out/python_deps` to their python path will be able use the packages, and in particular it means that the emscripten compiler itself won't end up implicitly/accidentally depending on them.
Thanks, that is much appreciated.
I suppose the question I have is - how do we test and verify that this will remain to be the case? Since it is not possible to run tests without running bootstrap, then it is no longer possible to launch the test runner in a mode that does not have these dependencies installed? What I would like is to keep running the Emscripten test suite in a mode that does not require dev dependencies. Those tests prove that Emscripten does not depend on unwanted packages. I.e. even if we would statically say that Emscripten does not depend on dev packages, it would not automatically mean that we would be testing to verify that it does not depend on those packages? So we wouldn't/couldn't catch an error if that assumption regresses?
That does not (fortunately) seem to be the case. Or at least if one is installing via emsdk. It is important for us to be able to do a bootstrap that does not install dev dependencies. |
The idea is that the test runner has a hard dependency on these dev packages, but emscripten itself does not.
The emscripten compiler itself is never run with I suppose its possible that someone could defeat that by adding |
|
I added some assertion to emcc.py to ensure that the compiler itself doesn't run with those python libs in its path. |
5ac23bf to
996aa63
Compare
I guess this is the part that I feel off about. Currently there does not exist a hard dependency in the test runner to dev packages, and so I can test the majority of the test suite using the end user python configuration. If we require a hard dependency to dev packages to be installed when testing, then we lose guarantee that we are testing the end user configuration. In our packaging process on the CI, what I am currently doing is:
It is beneficial to do this testing without dev dependencies installed. That ensures that the tested Python setup matches the production one. I suppose I could introduce a second delete pass between steps 4 and 5 to delete the dev dependencies from running Python unit tests (nuke
What I worry is that not all tests start with invoking production Or another way an issue might occur is if test Python has a PYTHON_PATH, that launching a child tool (not necessarily just emcc, but maybe other .py subtool), might inherit the PYTHON_PATH in env, and leak the subprocess launch to use the dev dependencies. Installing the python dev dependencies to |
I'm not sure what you mean by this. 99% of the test run
There are two reason I think this cannot happen:
|
Its seems like we already have hard dependency on |
I think you certainly want to remove the |
Both pywin32 and psutil are production dependencies as per https://github.com/emscripten-core/emsdk/blob/7b4e60e4bfcba326025e373024369eaa9904af55/scripts/update_python.py#L77-L78 . This has been the case for a very long time because the
And the remaining 1% are unlabeled, which we don't have a good grasp of which would be subject to a possible regression. Currently the known discrepancies are explicitly tracked and flagged with the By construction, we know that the tests will run on the very same Python setup that goes out to the end users. The two points you mention, require tacit knowledge. For example, I did not know about the It looks for example that this test is not passing the It seems simpler to set up Python the way that end users will have it, and then test that? Then if there are dev packages that add extra, those would be managed with |
But on linux we won't have python package in emsdk. How does it work on linux? I guess the OS python always supplies this? The emun usage of |
But this construction is very new and only exists in your CI. The emscripten CI on github and emsripten-releases bots have always run with dev dependencies installed. In all the years we have been doing that we have not (as far as I remember) run into an issue where the compiler code ended up accidentally depending on a dev package. But I agree it is possible, which is why I'm going to all these lengths to make it close to impossible. I think the mitigations in this PR against this happening are very strong. In summary, I think we have low risk and high level of mitigation. |
|
BTW I hope is that this PR will bring the emscripten CI and emscripten-releases builders closer to your CI, in that they will no longer run dev dependencies visible to the compiler code. |
|
I suppose we can land this change without removing the opt-out, and then we and continue to discuss the opt out separately. |
|
That sounds good, if you have the cycles to slice the PR. |
Instead we use pip to install the dev dependencies in
out/python_depsdirectory.This means that the emscripten compiler itself does not have access to them,
only the test code which explictly adds this path.
This means we can perform the installation as part of the bootstrap script
(just like we already do for node dev dependencies) which in turn means
we can consistently rely on dev dependencies to be available in test code
(without needing to include an opt out mechanism).