Skip to content

py_binary rules default to importing non-sandboxed code #7091

@FuegoFro

Description

@FuegoFro

Description of the problem / feature request:

Python (2.7.15 in this case) populates the first entry of the sys.path by resolving the first argument it is given to its real path (including resolving symlinks) and then using the directory containing that real path. This interferes with Bazel's sandboxing of Python files via symlinks and the runfiles in general, since the actual directory from the original source code will be the first entry in the sys.path.

More info on the method of populating sys.path[0] can be found in these discussions:
https://bugs.python.org/issue6386
https://bugs.python.org/issue17639

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Init a fresh git repo, git apply this patch, and run bazel run py:test

diff --git a/WORKSPACE b/WORKSPACE
new file mode 100644
index 0000000..e69de29
diff --git a/py/BUILD b/py/BUILD
new file mode 100644
index 0000000..ec84ab9
--- /dev/null
+++ b/py/BUILD
@@ -0,0 +1,6 @@
+py_binary(
+        name = 'test',
+        srcs = glob(['**/*.py'], exclude=['foo/baz.py']),
+        imports = [''],
+        main = 'main.py',
+)
diff --git a/py/foo/__init__.py b/py/foo/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/py/foo/bar.py b/py/foo/bar.py
new file mode 100644
index 0000000..e69de29
diff --git a/py/foo/baz.py b/py/foo/baz.py
new file mode 100644
index 0000000..e69de29
diff --git a/py/main.py b/py/main.py
new file mode 100644
index 0000000..5c21a2d
--- /dev/null
+++ b/py/main.py
@@ -0,0 +1,10 @@
+import sys
+# sys.path = sys.path[1:]  # Uncommenting this line gives us the expected behavior
+print sys.path
+
+import foo.bar
+print foo.bar.__file__
+
+# This shouldn't work!
+import foo.baz
+print foo.baz.__file__

Note that when sys.path is printed out, the first entry is in the original source code, and not the bazel output directory. Also note that the py/foo/baz.py file isn't included in the target but can be imported. Manually trimming sys.path at the beginning of the main.py file yields the expected behavior.

On solution to this is to not symlink, but rather actually copy, the main file. There are likely other solutions (maybe something involving a .pth file?) but I wasn't able to determine if it was possible to turn off this symlink-following behavior in Python (it seems like not, given the conversations I linked to earlier).

What operating system are you running Bazel on?

macOS 10.14.2

What's the output of bazel info release?

release 0.20.0-homebrew
(though I also repro'd this with non-Brew 0.21.0 and Brew 0.17.2)

Have you found anything relevant by searching the web?

Python issues listed above, regarding sys.path[0]:
https://bugs.python.org/issue6386
https://bugs.python.org/issue17639
Bazel issue regarding symlinking to a runfiles:
#4022
Commit handling runfiles symlinks on Windows:
#6036

Metadata

Metadata

Assignees

Labels

P3We're not considering working on this, but happy to review a PR. (No assignee)team-Rules-PythonNative rules for Pythontype: bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions