🚀 feature request
Relevant Rules
Adding a runfile object containing all "runfiles" of the (stage2) bootstrappers to PyExecutableInfo (impacts py_binary, py_test).
Description
We have a custom collect_layers rule that takes a py_binary and produces different output groups that map to different layers in an OCI image that will be built in downstream rules. Most of this is captured by PyRuntimeInfo (for the interpreter) and PyExecutableInfo.app_runfiles - however, we would like to split 3rd party deps/runfiles (from PyPI) and our own code into different layers for optimization. At the moment, we are effectively rebuilding similar depsets to app_runfiles but are doing so via an aspect that traverses the build graph in order to do so efficiently without flattening the depset.
#3324 added extra fields to identify the stage2_bootstrap, but additional files required by the stage2 bootstrapper to work correctly (like the venv site-packages files venv _bazel_site_init.pth and bazel_site_init.py) are still only surfaced via the PyExecutableInfo.app_runfiles - this makes it impossible to add them to my output groups without flattening that depset which is not desirable.
Describe the solution you'd like
It would be great if PyExecutableInfo could return a stage2_runfiles (for stage2 specifically) or bootstrap_runfiles (for all bootstrap-related files) fields which includes any additional runfiles it needs.
Describe alternatives you've considered
At the moment, these "extra runfiles" are already modeled somewhat in code under files_without_interpreter, we just surface them via a targeted patch of the provider like so:
# This patch is for rules_python 1.9.0, as of time of writing
diff --git python/private/py_executable.bzl python/private/py_executable.bzl
index 284aea6b..95bb4676 100644
--- python/private/py_executable.bzl
+++ python/private/py_executable.bzl
@@ -491,6 +491,7 @@ WARNING: Target: {}
app_runfiles = app_runfiles.build(ctx),
# File|None; the venv `bin/python3` file, if any.
venv_python_exe = venv.interpreter if venv else None,
+ files_without_interpreter = venv.files_without_interpreter if venv else None,
)
def _create_zip_main(ctx, *, stage2_bootstrap, runtime_details, venv):
@@ -1094,6 +1095,7 @@ def py_executable_base_impl(ctx, *, semantics, is_test, inherited_environment =
app_runfiles = app_runfiles,
venv_python_exe = exec_result.venv_python_exe,
interpreter_args = ctx.attr.interpreter_args,
+ files_without_interpreter = exec_result.files_without_interpreter,
)
def _get_build_info(ctx, cc_toolchain):
@@ -1639,7 +1641,8 @@ def _create_providers(
stage2_bootstrap,
app_runfiles,
venv_python_exe,
- interpreter_args):
+ interpreter_args,
+ files_without_interpreter):
"""Creates the providers an executable should return.
Args:
@@ -1700,6 +1703,7 @@ def _create_providers(
app_runfiles = app_runfiles,
venv_python_exe = venv_python_exe,
interpreter_args = interpreter_args,
+ files_without_interpreter = files_without_interpreter,
),
]
diff --git python/private/py_executable_info.bzl python/private/py_executable_info.bzl
index defbd3a0..c1b7592a 100644
--- python/private/py_executable_info.bzl
+++ python/private/py_executable_info.bzl
@@ -87,5 +87,6 @@ mode is not enabled.
:::{versionadded} 1.9.0
:::
""",
+ "files_without_interpreter": "Extra files required by the stage2 bootstrapper",
},
)
At the moment, we are still relying on bootstrap_impl=system_python (but "hack" the shebang to point to the interpreter in runfiles - this is the very first layer in the image before the other two).
I believe that the venv/bin/python3 symlink is also required for the stage1 bootstrap if one were to use bootstrap_impl=script - if so, it might be nicer to model this as a a generic bootstrap_runfiles field in which all bootstrap runfiles could live (so everything under venv and the stage2 bootstrap file itself)?
🚀 feature request
Relevant Rules
Adding a runfile object containing all "runfiles" of the (stage2) bootstrappers to PyExecutableInfo (impacts py_binary, py_test).
Description
We have a custom
collect_layersrule that takes apy_binaryand produces different output groups that map to different layers in an OCI image that will be built in downstream rules. Most of this is captured byPyRuntimeInfo(for the interpreter) andPyExecutableInfo.app_runfiles- however, we would like to split 3rd party deps/runfiles (from PyPI) and our own code into different layers for optimization. At the moment, we are effectively rebuilding similar depsets toapp_runfilesbut are doing so via anaspectthat traverses the build graph in order to do so efficiently without flattening the depset.#3324 added extra fields to identify the
stage2_bootstrap, but additional files required by the stage2 bootstrapper to work correctly (like the venv site-packages files venv _bazel_site_init.pth and bazel_site_init.py) are still only surfaced via thePyExecutableInfo.app_runfiles- this makes it impossible to add them to my output groups without flattening that depset which is not desirable.Describe the solution you'd like
It would be great if
PyExecutableInfocould return astage2_runfiles(for stage2 specifically) orbootstrap_runfiles(for all bootstrap-related files) fields which includes any additional runfiles it needs.Describe alternatives you've considered
At the moment, these "extra runfiles" are already modeled somewhat in code under
files_without_interpreter, we just surface them via a targeted patch of the provider like so:At the moment, we are still relying on
bootstrap_impl=system_python(but "hack" the shebang to point to the interpreter in runfiles - this is the very first layer in the image before the other two).I believe that the
venv/bin/python3symlink is also required for the stage1 bootstrap if one were to usebootstrap_impl=script- if so, it might be nicer to model this as a a genericbootstrap_runfilesfield in which all bootstrap runfiles could live (so everything under venv and the stage2 bootstrap file itself)?