Surfacing all bootstrap runfiles in PyExecutableInfo

# 🚀 feature request

### Relevant Rules

Adding a runfile object containing all "runfiles" of the (stage2) bootstrappers to PyExecutableInfo (impacts py_binary, py_test).

### Description

We have a custom `collect_layers` rule that takes a `py_binary` and produces different output groups that map to different layers in an OCI image that will be built in downstream rules. Most of this is captured by `PyRuntimeInfo` (for the interpreter) and `PyExecutableInfo.app_runfiles` - however, we would like to split 3rd party deps/runfiles (from PyPI) and our own code into different layers for optimization. At the moment, we are effectively rebuilding similar depsets to `app_runfiles` but are doing so via an `aspect` that traverses the build graph in order to do so efficiently without [flattening the depset](https://bazel.build/rules/performance#avoid-depset-to-list).

https://github.com/bazel-contrib/rules_python/issues/3324 added extra fields to identify the `stage2_bootstrap`, but additional files required by the stage2 bootstrapper to work correctly (like the venv site-packages files venv _bazel_site_init.pth and bazel_site_init.py) are still only surfaced via the `PyExecutableInfo.app_runfiles` - this makes it impossible to add them to my output groups without flattening that depset which is not desirable.

### Describe the solution you'd like

It would be great if `PyExecutableInfo` could return a `stage2_runfiles` (for stage2 specifically) or `bootstrap_runfiles` (for all bootstrap-related files) fields which includes any additional runfiles it needs. 

### Describe alternatives you've considered

At the moment, these "extra runfiles" are already modeled somewhat in code under `files_without_interpreter`, we just surface them via a targeted patch of the provider like so:
```patch
# This patch is for rules_python 1.9.0, as of time of writing
diff --git python/private/py_executable.bzl python/private/py_executable.bzl
index 284aea6b..95bb4676 100644
--- python/private/py_executable.bzl
+++ python/private/py_executable.bzl
@@ -491,6 +491,7 @@ WARNING: Target: {}
         app_runfiles = app_runfiles.build(ctx),
         # File|None; the venv `bin/python3` file, if any.
         venv_python_exe = venv.interpreter if venv else None,
+        files_without_interpreter = venv.files_without_interpreter if venv else None,
     )
 
 def _create_zip_main(ctx, *, stage2_bootstrap, runtime_details, venv):
@@ -1094,6 +1095,7 @@ def py_executable_base_impl(ctx, *, semantics, is_test, inherited_environment =
         app_runfiles = app_runfiles,
         venv_python_exe = exec_result.venv_python_exe,
         interpreter_args = ctx.attr.interpreter_args,
+        files_without_interpreter = exec_result.files_without_interpreter,
     )
 
 def _get_build_info(ctx, cc_toolchain):
@@ -1639,7 +1641,8 @@ def _create_providers(
         stage2_bootstrap,
         app_runfiles,
         venv_python_exe,
-        interpreter_args):
+        interpreter_args,
+        files_without_interpreter):
     """Creates the providers an executable should return.
 
     Args:
@@ -1700,6 +1703,7 @@ def _create_providers(
             app_runfiles = app_runfiles,
             venv_python_exe = venv_python_exe,
             interpreter_args = interpreter_args,
+            files_without_interpreter = files_without_interpreter,
         ),
     ]
 
diff --git python/private/py_executable_info.bzl python/private/py_executable_info.bzl
index defbd3a0..c1b7592a 100644
--- python/private/py_executable_info.bzl
+++ python/private/py_executable_info.bzl
@@ -87,5 +87,6 @@ mode is not enabled.
 :::{versionadded} 1.9.0
 :::
 """,
+        "files_without_interpreter": "Extra files required by the stage2 bootstrapper",
     },
 )
```

At the moment, we are still relying on `bootstrap_impl=system_python` (but "hack" the shebang to point to the interpreter in runfiles - this is the very first layer in the image before the other two). 

I believe that the `venv/bin/python3` symlink is also required for the stage1 bootstrap if one were to use `bootstrap_impl=script` - if so, it might be nicer to model this as a a generic `bootstrap_runfiles` field in which all bootstrap runfiles could live (so _everything_ under venv and the stage2 bootstrap file itself)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Surfacing all bootstrap runfiles in PyExecutableInfo #3745

🚀 feature request

Relevant Rules

Description

Describe the solution you'd like

Describe alternatives you've considered

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Surfacing all bootstrap runfiles in PyExecutableInfo #3745

Description

🚀 feature request

Relevant Rules

Description

Describe the solution you'd like

Describe alternatives you've considered

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions