fix(py_image_layer): resolve symlinks in mtree specs to preserve them in tar layers (#567)#892
Conversation
… in tar layers (aspect-build#567) mtree_spec records every file as type=file even when files are symlinks (e.g. python/python3 -> python3.13 in the Python toolchain), causing tars to contain multiple full copies instead of lightweight links. Add a _resolve_symlinks rule that post-processes mtree specs using a two-step readlink to detect real filesystem symlinks and convert them to type=link entries. Only relative symlink targets are converted; absolute targets are left as-is to avoid breaking cross-repo references. Fixes: aspect-build#567
|
|
I'll leave it to @thesayyn who has more deep context on this than I do to really review this change which on the surface looks great. My one comment is that rules_py is moving towards putting as much in the e2e tests as possible both for test organization and to isolate dependencies from the main repo, I'd ask that you shuffle (or have Claude shuffle) this test case into an e2e regression test case dir bearing the #567 issue suffix as with the others. |
|
One comment here is that I elsewhere found that I wanted to do the same thing (resolve symlinks) for a Go image rule. I wonder if it’s worth moving the dedup by symlinks logic into |
@mrdomino , In most cases Bazel is aware of symlinks in its output (e.g. they're declared with In this case, we download/extract the toolchain (Python) as-is, and thus Bazel never been told what's a symlink and what's a regular file. As such - I believe the solution strategies should be different in those two cases. Additionally, please note that there's an experimental Thanks for the comment :) |

mtree_spec records every file as type=file even when files are symlinks (e.g. python/python3 -> python3.13 in the Python toolchain), causing tars to contain multiple full copies instead of lightweight links.
Add a _resolve_symlinks rule that post-processes mtree specs using a two-step readlink to detect real filesystem symlinks and convert them to type=link entries. Only relative symlink targets are converted; absolute targets are left as-is to avoid breaking cross-repo references.
Fixes: #567
Changes are visible to end-users: yes
Fix
py_image_layerinterpreter layer bloat by preserving filesystem-level symlinks (e.g.python->python3.13), reducing the layer size by ~60%.Test plan