-
-
Notifications
You must be signed in to change notification settings - Fork 659
exp(pypi): store necessary facts fetched from SimpleAPI #3559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This allows us to cache what we store from the SimpleAPI in between the runs.
Summary of ChangesHello @aignas, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the efficiency and reproducibility of PyPI package resolution within Bazel by implementing a caching mechanism for SimpleAPI metadata. By storing essential 'facts' about distributions (such as filenames, hashes, and yanked status) directly in the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a caching mechanism for SimpleAPI lookups using Bazel's facts feature, which is a significant improvement for performance on subsequent runs. The implementation involves a substantial refactoring of the download and parsing logic to be more modular and to better support asynchronous operations through a new _await helper. The changes are well-structured, particularly in how facts are managed to be compact for storage in the lockfile. My feedback focuses on improving documentation and code clarity in a few areas.
| """ | ||
|
|
||
| def parse_simpleapi_html(*, url, content): | ||
| def parse_simpleapi_html(*, url, content, distribution = None, return_absolute = True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few TODOs in the docstrings for new and modified functions in this file that should be addressed:
parse_simpleapi_html(lines 24, 26): Thedistributionandreturn_absoluteparameters need descriptions.pkg_version(lines 129, 130): Thefilenameanddistributionparameters need descriptions.
Please fill these in to improve the documentation.
|
|
||
| def _read_index_result(ctx, result, output, url, cache, cache_key): | ||
| if not result.success: | ||
| def _read_simpleapi(ctx, index_url, pkg, attr, cache, get_auth = None, return_absolute = True, **download_kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few TODOs in the docstrings for new functions in this file that should be addressed:
_read_simpleapi(line 282): Thereturn_absoluteparameter needs a description._read_simpleapi_with_facts(lines 342-343): Theindex_urlanddistributionparameters need descriptions.
Please fill these in to improve the documentation.
| sha256s_by_version = sha256s_by_version, | ||
| ) | ||
|
|
||
| def _facts(ctx, store_facts, facts_version = _FACT_VERSION): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter name store_facts in this function is a bit confusing as it shadows the boolean flag with the same name in the simpleapi_download function. Here it refers to the dictionary that will be populated with facts. Consider renaming it to something like facts_dict or facts_to_populate to improve clarity and avoid ambiguity.
This allows us to cache what we store from the SimpleAPI in between the
runs.
Summary:
awaithelper to handle parallelismbetter when we need to wait for the download of the Simple API contents and
then do some processing.
instead of everything that we find on the index.
TODO:
pip.defaultAPI for the root module only.Fixes #2731