-
Notifications
You must be signed in to change notification settings - Fork 24
Source Links Fix #440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source Links Fix #440
Changes from all commits
bde02c2
a898eee
c2861cd
fb5bdc5
8ea7417
f818cf2
adeae4f
b92bf6e
5a08e5d
ce31387
c573b2f
9d87c2c
f50ef0c
976c2ec
d094914
0d1e490
7cda52e
1d74ad0
ddfa34c
6230c47
a8ce766
740b019
6915283
a836298
133a44d
748467d
ae5969d
78b0095
9c35436
fdfeb2b
8c7aee6
9473852
feb9863
2d58de2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -25,25 +25,43 @@ | |||||||||||||||
| from src.extensions.score_source_code_linker.generate_source_code_links_json import ( | ||||||||||||||||
| _extract_references_from_file, # pyright: ignore[reportPrivateUsage] TODO: move it out of the extension and into this script | ||||||||||||||||
| ) | ||||||||||||||||
| from src.extensions.score_source_code_linker.helpers import parse_repo_name_from_path | ||||||||||||||||
| from src.extensions.score_source_code_linker.needlinks import ( | ||||||||||||||||
| store_source_code_links_json, | ||||||||||||||||
| DefaultMetaData, | ||||||||||||||||
| store_source_code_links_with_metadata_json, | ||||||||||||||||
| ) | ||||||||||||||||
|
|
||||||||||||||||
| logging.basicConfig(level=logging.INFO, format="%(message)s") | ||||||||||||||||
| logger = logging.getLogger(__name__) | ||||||||||||||||
|
|
||||||||||||||||
|
|
||||||||||||||||
| def clean_external_prefix(path: Path) -> Path: | ||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this still here? With our new design, this should not be necessary.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is neccessary. Otherwise the filename will be wrong that is passed to the extraction function.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wonder if this can be used in the testlink cleaning too, or if they are too different. |
||||||||||||||||
| """ | ||||||||||||||||
| As it can be in combo builds that the path is: | ||||||||||||||||
| `external/score_docs_as_code+/....` or similar | ||||||||||||||||
| We have to remove this prefix from the file | ||||||||||||||||
| before we pass it to the extract function. Otherwise we have | ||||||||||||||||
| this prefix inside the `file` attribute leading to wrong links | ||||||||||||||||
| """ | ||||||||||||||||
| if not str(path).startswith("external/"): | ||||||||||||||||
| return path | ||||||||||||||||
| # This allows for files / folders etc. to have `external` in their name too. | ||||||||||||||||
| path_raw = str(path).removeprefix("external/") | ||||||||||||||||
| filepath_split = str(path_raw).split("/", maxsplit=1) | ||||||||||||||||
| return Path("".join(filepath_split[1:])) | ||||||||||||||||
|
|
||||||||||||||||
|
|
||||||||||||||||
| def main(): | ||||||||||||||||
| parser = argparse.ArgumentParser( | ||||||||||||||||
| description="Generate source code links JSON from source files" | ||||||||||||||||
| ) | ||||||||||||||||
| parser.add_argument( | ||||||||||||||||
| _ = parser.add_argument( | ||||||||||||||||
| "--output", | ||||||||||||||||
| required=True, | ||||||||||||||||
| type=Path, | ||||||||||||||||
| help="Output JSON file path", | ||||||||||||||||
| ) | ||||||||||||||||
| parser.add_argument( | ||||||||||||||||
| _ = parser.add_argument( | ||||||||||||||||
| "files", | ||||||||||||||||
| nargs="*", | ||||||||||||||||
| type=Path, | ||||||||||||||||
|
|
@@ -53,15 +71,23 @@ def main(): | |||||||||||||||
| args = parser.parse_args() | ||||||||||||||||
|
|
||||||||||||||||
| all_need_references = [] | ||||||||||||||||
|
|
||||||||||||||||
| metadata = DefaultMetaData() | ||||||||||||||||
| metadata_set = False | ||||||||||||||||
| for file_path in args.files: | ||||||||||||||||
| if "known_good.json" not in str(file_path) and not metadata_set: | ||||||||||||||||
| metadata["repo_name"] = parse_repo_name_from_path(file_path) | ||||||||||||||||
| metadata_set = True | ||||||||||||||||
|
Comment on lines
+78
to
+80
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
| abs_file_path = file_path.resolve() | ||||||||||||||||
| assert abs_file_path.exists(), abs_file_path | ||||||||||||||||
| clean_path = clean_external_prefix(file_path) | ||||||||||||||||
| references = _extract_references_from_file( | ||||||||||||||||
| abs_file_path.parent, Path(abs_file_path.name) | ||||||||||||||||
| abs_file_path.parent, Path(abs_file_path.name), clean_path | ||||||||||||||||
| ) | ||||||||||||||||
MaximilianSoerenPollak marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||
| all_need_references.extend(references) | ||||||||||||||||
|
|
||||||||||||||||
| store_source_code_links_json(args.output, all_need_references) | ||||||||||||||||
| store_source_code_links_with_metadata_json( | ||||||||||||||||
| file=args.output, metadata=metadata, needlist=all_need_references | ||||||||||||||||
| ) | ||||||||||||||||
|
Comment on lines
+88
to
+90
|
||||||||||||||||
| logger.info( | ||||||||||||||||
| f"Found {len(all_need_references)} need references in {len(args.files)} files" | ||||||||||||||||
| ) | ||||||||||||||||
|
|
||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,6 +21,8 @@ | |
| import sys | ||
| from pathlib import Path | ||
|
|
||
| from src.extensions.score_source_code_linker.helpers import parse_info_from_known_good | ||
|
|
||
| logging.basicConfig(level=logging.INFO, format="%(message)s") | ||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
@@ -29,28 +31,57 @@ def main(): | |
| parser = argparse.ArgumentParser( | ||
| description="Merge multiple sourcelinks JSON files into one" | ||
| ) | ||
| parser.add_argument( | ||
| _ = parser.add_argument( | ||
| "--output", | ||
| required=True, | ||
| type=Path, | ||
| help="Output merged JSON file path", | ||
| ) | ||
| parser.add_argument( | ||
| _ = parser.add_argument( | ||
| "--known_good", | ||
| required=True, | ||
| help="Path to a required 'known good' JSON file (provided by Bazel).", | ||
| ) | ||
| _ = parser.add_argument( | ||
| "files", | ||
| nargs="*", | ||
| type=Path, | ||
| help="Input JSON files to merge", | ||
| ) | ||
|
|
||
| args = parser.parse_args() | ||
| all_files = [x for x in args.files if "known_good.json" not in str(x)] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a comment: |
||
|
|
||
| merged = [] | ||
| for json_file in args.files: | ||
| for json_file in all_files: | ||
| with open(json_file) as f: | ||
| data = json.load(f) | ||
| assert isinstance(data, list), repr(data) | ||
| merged.extend(data) | ||
| # If the file is empty e.g. '[]' there is nothing to parse, we continue | ||
| if not data: | ||
| continue | ||
| metadata = data[0] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need a worldview assertion here? |
||
| if not isinstance(metadata, dict) or "repo_name" not in metadata: | ||
| logger.warning( | ||
MaximilianSoerenPollak marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| f"Unexpected schema in sourcelinks file '{json_file}': " | ||
| "expected first element to be a metadata dict " | ||
| "with a 'repo_name' key. " | ||
| ) | ||
|
Comment on lines
+64
to
+68
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be an error instead? |
||
| # As we can't deal with bad JSON structure we just skip it | ||
| continue | ||
| if metadata["repo_name"] and metadata["repo_name"] != "local_repo": | ||
| hash, repo = parse_info_from_known_good( | ||
| known_good_json=args.known_good, repo_name=metadata["repo_name"] | ||
| ) | ||
MaximilianSoerenPollak marked this conversation as resolved.
Show resolved
Hide resolved
MaximilianSoerenPollak marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| metadata["hash"] = hash | ||
| metadata["url"] = repo | ||
| # In the case that 'metadata[repo_name]' is 'local_module' | ||
| # hash & url are already existing and empty inside of 'metadata' | ||
| # Therefore all 3 keys will be written to needlinks in each branch | ||
|
|
||
| for d in data[1:]: | ||
| d.update(metadata) | ||
| assert isinstance(data, list), repr(data) | ||
| merged.extend(data[1:]) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add comment: |
||
| with open(args.output, "w") as f: | ||
| json.dump(merged, f, indent=2, ensure_ascii=False) | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.