Draft: Switch to image_file_name as the standard key for cache and other dictionaries.#77
Closed
Abdul-Mukit wants to merge 3 commits intoMultimediaTechLab:mainfrom
Closed
Conversation
…_dict and image_info_dict with image file name as key to ensure uniform key accross the code base. refactor: dataset_utils.map_annotations_to_image_names returns annotations list mapped to image file names instead of image_id. refactor: several variable names made more descriptive. docs: docstrings updated.
…stead of image_path as the key. refactor: annotations_index renamed to annotations_dict.
Contributor
Author
|
@henrytsui000 I would like your initial opinion about this matter before I invest more time in this MR. I appreciate your time. Thank you. Please let me know what you like, or don't like. Does switching to |
Contributor
Author
|
Working on it more. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
This MR attempts to establish "image file names" (e.g. 000123_xyz_.jpg) as the consistent
keywhen creatingimage_infoorannotation_infodictionaries or saving cache liketrain.cache/val.cache.In the current behaviour, "image paths" are used as keys for data cache.
image_idread from coco json files are used as keys in some dictionaries. In some other functions,image_idis being derived from image file names based on the incorrect assumption that image file names will always be int convertible. In some cases, image files names without extension is used as keys.Due to not following a standard/uniform convention throughout this code base, several problems are arising: #67 and #72.
This is creating friction and has to be improved. I think it would be best to settle on a uniform/standard key for all/most of the workflow. This MR attempts to make the codebase more predictable and thus easier to debug.
Please note that, using image file name as the standard key will allow for:
phase_name. The folder "images" is already consistent between COCO and yolo format.image_idfrom paths."image_id": int(Path(img_path).stem)inpredicts_to_json(). The assumption that image file name is int convertible and will be equal to image_id defined in the json file is inaccurate. Often industrial applications have image file names like<time>_<date>_<location>.jpgto be able to track performance/incidents.To address these issues, this MR will try to switch all necessary keys to just the image file names, including the extentions.
This is a WIP. I am publishing it to gain feedback from others. Once I successfully train a model after these changes I'll remove the Draft.
Any thoughts are welcome.