Navigation: Project README | Docs Index
This guide covers day-to-day usage, workflow details, and project behavior.
- ML engineers maintaining image-caption or image-tag datasets.
- Researchers running iterative dataset cleanup before training or fine-tuning.
- Synthetic data and LoRA workflow builders who need fast human-in-the-loop correction.
- Small teams that prefer local desktop workflows over heavier annotation platforms.
ImageTagger works with image and sidecar text pairs.
- Each image is associated with a text file that has the same base name.
- Example pair:
- photo01.jpg
- photo01.txt
- If the text file does not exist, it is created on save.
- Open a folder with images.
- If needed, install Ollama from https://ollama.com.
- Connect to an Ollama server and select a model.
- Select one or more images in the left panel.
- Run Generate, Validate, AI Find, and Fixup as needed.
- Ollama is the local model runtime used by ImageTagger. If you do not already have it, install it from https://ollama.com.
- Recommended model: Qwen3-VL-8B.
- Current practical recommendation: start with Qwen3-VL-8B for the best observed performance/quality balance.
- Thread count defaults to 1 for safety and to avoid overcommitting slower GPUs.
- Setting thread count to 0 enables auto mode, which rebalances thread count dynamically and usually produces good results once it settles.
- On RTX 3090-class and newer top-end GPUs, Qwen3-VL-8B can often scale up to 16 threads with a significant speedup compared to a single thread.
- The best thread count depends on your GPU, VRAM pressure, and what else is running on the machine, so it is worth experimenting.
- During Generate, Validate, and AI Find tasks, the current thread count is shown in the status line.
Generate adds tags and/or description to selected images.
- Use checkboxes to choose Tags and Description.
- Timeout is a per-image budget.
- Retries can be configured.
- Downscale controls image query resolution before sending to Ollama.
- Threads can be fixed or set to 0 for auto behavior.
- Default is 1 thread for safety; on strong GPUs, especially RTX 3090-class and newer hardware running Qwen3-VL-8B, testing higher values up to 16 can produce substantial speedups.
- Setting threads to 0 enables automatic balancing; after it settles, it usually gives good results and the live thread count is visible in the status line while a task is running.
Validate checks existing annotations and writes fixup files when needed.
- Runs on selected images with existing annotations.
- Creates one .fixup file per image with detected issues.
- Removes stale .fixup files when validation result is OK.
AI Find checks whether selected images contain a target concept.
- Enter a concept in the AI Find field.
- Run on selected images.
- Matching images are tracked and recorded in fixup/search data.
Fixup opens the merge dialog for the current image when a fixup exists.
- Review proposed description and tag changes.
- Apply accept, reject, merge, and next/previous navigation actions.
- Regenerate can be run from inside the fixup dialog with its own controls.
- Useful when existing tags and description are completely messed up; regenerate fresh candidates and compare side-by-side before merging.
Inside the merge dialog, regeneration has local controls that can override your main-window defaults for the current fixup pass:
- Server URL input, Fetch models, model dropdown, and Use button let you switch regenerate calls to a different Ollama/OpenAI-compatible endpoint and model.
- Description prompt and Tags prompt tabs let you locally edit prompt text used by regenerate.
- These overrides are scoped to merge-dialog regenerate behavior and do not replace your main-window model selection.
This is especially useful when a model struggles to regenerate good tags or description with its default settings. You can test a stronger model, a different endpoint, or stricter local prompt wording immediately, then compare results side-by-side before merging.
The merge dialog presents a meld-like 2-way comparison:
- Left pane: current (existing) annotation.
- Right pane: proposed (fixup) annotation.
Navigation:
- Arrow keys: Move through comparison rows inside the table.
- Alt+Up / Alt+Down: Jump to previous/next difference row.
Quick Actions:
- Alt+T: Quick add a new tag not present in existing rows.
- Alt+A: Accept all proposed rows and merge.
- Alt+R: Start regeneration to create fresh candidates.
- Alt+Enter: Merge current change (save left/current pane to image and proceed to next image).
- Left arrow key: Accept proposed change from right into result.
- Del key: Delete selected row.
Use Merge/Reject buttons to apply your final decision and navigate to the next fixup image.
The merge comparison table also supports mouse and trackpad actions:
- Double-click (left button): triggers the current row action (apply proposed value, delete current value, or add suggested tag depending on row/action state).
- Right-click opens row context menu.
- On macOS trackpads, a two-finger tap maps to right-click and opens the same row context menu.
- Horizontal swipe/drag can trigger row actions when enabled.
- Horizontal scroll (mouse horizontal wheel or trackpad horizontal two-finger scroll) can trigger row actions when enabled.
Defaults:
- Double-click action: enabled.
- Swipe actions: disabled.
- Horizontal scroll actions: disabled.
- Horizontal scroll reverse: disabled.
- Horizontal scroll stop-idle seconds: 0.45.
- Horizontal scroll row-target mode: 3 (safest mode).
Configuration is stored in config.json under merge_table_mouse_actions:
"merge_table_mouse_actions": {
"double_click_action_enabled": true,
"swipe_actions_enabled": false,
"horizontal_scroll_actions_enabled": false,
"horizontal_scroll_reverse_enabled": false,
"horizontal_scroll_stop_idle_seconds": 0.45,
"horizontal_scroll_row_target_mode": 3
}Possible values:
double_click_action_enabled:trueorfalse.swipe_actions_enabled:trueorfalse.horizontal_scroll_actions_enabled:trueorfalse.horizontal_scroll_reverse_enabled:trueorfalse.horizontal_scroll_stop_idle_seconds: number>= 0.0disables stop-before-rearm gating, so long continuous scrolling can trigger repeated actions.> 0requires scrolling to stop/idle for that many seconds before another scroll action can trigger.
horizontal_scroll_row_target_mode:1,2, or3.1: action applies to row under mouse pointer.2: action applies to selected row regardless of pointer position.3: action applies only when pointer is over selected row (default).
Use the image list filter to narrow large datasets quickly.
Supported terms:
fixup: images with a fixup file.untagged: images that have no annotation (.txt) file at all.resolution <, >, <=, >=: images matching a resolution threshold in megapixels."tag": exact tag match.'text': case-insensitive text match against annotation content.
Operators:
!or~: NOT (negates the following term or group)&: AND|: OR( ... ): grouping
Precedence (highest to lowest): NOT, AND, OR — same as C. So a | b & c is a | (b & c) and !a & b is (!a) & b.
Examples:
!fixup— images without fixup filesresolution < 1.0— images with resolution lower than 1 MPxresolution >= 5— images with resolution 5 MPx or higher(resolution > 5) & 'landscape'— high-res landscape imagesfixup & "portrait"!"landscape" | 'sunset'~fixup & ("animal" | 'night')
See the full cross-platform shortcut reference:
Prompt files are optional and loaded from the prompts directory in the project root. If missing, built-in defaults are used.
- prompts/description_prompt.txt
- prompts/tags_prompt.txt
- prompts/validation_prompt.txt
- prompts/search_prompt.txt
In the app, prompt tabs allow in-memory apply, save to file, and reset to default behavior.
Main window:
Merge dialog:
ImageTagger is built with Python and PyQt6.
Python 3.9 or newer is required.
Windows, Linux, and macOS are tested.
Install and run scripts are provided for all three platforms:
- Windows:
install.bat,run.bat,update.bat - Linux / macOS:
install.sh,run.sh,update.sh
config.json stores session and UI state, including:
- last opened directory
- selected Ollama server and model
- query downscale value (llm_max_resolution_mpx)
- thread setting
- window geometry state
- merge-dialog mouse action settings in
merge_table_mouse_actions
For a full list of Ollama and auto-mode keys, see ollama_settings.md.
This project is heavily inspired by TagGUI.
TagGUI deserves full credit for the core layout direction and practical workflow ideas that informed this project.

