Skip to content

feat: implement dynamic date filtering for image search #1166#1256

Closed
Varun-JP wants to merge 1 commit intoAOSSIE-Org:mainfrom
Varun-JP:feat-task-1166
Closed

feat: implement dynamic date filtering for image search #1166#1256
Varun-JP wants to merge 1 commit intoAOSSIE-Org:mainfrom
Varun-JP:feat-task-1166

Conversation

@Varun-JP
Copy link
Copy Markdown

@Varun-JP Varun-JP commented Mar 31, 2026

Addressed Issues:

Fixes #1166

Screenshots/Recordings:

N/A (backend/thread-safety fix, no UI changes)

Additional Notes:

  • Added a global threading.RLock() in sync-microservice/app/utils/watcher.py to protect shared watcher state.
  • Wrapped critical sections in:
    • watcher_util_start_folder_watcher
    • watcher_util_stop_folder_watcher
    • watcher_util_restart_folder_watcher
    • watcher_util_is_watcher_running
    • watcher_util_get_watcher_info
  • This prevents concurrent /start requests from spawning multiple watcher threads (TOCTOU race condition).

AI Usage Disclosure:

We encourage contributors to use AI tools responsibly when creating Pull Requests. While AI can be a valuable aid, it is essential to ensure that your contributions meet the task requirements, build successfully, include relevant tests, and pass all linters. Submissions that do not meet these standards may be closed without warning to maintain the quality and integrity of the project. Please take the time to understand the changes you are proposing and their impact. AI slop is strongly discouraged and may lead to banning and blocking. Do not spam our repos with AI slop.

Check one of the checkboxes below:

  • This PR does not contain AI-generated code at all.
  • This PR contains AI-generated code. I have read the AI Usage Policy and this PR complies with this policy. I have tested the code locally and I am responsible for it.

I have used the following AI models and tools:

  • Gemini (Strategy & Logic)
  • Claude (Code Correction & Optimization)

Checklist

  • My PR addresses a single issue, fixes a single bug or makes a single improvement.
  • My code follows the project's code style and conventions
  • If applicable, I have made corresponding changes or additions to the documentation
  • If applicable, I have made corresponding changes or additions to tests
  • My changes generate no new warnings or errors
  • I have joined the [Discord server] and I will share a link to this PR with the project maintainers there
  • I have read the Contribution Guidelines
  • Once I submit my PR, CodeRabbit AI will automatically review it and I will address CodeRabbit's comments.
  • I have filled this PR template completely and carefully, and I understand that my PR may be closed without review otherwise.

Summary by CodeRabbit

  • Refactor
    • Improved concurrency handling and synchronization within the folder watcher system to enhance reliability and stability during concurrent operations.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

Walkthrough

Introduced a shared threading.RLock to synchronize access to watcher global state (watcher_thread, watched_folders, folder_id_map). Updated five watcher utility functions to perform critical state checks and mutations atomically under lock protection, preventing concurrent thread spawning and race conditions during watcher initialization and termination.

Changes

Cohort / File(s) Summary
Watcher State Synchronization
sync-microservice/app/utils/watcher.py
Added state_lock (threading.RLock) to guard watcher global state. Updated watcher_util_is_watcher_running(), watcher_util_start_folder_watcher(), watcher_util_stop_folder_watcher(), watcher_util_restart_folder_watcher(), and watcher_util_get_watcher_info() to acquire lock during state access and mutation, ensuring atomic operations and preventing TOCTOU race conditions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

Python

Poem

🐰 A lock for every thread so fine,
No more race conditions in this line,
The watcher state now safely guarded,
With RLock embrace, no care discarded,
Atomic flows make harmony divine! 🔒✨

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title claims to implement dynamic date filtering for image search, but the actual changes implement thread-safety synchronization using RLock in the watcher utility—a completely different feature. Update the PR title to accurately reflect the changes, e.g., 'fix: add thread-safe locking to watcher state management' or 'fix: prevent race condition in watcher initialization #1166'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed All coding requirements from issue #1166 are met: global RLock introduced, critical sections wrapped in watcher_util_start_folder_watcher, watcher_util_stop_folder_watcher, watcher_util_restart_folder_watcher, and watcher_util_is_watcher_running/watcher_util_get_watcher_info to prevent TOCTOU race condition.
Out of Scope Changes check ✅ Passed All changes are scoped to protecting watcher global state and preventing the TOCTOU race condition in sync-microservice/app/utils/watcher.py as required by issue #1166; no out-of-scope modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sync-microservice/app/utils/watcher.py (1)

336-347: ⚠️ Potential issue | 🔴 Critical

Critical: Thread cannot join itself when restart is called from the worker thread.

When a watched folder is deleted, watcher_util_handle_file_changes (line 89) calls this function from within the worker thread. This causes watcher_thread.join() to attempt joining the current thread, raising RuntimeError: cannot join current thread. The exception is caught, state is cleared, and a new thread spawns—while the old worker is still running. This defeats the PR's purpose of preventing multiple watcher threads.

🐛 Proposed fix: detect self-join and handle restart differently
 def watcher_util_restart_folder_watcher() -> bool:
     """
     Restart the folder watcher by stopping the current one and starting fresh.
 
     Returns:
         True if restart was successful, False otherwise
     """
+    global watcher_thread
+    
     with state_lock:
+        # Check if we're being called from the worker thread itself
+        if watcher_thread is not None and watcher_thread is threading.current_thread():
+            logger.info("Restart requested from worker thread - scheduling async restart")
+            # Signal stop; a separate mechanism must handle the actual restart
+            stop_event.set()
+            return False
+        
         logger.info("Restarting folder watcher...")
         watcher_util_stop_folder_watcher()
         return watcher_util_start_folder_watcher()

Alternatively, consider refactoring watcher_util_handle_file_changes to signal a restart request rather than performing it synchronously. A separate thread or event loop could then handle the restart after the worker exits.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sync-microservice/app/utils/watcher.py` around lines 336 - 347, The restart
function can try to join the watcher from within the watcher thread causing
RuntimeError; update watcher_util_restart_folder_watcher to detect if
threading.current_thread() is the active watcher_thread (the watcher_thread
variable used by watcher_util_start_folder_watcher /
watcher_util_stop_folder_watcher) and avoid calling watcher_thread.join() in
that case — instead set a restart_requested flag (or schedule a restart) and
return True so the worker can exit cleanly; also update
watcher_util_stop_folder_watcher to respect the restart flag and not clear state
prematurely, and ensure watcher_util_handle_file_changes signals the
restart_requested flag rather than doing a synchronous restart when running
inside the worker thread.
🧹 Nitpick comments (3)
sync-microservice/app/utils/watcher.py (3)

32-51: Shared state watched_folders is read without lock protection.

While list iteration during concurrent assignment is generally safe under CPython's GIL, this relies on implementation details. For consistency with the PR's thread-safety goals, consider protecting reads of watched_folders here and in watcher_util_handle_file_changes (line 75).

♻️ Proposed fix for watcher_util_get_folder_id_if_watched
 def watcher_util_get_folder_id_if_watched(file_path: str) -> Optional[str]:
     ...
     # Normalize the file path
     normalized_path = os.path.abspath(file_path)
 
+    with state_lock:
+        folders_snapshot = list(watched_folders)
+
     # Check if this path matches any of our watched folders
-    for folder_id, folder_path in watched_folders:
+    for folder_id, folder_path in folders_snapshot:
         if os.path.abspath(folder_path) == normalized_path:
             return folder_id
 
     return None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sync-microservice/app/utils/watcher.py` around lines 32 - 51, The function
watcher_util_get_folder_id_if_watched reads the shared list watched_folders
without synchronization; wrap the read/iteration in the same thread-safety lock
used elsewhere (e.g., watched_folders_lock) to prevent races, i.e., acquire the
lock before normalizing/iterating watched_folders and release it after returning
(or use a context manager) and apply the same locking discipline in
watcher_util_handle_file_changes to protect all accesses to watched_folders.

349-362: Lock ensures consistent snapshot of watcher state.

The implementation correctly reads all state under the lock for consistency.

Consider adding a more specific return type hint for better IDE support:

-def watcher_util_get_watcher_info() -> dict:
+def watcher_util_get_watcher_info() -> Dict[str, Any]:

You'd also need to add Any to the imports from typing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sync-microservice/app/utils/watcher.py` around lines 349 - 362, The return
type of watcher_util_get_watcher_info is currently untyped (dict); change it to
a more specific type such as dict[str, Any] (or Mapping[str, Any]) to improve
IDE/type support and add Any to the typing imports; update the function
signature watcher_util_get_watcher_info() -> dict[str, Any] and ensure the
module imports Any from typing alongside any existing typing imports.

364-376: Consider adding lock protection for consistency.

This function reads watcher_thread without acquiring state_lock, which is inconsistent with the thread-safety pattern applied elsewhere. While primarily used in the example main(), protecting it ensures consistent behavior if called from other contexts.

♻️ Proposed fix
 def watcher_util_wait_for_watcher() -> None:
     """
     Wait for the watcher to finish (useful for keeping the program running).
     """
-    if watcher_thread and watcher_thread.is_alive():
+    with state_lock:
+        thread = watcher_thread
+    
+    if thread and thread.is_alive():
         try:
-            watcher_thread.join()  # Wait indefinitely
+            thread.join()  # Wait indefinitely
         except KeyboardInterrupt:
             logger.info("Interrupted by user")
             watcher_util_stop_folder_watcher()
     else:
         logger.info("No watcher thread to wait for")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sync-microservice/app/utils/watcher.py` around lines 364 - 376, Protect reads
of the shared watcher_thread with the existing state_lock in
watcher_util_wait_for_watcher: acquire state_lock before checking watcher_thread
and watcher_thread.is_alive(), store a local reference, release the lock, then
perform the blocking join and KeyboardInterrupt handling as before (call
watcher_util_stop_folder_watcher() and use logger). This mirrors the
thread-safety pattern used elsewhere and keeps use of watcher_thread consistent
with state_lock while preserving behavior of watcher_util_wait_for_watcher.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@sync-microservice/app/utils/watcher.py`:
- Around line 336-347: The restart function can try to join the watcher from
within the watcher thread causing RuntimeError; update
watcher_util_restart_folder_watcher to detect if threading.current_thread() is
the active watcher_thread (the watcher_thread variable used by
watcher_util_start_folder_watcher / watcher_util_stop_folder_watcher) and avoid
calling watcher_thread.join() in that case — instead set a restart_requested
flag (or schedule a restart) and return True so the worker can exit cleanly;
also update watcher_util_stop_folder_watcher to respect the restart flag and not
clear state prematurely, and ensure watcher_util_handle_file_changes signals the
restart_requested flag rather than doing a synchronous restart when running
inside the worker thread.

---

Nitpick comments:
In `@sync-microservice/app/utils/watcher.py`:
- Around line 32-51: The function watcher_util_get_folder_id_if_watched reads
the shared list watched_folders without synchronization; wrap the read/iteration
in the same thread-safety lock used elsewhere (e.g., watched_folders_lock) to
prevent races, i.e., acquire the lock before normalizing/iterating
watched_folders and release it after returning (or use a context manager) and
apply the same locking discipline in watcher_util_handle_file_changes to protect
all accesses to watched_folders.
- Around line 349-362: The return type of watcher_util_get_watcher_info is
currently untyped (dict); change it to a more specific type such as dict[str,
Any] (or Mapping[str, Any]) to improve IDE/type support and add Any to the
typing imports; update the function signature watcher_util_get_watcher_info() ->
dict[str, Any] and ensure the module imports Any from typing alongside any
existing typing imports.
- Around line 364-376: Protect reads of the shared watcher_thread with the
existing state_lock in watcher_util_wait_for_watcher: acquire state_lock before
checking watcher_thread and watcher_thread.is_alive(), store a local reference,
release the lock, then perform the blocking join and KeyboardInterrupt handling
as before (call watcher_util_stop_folder_watcher() and use logger). This mirrors
the thread-safety pattern used elsewhere and keeps use of watcher_thread
consistent with state_lock while preserving behavior of
watcher_util_wait_for_watcher.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 26da2d06-4c21-42f0-b597-3c50755db897

📥 Commits

Reviewing files that changed from the base of the PR and between 8274ebd and 2c3da01.

📒 Files selected for processing (1)
  • sync-microservice/app/utils/watcher.py

@SIDDHANTCOOKIE
Copy link
Copy Markdown

PR without issue assignment is not appreciated please wait for reviewers to workout and assign your issues first :)

@Varun-JP
Copy link
Copy Markdown
Author

Varun-JP commented Apr 2, 2026

Hi @SIDDHANTCOOKIE, Thanks for the clarification, and apologies for opening the PR without prior assignment.
I’ll follow the proper process going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Critical Race Condition in Sync Microservice Watcher Initialization

2 participants