Skip to content

UN-3266 [FEAT] Async Executor Backend for Prompt Studio#1849

Open
harini-venkataraman wants to merge 88 commits intomainfrom
feat/async-prompt-service-v2
Open

UN-3266 [FEAT] Async Executor Backend for Prompt Studio#1849
harini-venkataraman wants to merge 88 commits intomainfrom
feat/async-prompt-service-v2

Conversation

@harini-venkataraman
Copy link
Contributor

@harini-venkataraman harini-venkataraman commented Mar 11, 2026

What

Introduces a pluggable executor system that replaces Docker-container-based tool execution with Celery worker tasks, and migrates the Prompt Studio IDE to an async execution model using Socket.IO for result delivery. Gated behind the async_prompt_execution feature flag for safe rollout.

Why

The existing architecture has several limitations:

  • Prompt Studio IDE executions block HTTP connections — Django workers are tied up waiting for LLM responses (up to minutes per prompt), limiting concurrency and causing timeouts
  • Docker-container-based tool execution requires spinning up containers per workflow step, adding overhead and complicating deployments
  • No real-time feedback — the frontend polls for results, wasting resources and providing poor UX
  • Tight coupling between prompt-service HTTP calls and the Django backend makes it hard to scale execution independently

How

Backend (65 files)

  • Async Prompt Studio views: index_document, fetch_response, single_pass_extraction now return HTTP 202 (accepted) with a task_id instead of blocking. Gated by async_prompt_execution feature flag — old sync path preserved as fallback
  • Celery callback tasks (backend/prompt_studio/prompt_studio_core_v2/tasks.py): ide_index_complete, ide_prompt_complete, ide_prompt_error etc. run on prompt_studio_callback queue, perform ORM writes via OutputManagerHelper, and emit prompt_studio_result Socket.IO events
  • Worker dispatch Celery app (backend/backend/worker_celery.py): A second Celery app instance that coexists with Django's Celery app, configured to route tasks to executor workers
  • prompt_studio_helper.py rewrite: Removed PromptTool HTTP calls entirely. New build_index_payload(), build_fetch_response_payload(), build_single_pass_payload() methods construct ExecutionContext objects with all ORM data pre-loaded
  • Removed: backend/backend/workers/, file_execution_tasks.py, celery_task.py (old in-process workers)

Workers (70 files, ~19,500 new lines)

  • Executor Worker (workers/executor/): New WorkerType.EXECUTOR Celery worker with LegacyExecutor handling all operations: extract, index, answer_prompt, single_pass_extraction, summarize, agentic_extraction, structure_pipeline
  • Pluggable Executor Framework: BaseExecutorExecutorRegistry (class-decorator self-registration) → ExecutionOrchestratorExecutionDispatcher (Celery send_task)
  • ExecutorToolShim: Lightweight stand-in for BaseTool that satisfies SDK1 adapter interfaces without Docker context
  • Structure tool task (workers/file_processing/structure_tool_task.py): Celery-native replacement for Docker-based StructureTool.run() with profile overrides, smart table detection, and output file management
  • 26 test files (~10,000+ lines): Comprehensive coverage from unit tests through full Celery eager-mode integration tests

SDK1 (22 files)

  • Execution framework (unstract/sdk1/src/unstract/sdk1/execution/): ExecutionContext, ExecutionResult (serializable DTOs for Celery JSON transport), ExecutionDispatcher (dispatch() + dispatch_with_callback()), BaseExecutor, ExecutorRegistry

Frontend (275 files)

  • Async prompt execution: usePromptStudioSocket hook listens for prompt_studio_result Socket.IO events. usePromptRun rewritten from polling to fire-and-forget. PromptRun.jsx conditionally renders async or sync path based on feature flag
  • CRA → Vite migration: Build tooling migrated to Vite + Bun with Biome linter replacing ESLint
  • Dashboard metrics UI: New metrics dashboard with charts, LLM usage table, and recent activity
  • Card-based layouts: New card grid views for pipelines and API deployments

Docker / Infrastructure

  • Added: worker-executor-v2, worker-prompt-studio-callback, worker-metrics
  • Promoted: All workers-v2 services from opt-in (profiles: [workers-v2]) to default

Architecture Change

BEFORE:  FE → Django (blocks) → PromptTool HTTP → prompt-service → LLM
AFTER:   FE → Django (HTTP 202) → ExecutionDispatcher → Executor Worker → LLM
              ↑ Socket.IO result    (Celery send_task)    (LegacyExecutor)

Can this PR break any existing features? If yes, please list possible items. If no, please explain why.

Yes, potential breaking changes — mitigated by feature flag:
Prompt Studio IDE async path — gated by async_prompt_execution feature flag. When flag is OFF (default), all 3 endpoints (index_document, fetch_response, single_pass_extraction) use the old sync path returning HTTP 200. No behavior change for existing users.

Review Guidelines

This PR touches 441 files across backend, frontend, workers, and SDK1. Below is a structured review path to navigate it efficiently.

Code Structure Overview

unstract/sdk1/src/unstract/sdk1/execution/   ← Core abstractions (review FIRST)
    context.py          ExecutionContext dataclass (the universal payload)
    result.py           ExecutionResult dataclass (success/failure container)
    executor.py         BaseExecutor ABC (the executor contract)
    registry.py         ExecutorRegistry (class-decorator self-registration)
    dispatcher.py       ExecutionDispatcher (Celery send_task, 3 dispatch modes)
    orchestrator.py     ExecutionOrchestrator (worker-side: find executor → execute)

workers/executor/                            ← Executor worker (review SECOND)
    worker.py           Celery app entry point
    tasks.py            Single task: execute_extraction (deserialize → orchestrate → return)
    executor_tool_shim.py   BaseTool substitute for worker context
    executors/
        legacy_executor.py  Main executor: 7 operations via _OPERATION_MAP strategy pattern
        answer_prompt.py    Prompt answering pipeline (retrieve → LLM → postprocess)
        index.py            Document indexing (vectorDB operations)
        retrieval.py        RetrievalService + 7 retriever strategies
        variable_replacement.py, postprocessor.py, json_repair_helper.py, usage.py

backend/prompt_studio/prompt_studio_core_v2/ ← Django async wiring (review THIRD)
    views.py            3 endpoints return HTTP 202 (gated by feature flag)
    prompt_studio_helper.py   build_*_payload() methods construct ExecutionContext
    tasks.py            Celery callbacks: ORM writes + Socket.IO emission

frontend/src/                                ← Frontend async path (review FOURTH)
    hooks/usePromptRun.js           Fire-and-forget POST + 5-min timeout safety net
    hooks/usePromptStudioSocket.js  Socket.IO listener for prompt_studio_result
    components/.../PromptRun.jsx    Headless queue manager (dequeues + calls runPrompt)

Recommended Review Order

Review in dependency order — each layer builds on the previous:

Step Area Key Files What to Look For
1 SDK1 Execution Framework execution/context.py, result.py, dispatcher.py, registry.py Contract stability: are to_dict()/from_dict() round-trips correct? Is the Operation enum complete? Queue naming (celery_executor_{name}).
2 Executor Worker Entry executor/tasks.py, executor/worker.py Single entry point execute_extraction: retry policy, error handling, log correlation.
3 LegacyExecutor Core executors/legacy_executor.py (focus on _OPERATION_MAP + execute()) Strategy pattern routing. Unsupported operation handling. Error wrapping.
4 LegacyExecutor Handlers answer_prompt.py, index.py, retrieval.py Parameter contracts: do the keys in executor_params match what build_*_payload() sends? Lazy import pattern (_get_prompt_deps(), _get_indexing_deps()).
5 Backend Views (async path) views.py lines 351–583 Feature flag gating. 202 vs 200 response. dispatch_with_callback usage with correct callback task names and queue.
6 Backend Payload Builders prompt_studio_helper.py (build_index_payload, build_fetch_response_payload, build_single_pass_payload) ORM data loading. Are all required params packed into executor_params? Key compatibility with executor handlers.
7 Backend Callbacks tasks.py (callback tasks) ide_prompt_complete: ORM writes via OutputManagerHelper. Socket.IO emission shape. Error callback cleanup. State store setup/teardown.
8 Frontend usePromptRun.js, usePromptStudioSocket.js, PromptRun.jsx Socket event shape matches backend _emit_result(). Timeout handling. Status cleanup on failure.
9 Docker/Infra docker/docker-compose.yaml New services: worker-executor-v2, worker-prompt-studio-callback. Removed old workers. Queue bindings.
10 Tests workers/tests/test_sanity_phase*.py Integration tests validate end-to-end Celery chains in eager mode.

Data Flow (End-to-End)

User clicks "Run" in Prompt Studio IDE
  │
  ▼
[Frontend] PromptRun.jsx dequeues → usePromptRun.runPromptApi()
  │  POST /fetch_response/{tool_id}  (fire-and-forget)
  ▼
[Django View] views.py:fetch_response()
  │  if feature_flag ON → build_fetch_response_payload() → dispatch_with_callback()
  │  Returns HTTP 202 {task_id, run_id, status: "accepted"}
  ▼
[RabbitMQ] → celery_executor_legacy queue
  ▼
[Executor Worker] tasks.py:execute_extraction()
  │  ExecutionOrchestrator → ExecutorRegistry.get("legacy") → LegacyExecutor
  │  → _handle_answer_prompt() → RetrievalService → LLM call → postprocess
  │  Returns ExecutionResult.to_dict()
  ▼
[Celery link callback] → prompt_studio_callback queue
  ▼
[Django Callback Worker] tasks.py:ide_prompt_complete()
  │  OutputManagerHelper.handle_prompt_output_update() (ORM write)
  │  _emit_result() → Socket.IO "prompt_studio_result" event
  ▼
[Frontend] usePromptStudioSocket.onResult()
  │  handleCompleted("fetch_response", result)
  │  → updatePromptOutputState(data) → clears spinner
  ▼
User sees result in UI

Known Code Duplication

Where What's Duplicated Severity Notes
views.py — 3 view actions Dispatch pattern: build_payload → get_dispatcher → dispatch_with_callback → return 202 Low Each view has different ORM/param resolution before the common dispatch. Could be a helper but manageable at 3 instances.
tasks.py — callback tasks ide_index_complete and ide_prompt_complete follow same structure: extract kwargs → setup state → check result → ORM work → emit → cleanup Low Different ORM logic per callback type. Acceptable for 2 callbacks; monitor if more are added.
tasks.py — legacy tasks run_index_document, run_fetch_response, run_single_pass_extraction kept alongside new callback tasks Intentional Legacy tasks retained for backward compatibility during feature flag rollout. Can be removed once flag is permanently ON.

Files Safe to Skim

  • workers/tests/ — 24 test files, ~10,000 lines. Well-structured but high volume. Focus on test_sanity_phase2.py (full Celery chain) and test_sanity_phase4.py (IDE payload compatibility) as representative examples.
  • workers/executor/executors/retrievers/ — 7 retriever implementations. All follow the same pattern. Reviewing one (simple.py) covers the pattern.
  • Architecture docs at repo root (architecture-*.md, phase*.md) — Reference material, not code.

Relevant Docs

  • Architecture: architecture-executor-system.md, architecture-flow-diagram.md, architecture-sequence-diagrams.md in repo root
  • Migration phases: architecture-migration-phases.md
  • Rollout: rollout-plan.md

Related Issues or PRs

  • Async Prompt Studio Execution epic

Dependencies Versions / Env Variables

New env variables:

Variable Purpose Default
FLIPT_SERVICE_AVAILABLE Enable Flipt feature flag service false

Notes on Testing

  • Workers: cd workers && uv run pytest -v — 490+ tests (444 in workers/tests/ + extras)
  • SDK1: cd unstract/sdk1 && uv run pytest -v — 146+ tests
  • Backend callbacks: cd backend && python -m pytest prompt_studio/prompt_studio_core_v2/test_tasks.py -v
  • Manual testing: Enable flag in Flipt (async_prompt_execution=true), trigger prompt runs in IDE, verify Socket.IO events deliver results via Network → WS → Messages tab
  • Feature flag OFF: Verify all sync paths still work identically to main branch

Screenshots

N/A (primarily backend/worker architecture change; frontend UX unchanged when feature flag is off)

Checklist

I have read and understood the Contribution Guidelines.

harini-venkataraman and others added 30 commits February 19, 2026 20:39
Conflicts resolved:
- docker-compose.yaml: Use main's dedicated dashboard_metric_events queue for worker-metrics
- PromptCard.jsx: Keep tool_id matching condition from our async socket feature
- PromptRun.jsx: Merge useEffect import from main with our branch
- ToolIde.jsx: Keep fire-and-forget socket approach (spinner waits for socket event)
- SocketMessages.js: Keep both session-store and socket-custom-tool imports + updateCusToolMessages dep
- SocketContext.js: Keep simpler path-based socket connection approach
- usePromptRun.js: Keep Celery fire-and-forget with socket delivery over polling
- setupProxy.js: Accept main's deletion (migrated to Vite)
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR introduces a substantial architectural shift for Prompt Studio IDE: replacing blocking HTTP-based tool execution with an async Celery + Socket.IO pipeline. The core design — ExecutionContext/ExecutionResult DTOs, ExecutorRegistry, ExecutionDispatcher, and callback tasks — is well-structured and the SDK1 abstractions are clean. However, several correctness and safety issues need resolution before this is production-ready.

Key findings:

  • Feature flag gating is absent — all three IDE endpoints (index_document, fetch_response, single_pass_extraction) unconditionally use the async path and return HTTP 202. The PR description promises a safe rollout where the sync path is preserved when async_prompt_execution=false, but this is not implemented. Any deployment immediately breaks all users without the new worker services running.
  • _resolve_llm_ids null dereference — when a tool has no monitor_llm/challenge_llm and no default LLM profile, ProfileManager.get_default_llm_profile() returns None and the subsequent dp.llm.id access raises AttributeError instead of DefaultProfileError.
  • set_document_indexing not rolled back on dispatch failure — if dispatch_with_callback raises after build_index_payload marks a document as "indexing in progress", no cleanup occurs and the document is permanently stuck in that state.
  • Partial ORM state in ide_index_completemark_document_indexed is written before the ProfileManager lookup; a missing profile leaves the document marked indexed while the frontend receives a failure event.
  • IDOR risk in task_status — any authenticated user can poll any task_id regardless of tool ownership by supplying an arbitrary UUID in the URL.
  • Socket events not scoped per toolprompt_studio_result events are emitted to the user-level log_events_id room; with multiple Prompt Studio tabs open, results from one tool pollute another tool's state.
  • _emit_error missing _json_safe()_emit_result sanitizes its payload; _emit_error does not, creating a potential silent failure path for future callers passing ORM/UUID values in the extra dict.
  • 5-minute timeout stale closure — the safety-net setTimeout in usePromptRun captures run-N identifiers and can falsely clear run N+1's spinner if the same prompt is re-triggered within the window.

Confidence Score: 2/5

  • Not safe to merge — the feature flag gate is missing, causing an immediate breaking change for all users; several null-dereference bugs, an IDOR, and a multi-tab state corruption issue also need resolution.
  • The architectural design is sound and the SDK1 abstractions are clean, but the implementation has multiple critical correctness issues: (1) the async path is unconditional with no feature-flag fallback, meaning all deployments immediately break without the new worker services; (2) _resolve_llm_ids raises AttributeError on a missing default profile; (3) set_document_indexing is not rolled back on broker failure; (4) the task_status IDOR allows cross-user data leakage; (5) Socket.IO events are per-user not per-tool, causing multi-tab state corruption. Together these block a safe production rollout.
  • backend/prompt_studio/prompt_studio_core_v2/views.py (missing feature flag gate, IDOR), backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py (null dereferences, unrolled-back state), backend/prompt_studio/prompt_studio_core_v2/tasks.py (partial ORM state, missing _json_safe), frontend/src/hooks/usePromptStudioSocket.js (multi-tab socket scope), frontend/src/hooks/usePromptRun.js (stale timeout closure).

Important Files Changed

Filename Overview
backend/prompt_studio/prompt_studio_core_v2/views.py All three IDE endpoints (index_document, fetch_response, single_pass_extraction) unconditionally dispatch to Celery workers and return HTTP 202, bypassing the described feature-flag gate. The new task_status endpoint exposes an IDOR risk. Redundant inline uuid imports remain in all three action methods.
backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py New _resolve_llm_ids helper raises AttributeError instead of DefaultProfileError when no default profile is set. build_index_payload calls validators before null-checking default_profile. set_document_indexing is not rolled back if dispatch_with_callback raises. build_fetch_response_payload calls _resolve_llm_ids before the profile_manager null guard.
backend/prompt_studio/prompt_studio_core_v2/tasks.py Well-structured callback tasks with proper state store setup/teardown. _emit_error lacks _json_safe() serialization, inconsistent with _emit_result. ide_index_complete marks document indexed before fetching ProfileManager, leaving potential partial ORM state. Legacy tasks preserved alongside new callback tasks as intended.
backend/backend/worker_celery.py New _WorkerDispatchCelery Celery subclass for worker-v2 dispatch. DB_USER is not URL-encoded in result backend connection string. Singleton init uses double-check without a lock. Declared task_queues=["executor"] does not match the actual dispatch queue name celery_executor_legacy.
unstract/sdk1/src/unstract/sdk1/execution/dispatcher.py Clean ExecutionDispatcher with three dispatch modes (sync, async, callback). Well-documented with docstrings. Minor: dispatch_with_callback passes a pre-generated task_id so callbacks can embed it in cb_kwargs before dispatch — a sound pattern, but relies on the same ID being used for both the task and the socket emit.
unstract/sdk1/src/unstract/sdk1/execution/context.py Clean ExecutionContext dataclass with solid to_dict()/from_dict() round-trip, enum normalization in __post_init__, and auto-generated request_id. No issues found.
workers/executor/tasks.py Single execute_extraction entry point is well-structured. autoretry_for retries on transient errors while returning a structured ExecutionResult.failure() on non-retryable errors to keep the link (success) path engaged. Inheriting link/link_error callbacks on retry can fire ide_index_complete/ide_prompt_complete multiple times in some edge cases.
workers/executor/executors/answer_prompt.py SSRF protection in _is_safe_public_url validates the webhook URL by resolving DNS at validation time, but the actual HTTP request is made later in postprocess_data. A DNS rebinding attack can bypass this protection by re-resolving to an internal IP after validation passes.
frontend/src/hooks/usePromptRun.js Migrated from polling to fire-and-forget. The 5-minute timeout safety net captures promptId/docId/profileId from run N and can falsely clear run N+1's status if a second run is triggered within the window. HTTP 200 "pending" responses are handled correctly with an info alert.
frontend/src/hooks/usePromptStudioSocket.js New socket listener for prompt_studio_result events. Socket events are scoped to log_events_id (per-user, not per-tool), so a multi-tab user sees results from one tool bleed into another tool's tab. clearResultStatuses will silently no-op if item.profile_manager doesn't match the stored UUID string format, leaving spinners stuck.
unstract/sdk1/src/unstract/sdk1/execution/registry.py Clean class-decorator registry with strict duplicate detection, list_executors(), and a clear() method for test isolation. No issues found.

Sequence Diagram

sequenceDiagram
    participant FE as Frontend (PromptRun.jsx)
    participant DV as Django View (views.py)
    participant PSH as PromptStudioHelper
    participant RMQ as RabbitMQ
    participant EW as Executor Worker (execute_extraction)
    participant LX as LegacyExecutor
    participant CBW as Callback Worker (prompt_studio_callback)
    participant SIO as Socket.IO

    FE->>DV: POST /fetch_response/{tool_id}
    DV->>PSH: build_fetch_response_payload()
    Note over PSH: ORM loads, extraction, indexing (blocking)
    PSH-->>DV: (ExecutionContext, cb_kwargs)
    DV->>DV: pre-generate executor_task_id
    DV->>RMQ: send_task("execute_extraction", link=ide_prompt_complete, task_id=executor_task_id)
    DV-->>FE: HTTP 202 {task_id, run_id, status:"accepted"}

    RMQ->>EW: execute_extraction(context_dict)
    EW->>LX: ExecutionOrchestrator.execute(context)
    LX->>LX: _handle_answer_prompt() → RetrievalService → LLM
    LX-->>EW: ExecutionResult
    EW-->>RMQ: result.to_dict() [stores in PG result backend]

    RMQ->>CBW: ide_prompt_complete(result_dict, callback_kwargs)
    CBW->>CBW: OutputManagerHelper.handle_prompt_output_update()
    CBW->>SIO: emit "prompt_studio_result" to log_events_id room

    SIO-->>FE: prompt_studio_result event
    FE->>FE: usePromptStudioSocket.onResult()
    FE->>FE: updatePromptOutputState(data)
    FE->>FE: clearResultStatuses(data) [remove spinners]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py
Line: 484-491

Comment:
**`AttributeError` on `None` default profile in `_resolve_llm_ids`**

When `tool.monitor_llm` or `tool.challenge_llm` is `None` (not explicitly set), the code falls through to `ProfileManager.get_default_llm_profile(tool)`. If that also returns `None` (no default profile configured), the next line `str(dp.llm.id)` immediately raises `AttributeError: 'NoneType' object has no attribute 'llm'`. This surfaces as an opaque 500 error rather than the expected `DefaultProfileError`.

This can happen when a user creates a Prompt Studio tool, sets a per-prompt profile manager, but has never configured a default tool-level profile AND has not set explicit `monitor_llm`/`challenge_llm` adapters.

Note that `_resolve_llm_ids` is called in `build_fetch_response_payload` **before** the `if not profile_manager: raise DefaultProfileError()` guard (line 536), so a missing default profile causes an `AttributeError` that bypasses the intended error handling entirely.

```suggestion
        if monitor_llm_instance:
            monitor_llm = str(monitor_llm_instance.id)
        else:
            dp = ProfileManager.get_default_llm_profile(tool)
            if not dp:
                raise DefaultProfileError()
            monitor_llm = str(dp.llm.id)
        if challenge_llm_instance:
            challenge_llm = str(challenge_llm_instance.id)
        else:
            dp = ProfileManager.get_default_llm_profile(tool)
            if not dp:
                raise DefaultProfileError()
            challenge_llm = str(dp.llm.id)
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/tasks.py
Line: 74-94

Comment:
**`_emit_error` missing `_json_safe()` serialization**

`_emit_result` wraps its payload through `_json_safe()` to handle `uuid.UUID` and `datetime` objects before emitting via Socket.IO. `_emit_error` has no equivalent sanitization — it calls `_emit_websocket_event` directly with raw data.

In practice the current callers store only strings in `extra` (e.g. from `cb.get("document_id", "")`), so this is latent rather than immediately broken. However, future callers could pass ORM values or UUIDs into `extra`, causing `json.dumps` inside `_emit_websocket_event` to raise a `TypeError` and leave the frontend without any error feedback at all.

```suggestion
    if extra:
        data.update(extra)
    _emit_websocket_event(
        room=log_events_id,
        event=PROMPT_STUDIO_RESULT_EVENT,
        data=_json_safe(data),
    )
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "Merge branch 'main' ..."

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR introduces a pluggable async executor framework that replaces blocking Django HTTP calls to the prompt-service with a fire-and-forget Celery dispatch chain (Django → RabbitMQ → ExecutorWorker → CallbackWorker → Socket.IO). The architecture is well-designed: the SDK1 ExecutionContext/ExecutionResult DTOs are clean and fully serialisable, the ExecutorRegistry self-registration pattern is solid, and the callback task structure correctly separates ORM writes from result delivery. The frontend migration to a Socket.IO listener (usePromptStudioSocket) with a 5-minute timeout safety net is a sensible approach.

Key issues found during review:

  • Feature flag gate is missing: The PR description explicitly states that all three IDE endpoints (index_document, fetch_response, single_pass_extraction) are gated behind the async_prompt_execution Flipt flag with a sync fallback. The actual views.py contains no feature flag check — all users receive HTTP 202 unconditionally. This directly breaks the stated safe-rollout strategy.
  • IDOR risk in task_status: The endpoint looks up a Celery task by task_id from the URL without verifying that the task was produced by an operation on the tool identified by pk. A user with access to any tool can query results belonging to a different user's execution.
  • Null guards after use in prompt_studio_helper.py: Both build_fetch_response_payload and build_single_pass_payload call validators on profile_manager/default_profile and then dereference them before the if not …: raise DefaultProfileError() guard, making those guards dead code for the None case.
  • Queue name mismatch in worker_celery.py: task_queues=[Queue("executor")] is configured, but ExecutionDispatcher dispatches to "celery_executor_legacy". The declared queue is never consumed by the dispatched tasks.

Confidence Score: 2/5

  • Not safe to merge without the feature flag gate — the async path is enabled for all users despite the PR's claim of a gated rollout, and there is an IDOR risk in the new task_status endpoint.
  • The SDK1 execution framework, executor worker, and callback task infrastructure are well-implemented. However, the missing feature flag check is a critical gap between the stated design (safe, flag-gated rollout) and the actual implementation (always-async). Every Prompt Studio user is immediately affected. The task_status IDOR allows cross-user data access. The null-guard ordering issues are latent bugs that could surface with incomplete profile configurations. These issues need to be resolved before merging.
  • backend/prompt_studio/prompt_studio_core_v2/views.py (missing feature flag gate + IDOR), backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py (null guard ordering in two build methods), backend/backend/worker_celery.py (queue name mismatch)

Important Files Changed

Filename Overview
backend/prompt_studio/prompt_studio_core_v2/views.py Three new async endpoints (index_document, fetch_response, single_pass_extraction) are missing the advertised async_prompt_execution feature flag gate, unconditionally breaking the sync fallback; task_status endpoint has an IDOR risk; redundant inner import uuid in three methods.
backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py New build_*_payload methods correctly pre-load ORM data and construct ExecutionContext objects; two methods have misordered null guards (DefaultProfileError check placed after the variable is already dereferenced).
backend/backend/worker_celery.py New second Celery app for dispatching to executor workers. Queue name in task_queues config ("executor") does not match the actual queue used by ExecutionDispatcher ("celery_executor_legacy"), making the queue config a no-op.
backend/prompt_studio/prompt_studio_core_v2/tasks.py New Celery callback tasks (ide_index_complete, ide_prompt_complete, ide_index_error, ide_prompt_error) cleanly handle ORM writes and Socket.IO emission; _json_safe serialisation guard is present; legacy tasks are correctly retained for rollback safety.
unstract/sdk1/src/unstract/sdk1/execution/context.py Clean, well-documented ExecutionContext dataclass with correct to_dict/from_dict round-trips, enum normalisation in post_init, and auto-generated request_id; no issues found.
unstract/sdk1/src/unstract/sdk1/execution/dispatcher.py ExecutionDispatcher provides three dispatch modes; queue naming convention is clear; disable_sync_subtasks=False use is justified; timeout resolution logic is sound.
unstract/sdk1/src/unstract/sdk1/execution/registry.py ExecutorRegistry class-decorator self-registration pattern is well-implemented with duplicate-name protection and a clear get/list/clear API for testing.
workers/executor/tasks.py Single execute_extraction entry point with sensible retry policy (ConnectionError, TimeoutError, OSError), proper context deserialization error handling, and lightweight log-correlation setup.
frontend/src/hooks/usePromptStudioSocket.js New Socket.IO listener correctly maps prompt_studio_result events to UI state updates (clearResultStatuses, updatePromptOutputState); handles fetch_response, single_pass_extraction, and index_document operations cleanly.
frontend/src/hooks/usePromptRun.js Correctly rewritten to fire-and-forget POST; 5-minute timeout safety net clears stale spinners; freeActiveApi() in finally is intentional (frees HTTP slot, not Celery task slot).
unstract/sdk1/src/unstract/sdk1/execution/orchestrator.py ExecutionOrchestrator correctly wraps all executor exceptions into failed ExecutionResult, measures elapsed time, and logs lifecycle events.

Sequence Diagram

sequenceDiagram
    participant FE as Frontend (PromptRun.jsx)
    participant DJV as Django View (views.py)
    participant PSH as PromptStudioHelper
    participant DISP as ExecutionDispatcher
    participant RMQ as RabbitMQ
    participant EW as Executor Worker (execute_extraction)
    participant LE as LegacyExecutor
    participant CBW as Callback Worker (prompt_studio_callback)
    participant SIO as Socket.IO
    participant FEH as Frontend (usePromptStudioSocket)

    FE->>DJV: POST /fetch_response/{tool_id}
    DJV->>PSH: build_fetch_response_payload() [ORM + extract + index - blocking]
    PSH-->>DJV: (ExecutionContext, cb_kwargs)
    DJV->>DISP: dispatch_with_callback(context, on_success=ide_prompt_complete, on_error=ide_prompt_error)
    DISP->>RMQ: send_task("execute_extraction", queue="celery_executor_legacy")
    DJV-->>FE: HTTP 202 {task_id, run_id, status:"accepted"}

    RMQ->>EW: execute_extraction(context_dict)
    EW->>LE: LegacyExecutor.execute(context)
    LE-->>EW: ExecutionResult
    EW-->>RMQ: result.to_dict() [Celery link callback triggered]

    RMQ->>CBW: ide_prompt_complete(result_dict, callback_kwargs)
    CBW->>CBW: OutputManagerHelper.handle_prompt_output_update() [ORM write]
    CBW->>SIO: _emit_websocket_event(room=log_events_id, event="prompt_studio_result")
    SIO-->>FEH: "prompt_studio_result" {status:"completed", operation, result}
    FEH->>FEH: handleCompleted() → updatePromptOutputState() + clearResultStatuses()
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/views.py
Line: 364-595

Comment:
**Missing feature flag gate on async endpoints**

The PR description states that all three IDE endpoints (`index_document`, `fetch_response`, `single_pass_extraction`) are gated behind the `async_prompt_execution` Flipt feature flag, with the old synchronous path preserved as a fallback when the flag is `OFF`. However, none of the three view methods contain any feature flag check — they unconditionally invoke the async/Celery path and return HTTP 202.

This means the breaking architectural change (fire-and-forget + Socket.IO result delivery) is deployed for **all users** regardless of the feature flag, directly contradicting the safe-rollout strategy described in the PR. When `async_prompt_execution=false`, users would still receive HTTP 202 with no result, because the old synchronous code path is never reached.

The sync fallback (e.g. delegating to the old `run_index_document` / `run_fetch_response` / `run_single_pass_extraction` Celery tasks or the direct helper methods) should be invoked when the flag is off.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/views.py
Line: 597-629

Comment:
**`task_status` lacks task-ownership verification (IDOR risk)**

The endpoint looks up `task_id` directly in the Celery result backend without verifying that the task belongs to the tool identified by `pk`. A user who has legitimate access to any Prompt Studio tool can supply an arbitrary `task_id` from a different tool/user's execution and retrieve that execution's `result` (the full `ExecutionResult` dict, which may contain extracted document data).

For example:
```
GET /prompt-studio/<my_tool_pk>/task-status/<other_users_task_id>
```
The permission check only validates access to `pk` (via `IsOwnerOrSharedUserOrSharedToOrg`), not whether `task_id` was produced by operations on that tool.

Consider either (a) storing a `(tool_id, task_id)` mapping server-side and validating the lookup, or (b) returning only the task's `status` from this endpoint (omitting the full `result` payload, since the real result is already delivered via Socket.IO).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/views.py
Line: 401-403

Comment:
**Redundant `import uuid as _uuid` inside method bodies**

`uuid` is already imported at the module level (line 2). The three identical inner imports (`import uuid as _uuid` in `index_document`, `fetch_response`, and `single_pass_extraction`) are redundant. Simply use the already-imported `uuid.uuid4()`.

```suggestion
        executor_task_id = str(uuid.uuid4())
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py
Line: 520-532

Comment:
**Null guard after the variable is already dereferenced**

`validate_adapter_status(profile_manager)` and `validate_profile_manager_owner_access(profile_manager)` are both called **before** the `if not profile_manager` guard. If `profile_manager` is `None` (e.g. when `prompt.profile_manager` is unset and no `profile_manager_id` is passed), those helper calls will raise an `AttributeError` inside them, not the intended `DefaultProfileError`. The guard at line 531–532 is effectively dead code for the `None` case.

The null check should be moved to immediately after `profile_manager` is resolved:

```python
profile_manager = prompt.profile_manager
if profile_manager_id:
    profile_manager = ProfileManagerHelper.get_profile_manager(
        profile_manager_id=profile_manager_id
    )

if not profile_manager:
    raise DefaultProfileError()

# Only then call validators
PromptStudioHelper.validate_adapter_status(profile_manager)
PromptStudioHelper.validate_profile_manager_owner_access(profile_manager)
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py
Line: 733-747

Comment:
**Null guard on `default_profile` comes after it is already used**

`default_profile.chunk_size = 0` mutates the object **before** the `if not default_profile: raise DefaultProfileError()` check. If `ProfileManager.get_default_llm_profile(tool)` returns `None`, the assignment at line 744 would raise `AttributeError` rather than the intended `DefaultProfileError`. The guard is dead code for the `None` case.

Move the null check to immediately after `default_profile` is assigned (before the validators and the `chunk_size` assignment):

```python
default_profile = ProfileManager.get_default_llm_profile(tool)
if not default_profile:
    raise DefaultProfileError()

PromptStudioHelper.validate_adapter_status(default_profile)
PromptStudioHelper.validate_profile_manager_owner_access(default_profile)
default_profile.chunk_size = 0
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: backend/backend/worker_celery.py
Line: 85-92

Comment:
**Configured queue name `"executor"` doesn't match the actual dispatch queue**

`get_worker_celery_app()` registers `task_queues=[Queue("executor")]`, but `ExecutionDispatcher._get_queue()` (in `sdk1/execution/dispatcher.py`) constructs the actual queue name as `celery_executor_{executor_name}` — for the legacy executor this becomes `"celery_executor_legacy"`.

The queue declared on the app (`"executor"`) never matches the queue used by `send_task`, so this `task_queues` setting has no practical effect. While `send_task` with an explicit `queue` parameter bypasses queue routing and the task is delivered correctly, the misconfigured `task_queues` setting means any queue-routing policies (e.g. prefetch limits, fair scheduling) configured on `"executor"` will not apply.

Either align the queue name to `"celery_executor_legacy"` (or the appropriate prefix), or remove the stale `task_queues` declaration from this app's config if it is intentionally unused.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 4200ac1

# Check if highlight data should be removed using configuration registry
# Ensure workflow identification keys are always in item metadata
organization = api.organization if api else None
org_id = str(organization.organization_id) if organization else ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think this should be allowed when the organization is missing. Also how it works with an empty org_id?
cc: @vishnuszipstack

) -> None:
"""Inject per-model usage breakdown into item['result']['metadata']."""
inner_result = item.get("result")
if not isinstance(inner_result, dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: improve/enhance class ExecutionResponse by adding a dto for result

)
return APIExecutionResponseSerializer(result).data

@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope there is no structure change of result here .. Can you please add the model/sample in descreption. or along the class ExecutionResponse

_worker_app: Celery | None = None


class _WorkerDispatchCelery(Celery):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why celery here ? We already moved it from backend . What this methods do here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@muhammad-ali-e The backend Celery worker handles fire-and-forget callback tasks that run after the executor worker finishes. Here's the flow:

Backend dispatches task → Executor Worker (does the heavy lifting)
↓ (Celery link/link_error)
Backend Callback Task (lightweight)
├── ORM writes (persist results to DB)
└── WebSocket push (notify frontend in real-time)

Why these run on the backend (not the executor worker):

  • They need Django ORM access (database models, services) — the executor worker doesn't have Django loaded
  • They need the Socket.IO emitter to push real-time updates to the frontend
  • They're lightweight — just DB writes + WebSocket emit, no heavy computation
  • Keeps the executor worker stateless and focused on execution only

the action.
"""
profile_manager_owner = profile_manager.created_by
if profile_manager_owner is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this created_by a default value , right? when will it be None?

@athul-rs athul-rs self-requested a review March 18, 2026 04:51
Comment on lines +61 to +63
global _worker_app
if _worker_app is not None:
return _worker_app
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unsynchronised singleton initialisation — race condition under concurrent requests

get_worker_celery_app() uses the classic double-check-without-lock pattern:

if _worker_app is not None:
    return _worker_app

Under gunicorn with threaded workers (or any multi-threaded Django deployment), two threads can simultaneously see _worker_app is None and both proceed to create a new _WorkerDispatchCelery instance. The second assignment overwrites the first (last-writer-wins), so each thread may end up holding a reference to a different object than what ends up in the module global. This is benign in practice because both instances are configured identically, but it is wasteful and could cause subtle issues if Celery connection pools are per-instance.

The idiomatic Python fix is to use a module-level lock:

import threading
_worker_app: Celery | None = None
_worker_app_lock = threading.Lock()

def get_worker_celery_app() -> Celery:
    global _worker_app
    if _worker_app is not None:
        return _worker_app
    with _worker_app_lock:
        if _worker_app is None:   # re-check inside lock
            ...
            _worker_app = app
    return _worker_app
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/backend/worker_celery.py
Line: 61-63

Comment:
**Unsynchronised singleton initialisation — race condition under concurrent requests**

`get_worker_celery_app()` uses the classic double-check-without-lock pattern:

```python
if _worker_app is not None:
    return _worker_app
```

Under gunicorn with threaded workers (or any multi-threaded Django deployment), two threads can simultaneously see `_worker_app is None` and both proceed to create a new `_WorkerDispatchCelery` instance. The second assignment overwrites the first (last-writer-wins), so each thread may end up holding a reference to a *different* object than what ends up in the module global. This is benign in practice because both instances are configured identically, but it is wasteful and could cause subtle issues if Celery connection pools are per-instance.

The idiomatic Python fix is to use a module-level lock:

```python
import threading
_worker_app: Celery | None = None
_worker_app_lock = threading.Lock()

def get_worker_celery_app() -> Celery:
    global _worker_app
    if _worker_app is not None:
        return _worker_app
    with _worker_app_lock:
        if _worker_app is None:   # re-check inside lock
            ...
            _worker_app = app
    return _worker_app
```

How can I resolve this? If you propose a fix, please make it concise.

@athul-rs
Copy link
Contributor

Code review

Found 1 issue:

  1. Missing feature flag gate on async endpoints. The PR description states that all three endpoints (index_document, fetch_response, single_pass_extraction) are gated behind the async_prompt_execution Flipt feature flag, with the old sync path as fallback when the flag is OFF. However, the actual code contains no feature flag check — all three endpoints unconditionally dispatch to Celery and return HTTP 202. The old sync path is fully replaced. This means merging this PR immediately switches all users to the async path with no rollback mechanism, contradicting the stated safety guarantee of "When flag is OFF (default), all 3 endpoints use the old sync path returning HTTP 200."

@action(detail=True, methods=["post"])
def index_document(self, request: HttpRequest, pk: Any = None) -> Response:
"""API Entry point method to index input file.
Builds the full execution payload (ORM work), then fires a
single executor task with Celery link/link_error callbacks.
The backend worker slot is freed immediately.
Args:
request (HttpRequest)
Raises:
IndexingError
ValidationError
Returns:
Response
"""
tool = self.get_object()
serializer = PromptStudioIndexSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
document_id: str = serializer.validated_data.get(ToolStudioPromptKeys.DOCUMENT_ID)
document: DocumentManager = DocumentManager.objects.get(pk=document_id)
file_name: str = document.document_name
run_id = CommonUtils.generate_uuid()
context, cb_kwargs = PromptStudioHelper.build_index_payload(
tool_id=str(tool.tool_id),
file_name=file_name,
org_id=UserSessionUtils.get_organization_id(request),
user_id=tool.created_by.user_id,
document_id=document_id,
run_id=run_id,
)
dispatcher = PromptStudioHelper._get_dispatcher()
# Pre-generate task ID so callbacks can reference it
import uuid as _uuid
executor_task_id = str(_uuid.uuid4())
cb_kwargs["executor_task_id"] = executor_task_id
task = dispatcher.dispatch_with_callback(
context,
on_success=signature(
"ide_index_complete",
kwargs={"callback_kwargs": cb_kwargs},
queue="prompt_studio_callback",
),
on_error=signature(
"ide_index_error",
kwargs={"callback_kwargs": cb_kwargs},
queue="prompt_studio_callback",
),
task_id=executor_task_id,
)
return Response(
{"task_id": task.id, "run_id": run_id, "status": "accepted"},
status=status.HTTP_202_ACCEPTED,
)

@action(detail=True, methods=["post"])
def fetch_response(self, request: HttpRequest, pk: Any = None) -> Response:
"""API Entry point method to fetch response to prompt.
Builds the full execution payload (ORM work), then fires a
single executor task with Celery link/link_error callbacks.
Args:
request (HttpRequest)
Returns:
Response
"""
custom_tool = self.get_object()
document_id: str = request.data.get(ToolStudioPromptKeys.DOCUMENT_ID)
prompt_id: str = request.data.get(ToolStudioPromptKeys.ID)
run_id: str = request.data.get(ToolStudioPromptKeys.RUN_ID)
profile_manager_id: str = request.data.get(
ToolStudioPromptKeys.PROFILE_MANAGER_ID
)
if not run_id:
run_id = CommonUtils.generate_uuid()
org_id = UserSessionUtils.get_organization_id(request)
user_id = custom_tool.created_by.user_id
# Resolve prompt
prompt = ToolStudioPrompt.objects.get(pk=prompt_id)
# Build file path
doc_path = PromptStudioFileHelper.get_or_create_prompt_studio_subdirectory(
org_id,
is_create=False,
user_id=user_id,
tool_id=str(custom_tool.tool_id),
)
document: DocumentManager = DocumentManager.objects.get(pk=document_id)
doc_path = str(Path(doc_path) / document.document_name)
context, cb_kwargs = PromptStudioHelper.build_fetch_response_payload(
tool=custom_tool,
doc_path=doc_path,
doc_name=document.document_name,
prompt=prompt,
org_id=org_id,
user_id=user_id,
document_id=document_id,
run_id=run_id,
profile_manager_id=profile_manager_id,
)
# If document is being indexed, return pending status
if context is None:
return Response(cb_kwargs, status=status.HTTP_200_OK)
dispatcher = PromptStudioHelper._get_dispatcher()
import uuid as _uuid
executor_task_id = str(_uuid.uuid4())
cb_kwargs["executor_task_id"] = executor_task_id
task = dispatcher.dispatch_with_callback(
context,
on_success=signature(
"ide_prompt_complete",
kwargs={"callback_kwargs": cb_kwargs},
queue="prompt_studio_callback",
),
on_error=signature(
"ide_prompt_error",
kwargs={"callback_kwargs": cb_kwargs},
queue="prompt_studio_callback",
),
task_id=executor_task_id,
)
return Response(
{"task_id": task.id, "run_id": run_id, "status": "accepted"},
status=status.HTTP_202_ACCEPTED,
)

@action(detail=True, methods=["post"])
def single_pass_extraction(self, request: HttpRequest, pk: uuid) -> Response:
"""API Entry point method for single pass extraction.
Builds the full execution payload (ORM work), then fires a
single executor task with Celery link/link_error callbacks.
Args:
request (HttpRequest)
pk: Primary key of the CustomTool
Returns:
Response
"""
custom_tool = self.get_object()
document_id: str = request.data.get(ToolStudioPromptKeys.DOCUMENT_ID)
run_id: str = request.data.get(ToolStudioPromptKeys.RUN_ID)
if not run_id:
run_id = CommonUtils.generate_uuid()
org_id = UserSessionUtils.get_organization_id(request)
user_id = custom_tool.created_by.user_id
# Build file path
doc_path = PromptStudioFileHelper.get_or_create_prompt_studio_subdirectory(
org_id,
is_create=False,
user_id=user_id,
tool_id=str(custom_tool.tool_id),
)
document: DocumentManager = DocumentManager.objects.get(pk=document_id)
doc_path = str(Path(doc_path) / document.document_name)
# Fetch prompts eligible for single-pass extraction.
# Mirrors the filtering in _execute_prompts_in_single_pass:
# only active, non-NOTES, non-TABLE/RECORD prompts.
prompts = list(
ToolStudioPrompt.objects.filter(tool_id=custom_tool.tool_id).order_by(
"sequence_number"
)
)
prompts = [
p
for p in prompts
if p.prompt_type != ToolStudioPromptKeys.NOTES
and p.active
and p.enforce_type != ToolStudioPromptKeys.TABLE
and p.enforce_type != ToolStudioPromptKeys.RECORD
]
if not prompts:
return Response(
{"error": "No active prompts found for single pass extraction."},
status=status.HTTP_400_BAD_REQUEST,
)
context, cb_kwargs = PromptStudioHelper.build_single_pass_payload(
tool=custom_tool,
doc_path=doc_path,
doc_name=document.document_name,
prompts=prompts,
org_id=org_id,
document_id=document_id,
run_id=run_id,
)
dispatcher = PromptStudioHelper._get_dispatcher()
import uuid as _uuid
executor_task_id = str(_uuid.uuid4())
cb_kwargs["executor_task_id"] = executor_task_id
task = dispatcher.dispatch_with_callback(
context,
on_success=signature(
"ide_prompt_complete",
kwargs={"callback_kwargs": cb_kwargs},
queue="prompt_studio_callback",
),
on_error=signature(
"ide_prompt_error",
kwargs={"callback_kwargs": cb_kwargs},
queue="prompt_studio_callback",
),
task_id=executor_task_id,
)
return Response(
{"task_id": task.id, "run_id": run_id, "status": "accepted"},
status=status.HTTP_202_ACCEPTED,
)

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

Comment on lines +44 to +76
def _is_safe_public_url(url: str) -> bool:
"""Validate webhook URL for SSRF protection.

Only allows HTTPS and blocks private/loopback/internal addresses.
"""
try:
p = urlparse(url)
if p.scheme not in ("https",):
return False
host = p.hostname or ""
if host in ("localhost",):
return False

addrs = _resolve_host_addresses(host)
if not addrs:
return False

for addr in addrs:
try:
ip = ipaddress.ip_address(addr)
except ValueError:
return False
if (
ip.is_private
or ip.is_loopback
or ip.is_link_local
or ip.is_reserved
or ip.is_multicast
):
return False
return True
except Exception:
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 DNS rebinding (TOCTOU) bypasses SSRF protection

_is_safe_public_url resolves the webhook hostname via DNS at validation time, but the actual HTTP request to the webhook (via postprocess_data) happens afterwards. An attacker who controls a DNS server can perform a DNS rebinding attack:

  1. During validation: attacker.com1.2.3.4 (a public IP) — validation passes.
  2. Attacker flips DNS TTL to 0 and rebinds attacker.com169.254.169.254 (AWS metadata), 10.0.0.1 (internal service), etc.
  3. During the actual HTTP request: the OS re-resolves attacker.com and connects to the internal address.

The current is_private / is_loopback / etc. checks are bypassed entirely because they only guard the validation-time resolution. This is a real risk in a multi-tenant environment where arbitrary webhook URLs can be registered.

The standard mitigation is to make the HTTP request through a socket-level wrapper that re-validates the resolved IP at connection time — for example by overriding the socket's connect() to check the destination address immediately before the connection is established, or by using a library such as urllib3-pyOpenSSL with a custom ProxyManager, or setting a fixed DNS-resolved IP on the requests session host header.

Prompt To Fix With AI
This is a comment left during a code review.
Path: workers/executor/executors/answer_prompt.py
Line: 44-76

Comment:
**DNS rebinding (TOCTOU) bypasses SSRF protection**

`_is_safe_public_url` resolves the webhook hostname via DNS at *validation time*, but the actual HTTP request to the webhook (via `postprocess_data`) happens *afterwards*. An attacker who controls a DNS server can perform a DNS rebinding attack:

1. During validation: `attacker.com``1.2.3.4` (a public IP) — validation passes.
2. Attacker flips DNS TTL to 0 and rebinds `attacker.com``169.254.169.254` (AWS metadata), `10.0.0.1` (internal service), etc.
3. During the actual HTTP request: the OS re-resolves `attacker.com` and connects to the internal address.

The current `is_private` / `is_loopback` / etc. checks are bypassed entirely because they only guard the validation-time resolution. This is a real risk in a multi-tenant environment where arbitrary webhook URLs can be registered.

The standard mitigation is to make the HTTP request through a socket-level wrapper that re-validates the resolved IP *at connection time* — for example by overriding the socket's `connect()` to check the destination address immediately before the connection is established, or by using a library such as [urllib3-pyOpenSSL](https://github.com/urllib3/urllib3) with a custom `ProxyManager`, or setting a fixed DNS-resolved IP on the `requests` session host header.

How can I resolve this? If you propose a fix, please make it concise.

harini-venkataraman and others added 2 commits March 19, 2026 14:39
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
@harini-venkataraman
Copy link
Contributor Author

@claude review

Comment on lines +392 to +395
DocumentIndexingService.set_document_indexing(
org_id=org_id, user_id=user_id, doc_id_key=doc_id_key
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 set_document_indexing not rolled back on broker failure

DocumentIndexingService.set_document_indexing(...) is called at the end of build_index_payload (before dispatch_with_callback is called in the view). If dispatch_with_callback subsequently raises — for example, because the broker is unavailable, the Celery app is not configured, or any other exception — the document is permanently left in the "indexing in progress" state. The ide_index_error errback is never invoked because no task was dispatched, so there is no mechanism to clear the stuck flag.

The view code does not wrap dispatch_with_callback in a try/except that would call DocumentIndexingService.remove_document_indexing(...) on failure. The result is that the user sees an infinite "indexing in progress" indicator and cannot re-index the document without manual DB intervention.

To fix this, either:

  1. Move set_document_indexing to after successful dispatch (wrap the dispatch, set the flag only on success), or
  2. Add a try/except around dispatch_with_callback in the view that calls remove_document_indexing on error.
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py
Line: 392-395

Comment:
**`set_document_indexing` not rolled back on broker failure**

`DocumentIndexingService.set_document_indexing(...)` is called at the end of `build_index_payload` (before `dispatch_with_callback` is called in the view). If `dispatch_with_callback` subsequently raises — for example, because the broker is unavailable, the Celery app is not configured, or any other exception — the document is permanently left in the "indexing in progress" state. The `ide_index_error` errback is never invoked because no task was dispatched, so there is no mechanism to clear the stuck flag.

The view code does not wrap `dispatch_with_callback` in a try/except that would call `DocumentIndexingService.remove_document_indexing(...)` on failure. The result is that the user sees an infinite "indexing in progress" indicator and cannot re-index the document without manual DB intervention.

To fix this, either:
1. Move `set_document_indexing` to after successful dispatch (wrap the dispatch, set the flag only on success), or
2. Add a `try/except` around `dispatch_with_callback` in the view that calls `remove_document_indexing` on error.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +855 to +865
cb_kwargs = {
"log_events_id": log_events_id,
"request_id": request_id,
"org_id": org_id,
"operation": "single_pass_extraction",
"run_id": run_id,
"document_id": document_id,
"tool_id": tool_id,
"prompt_ids": [str(p.prompt_id) for p in prompts],
"is_single_pass": True,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing profile_manager_id in single_pass_extraction cb_kwargs

build_single_pass_payload does not include profile_manager_id in its cb_kwargs (lines 855–865). When ide_prompt_complete processes this callback it reads:

profile_manager_id = cb.get("profile_manager_id")  # always None for single-pass

and passes profile_manager_id=None to OutputManagerHelper.handle_prompt_output_update. Depending on how that helper uses the field, single-pass outputs may not be correctly associated with the profile manager, producing a different storage behavior than the fetch_response path (which always passes the explicit profile_manager_id).

More concretely, when ide_prompt_error fires for a single-pass failure, the emitted error event also lacks profile_manager_id. The frontend's handleFailed falls through to the broad clearPromptStatusById(promptId) fallback, which clears ALL doc/profile status combinations for those prompts — not just the one that was actually running. This means an error in one single-pass run would cancel the loading spinners for unrelated concurrent runs.

Consider adding the default profile's profile_id to cb_kwargs:

cb_kwargs = {
    ...
    "profile_manager_id": str(default_profile.profile_id),
    ...
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py
Line: 855-865

Comment:
**Missing `profile_manager_id` in `single_pass_extraction` `cb_kwargs`**

`build_single_pass_payload` does not include `profile_manager_id` in its `cb_kwargs` (lines 855–865). When `ide_prompt_complete` processes this callback it reads:

```python
profile_manager_id = cb.get("profile_manager_id")  # always None for single-pass
```

and passes `profile_manager_id=None` to `OutputManagerHelper.handle_prompt_output_update`. Depending on how that helper uses the field, single-pass outputs may not be correctly associated with the profile manager, producing a different storage behavior than the `fetch_response` path (which always passes the explicit `profile_manager_id`).

More concretely, when `ide_prompt_error` fires for a single-pass failure, the emitted error event also lacks `profile_manager_id`. The frontend's `handleFailed` falls through to the broad `clearPromptStatusById(promptId)` fallback, which clears ALL doc/profile status combinations for those prompts — not just the one that was actually running. This means an error in one single-pass run would cancel the loading spinners for unrelated concurrent runs.

Consider adding the default profile's `profile_id` to `cb_kwargs`:

```python
cb_kwargs = {
    ...
    "profile_manager_id": str(default_profile.profile_id),
    ...
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +131 to +149
const onResult = useCallback(
(payload) => {
try {
const msg = payload?.data || payload;
const { status, operation, result, error, ...extra } = msg;

if (status === "completed") {
handleCompleted(operation, result);
} else if (status === "failed") {
handleFailed(operation, error, extra);
}
} catch (err) {
setAlertDetails(
handleException(err, "Failed to process prompt studio result"),
);
}
},
[handleCompleted, handleFailed, setAlertDetails, handleException],
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Socket result event not scoped to the current tool — multi-tab state corruption

prompt_studio_result events are emitted to the log_events_id Socket.IO room, which is per-user-session, not per-tool or per-tab. If a user has two Prompt Studio tools open simultaneously in separate tabs (both sharing the same Socket.IO connection and log_events_id), a result from Tool A's execution will be received and processed by Tab B's usePromptStudioSocket listener as well.

In handleCompleted("fetch_response", result):

updatePromptOutputState(data, false);  // writes Tool A's outputs into Tab B's store
clearResultStatuses(data);             // tries to clear statuses using Tool A's prompt IDs

updatePromptOutputState in Tab B would overwrite prompt output state with data belonging to Tool A's prompts. This can cause phantom outputs to appear under the wrong tool and leave Tab B in an inconsistent state.

The socket event payload (_emit_result in tasks.py) does not include a tool_id field, so the frontend has no way to discard irrelevant events. Consider adding tool_id (or custom_tool_id) to the emitted payload and filtering it in onResult:

const onResult = useCallback((payload) => {
  const msg = payload?.data || payload;
  if (msg.tool_id && msg.tool_id !== details?.tool_id) return; // ignore events for other tools
  ...
}, [..., details?.tool_id]);
Prompt To Fix With AI
This is a comment left during a code review.
Path: frontend/src/hooks/usePromptStudioSocket.js
Line: 131-149

Comment:
**Socket result event not scoped to the current tool — multi-tab state corruption**

`prompt_studio_result` events are emitted to the `log_events_id` Socket.IO room, which is per-user-session, not per-tool or per-tab. If a user has two Prompt Studio tools open simultaneously in separate tabs (both sharing the same Socket.IO connection and `log_events_id`), a result from Tool A's execution will be received and processed by Tab B's `usePromptStudioSocket` listener as well.

In `handleCompleted("fetch_response", result)`:
```js
updatePromptOutputState(data, false);  // writes Tool A's outputs into Tab B's store
clearResultStatuses(data);             // tries to clear statuses using Tool A's prompt IDs
```

`updatePromptOutputState` in Tab B would overwrite prompt output state with data belonging to Tool A's prompts. This can cause phantom outputs to appear under the wrong tool and leave Tab B in an inconsistent state.

The socket event payload (`_emit_result` in `tasks.py`) does not include a `tool_id` field, so the frontend has no way to discard irrelevant events. Consider adding `tool_id` (or `custom_tool_id`) to the emitted payload and filtering it in `onResult`:

```js
const onResult = useCallback((payload) => {
  const msg = payload?.data || payload;
  if (msg.tool_id && msg.tool_id !== details?.tool_id) return; // ignore events for other tools
  ...
}, [..., details?.tool_id]);
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +28 to +44
const clearResultStatuses = useCallback(
(data) => {
if (!Array.isArray(data)) {
return;
}
data.forEach((item) => {
const promptId = item?.prompt_id;
const docId = item?.document_manager;
const profileId = item?.profile_manager;
if (promptId && docId && profileId) {
const statusKey = generateApiRunStatusId(docId, profileId);
removePromptStatus(promptId, statusKey);
}
});
},
[removePromptStatus],
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 clearResultStatuses spinner-clearing may permanently fail

clearResultStatuses derives the status key from item.profile_manager on the result data items. The status was originally stored using a profileId taken directly from the queue item string — a raw UUID string. For clearResultStatuses to match and call removePromptStatus, item.profile_manager in the result data must be the exact same UUID string.

If OutputManagerHelper.handle_prompt_output_update returns serialized objects where profile_manager is an integer PK, a nested object, or null, the condition if (promptId && docId && profileId) will be false, removePromptStatus will never be called, and the loading spinner for the prompt will remain active forever. The user would be unable to re-run the prompt without a page refresh.

The old polling path avoided this by explicitly removing the status with the IDs already available in the callback closure. The new socket path has no such explicit fallback.

Consider including prompt_ids, document_id, and profile_manager_id in the socket event payload (they are already present in cb_kwargs) so the frontend can always do a direct cleanup regardless of the result data format, rather than relying on parsing the ORM-serialized result items.

Prompt To Fix With AI
This is a comment left during a code review.
Path: frontend/src/hooks/usePromptStudioSocket.js
Line: 28-44

Comment:
**`clearResultStatuses` spinner-clearing may permanently fail**

`clearResultStatuses` derives the status key from `item.profile_manager` on the result data items. The status was originally stored using a `profileId` taken directly from the queue item string — a raw UUID string. For `clearResultStatuses` to match and call `removePromptStatus`, `item.profile_manager` in the result data must be the exact same UUID string.

If `OutputManagerHelper.handle_prompt_output_update` returns serialized objects where `profile_manager` is an integer PK, a nested object, or `null`, the condition `if (promptId && docId && profileId)` will be false, `removePromptStatus` will never be called, and the loading spinner for the prompt will remain active forever. The user would be unable to re-run the prompt without a page refresh.

The old polling path avoided this by explicitly removing the status with the IDs already available in the callback closure. The new socket path has no such explicit fallback.

Consider including `prompt_ids`, `document_id`, and `profile_manager_id` in the socket event payload (they are already present in `cb_kwargs`) so the frontend can always do a direct cleanup regardless of the result data format, rather than relying on parsing the ORM-serialized result items.

How can I resolve this? If you propose a fix, please make it concise.

harini-venkataraman and others added 2 commits March 19, 2026 15:17
…t drift

- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines +56 to +70
const statusKey = generateApiRunStatusId(docId, profileId);
removePromptStatus(promptId, statusKey);
setAlertDetails({
type: "info",
content:
res?.data?.message || "Document is being indexed. Please wait.",
});
return;
}

// Timeout safety net: clear stale status if socket event never arrives.
setTimeout(() => {
const statusKey = generateApiRunStatusId(docId, profileId);
const current = usePromptRunStatusStore.getState().promptRunStatus;
if (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 5-minute timeout can falsely cancel a later re-run of the same prompt

The setTimeout closure captures promptId, docId, and profileId from run N. When it fires 5 minutes later it reads the current store state and checks whether current?.[promptId]?.[statusKey] === PROMPT_RUN_API_STATUSES.RUNNING. If the user triggered another run of the same (promptId, docId, profileId) combination (run N+1) within that 5-minute window, the timer from run N will see run N+1's RUNNING state, clear it, and display a spurious "timed out" warning — even though run N+1 may still be processing and will later receive a valid socket event. The result is a permanently stuck spinner for run N+1 (status cleared by the stale timer) while the socket result arrives and tries to call clearResultStatuses on an already-removed entry.

Mitigation: tag each dispatch with a unique runId and include it in the status store key, so the timeout only clears the specific run it corresponds to:

const runNonce = generateUUID();
addPromptStatus(promptId, statusKey, PROMPT_RUN_API_STATUSES.RUNNING, runNonce);

setTimeout(() => {
  const current = usePromptRunStatusStore.getState().promptRunStatus;
  if (current?.[promptId]?.[statusKey]?.nonce === runNonce) {
    removePromptStatus(promptId, statusKey);
    setAlertDetails({ type: "warning", content: "Prompt execution timed out. Please try again." });
  }
}, SOCKET_TIMEOUT_MS);
Prompt To Fix With AI
This is a comment left during a code review.
Path: frontend/src/hooks/usePromptRun.js
Line: 56-70

Comment:
**5-minute timeout can falsely cancel a later re-run of the same prompt**

The `setTimeout` closure captures `promptId`, `docId`, and `profileId` from run N. When it fires 5 minutes later it reads the *current* store state and checks whether `current?.[promptId]?.[statusKey] === PROMPT_RUN_API_STATUSES.RUNNING`. If the user triggered another run of the same `(promptId, docId, profileId)` combination (run N+1) within that 5-minute window, the timer from run N will see run N+1's `RUNNING` state, clear it, and display a spurious "timed out" warning — even though run N+1 may still be processing and will later receive a valid socket event. The result is a permanently stuck spinner for run N+1 (status cleared by the stale timer) while the socket result arrives and tries to call `clearResultStatuses` on an already-removed entry.

Mitigation: tag each dispatch with a unique `runId` and include it in the status store key, so the timeout only clears the *specific* run it corresponds to:

```js
const runNonce = generateUUID();
addPromptStatus(promptId, statusKey, PROMPT_RUN_API_STATUSES.RUNNING, runNonce);

setTimeout(() => {
  const current = usePromptRunStatusStore.getState().promptRunStatus;
  if (current?.[promptId]?.[statusKey]?.nonce === runNonce) {
    removePromptStatus(promptId, statusKey);
    setAlertDetails({ type: "warning", content: "Prompt execution timed out. Please try again." });
  }
}, SOCKET_TIMEOUT_MS);
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +297 to +302
)
return str(platform_key.key)

# ------------------------------------------------------------------
# Phase 5B — Payload builders for fire-and-forget dispatch
# ------------------------------------------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 default_profile dereferenced before null guard in build_index_payload

build_index_payload calls ProfileManager.get_default_llm_profile(tool), then immediately passes the result to validate_adapter_status and validate_profile_manager_owner_access without any null check. If no default LLM profile is configured for the tool, get_default_llm_profile returns None and both validators will raise AttributeError deep inside the helper, surfacing as an opaque 500 error instead of the intended DefaultProfileError.

The same defensiveness present in build_single_pass_payload (if not default_profile: raise DefaultProfileError()) should be applied here:

default_profile = ProfileManager.get_default_llm_profile(tool)
if not default_profile:
    raise DefaultProfileError()

PromptStudioHelper.validate_adapter_status(default_profile)
PromptStudioHelper.validate_profile_manager_owner_access(default_profile)
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py
Line: 297-302

Comment:
**`default_profile` dereferenced before null guard in `build_index_payload`**

`build_index_payload` calls `ProfileManager.get_default_llm_profile(tool)`, then immediately passes the result to `validate_adapter_status` and `validate_profile_manager_owner_access` without any null check. If no default LLM profile is configured for the tool, `get_default_llm_profile` returns `None` and both validators will raise `AttributeError` deep inside the helper, surfacing as an opaque 500 error instead of the intended `DefaultProfileError`.

The same defensiveness present in `build_single_pass_payload` (`if not default_profile: raise DefaultProfileError()`) should be applied here:

```python
default_profile = ProfileManager.get_default_llm_profile(tool)
if not default_profile:
    raise DefaultProfileError()

PromptStudioHelper.validate_adapter_status(default_profile)
PromptStudioHelper.validate_profile_manager_owner_access(default_profile)
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 430 to +443

Raises:
FilenameMissingError: _description_
Args:
request (HttpRequest)

Returns:
Response
"""
custom_tool = self.get_object()
tool_id: str = str(custom_tool.tool_id)
document_id: str = request.data.get(ToolStudioPromptKeys.DOCUMENT_ID)
id: str = request.data.get(ToolStudioPromptKeys.ID)
prompt_id: str = request.data.get(ToolStudioPromptKeys.ID)
run_id: str = request.data.get(ToolStudioPromptKeys.RUN_ID)
profile_manager: str = request.data.get(ToolStudioPromptKeys.PROFILE_MANAGER_ID)
profile_manager_id: str = request.data.get(
ToolStudioPromptKeys.PROFILE_MANAGER_ID
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 HubSpot first-run analytics event silently dropped in async path

The old sync fetch_response path tracked output_count_before and called notify_hubspot_event(user, "PROMPT_RUN", is_first_for_org=..., ...) to fire a business analytics event on the first prompt run for an organisation. The new async path removes both the count query and the notification call entirely with no comment or TODO.

If this is intentional (e.g., to be re-added once the async path is stable), a comment noting this would prevent it from being permanently lost. If it is unintentional, first-run HubSpot events will silently stop firing for any organisation that has async_prompt_execution enabled, skewing adoption metrics.

Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/views.py
Line: 430-443

Comment:
**HubSpot first-run analytics event silently dropped in async path**

The old sync `fetch_response` path tracked `output_count_before` and called `notify_hubspot_event(user, "PROMPT_RUN", is_first_for_org=..., ...)` to fire a business analytics event on the first prompt run for an organisation. The new async path removes both the count query and the notification call entirely with no comment or TODO.

If this is intentional (e.g., to be re-added once the async path is stable), a comment noting this would prevent it from being permanently lost. If it is unintentional, first-run HubSpot events will silently stop firing for any organisation that has `async_prompt_execution` enabled, skewing adoption metrics.

How can I resolve this? If you propose a fix, please make it concise.

@github-actions
Copy link
Contributor

Frontend Lint Report (Biome)

All checks passed! No linting or formatting issues found.

@github-actions
Copy link
Contributor

Test Results

Summary
  • Runner Tests: 11 passed, 0 failed (11 total)
  • SDK1 Tests: 142 passed, 0 failed (142 total)

Runner Tests - Full Report
filepath function $$\textcolor{#23d18b}{\tt{passed}}$$ SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_logs}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_client\_init}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_run\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$ $$\textcolor{#23d18b}{\tt{11}}$$ $$\textcolor{#23d18b}{\tt{11}}$$
SDK1 Tests - Full Report
filepath function $$\textcolor{#23d18b}{\tt{passed}}$$ SUBTOTAL
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_round\_trip\_serialization}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_json\_serializable}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_enum\_values\_normalized}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_string\_values\_accepted}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_auto\_generates\_request\_id}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_explicit\_request\_id\_preserved}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_optional\_organization\_id}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_empty\_executor\_params\_default}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_complex\_executor\_params}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_validation\_rejects\_empty\_required\_fields}}$$ $$\textcolor{#23d18b}{\tt{4}}$$ $$\textcolor{#23d18b}{\tt{4}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_all\_operations\_accepted}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionContext.test\_from\_dict\_missing\_optional\_fields}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_success\_round\_trip}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_failure\_round\_trip}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_json\_serializable}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_failure\_requires\_error\_message}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_success\_allows\_no\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_failure\_factory}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_failure\_factory\_no\_metadata}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_error\_not\_in\_success\_dict}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_error\_in\_failure\_dict}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_default\_empty\_dicts}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_from\_dict\_missing\_optional\_fields}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_response\_contract\_extract}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_response\_contract\_index}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionResult.test\_response\_contract\_answer\_prompt}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestBaseExecutor.test\_cannot\_instantiate\_abstract}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestBaseExecutor.test\_concrete\_subclass\_works}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestBaseExecutor.test\_execute\_returns\_result}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_register\_and\_get}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_get\_returns\_fresh\_instance}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_register\_as\_decorator}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_list\_executors}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_list\_executors\_empty}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_get\_unknown\_raises\_key\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_get\_unknown\_lists\_available}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_duplicate\_name\_raises\_value\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_register\_non\_subclass\_raises\_type\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_register\_non\_class\_raises\_type\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_clear}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorRegistry.test\_execute\_through\_registry}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionOrchestrator.test\_dispatches\_to\_correct\_executor}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionOrchestrator.test\_unknown\_executor\_returns\_failure}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionOrchestrator.test\_executor\_exception\_returns\_failure}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionOrchestrator.test\_exception\_result\_has\_elapsed\_metadata}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionOrchestrator.test\_successful\_result\_passed\_through}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionOrchestrator.test\_executor\_returning\_failure\_is\_not\_wrapped}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_sends\_task\_and\_returns\_result}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_uses\_default\_timeout}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_timeout\_from\_env}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_explicit\_timeout\_overrides\_env}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_timeout\_returns\_failure}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_generic\_exception\_returns\_failure}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_async\_returns\_task\_id}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_no\_app\_raises\_value\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_async\_no\_app\_raises\_value\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_failure\_result\_from\_executor}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_context\_serialized\_correctly}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_sends\_link\_and\_link\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_success\_only}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_error\_only}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_no\_callbacks}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_returns\_async\_result}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_no\_app\_raises\_value\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_context\_serialized}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_custom\_task\_id}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutionDispatcher.test\_dispatch\_with\_callback\_no\_task\_id\_omits\_kwarg}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_platform\_api\_key\_returned}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_platform\_api\_key\_missing\_raises}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_other\_env\_var\_from\_environ}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_missing\_env\_var\_raises}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_empty\_env\_var\_raises}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_stream\_log\_routes\_to\_logging}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_stream\_log\_respects\_level}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_stream\_error\_and\_exit\_raises\_sdk\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_execution.py}}$$ $$\textcolor{#23d18b}{\tt{TestExecutorToolShim.test\_stream\_error\_and\_exit\_wraps\_original}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_success\_on\_first\_attempt}}$$ $$\textcolor{#23d18b}{\tt{2}}$$ $$\textcolor{#23d18b}{\tt{2}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_retry\_on\_connection\_error}}$$ $$\textcolor{#23d18b}{\tt{2}}$$ $$\textcolor{#23d18b}{\tt{2}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_non\_retryable\_http\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_retryable\_http\_errors}}$$ $$\textcolor{#23d18b}{\tt{3}}$$ $$\textcolor{#23d18b}{\tt{3}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_post\_method\_retry}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_retry\_logging}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_prompt.py}}$$ $$\textcolor{#23d18b}{\tt{TestPromptToolRetry.test\_success\_on\_first\_attempt}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_prompt.py}}$$ $$\textcolor{#23d18b}{\tt{TestPromptToolRetry.test\_retry\_on\_errors}}$$ $$\textcolor{#23d18b}{\tt{2}}$$ $$\textcolor{#23d18b}{\tt{2}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_prompt.py}}$$ $$\textcolor{#23d18b}{\tt{TestPromptToolRetry.test\_wrapper\_methods\_retry}}$$ $$\textcolor{#23d18b}{\tt{4}}$$ $$\textcolor{#23d18b}{\tt{4}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_connection\_error\_is\_retryable}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_timeout\_is\_retryable}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_http\_error\_retryable\_status\_codes}}$$ $$\textcolor{#23d18b}{\tt{3}}$$ $$\textcolor{#23d18b}{\tt{3}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_http\_error\_non\_retryable\_status\_codes}}$$ $$\textcolor{#23d18b}{\tt{5}}$$ $$\textcolor{#23d18b}{\tt{5}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_http\_error\_without\_response}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_os\_error\_retryable\_errno}}$$ $$\textcolor{#23d18b}{\tt{5}}$$ $$\textcolor{#23d18b}{\tt{5}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_os\_error\_non\_retryable\_errno}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_other\_exception\_not\_retryable}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCalculateDelay.test\_exponential\_backoff\_without\_jitter}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCalculateDelay.test\_exponential\_backoff\_with\_jitter}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCalculateDelay.test\_max\_delay\_cap}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCalculateDelay.test\_max\_delay\_cap\_with\_jitter}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_successful\_call\_first\_attempt}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_retry\_after\_transient\_failure}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_max\_retries\_exceeded}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_retry\_with\_custom\_predicate}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_no\_retry\_with\_predicate\_false}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_exception\_not\_in\_tuple\_not\_retried}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_default\_configuration}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_environment\_variable\_configuration}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_invalid\_max\_retries}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_invalid\_base\_delay}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_invalid\_multiplier}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_jitter\_values}}$$ $$\textcolor{#23d18b}{\tt{2}}$$ $$\textcolor{#23d18b}{\tt{2}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_custom\_exceptions\_only}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_custom\_predicate\_only}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_both\_exceptions\_and\_predicate}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_exceptions\_match\_but\_predicate\_false}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestPreconfiguredDecorators.test\_retry\_platform\_service\_call\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestPreconfiguredDecorators.test\_retry\_prompt\_service\_call\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestPreconfiguredDecorators.test\_platform\_service\_decorator\_retries\_on\_connection\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestPreconfiguredDecorators.test\_prompt\_service\_decorator\_retries\_on\_timeout}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryLogging.test\_warning\_logged\_on\_retry}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryLogging.test\_info\_logged\_on\_success\_after\_retry}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryLogging.test\_exception\_logged\_on\_giving\_up}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$ $$\textcolor{#23d18b}{\tt{142}}$$ $$\textcolor{#23d18b}{\tt{142}}$$

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
6.9% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Comment on lines +484 to +491
dp = ProfileManager.get_default_llm_profile(tool)
monitor_llm = str(dp.llm.id)
if challenge_llm_instance:
challenge_llm = str(challenge_llm_instance.id)
else:
dp = ProfileManager.get_default_llm_profile(tool)
challenge_llm = str(dp.llm.id)
return monitor_llm, challenge_llm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 AttributeError on None default profile in _resolve_llm_ids

When tool.monitor_llm or tool.challenge_llm is None (not explicitly set), the code falls through to ProfileManager.get_default_llm_profile(tool). If that also returns None (no default profile configured), the next line str(dp.llm.id) immediately raises AttributeError: 'NoneType' object has no attribute 'llm'. This surfaces as an opaque 500 error rather than the expected DefaultProfileError.

This can happen when a user creates a Prompt Studio tool, sets a per-prompt profile manager, but has never configured a default tool-level profile AND has not set explicit monitor_llm/challenge_llm adapters.

Note that _resolve_llm_ids is called in build_fetch_response_payload before the if not profile_manager: raise DefaultProfileError() guard (line 536), so a missing default profile causes an AttributeError that bypasses the intended error handling entirely.

Suggested change
dp = ProfileManager.get_default_llm_profile(tool)
monitor_llm = str(dp.llm.id)
if challenge_llm_instance:
challenge_llm = str(challenge_llm_instance.id)
else:
dp = ProfileManager.get_default_llm_profile(tool)
challenge_llm = str(dp.llm.id)
return monitor_llm, challenge_llm
if monitor_llm_instance:
monitor_llm = str(monitor_llm_instance.id)
else:
dp = ProfileManager.get_default_llm_profile(tool)
if not dp:
raise DefaultProfileError()
monitor_llm = str(dp.llm.id)
if challenge_llm_instance:
challenge_llm = str(challenge_llm_instance.id)
else:
dp = ProfileManager.get_default_llm_profile(tool)
if not dp:
raise DefaultProfileError()
challenge_llm = str(dp.llm.id)
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/prompt_studio_helper.py
Line: 484-491

Comment:
**`AttributeError` on `None` default profile in `_resolve_llm_ids`**

When `tool.monitor_llm` or `tool.challenge_llm` is `None` (not explicitly set), the code falls through to `ProfileManager.get_default_llm_profile(tool)`. If that also returns `None` (no default profile configured), the next line `str(dp.llm.id)` immediately raises `AttributeError: 'NoneType' object has no attribute 'llm'`. This surfaces as an opaque 500 error rather than the expected `DefaultProfileError`.

This can happen when a user creates a Prompt Studio tool, sets a per-prompt profile manager, but has never configured a default tool-level profile AND has not set explicit `monitor_llm`/`challenge_llm` adapters.

Note that `_resolve_llm_ids` is called in `build_fetch_response_payload` **before** the `if not profile_manager: raise DefaultProfileError()` guard (line 536), so a missing default profile causes an `AttributeError` that bypasses the intended error handling entirely.

```suggestion
        if monitor_llm_instance:
            monitor_llm = str(monitor_llm_instance.id)
        else:
            dp = ProfileManager.get_default_llm_profile(tool)
            if not dp:
                raise DefaultProfileError()
            monitor_llm = str(dp.llm.id)
        if challenge_llm_instance:
            challenge_llm = str(challenge_llm_instance.id)
        else:
            dp = ProfileManager.get_default_llm_profile(tool)
            if not dp:
                raise DefaultProfileError()
            challenge_llm = str(dp.llm.id)
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +74 to +94
def _emit_error(
log_events_id: str,
task_id: str,
operation: str,
error: str,
extra: dict[str, Any] | None = None,
) -> None:
"""Push a failure event to the frontend via Socket.IO."""
data: dict[str, Any] = {
"task_id": task_id,
"status": "failed",
"operation": operation,
"error": error,
}
if extra:
data.update(extra)
_emit_websocket_event(
room=log_events_id,
event=PROMPT_STUDIO_RESULT_EVENT,
data=data,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _emit_error missing _json_safe() serialization

_emit_result wraps its payload through _json_safe() to handle uuid.UUID and datetime objects before emitting via Socket.IO. _emit_error has no equivalent sanitization — it calls _emit_websocket_event directly with raw data.

In practice the current callers store only strings in extra (e.g. from cb.get("document_id", "")), so this is latent rather than immediately broken. However, future callers could pass ORM values or UUIDs into extra, causing json.dumps inside _emit_websocket_event to raise a TypeError and leave the frontend without any error feedback at all.

Suggested change
def _emit_error(
log_events_id: str,
task_id: str,
operation: str,
error: str,
extra: dict[str, Any] | None = None,
) -> None:
"""Push a failure event to the frontend via Socket.IO."""
data: dict[str, Any] = {
"task_id": task_id,
"status": "failed",
"operation": operation,
"error": error,
}
if extra:
data.update(extra)
_emit_websocket_event(
room=log_events_id,
event=PROMPT_STUDIO_RESULT_EVENT,
data=data,
)
if extra:
data.update(extra)
_emit_websocket_event(
room=log_events_id,
event=PROMPT_STUDIO_RESULT_EVENT,
data=_json_safe(data),
)
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/prompt_studio/prompt_studio_core_v2/tasks.py
Line: 74-94

Comment:
**`_emit_error` missing `_json_safe()` serialization**

`_emit_result` wraps its payload through `_json_safe()` to handle `uuid.UUID` and `datetime` objects before emitting via Socket.IO. `_emit_error` has no equivalent sanitization — it calls `_emit_websocket_event` directly with raw data.

In practice the current callers store only strings in `extra` (e.g. from `cb.get("document_id", "")`), so this is latent rather than immediately broken. However, future callers could pass ORM values or UUIDs into `extra`, causing `json.dumps` inside `_emit_websocket_event` to raise a `TypeError` and leave the frontend without any error feedback at all.

```suggestion
    if extra:
        data.update(extra)
    _emit_websocket_event(
        room=log_events_id,
        event=PROMPT_STUDIO_RESULT_EVENT,
        data=_json_safe(data),
    )
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants