feat: 1 kafka topic = 1 msg type, everywhere by HardMax71 · Pull Request #174 · HardMax71/Integr8sCode

HardMax71 · 2026-02-12T19:24:24Z

Summary by cubic

Switch to one Kafka topic per event type across the backend, with producers/consumers publishing and subscribing directly to event-type topics. Cleaned up deployment by removing the DLQ processor service.

Refactors
- Removed KafkaTopic enum and routing/helpers; simplified mappings to event-type → class and group subscriptions by EventType.
- Producer publishes to prefix+event_type; removed send_to_dlq.
- Consumers subscribe to per-event-type topics instead of shared topics + body filters.
- DLQManager publishes DLQ events to their own event-type topics; retry policies keyed by prefixed topics.
- Dropped DLQProvider and the standalone DLQ worker (runner + config); removed dlq-processor from docker-compose and CI workflow/logs.
- PodMonitorConfig now uses str(EventType) for topics.
- Topic creation script creates one topic per EventType using broker defaults.
Migration
- Create topics for all EventType values with the standard prefix; stop producing to old shared topics.
- Remove the DLQ processor deployment and its config; update dashboards/alerts and ACLs to new topic names.
- Manage partitions/retention at the broker level.
- Deploy services together so producers/consumers use the new topics consistently.

^{Written for commit 0b220b5. Summary will update on new commits.}

Summary by CodeRabbit

Refactor
- Consolidated dead-letter queue handling into the worker lifecycle and removed the standalone DLQ processor service.
- Simplified topic management to use event-type-driven topics and streamlined event subscriber registration to explicit per-event subscriptions.
Chores
- Removed DLQ processor configuration and related orchestration wiring; reduced E2E log collection scope.
Tests
- Updated tests to align with the new event-type topic representation and DLQ workflow changes.

coderabbitai · 2026-02-12T19:24:59Z

📝 Walkthrough

Walkthrough

Replaces KafkaTopic-based routing with EventType/string topics, removes KafkaTopic enum and topic config helpers, consolidates DLQ logic into a DLQWorkerProvider with APScheduler-managed retry monitoring, and removes the standalone DLQ processor worker; many consumers/subscribers and tests updated accordingly.

Changes

Cohort / File(s)	Summary
Core container & provider wiring `backend/app/core/container.py`, `backend/app/core/providers.py`	Replaced `DLQProvider` with `DLQWorkerProvider` in container wiring; removed `create_dlq_processor_container`; `DLQWorkerProvider` now manages APScheduler lifecycle for DLQ retry monitoring and exposes `get_dlq_manager`.
Enums: KafkaTopic removal `backend/app/domain/enums/kafka.py`, `backend/app/domain/enums/__init__.py`	Deleted `KafkaTopic` enum and removed `DLQ_MANAGER` from `GroupId`; removed `KafkaTopic` from public exports.
Topic routing & mappings `backend/app/infrastructure/kafka/mappings.py`, `backend/app/infrastructure/kafka/__init__.py`, `backend/app/infrastructure/kafka/topics.py`	Removed topic helpers and static topic config module; replaced EVENT_TYPE_TO_TOPIC with `CONSUMER_GROUP_SUBSCRIPTIONS` mapping GroupId → set[EventType]; deleted `get_all_topics`, `get_topic_configs`, `get_topic_for_event`.
Event handlers & subscriptions `backend/app/events/handlers.py`	Reworked subscriber registration to explicit per-EventType topic subscriptions via a new `_topic(settings, event_type)` helper; removed DLQ subscriber registration and event-type filtering helpers.
DLQ manager & producer `backend/app/dlq/manager.py`, `backend/app/events/core/producer.py`	`DLQManager` no longer stores a single `dlq_topic`; uses a `_topic_prefix` and publishes to per-event-type topics; removed `UnifiedProducer.send_to_dlq` and DLQ-specific producer logic.
Replay/config/schema updates `backend/app/db/docs/replay.py`, `backend/app/domain/replay/models.py`, `backend/app/schemas_pydantic/replay.py`, `backend/app/schemas_pydantic/replay_models.py`	Removed `target_topics` from ReplayConfig/ReplayRequest/ReplayConfigSchema and removed `KafkaTopic` references in related imports.
Service config changes `backend/app/services/pod_monitor/config.py`	Switched several topic fields from `KafkaTopic` to `str` and set defaults to `str(EventType.*)`; removed `get_topic_for_event` usage.
Topic creation & scripts `backend/scripts/create_topics.py`, `backend/config.dlq-processor.toml`	Topic creation now uses `EventType` values and broker defaults (removed per-topic config usage); removed dlq-processor TOML entries.
Worker removal `backend/workers/run_dlq_processor.py`, `docker-compose.yaml`	Deleted the standalone `dlq-processor` worker script and removed the dlq-processor service from `docker-compose.yaml`; backend service gained two read-only mounts.
Tests `backend/tests/e2e/conftest.py`, `backend/tests/e2e/dlq/.py`, `backend/tests/e2e/events/test_producer_roundtrip.py`, `backend/tests/unit/`	Updated tests to use `EventType`/string topics instead of `KafkaTopic`; removed DLQ path from producer test; added `TOPIC_ONLY_TYPES` exclusion in event schema coverage tests; simplified mapping tests.
CI workflow `.github/workflows/stack-tests.yml`	Narrowed E2E log collection to exclude `dlq-processor` and reduced set of services for failure log capture.

Sequence Diagram(s)

sequenceDiagram
  participant Container as Container (create_*)
  participant Provider as DLQWorkerProvider
  participant Scheduler as APScheduler
  participant DLQMgr as DLQManager
  participant Broker as KafkaBroker

  Container->>Provider: resolve DLQWorkerProvider
  Provider->>Scheduler: start scheduler (on provider start)
  Provider->>DLQMgr: instantiate DLQManager (get_dlq_manager)
  DLQMgr->>Broker: publish(received_event) -> topic = prefix + EventType
  Scheduler->>DLQMgr: trigger retry check (periodic)
  DLQMgr->>Broker: publish(retried_event/discarded_event) -> topic = prefix + EventType
  Provider->>Scheduler: stop scheduler (on provider stop)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

PR #126: Overlapping changes to DLQManager, provider wiring, and dlq_processor worker lifecycle.
PR #133: Related provider-level refactors that move lifecycle responsibilities into DI providers and change DLQ handling.
PR #137: Refactors DLQ provider implementation and event-by-event publishing patterns tied to DLQManager.

Poem

🐰 I hop from EventType to string,
No KafkaTopic in spring,
The DLQWorker hums with scheduled care,
Retries and routes now everywhere —
nibbling bugs with tidy flair.

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 45.28% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: 1 kafka topic = 1 msg type, everywhere' directly and accurately summarizes the main architectural change: implementing a one-to-one mapping between Kafka topics and event types across the entire codebase, which is precisely what this PR accomplishes.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/one-one

No actionable comments were generated in the recent review. 🎉

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@backend/app/dlq/manager.py`:
- Around line 145-161: The publish of DLQMessageRetriedEvent via
self._broker.publish in retry_message can raise and cause retry_messages_batch
to mark a message as "failed" even though retry and DB update succeeded; wrap
the publish call in a try/except that catches exceptions from
self._broker.publish, logs the failure (or records a non-fatal notification
error) and does not re-raise so the successful retry/discard state remains;
apply the same pattern to the publish calls in handle_message and
discard_message to ensure notification failures never overwrite successful
operations (reference functions: retry_message, handle_message, discard_message;
event class: DLQMessageRetriedEvent; method: self._broker.publish; surrounding
caller: retry_messages_batch).

In `@backend/tests/e2e/dlq/test_dlq_discard.py`:
- Line 35: The test DLQ document uses
original_topic=str(EventType.EXECUTION_REQUESTED) which omits the
KAFKA_TOPIC_PREFIX and creates inconsistent, non-production-like topic names;
update the creation to include the prefix (e.g.,
f"{KAFKA_TOPIC_PREFIX}{EventType.EXECUTION_REQUESTED}") or change the helper
_create_dlq_document to accept a prefix parameter and use it when building
original_topic so tests (including where _create_dlq_document is called) produce
prefixed topics consistent with production and with the behavior in
test_dlq_manager.py.

🧹 Nitpick comments (5)

backend/tests/unit/events/test_mappings_and_types.py (1)

5-10: Consider testing a negative case (unknown/unmapped event type).

The test only verifies two happy-path lookups. A test for an event type that has no class mapping (returning None) would increase confidence in the function's behavior.

backend/app/infrastructure/kafka/mappings.py (1)

46-49: Minor: redundant lru_cache on get_event_class_for_type.

Since _get_event_type_to_class() is already cached (returns the same dict every time), caching the .get() lookup on top adds negligible value. Not a problem, just unnecessary layering.

backend/tests/e2e/dlq/test_dlq_retry.py (1)

19-44: Duplicated _create_dlq_document helper across DLQ test files.

This helper is nearly identical to the one in test_dlq_discard.py (lines 19–44). Both also share the same original_topic prefix inconsistency. Consider extracting it into a shared DLQ test fixture or conftest to reduce duplication and fix the topic format in one place.

backend/tests/e2e/events/test_producer_roundtrip.py (1)

15-22: Test has no assertions — consider adding a basic check.

The test fires produce() but never asserts on the outcome. Even for an e2e smoke test, verifying that produce completes without exception (or returns a non-None result) would add a small safety net. Consider at minimum an explicit assertion or a comment explaining that "no exception == pass" is intentional.
backend/app/dlq/manager.py (1)
87-105: Duplicated EventMetadata construction across three methods.

The same EventMetadata(service_name="dlq-manager", service_version="1.0.0", ...) pattern is repeated in handle_message (Line 95), retry_message (Line 151), and discard_message (Line 183). Consider extracting a small helper to reduce duplication and ensure consistency if the service name/version ever changes.
♻️ Example helper extraction
+    def _event_metadata(self, message: DLQMessage) -> EventMetadata:
+        return EventMetadata(
+            service_name="dlq-manager",
+            service_version="1.0.0",
+            correlation_id=message.event.metadata.correlation_id,
+            user_id=message.event.metadata.user_id,
+        )
Then replace each inline EventMetadata(...) with self._event_metadata(message).

coderabbitai · 2026-02-12T19:30:16Z

backend/app/dlq/manager.py

+        retried_event = DLQMessageRetriedEvent(
+            dlq_event_id=message.event.event_id,
+            original_topic=message.original_topic,
+            original_event_type=message.event.event_type,
+            retry_count=new_retry_count,
+            retry_topic=message.original_topic,
+            metadata=EventMetadata(
+                service_name="dlq-manager",
+                service_version="1.0.0",
+                correlation_id=message.event.metadata.correlation_id,
+                user_id=message.event.metadata.user_id,
            ),
-            topic=self._dlq_events_topic,
+        )
+        await self._broker.publish(
+            retried_event,
+            topic=f"{self._topic_prefix}{retried_event.event_type}",
        )


⚠️ Potential issue | 🟡 Minor

Notification publish failure after successful retry can corrupt batch results.

In retry_message, the core retry (Line 127-131) and DB update (Line 136-143) complete before the DLQMessageRetriedEvent is published (Line 158-161). If this notification publish fails, the exception propagates to retry_messages_batch (Line 246), which catches it and marks the message as "failed" — even though the retry actually succeeded.

Consider wrapping the notification publishes in a try/except so that a notification failure doesn't mask a successful retry/discard.

🛡️ Proposed fix

- await self._broker.publish( - retried_event, - topic=f"{self._topic_prefix}{retried_event.event_type}", - ) + try: + await self._broker.publish( + retried_event, + topic=f"{self._topic_prefix}{retried_event.event_type}", + ) + except Exception: + self.logger.warning( + "Failed to publish DLQ retried notification event", + extra={"event_id": message.event.event_id}, + exc_info=True, + )

Apply the same pattern to handle_message (Line 102-105) and discard_message (Line 190-193).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

retried_event = DLQMessageRetriedEvent(

dlq_event_id=message.event.event_id,

original_topic=message.original_topic,

original_event_type=message.event.event_type,

retry_count=new_retry_count,

retry_topic=message.original_topic,

metadata=EventMetadata(

service_name="dlq-manager",

service_version="1.0.0",

correlation_id=message.event.metadata.correlation_id,

user_id=message.event.metadata.user_id,

),

topic=self._dlq_events_topic,

)

await self._broker.publish(

retried_event,

topic=f"{self._topic_prefix}{retried_event.event_type}",

)

retried_event = DLQMessageRetriedEvent(

dlq_event_id=message.event.event_id,

original_topic=message.original_topic,

original_event_type=message.event.event_type,

retry_count=new_retry_count,

retry_topic=message.original_topic,

metadata=EventMetadata(

service_name="dlq-manager",

service_version="1.0.0",

correlation_id=message.event.metadata.correlation_id,

user_id=message.event.metadata.user_id,

),

)

try:

await self._broker.publish(

retried_event,

topic=f"{self._topic_prefix}{retried_event.event_type}",

)

except Exception:

self.logger.warning(

"Failed to publish DLQ retried notification event",

extra={"event_id": message.event.event_id},

exc_info=True,

)

🤖 Prompt for AI Agents

In `@backend/app/dlq/manager.py` around lines 145 - 161, The publish of DLQMessageRetriedEvent via self._broker.publish in retry_message can raise and cause retry_messages_batch to mark a message as "failed" even though retry and DB update succeeded; wrap the publish call in a try/except that catches exceptions from self._broker.publish, logs the failure (or records a non-fatal notification error) and does not re-raise so the successful retry/discard state remains; apply the same pattern to the publish calls in handle_message and discard_message to ensure notification failures never overwrite successful operations (reference functions: retry_message, handle_message, discard_message; event class: DLQMessageRetriedEvent; method: self._broker.publish; surrounding caller: retry_messages_batch).

coderabbitai · 2026-02-12T19:30:16Z

backend/tests/e2e/dlq/test_dlq_discard.py

    doc = DLQMessageDocument(
        event=event_dict,
-        original_topic=KafkaTopic.EXECUTION_EVENTS,
+        original_topic=str(EventType.EXECUTION_REQUESTED),


⚠️ Potential issue | 🟡 Minor

original_topic is missing the KAFKA_TOPIC_PREFIX, inconsistent with production topic names.

In production, topics are always prefixed (e.g., f"{prefix}{EventType.EXECUTION_REQUESTED}"). In test_dlq_manager.py:89, the same field uses the prefix. Here, str(EventType.EXECUTION_REQUESTED) yields just "execution_requested".

While this doesn't affect these repository-level tests (they don't route by topic), it creates unrealistic test data and inconsistency across the DLQ test suite.

♻️ Suggested fix

- original_topic=str(EventType.EXECUTION_REQUESTED), + original_topic=f"test-prefix-{EventType.EXECUTION_REQUESTED}",

Or better, accept prefix as a parameter for _create_dlq_document.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

original_topic=str(EventType.EXECUTION_REQUESTED),

original_topic=f"test-prefix-{EventType.EXECUTION_REQUESTED}",

🤖 Prompt for AI Agents

In `@backend/tests/e2e/dlq/test_dlq_discard.py` at line 35, The test DLQ document uses original_topic=str(EventType.EXECUTION_REQUESTED) which omits the KAFKA_TOPIC_PREFIX and creates inconsistent, non-production-like topic names; update the creation to include the prefix (e.g., f"{KAFKA_TOPIC_PREFIX}{EventType.EXECUTION_REQUESTED}") or change the helper _create_dlq_document to accept a prefix parameter and use it when building original_topic so tests (including where _create_dlq_document is called) produce prefixed topics consistent with production and with the behavior in test_dlq_manager.py.

cubic-dev-ai

4 issues found across 27 files

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="backend/app/core/providers.py">

<violation number="1" location="backend/app/core/providers.py:227">
P2: Execution event retry policies now exclude several execution lifecycle topics (accepted/queued/started/running), so those topics fall back to the default policy instead of the execution-specific policy. Include all execution lifecycle event types to preserve prior retry behavior after the per-topic split.</violation>

<violation number="2" location="backend/app/core/providers.py:240">
P2: Pod retry policies now omit several pod lifecycle topics (scheduled/running/terminated/deleted), so those topics fall back to the default policy instead of the pod-specific policy. Add the missing pod event types to keep retry behavior consistent after the topic split.</violation>

<violation number="3" location="backend/app/core/providers.py:251">
P2: Saga command retry policies now omit allocate/release resource commands, so those topics fall back to the default policy instead of the saga-specific policy. Include the missing saga command types to keep retry behavior consistent after the topic split.</violation>
</file>

<file name="backend/app/infrastructure/kafka/mappings.py">

<violation number="1" location="backend/app/infrastructure/kafka/mappings.py:6">
P1: Missing subscription entries for active consumer groups. `POD_MONITOR`, `WEBSOCKET_GATEWAY`, and `DLQ_MANAGER` still exist in the codebase (e.g., `workers/run_pod_monitor.py`, `services/pod_monitor/event_mapper.py`) but are absent from the new `CONSUMER_GROUP_SUBSCRIPTIONS`. If this map is used to determine topic subscriptions, these services will have no subscriptions or fail with a `KeyError`.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-02-12T19:31:21Z

backend/app/infrastructure/kafka/mappings.py

-    EventType.DLQ_MESSAGE_RECEIVED: KafkaTopic.DLQ_EVENTS,
-    EventType.DLQ_MESSAGE_RETRIED: KafkaTopic.DLQ_EVENTS,
-    EventType.DLQ_MESSAGE_DISCARDED: KafkaTopic.DLQ_EVENTS,
+CONSUMER_GROUP_SUBSCRIPTIONS: dict[GroupId, set[EventType]] = {


P1: Missing subscription entries for active consumer groups. POD_MONITOR, WEBSOCKET_GATEWAY, and DLQ_MANAGER still exist in the codebase (e.g., workers/run_pod_monitor.py, services/pod_monitor/event_mapper.py) but are absent from the new CONSUMER_GROUP_SUBSCRIPTIONS. If this map is used to determine topic subscriptions, these services will have no subscriptions or fail with a KeyError.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At backend/app/infrastructure/kafka/mappings.py, line 6: <comment>Missing subscription entries for active consumer groups. `POD_MONITOR`, `WEBSOCKET_GATEWAY`, and `DLQ_MANAGER` still exist in the codebase (e.g., `workers/run_pod_monitor.py`, `services/pod_monitor/event_mapper.py`) but are absent from the new `CONSUMER_GROUP_SUBSCRIPTIONS`. If this map is used to determine topic subscriptions, these services will have no subscriptions or fail with a `KeyError`.</comment> <file context> @@ -1,78 +1,35 @@ - EventType.DLQ_MESSAGE_RECEIVED: KafkaTopic.DLQ_EVENTS, - EventType.DLQ_MESSAGE_RETRIED: KafkaTopic.DLQ_EVENTS, - EventType.DLQ_MESSAGE_DISCARDED: KafkaTopic.DLQ_EVENTS, +CONSUMER_GROUP_SUBSCRIPTIONS: dict[GroupId, set[EventType]] = { + GroupId.EXECUTION_COORDINATOR: { + EventType.EXECUTION_REQUESTED, </file context>

cubic-dev-ai · 2026-02-12T19:31:21Z

backend/app/core/providers.py

-            topic=saga_commands,
+        )
+
+    for et in (EventType.CREATE_POD_COMMAND, EventType.DELETE_POD_COMMAND):


P2: Saga command retry policies now omit allocate/release resource commands, so those topics fall back to the default policy instead of the saga-specific policy. Include the missing saga command types to keep retry behavior consistent after the topic split.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At backend/app/core/providers.py, line 251: <comment>Saga command retry policies now omit allocate/release resource commands, so those topics fall back to the default policy instead of the saga-specific policy. Include the missing saga command types to keep retry behavior consistent after the topic split.</comment> <file context> @@ -222,74 +222,56 @@ def _default_retry_policies(prefix: str) -> dict[str, RetryPolicy]: - topic=saga_commands, + ) + + for et in (EventType.CREATE_POD_COMMAND, EventType.DELETE_POD_COMMAND): + topic = f"{prefix}{et}" + policies[topic] = RetryPolicy( </file context>

Suggested change

for et in (EventType.CREATE_POD_COMMAND, EventType.DELETE_POD_COMMAND):

for et in (EventType.CREATE_POD_COMMAND, EventType.DELETE_POD_COMMAND,

EventType.ALLOCATE_RESOURCES_COMMAND, EventType.RELEASE_RESOURCES_COMMAND):

cubic-dev-ai · 2026-02-12T19:31:21Z

backend/app/core/providers.py

-            topic=pod_events,
+        )
+
+    for et in (EventType.POD_CREATED, EventType.POD_FAILED, EventType.POD_SUCCEEDED):


P2: Pod retry policies now omit several pod lifecycle topics (scheduled/running/terminated/deleted), so those topics fall back to the default policy instead of the pod-specific policy. Add the missing pod event types to keep retry behavior consistent after the topic split.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At backend/app/core/providers.py, line 240: <comment>Pod retry policies now omit several pod lifecycle topics (scheduled/running/terminated/deleted), so those topics fall back to the default policy instead of the pod-specific policy. Add the missing pod event types to keep retry behavior consistent after the topic split.</comment> <file context> @@ -222,74 +222,56 @@ def _default_retry_policies(prefix: str) -> dict[str, RetryPolicy]: - topic=pod_events, + ) + + for et in (EventType.POD_CREATED, EventType.POD_FAILED, EventType.POD_SUCCEEDED): + topic = f"{prefix}{et}" + policies[topic] = RetryPolicy( </file context>

Suggested change

for et in (EventType.POD_CREATED, EventType.POD_FAILED, EventType.POD_SUCCEEDED):

for et in (EventType.POD_CREATED, EventType.POD_SCHEDULED, EventType.POD_RUNNING,

EventType.POD_SUCCEEDED, EventType.POD_FAILED, EventType.POD_TERMINATED,

EventType.POD_DELETED):

cubic-dev-ai · 2026-02-12T19:31:21Z

backend/app/core/providers.py

+    for et in (EventType.EXECUTION_REQUESTED, EventType.EXECUTION_COMPLETED,
+               EventType.EXECUTION_FAILED, EventType.EXECUTION_TIMEOUT,
+               EventType.EXECUTION_CANCELLED):


P2: Execution event retry policies now exclude several execution lifecycle topics (accepted/queued/started/running), so those topics fall back to the default policy instead of the execution-specific policy. Include all execution lifecycle event types to preserve prior retry behavior after the per-topic split.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At backend/app/core/providers.py, line 227: <comment>Execution event retry policies now exclude several execution lifecycle topics (accepted/queued/started/running), so those topics fall back to the default policy instead of the execution-specific policy. Include all execution lifecycle event types to preserve prior retry behavior after the per-topic split.</comment> <file context> @@ -222,74 +222,56 @@ def _default_retry_policies(prefix: str) -> dict[str, RetryPolicy]: - topic=execution_events, + policies: dict[str, RetryPolicy] = {} + + for et in (EventType.EXECUTION_REQUESTED, EventType.EXECUTION_COMPLETED, + EventType.EXECUTION_FAILED, EventType.EXECUTION_TIMEOUT, + EventType.EXECUTION_CANCELLED): </file context>

Suggested change

for et in (EventType.EXECUTION_REQUESTED, EventType.EXECUTION_COMPLETED,

EventType.EXECUTION_FAILED, EventType.EXECUTION_TIMEOUT,

EventType.EXECUTION_CANCELLED):

for et in (EventType.EXECUTION_REQUESTED, EventType.EXECUTION_ACCEPTED,

EventType.EXECUTION_QUEUED, EventType.EXECUTION_STARTED,

EventType.EXECUTION_RUNNING, EventType.EXECUTION_COMPLETED,

EventType.EXECUTION_FAILED, EventType.EXECUTION_TIMEOUT,

EventType.EXECUTION_CANCELLED):

sonarqubecloud · 2026-02-12T19:38:15Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

feat: 1 kafka topic = 1 msg type, everywhere

8e34b59

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

cubic-dev-ai bot reviewed Feb 12, 2026

View reviewed changes

fix: deploy (no dlq processor)

0b220b5

	original_topic=str(EventType.EXECUTION_REQUESTED),
	original_topic=f"test-prefix-{EventType.EXECUTION_REQUESTED}",

	for et in (EventType.CREATE_POD_COMMAND, EventType.DELETE_POD_COMMAND):
	for et in (EventType.CREATE_POD_COMMAND, EventType.DELETE_POD_COMMAND,
	EventType.ALLOCATE_RESOURCES_COMMAND, EventType.RELEASE_RESOURCES_COMMAND):

-    for et in (EventType.POD_CREATED, EventType.POD_FAILED, EventType.POD_SUCCEEDED):
+    for et in (EventType.POD_CREATED, EventType.POD_SCHEDULED, EventType.POD_RUNNING,
+               EventType.POD_SUCCEEDED, EventType.POD_FAILED, EventType.POD_TERMINATED,
+               EventType.POD_DELETED):

Conversation

HardMax71 commented Feb 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Feb 12, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

HardMax71 commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

cubic-dev-ai bot Feb 12, 2026 •

edited

Loading

cubic-dev-ai bot Feb 12, 2026 •

edited

Loading

cubic-dev-ai bot Feb 12, 2026 •

edited

Loading

cubic-dev-ai bot Feb 12, 2026 •

edited

Loading