fix: structog instead of logging, also updated middlewares#175
fix: structog instead of logging, also updated middlewares#175
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRemoved legacy correlation-ID handling and adaptive sampler; consolidated tracing into a new Tracer class with OTLP support; migrated logging to structlog and updated many DI/provider/service/repository signatures to accept structlog BoundLogger; renamed/removed correlation_id in replay/event models and APIs. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Tracer as Tracer (new)
participant OTLP as OTLP Exporter
participant Libs as Instrumented Libraries
App->>Tracer: instantiate Tracer(settings, logger)
Tracer->>Tracer: build Resource, choose Sampler, configure Provider
Tracer->>OTLP: attach exporter if OTLP_TRACES_ENDPOINT set
Tracer->>Libs: instrument FastAPI / HTTPX / PyMongo / logging
Tracer->>App: set global tracer provider & propagator
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
3 issues found across 85 files
Note: This PR contains a large number of files. cubic only reviews up to 75 files per PR, so some files may not have been reviewed.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/core/logging.py">
<violation number="1" location="backend/app/core/logging.py:26">
P1: Sensitive data redaction now only applies to the `event` field, so exception/stack strings added by `format_exc_info`/`StackInfoRenderer` remain unredacted. This can leak secrets in stack traces or exception messages. Consider sanitizing all string fields in the event dict.</violation>
</file>
<file name="backend/app/core/tracing/__init__.py">
<violation number="1" location="backend/app/core/tracing/__init__.py:49">
P2: Hard-coding `insecure=True` disables TLS for all OTLP gRPC exports, which can leak trace data or break when the collector requires TLS. Make this configurable or derive it from the endpoint scheme so production can use secure transport.</violation>
</file>
<file name="backend/app/db/repositories/event_repository.py">
<violation number="1" location="backend/app/db/repositories/event_repository.py:48">
P2: OpenTelemetry span attributes only allow primitive types; passing EventType (likely an Enum) can lead to attributes being dropped or rejected. Convert the event type to a string or value before setting the attribute.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/services/saved_script_service.py (1)
21-28:⚠️ Potential issue | 🟠 MajorUse keyword arguments directly instead of
extra={}for idiomatic structlog logging.The current
extra={...}pattern doesn't work as intended with this structlog configuration. Withoutstructlog.stdlib.ExtraAdder()in the processor chain, theextradict is treated as a single nested key in the JSON output, not as flattened context fields. This affects ~10 logging call sites in this file.Pass context as direct keyword arguments instead:
self.logger.info( "Creating new saved script", user_id=user_id, script_name=saved_script_create.name, script_length=len(saved_script_create.script), )This is the idiomatic structlog approach and will ensure proper log structure for analysis and aggregation.
backend/app/services/notification_service.py (1)
144-153:⚠️ Potential issue | 🟠 Major
extra={...}creates nested dicts instead of top-level fields in structlog output.In stdlib
logging,extrais a special parameter that merges keys into theLogRecord. In structlog,extrahas no special meaning—it becomes a nested dict under an"extra"key in the structured output.The test suite (test_logging_and_correlation.py:180) demonstrates the idiomatic pattern:
logger.info("test message", key="value")produces{"event": "test message", "key": "value"}at the top level. Usingextra={"key": "value"}instead produces{"event": "test message", "extra": {"key": "value"}}, which breaks downstream log parsing and alerting that depend on top-level field access.This pattern appears across 30+ files in the codebase (kafka_event_service.py, sse_service.py, user_settings_service.py, admin services, saga services, execution_service.py, dlq/manager.py, repositories, and others).
♻️ Suggested fix (example for lines 144-153)
self.logger.info( - f"Creating notification for user {user_id}", - extra={ - "user_id": user_id, - "channel": channel, - "severity": str(severity), - "tags": tags, - "scheduled": scheduled_for is not None, - }, + "Creating notification", + user_id=user_id, + channel=channel, + severity=str(severity), + tags=tags, + scheduled=scheduled_for is not None, )
🤖 Fix all issues with AI agents
In `@backend/app/core/logging.py`:
- Around line 20-31: The sanitizer currently only redacts sensitive data in
event_dict["event"] inside sanitize_sensitive_data, so keyword args (e.g.,
token=secret) remain unhashed; update the function to iterate over event_dict
items and, for any value that is a str, apply the same SENSITIVE_PATTERNS
substitutions and replace the value in event_dict, while leaving non-string
values untouched; also update the docstring of sanitize_sensitive_data to
document that all string fields in event_dict are sanitized and reference
SENSITIVE_PATTERNS and the function name for clarity.
- Around line 58-88: The setup_logger function is reconfiguring structlog every
time it's called, clearing global state; add an idempotency guard (e.g., a
module-level boolean like _logger_configured and a cached module-level logger
variable) so setup_logger(...) performs structlog.configure(...) only once and
subsequent calls return the cached structlog.get_logger("integr8scode") without
reconfiguring; alternatively remove direct calls to setup_logger and ensure
callers use the DI-provided LoggingProvider/APP-scoped logger (update references
in callers such as main.py and dishka_lifespan.py to accept/inject the logger
instead of calling setup_logger directly).
In `@backend/app/domain/events/typed.py`:
- Line 37: EventMetadata's correlation_id default of "" causes app-scoped
services to emit events without tracing; to fix, ensure correlation IDs are
propagated from request-scoped code into services that create events: either
make services that emit events (e.g., UserSettingsService) request-scoped
(change DI registration to request scope) so they can read
request.state.correlation_id in publish_event, or explicitly thread a
correlation_id parameter through public APIs and constructors so methods that
build EventMetadata (and event classes like UserSettingsUpdatedEvent) receive
and set correlation_id; update call sites in ExecutionService and request
handlers to pass the request's correlation_id into service calls and into
publish_event instead of relying on the empty default.
In `@backend/pyproject.toml`:
- Line 125: The pyproject.toml dependency for structlog is using a range
operator ("structlog>=25.5.0") which breaks reproducible builds; change that
entry to an exact pin ("structlog==25.5.0") so it matches the rest of the file
and ensures deterministic installs—update the line containing
"structlog>=25.5.0" accordingly and run your dependency install/check to
confirm.
🧹 Nitpick comments (26)
backend/app/services/saga/execution_saga.py (1)
23-23: Prefer structlog keyword arguments over f-strings for structured log messages.Using f-strings with structlog defeats the purpose of structured logging — the interpolated values won't be individually queryable/filterable in log aggregation systems. This pattern recurs throughout this file (lines 23, 52, 90, 97, 126, 149, 167, 183).
Example fix for line 23:
♻️ Suggested refactor
- logger.info(f"Validating execution {event.execution_id}") + logger.info("Validating execution", execution_id=event.execution_id)Apply the same pattern to all other f-string log calls in this file.
backend/app/services/user_settings_service.py (1)
41-50: Mixed logging styles: f-strings andextra={}dict are both non-idiomatic for structlog.The file mixes two patterns:
extra={...}dict (line 41) — this is a stdlibloggingconvention; structlog passes context as keyword arguments directly.- f-strings (line 50, 201, 206, 212) — interpolated values are not structured.
With
structlog.stdlib.BoundLogger, the idiomatic approach is:# Instead of extra={} self.logger.info("UserSettingsService initialized", cache_ttl_seconds=self._cache_ttl.total_seconds(), max_cache_size=self._max_cache_size) # Instead of f-strings self.logger.debug("Settings cache hit", user_id=user_id, cache_size=len(self._cache))Whether
extra=works depends on your structlog processor chain configuration, but switching to direct kwargs is more portable and idiomatic.backend/app/services/execution_service.py (1)
116-124: Sameextra={}pattern note as inuser_settings_service.py.The
extra=dict style is used extensively throughout this file. With structlog, prefer passing context as direct keyword arguments for idiomatic structured logging. This applies to allextra={...}calls in this file.backend/app/services/admin/admin_user_service.py (1)
36-36: Consider using structlog's native keyword arguments instead ofextra={}.With
structlog.stdlib.BoundLogger, context variables can be passed directly as keyword arguments (e.g.,self.logger.info("Admin getting user overview", target_user_id=user_id, hours=hours)). Theextra={}dict is a stdlibloggingconvention. While it works when structlog is configured with stdlib compatibility, using native kwargs is more idiomatic and ensures the context is properly processed by all structlog processors.This applies throughout this file and likely across the codebase.
backend/app/services/result_processor/processor.py (1)
73-74: Prefer structured key-value pairs over f-string interpolation in structlog.Using f-strings in log messages (e.g.,
f"Failed to handle ExecutionCompletedEvent: {e}") defeats the purpose of structured logging — the error details get baked into the message string rather than being searchable fields. This pattern appears in lines 73, 95, and 121.♻️ Example refactor
- self.logger.error(f"Failed to handle ExecutionCompletedEvent: {e}", exc_info=True) + self.logger.error("Failed to handle ExecutionCompletedEvent", error=str(e), exc_info=True)backend/app/services/pod_monitor/event_mapper.py (1)
82-85: Structured logging calls still use f-strings instead of structlog keyword arguments.Throughout this file, log calls embed data in f-strings (e.g.,
self.logger.info(f"POD-EVENT: type={event_type} ...")). This negates the key benefit of structlog — machine-parseable structured fields. Consider using keyword arguments instead:self.logger.info("pod_event_received", event_type=event_type, name=..., namespace=..., phase=...)This applies to all ~20 log calls in the file. Not blocking for this PR, but worth tracking as a follow-up.
backend/tests/unit/services/pod_monitor/test_event_mapper.py (1)
206-206:caplogfixture is unused in this test.The
caplog: pytest.LogCaptureFixtureparameter is declared but never asserted on. With structlog (unless explicitly configured to route through stdlib),caplogwon't capture structured log output anyway. Consider removing it.backend/app/services/sse/sse_service.py (1)
6-7: Nit: extra blank line after import.Line 7 is an empty line that creates a double blank line between the
structlogimport and the next import block (line 8). Minor formatting nit.backend/app/core/tracing/__init__.py (2)
45-51:insecure=Trueis hardcoded — make it configurable.This forces plaintext gRPC for all environments. If the OTLP collector is reachable over a network (not localhost), spans are transmitted without TLS. Consider adding a setting (e.g.,
OTLP_INSECURE: bool = True) so production deployments with TLS-enabled collectors can disable this.🔒 Suggested change
provider.add_span_processor( BatchSpanProcessor(OTLPSpanExporter( endpoint=settings.OTLP_TRACES_ENDPOINT, - insecure=True, + insecure=settings.OTLP_INSECURE, )) )
20-64: Noshutdown()— pending spans may be lost on graceful exit.
BatchSpanProcessorbuffers spans and flushes them periodically. Without callingprovider.shutdown()during application teardown, any buffered spans at exit will be dropped. Consider exposing a shutdown method (or registering anatexithandler) so the DI container or lifespan can flush spans cleanly.♻️ Suggested addition
class Tracer: """DI-managed OpenTelemetry tracer. Initialization happens on construction.""" def __init__(self, settings: Settings, logger: structlog.stdlib.BoundLogger) -> None: + self._provider: TracerProvider | None = None name = settings.TRACING_SERVICE_NAME rate = settings.TRACING_SAMPLING_RATE ... provider = TracerProvider(resource=resource, sampler=sampler) + self._provider = provider ... + def shutdown(self) -> None: + """Flush pending spans and shut down the tracer provider.""" + if self._provider: + self._provider.shutdown()backend/app/services/event_replay/replay_service.py (3)
60-61: Prefer structured kwargs over f-strings with structlog.With structlog, using f-strings loses the structured context. Pass the error as a keyword argument so it appears as a separate field in structured output.
♻️ Suggested change
- self.logger.error(f"Failed to create replay session: {e}") + self.logger.error("Failed to create replay session", error=str(e))
270-278: Use direct keyword arguments instead ofextra={}for structlog.The
extra={}dict is a stdlibloggingpattern. Withstructlog.stdlib.BoundLogger, keyword arguments should be passed directly so they appear as top-level keys in the structured log output. Usingextra={...}nests them under anextrakey, which defeats the purpose of the structlog migration.This applies to all
extra={}usages in this file (lines 172, 270–278, 296–298, 325–327, and 369).♻️ Example fix for this call site
self.logger.info( "Replay session finished", - extra={ - "session_id": session.session_id, - "status": session.status, - "replayed_events": session.replayed_events, - "failed_events": session.failed_events, - }, + session_id=session.session_id, + status=session.status, + replayed_events=session.replayed_events, + failed_events=session.failed_events, )
368-369: Same two issues here: f-string +extrapattern missing.- self.logger.error(f"Failed to update session in database: {e}") + self.logger.error("Failed to update session in database", error=str(e))backend/app/db/repositories/notification_repository.py (1)
205-205: Use structured key-value pairs instead of f-string for structlog.With structlog, embedding data via f-strings loses the benefit of structured logging (fields won't be queryable in log aggregators). Pass values as keyword arguments instead.
♻️ Suggested fix
- self.logger.info(f"Found {len(user_ids)} users with roles {roles}") + self.logger.info("Found users by roles", count=len(user_ids), roles=roles)backend/app/services/admin/admin_settings_service.py (1)
20-23: Prefer structlog keyword arguments overextra={}dict.With structlog's
BoundLogger, pass context as direct keyword arguments rather than wrapping them inextra={}. Theextrapattern is a stdlib logging convention; structlog natively supportsself.logger.info("msg", user_id=user_id)which makes fields first-class in the structured output.This pattern appears in multiple log calls in this file (lines 22, 29, 39) and likely across other migrated files.
♻️ Suggested fix (example for one call site)
self.logger.info( "Admin retrieving system settings", - extra={"user_id": user_id}, + user_id=user_id, )backend/app/db/repositories/execution_repository.py (1)
22-24: Sameextra={}pattern — use structlog keyword args directly.As noted in
admin_settings_service.py, prefer passing context as keyword arguments rather than wrapping inextra={}. This applies to all log calls in this file (lines 22, 24, 28, 31, 34, 51).♻️ Example fix
- self.logger.info("Inserting execution into MongoDB", extra={"execution_id": doc.execution_id}) + self.logger.info("Inserting execution into MongoDB", execution_id=doc.execution_id)backend/app/db/repositories/event_repository.py (1)
55-57: Consider using structlog's key-value style instead of f-strings.With structlog, the idiomatic approach is to pass context as keyword arguments (e.g.,
self.logger.debug("Event already stored", event_id=event.event_id)). Using f-strings defeats the purpose of structured logging since the data gets baked into the message string and can't be queried/filtered by log aggregation tools.This pattern appears throughout the codebase (not just this file), so it could be addressed as a follow-up.
Example for this file
- self.logger.debug(f"Event {event.event_id} already stored, skipping") + self.logger.debug("Event already stored, skipping", event_id=event.event_id) return event.event_id - self.logger.debug(f"Stored event {event.event_id} of type {event.event_type}") + self.logger.debug("Stored event", event_id=event.event_id, event_type=event.event_type)backend/app/api/routes/execution.py (1)
60-68: Hardcodedhttp.methodandhttp.routeare fragile and likely redundant.OpenTelemetry's FastAPI/Starlette instrumentation (if enabled) already sets
http.methodandhttp.routeon the server span automatically. Hardcoding them here means they'll go stale if the route path or method changes. The custom attributes (execution.language,execution.script_length,user.id,client.address) are the valuable additions.Also, note that
user.idin trace attributes could be a PII/compliance consideration depending on your organization's observability data policies — ensure your trace backend's retention and access controls account for this.Remove redundant attributes
trace.get_current_span().set_attributes({ - "http.method": "POST", - "http.route": "/api/v1/execute", "execution.language": execution.lang, "execution.language_version": execution.lang_version, "execution.script_length": len(execution.script), "user.id": current_user.user_id, "client.address": get_client_ip(request), })backend/app/services/admin/admin_events_service.py (1)
107-113:extra={}is a stdlib logging pattern — structlog prefers direct keyword arguments.With
structlog.stdlib.BoundLogger, passingextra={...}wraps the context under an"extra"key in the log event dict rather than promoting each field to a top-level key. The idiomatic structlog approach is to pass context as keyword arguments directly. This affects all logging calls in this file that useextra=.Whether this matters depends on your structlog processor pipeline (e.g., if you have a processor that extracts
extra), but it's worth aligning with structlog conventions for consistent log structure.Example fix (apply same pattern to all similar calls in this file)
self.logger.info( "Preparing replay session", - extra={ - "dry_run": dry_run, - "replay_correlation_id": replay_correlation_id, - }, + dry_run=dry_run, + replay_correlation_id=replay_correlation_id, )backend/app/db/repositories/user_settings_repository.py (1)
29-29: Prefer structured keyword arguments over f-strings with structlog.Using f-strings in structlog calls (
self.logger.info(f"Created settings snapshot for user {settings.user_id}")) embeds variables into the event string, which defeats structlog's key advantage: machine-parseable structured fields. The same applies to line 81.Idiomatic structlog passes variable data as keyword args so log aggregators can filter/index on them:
♻️ Suggested change
- self.logger.info(f"Created settings snapshot for user {settings.user_id}") + self.logger.info("Created settings snapshot", user_id=settings.user_id)- self.logger.info(f"Deleted all settings data for user {user_id}") + self.logger.info("Deleted all settings data", user_id=user_id)This pattern appears across most files in this PR (coordinator, notification_service, notification_scheduler, etc.). Consider a project-wide pass to replace f-string log messages with keyword args.
backend/app/services/notification_scheduler.py (1)
37-50: Same f-string pattern — use keyword args for structured fields.Already flagged in
user_settings_repository.py. Same applies here for lines 37, 46, and 50.♻️ Example for line 37
- self.logger.info(f"Found {len(due)} due scheduled notifications") + self.logger.info("Found due scheduled notifications", count=len(due))backend/app/services/coordinator/coordinator.py (1)
74-74:"HANDLER CALLED:"looks like a debug/development artifact.This log message reads like a temporary trace rather than a production log entry. Consider downgrading to
debugor removing theHANDLER CALLED:prefix.♻️ Suggested change
- self.logger.info(f"HANDLER CALLED: handle_execution_requested for event {event.event_id}") + self.logger.info("Handling execution request", event_id=event.event_id)backend/app/settings.py (1)
114-115: Two OTLP endpoint settings with inconsistent naming and default conventions.There are now two endpoint fields:
OTLP_TRACES_ENDPOINT: str = ""(traces, line 115)OTEL_EXPORTER_OTLP_ENDPOINT: str | None = None(metrics, line 157)The naming prefixes differ (
OTLP_vsOTEL_EXPORTER_OTLP_) and the "disabled" sentinels differ (""vsNone). Consider aligning them — e.g.,OTLP_METRICS_ENDPOINT: str | None = NonealongsideOTLP_TRACES_ENDPOINT: str | None = None— to reduce cognitive load and make the disabled-check consistent.backend/app/core/logging.py (1)
7-17: Sensitive patterns look solid; note the broad email regex.The email pattern on line 16 (
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}) will match any email-like string in log messages. This is good for GDPR/PII compliance, but be aware it may over-redact strings likeuser@service-namein non-PII contexts (e.g., Kubernetes resource names likepod@cluster). If this causes issues, you could tighten the TLD portion (e.g., require.com,.org, etc.) or move it to a separate opt-in processor.backend/app/core/providers.py (2)
162-162: Prefer structured key-value args over f-strings in structlog calls.Throughout this file, f-strings are used in structlog log calls (lines 162, 184, 186, 368, 840, 844, etc.). This defeats much of the benefit of structured logging — the dynamic values get baked into the message string rather than being separate, queryable fields.
Example:
-logger.info(f"Redis connected: {settings.REDIS_HOST}:{settings.REDIS_PORT}/{settings.REDIS_DB}") +logger.info("Redis connected", host=settings.REDIS_HOST, port=settings.REDIS_PORT, db=settings.REDIS_DB)This applies broadly across the file. Not blocking, but worth addressing to get the full benefit of the structlog migration.
93-94: Consider adding a comment explaining the_tracerdependency.
_tracer: Traceris injected solely to ensure theTraceris initialized before the broker starts (side-effect dependency). A brief inline comment would clarify this for future readers.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/core/providers.py (1)
92-108:⚠️ Potential issue | 🟠 MajorAdd explanatory comment to
_tracerparameter and fix logger type passed toKafkaBroker.The
_tracer: Tracerparameter is intentionally unused—it's a dependency injection ordering trick to ensure the Tracer is initialized (instrumenting FastAPI, HTTPX, etc.) before the broker starts. Add a brief inline comment to document this so future maintainers don't accidentally remove it.Critical:
KafkaBrokerexpectslogger: Optional[logging.Logger](standard library), but the code passesstructlog.stdlib.BoundLogger. According to FastStream's logging documentation, the correct pattern is to pass a stdliblogging.Loggerand let structlog wrap it at the application level, not at the provider level. Extract the underlying logger or create a stdlib logger instance to pass here.
🧹 Nitpick comments (1)
backend/app/core/providers.py (1)
147-147: Consider using structlog keyword arguments instead of f-strings for structured log messages.With structlog, passing data as keyword arguments (e.g.,
logger.info("Redis connected", host=..., port=..., db=...)) enables structured querying and avoids formatting costs when the level is filtered. This applies to several places in this file (lines 147, 169, 171, 334, 353, etc.).Not blocking, but it undermines the benefit of switching to structlog if messages remain unstructured f-strings.
Example for line 147
- logger.info(f"Redis connected: {settings.REDIS_HOST}:{settings.REDIS_PORT}/{settings.REDIS_DB}") + logger.info("Redis connected", host=settings.REDIS_HOST, port=settings.REDIS_PORT, db=settings.REDIS_DB)
There was a problem hiding this comment.
1 issue found across 4 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/api/routes/auth.py">
<violation number="1" location="backend/app/api/routes/auth.py:43">
P2: Switching to `structlog.stdlib.BoundLogger` but still using `extra={...}` means your structured fields won’t be captured the way structlog expects (it uses keyword args). This will likely drop or nest log context unless you add processors like `ExtraAdder`. Update log calls to pass fields as keyword arguments or bind the context instead.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
backend/app/events/handlers.py (1)
316-336:⚠️ Potential issue | 🟡 MinorInjected
loggeris unused inon_dlq_message.The
loggerparameter is injected via DI on line 320 but is never referenced in the function body. Either remove it or use it for observability (e.g., logging receipt of DLQ messages, errors, or processing duration).backend/app/services/execution_service.py (1)
295-305:⚠️ Potential issue | 🟡 MinorInconsistent
correlation_idusage in pseudo-event metadata.The
correlation_idparameter is correctly forwarded toexecute_script(lines 292, 337), but the pseudo-event on line 301 hardcodescorrelation_id=str(uuid4())instead of using the passed-incorrelation_id. This means idempotency operations (check, reserve, mark) will be tracked under a random ID rather than the request's correlation context.Proposed fix
pseudo_event = BaseEvent( event_id=str(uuid4()), event_type=EventType.EXECUTION_REQUESTED, timestamp=datetime.now(timezone.utc), metadata=EventMetadata( user_id=user_id, - correlation_id=str(uuid4()), + correlation_id=correlation_id or str(uuid4()), service_name="api", service_version="1.0.0", ), )backend/app/api/routes/auth.py (1)
43-55: 🛠️ Refactor suggestion | 🟠 MajorRefactor logging calls to use idiomatic structlog kwargs instead of
extra={}.With
structlog.stdlib.BoundLoggerconfigured viastructlog.stdlib.LoggerFactory()and the current processor chain, passingextra={...}nests the dict under an"extra"key in the event dict rather than merging values at the top level. This changes the log output shape.Idiomatic structlog would be:
logger.info( "Login attempt", username=form_data.username, client_ip=get_client_ip(request), endpoint="/login", user_agent=request.headers.get("user-agent"), )This pattern appears in approximately 125 locations across the codebase. Aligns with your PR note about updating dashboards/parsers for the new log shape — this refactoring will flatten the output and make it easier for parsers to consume the fields directly.
🧹 Nitpick comments (5)
backend/app/events/handlers.py (1)
49-49: Use structured key-value args instead of f-string for structlog.Now that you've migrated to structlog, this log line should leverage structured logging rather than embedding data in the message string. f-string messages defeat the purpose of structlog (machine-parseable fields, filtering by key, etc.).
♻️ Proposed fix
- logger.info(f"Duplicate event: {event.event_type} ({event.event_id})") + logger.info("duplicate_event", event_type=event.event_type, event_id=event.event_id)backend/tests/e2e/app/test_main_app.py (1)
260-266: Minor: local import and stale docstring.
structlogis imported locally inside the test method, whereastest_container.pyimports it at module level. Consider moving the import to the top of the file for consistency. Also, the docstring on Line 261 still says "Logger" rather than "BoundLogger".Suggested diff
At the top of the file (after line 4):
import pytest +import structlog import redis.asyncio as aioredisIn the test method:
async def test_container_resolves_logger(self, scope: AsyncContainer) -> None: - """Container can resolve Logger.""" - import structlog - + """Container can resolve BoundLogger.""" logger = await scope.get(structlog.stdlib.BoundLogger)backend/app/core/providers.py (2)
146-146: Prefer structured key-value args over f-strings with structlog.Throughout this file, log calls use f-strings (e.g., Lines 146, 168, 170, 352, 822, 826). This embeds data into the message string, losing the main benefit of structured logging (machine-parseable fields). Use keyword arguments instead.
Example fix for Line 146
- logger.info(f"Redis connected: {settings.REDIS_HOST}:{settings.REDIS_PORT}/{settings.REDIS_DB}") + logger.info("redis_connected", host=settings.REDIS_HOST, port=settings.REDIS_PORT, db=settings.REDIS_DB)The same pattern should be applied to all f-string log calls in this file (Lines 168, 170, 352, 822, 826).
91-107:_tracerparameter used solely as a DI side-effect trigger.The underscore-prefixed
_tracer: Traceron Line 93 is never used in the method body — it exists only to force the DI container to instantiateTracerbefore the broker is created. This works but is non-obvious. A brief inline comment would help future readers understand the intent.Suggested comment
`@provide` async def get_broker( - self, settings: Settings, logger: structlog.stdlib.BoundLogger, _tracer: Tracer, + self, settings: Settings, logger: structlog.stdlib.BoundLogger, + _tracer: Tracer, # injected to ensure OTEL tracing is initialized before broker starts ) -> AsyncIterator[KafkaBroker]:backend/app/services/execution_service.py (1)
438-439: Prefer structured message over f-string with structlog.Using an f-string in the log message embeds variable data into the message string, defeating structlog's structured logging benefits (searchability, parsing). Pass the count as a key instead.
Proposed fix
self.logger.debug( - f"Retrieved {len(executions)} executions for user", + "Retrieved executions for user", extra={ "user_id": user_id, + "count": len(executions), "filters": {k: v for k, v in query.items() if k != "user_id"}, "limit": limit, "skip": skip, }, )
There was a problem hiding this comment.
1 issue found across 53 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/db/repositories/admin/admin_events_repository.py">
<violation number="1" location="backend/app/db/repositories/admin/admin_events_repository.py:81">
P2: Guard against a missing aggregate_id; otherwise the related-events query can return unrelated events with aggregate_id=None.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
backend/app/dlq/manager.py (2)
83-83:⚠️ Potential issue | 🟠 Major
extra={…}is a stdlib logging pattern, not idiomatic structlog.With
structlog.stdlib.BoundLogger, passingextra={"event_id": ...}logs a single key"extra"containing a dict, rather than flatteningevent_idinto the log event's top-level context. The idiomatic structlog approach is to pass context as keyword arguments directly. This applies to all logging calls in this file (lines 83, 161, 191, 218, 222, 269, 273).Example fix (apply same pattern everywhere)
- self.logger.info("Message filtered out", extra={"event_id": message.event.event_id}) + self.logger.info("Message filtered out", event_id=message.event.event_id)Line 191:
- self.logger.warning("Discarded message", extra={"event_id": message.event.event_id, "reason": reason}) + self.logger.warning("Discarded message", event_id=message.event.event_id, reason=reason)Line 222:
- self.logger.info("Skipping manual retry", extra={"event_id": event_id, "status": message.status}) + self.logger.info("Skipping manual retry", event_id=event_id, status=message.status)
251-251:⚠️ Potential issue | 🟡 MinorAvoid f-string interpolation in structured log messages.
Using
f"Error retrying message {event_id}: {e}"defeats structured logging — the event_id and error won't be queryable as separate fields. Pass them as keyword arguments instead.Suggested fix
- self.logger.error(f"Error retrying message {event_id}: {e}") + self.logger.error("Error retrying message", event_id=event_id, error=str(e))backend/app/services/saga/saga_orchestrator.py (1)
258-258:⚠️ Potential issue | 🟠 Major
extra={}is a stdlib logging pattern, not idiomatic structlog.With structlog, context should be passed as keyword arguments directly, not via
extra={}. Theextradict is alogging.Loggerconcept and may not be processed correctly by structlog processors (depending on configuration).For example:
-self.logger.error("Saga not found", extra={"saga_id": saga_id}) +self.logger.error("Saga not found", saga_id=saga_id)-self.logger.warning( - "Cannot cancel saga in current state. Only RUNNING or CREATED sagas can be cancelled.", - extra={"saga_id": saga_id, "state": saga_instance.state}, -) +self.logger.warning( + "Cannot cancel saga in current state. Only RUNNING or CREATED sagas can be cancelled.", + saga_id=saga_id, state=saga_instance.state, +)This applies to all
extra={}calls incancel_saga. The f-string log calls elsewhere in this file have the same issue noted inexecution_saga.py— prefer keyword args for structured data.Also applies to: 262-265, 273-280, 303-303, 307-311
docs/operations/tracing.md (2)
84-84:⚠️ Potential issue | 🟡 MinorStale reference to adaptive sampling.
The PR removes adaptive sampling support, but this line still says "The sampler can be ratio-based or adaptive; both are supported." Update to reflect the current sampling strategy (tail_sampling via otel-collector per PR description).
7-7:⚠️ Potential issue | 🟡 MinorUpdate documentation to reflect new Tracer initialization flow.
The documentation at line 7 references the removed
init_tracingfunction and incorrect module path. The tracing initialization now happens via aTracerclass (inapp/core/tracing/__init__.py) that is instantiated as a DI-managed component inapp/core/providers.py. Update the paragraph to describe this new flow instead of the oldinit_tracing-based initialization.
🤖 Fix all issues with AI agents
In `@docs/operations/logging.md`:
- Around line 90-91: Update the stale reference to Jaeger in the sentence that
tells readers to "jump to Jaeger" — change it to reference the generic OTLP
tracing backend or the configured tracing system (e.g., OTLP collector/Tempo)
and mention the environment variable OTLP_TRACES_ENDPOINT and the trace_id for
locating traces; ensure the text now reads generically (e.g., "use the trace_id
to view the full distributed trace in your tracing backend or OTLP collector")
and remove any Jaeger-specific wording so it aligns with removal of
ENABLE_TRACING and Jaeger config.
In `@frontend/src/routes/admin/AdminEvents.svelte`:
- Around line 169-170: The new event object is being created with replay_id: ''
which leaves the replay banner inconsistent; instead assign replay_id from the
API response when mapping/constructing the event (e.g., use response.replay_id
or the API payload's replay_id) while still keeping created_at: new
Date().toISOString() as-is; update the code in the function that builds events
(where replay_id and created_at are set) to read and use the replay_id field
from the server response, falling back to an empty string only if the response
value is missing.
🧹 Nitpick comments (9)
backend/tests/unit/services/pod_monitor/test_event_mapper.py (1)
204-204: Unusedcaplogfixture parameter.
caplogis declared but never referenced in the test body. With the migration to structlog, it no longer captures log output unless structlog is explicitly configured to propagate to standard logging. Consider removing it.Suggested fix
-async def test_parse_and_log_paths_and_analyze_failure_variants(caplog: pytest.LogCaptureFixture) -> None: +async def test_parse_and_log_paths_and_analyze_failure_variants() -> None:backend/app/services/result_processor/processor.py (1)
73-73: Use structured key-value args instead of f-strings with structlog.The whole point of migrating to structlog is machine-parseable, structured logs. Using f-strings collapses context into an opaque string. Same applies to lines 95 and 121.
♻️ Proposed fix (apply same pattern to lines 95 and 121)
- self.logger.error(f"Failed to handle ExecutionCompletedEvent: {e}", exc_info=True) + self.logger.error("failed to handle ExecutionCompletedEvent", error=str(e), execution_id=event.execution_id, exc_info=True)backend/app/services/saga/execution_saga.py (1)
23-23: Prefer structured key-value args over f-strings with structlog.Using f-strings with structlog (e.g.,
logger.info(f"Validating execution {event.execution_id}")) embeds dynamic values into the message string, which defeats structlog's structured logging benefits — values won't be independently searchable or filterable. Consider using keyword arguments instead:logger.info("Validating execution", execution_id=event.execution_id)This applies to all log calls in this file. Not blocking, but it would be good to address given you're already touching every log line in this migration.
Also applies to: 52-52, 97-97, 124-124, 147-147, 165-165, 180-180
backend/app/services/saga/saga_orchestrator.py (1)
141-141: Tracer is re-created on every_execute_sagacall.
trace.get_tracer(__name__)is cheap but idiomatic usage is to create it once at module scope (like you did with the structlog logger inexecution_saga.py). Consider hoisting to module level:_tracer = trace.get_tracer(__name__)backend/app/services/coordinator/coordinator.py (1)
74-74: Leverage structlog's structured logging instead of f-strings.Now that the logger is a
structlog.stdlib.BoundLogger, all the log calls in this file still use f-string interpolation (e.g., lines 74, 84, 94, 109, 158, 161, 190–194, 212–214, etc.). This defeats the primary benefit of structlog — structured, machine-parsable, filterable key-value log entries.For example:
♻️ Suggested pattern (representative, not exhaustive)
- self.logger.info(f"HANDLER CALLED: handle_execution_requested for event {event.event_id}") + self.logger.info("handle_execution_requested called", event_id=event.event_id)- self.logger.error(f"Failed to handle execution request {event.execution_id}: {e}", exc_info=True) + self.logger.error("failed to handle execution request", execution_id=event.execution_id, exc_info=True)- self.logger.info( - f"Added execution {event.execution_id} to queue. " - f"Priority: {event.priority}, Position: {position}, " - f"Queue size: {len(self._queue)}" - ) + self.logger.info( + "execution added to queue", + execution_id=event.execution_id, + priority=event.priority, + position=position, + queue_size=len(self._queue), + )This applies to every
self.logger.*call in the file. Consider applying the same pattern across all files touched by this PR for consistency.Also applies to: 84-84, 94-94, 190-194
frontend/src/components/admin/events/EventsTable.svelte (1)
194-198: Mobile view correctly updated to showaggregate_idinstead ofcorrelation_id.Note that the desktop table view doesn't include an Aggregate column — only Time, Type, User, Service, and Actions are shown. If aggregate info is useful, consider adding it to the desktop view as well for parity, though this is a minor UX consideration.
backend/app/db/repositories/user_settings_repository.py (1)
29-29: Use structlog's native key-value binding instead of f-strings.The logging documentation (lines 37–49 of
docs/operations/logging.md) states that messages should be static strings with details passed as structured data. With structlog, pass context as keyword args rather than interpolating into the message:Suggested fix
- self.logger.info(f"Created settings snapshot for user {settings.user_id}") + self.logger.info("Created settings snapshot", user_id=settings.user_id)Same applies to line 76:
- self.logger.info(f"Deleted all settings data for user {user_id}") + self.logger.info("Deleted all settings data", user_id=user_id)This avoids log injection risk and makes logs queryable by
user_idas a separate field.backend/app/services/user_settings_service.py (1)
41-50: Inconsistent logging style: f-strings mixed withextradicts.Line 41-44 correctly uses a static message with
extra, but line 50 interpolatesuser_idinto the message while also passingextra. The same pattern appears on lines 200, 205, and 211. Consider adopting one consistent approach — ideally structlog's native keyword args:Example for line 50
- self.logger.debug(f"Settings cache hit for user {user_id}", extra={"cache_size": len(self._cache)}) + self.logger.debug("Settings cache hit", user_id=user_id, cache_size=len(self._cache))This also applies to lines 200, 205, and 211 — and to the repository file reviewed above. Using structlog's keyword args is idiomatic, avoids log injection, and makes all fields independently queryable.
backend/app/services/admin/admin_events_service.py (1)
8-9: Use keyword arguments instead ofextradict for structured fields.
Passing fields viaextra={...}creates a literal"extra"key in the event dict. Use keyword arguments to keep structured fields at the top level of the log context.♻️ Example change (apply similarly to other log calls)
- self.logger.info( - "Preparing replay session", - extra={ - "dry_run": dry_run, - "replay_id": replay_id, - }, - ) + self.logger.info( + "Preparing replay session", + dry_run=dry_run, + replay_id=replay_id, + )Also applies to: 142–147, 184–190
There was a problem hiding this comment.
1 issue found across 28 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/core/logging.py">
<violation number="1" location="backend/app/core/logging.py:81">
P2: `logging.basicConfig()` is a no-op when the root logger already has handlers. Since `setup_logger` is called multiple times (confirmed in `main.py` lines 54 and 131), subsequent calls won't update the stdlib log level. The removed `logging.getLogger().setLevel(level)` was specifically working around this. Consider adding `force=True` to `basicConfig` (available since Python 3.8) so that each call fully reconfigures the root logger.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/services/saved_script_service.py (1)
72-75:⚠️ Potential issue | 🟡 MinorRepo return value is discarded, causing a redundant DB round-trip.
saved_script_repo.update_saved_scriptalready returnsDomainSavedScript | None(returnsNonewhen not found, or the updated document on success). Ignoring this and issuing a separateget_saved_scriptcall means:
- An unnecessary extra database query on every update.
- A narrow race window: a concurrent delete between lines 72 and 73 would raise
SavedScriptNotFoundErroreven though the update succeeded.Use the repo's return value directly:
Proposed fix
- await self.saved_script_repo.update_saved_script(script_id, user_id, update_data) - updated_script = await self.saved_script_repo.get_saved_script(script_id, user_id) - if not updated_script: + updated_script = await self.saved_script_repo.update_saved_script(script_id, user_id, update_data) + if not updated_script: raise SavedScriptNotFoundError(script_id)backend/app/services/sse/redis_bus.py (1)
99-109:⚠️ Potential issue | 🟠 MajorPrintf-style
%slog calls were not migrated to structlog.These calls at lines 99 and 104-109 still use
logging.Logger-style positional%sformatting. The structlog configuration does not includePositionalArgumentsFormatterin its processor chain, so positional arguments will not be interpolated into the message. Instead, they will be treated as extra context fields, and the%splaceholders will appear literally in the JSON output. This is inconsistent with the structured logging migration.🐛 Proposed fix
- if not execution_id: - self.logger.debug("Event %s has no execution_id, skipping", event.event_type) - return + if not execution_id: + self.logger.debug("Event has no execution_id, skipping", event_type=event.event_type) + return try: await self.publish_event(execution_id, event) except Exception as e: self.logger.error( - "Failed to publish %s to Redis for %s: %s", - event.event_type, execution_id, e, + "Failed to publish event to Redis", + event_type=event.event_type, + execution_id=execution_id, + error=str(e), exc_info=True, - )
🤖 Fix all issues with AI agents
In `@backend/app/core/logging.py`:
- Around line 7-17: The email regex in SENSITIVE_PATTERNS causes all email-like
values to be redacted when sanitize_sensitive_data iterates event_dict values;
update either SENSITIVE_PATTERNS or sanitize_sensitive_data to avoid removing
intentionally-logged emails—preferably add an allowlist of keys (e.g.,
ALLOWLIST_PRESERVE_EMAILS) and change sanitize_sensitive_data to skip redaction
for those keys (referencing SENSITIVE_PATTERNS, sanitize_sensitive_data, and
event_dict), or alternatively remove the email tuple from SENSITIVE_PATTERNS and
handle email redaction only in a targeted helper that checks field names before
masking.
In `@backend/app/core/tracing/__init__.py`:
- Line 64: Replace the f-string logging call with a static message and argument
to follow the structured-logging convention: update the logger.info call
(currently logger.info(f"Tracing initialized for {name}")) to use a static
format and pass name as an argument, e.g. logger.info("Tracing initialized for
%s", name), so the message string remains constant and the runtime data is
passed separately.
- Around line 43-51: The TracerProvider instance stored in the local variable
provider can leave buffered spans unflushed because provider.shutdown() is never
called; modify the Tracer class to store the TracerProvider as an instance
attribute (e.g., self._provider) when you create provider and add a public
shutdown method (e.g., Tracer.shutdown) that calls self._provider.shutdown();
then ensure dishka_lifespan (or the DI container/lifespan teardown) invokes
Tracer.shutdown() in its finally block so BatchSpanProcessor flushes remaining
spans on application exit.
- Around line 56-62: The code currently double-injects trace context because
LoggingInstrumentor().instrument(set_logging_format=True) adds
otelTraceID/otelSpanID to stdlib LogRecord while the structlog processor chain
still includes add_otel_context; pick one approach and apply it consistently:
either disable stdlib injection by changing
LoggingInstrumentor().instrument(...) to use set_logging_format=False (keep
add_otel_context in the structlog processors) or remove add_otel_context from
the structlog processor chain (keep set_logging_format=True) so only one
mechanism (LoggingInstrumentor or add_otel_context) injects trace_id/span_id;
update the LoggingInstrumentor invocation or the structlog processor list
accordingly (look for LoggingInstrumentor().instrument(...) and add_otel_context
in the structlog config).
- Around line 56-59: The current Tracer.__init__ calls
FastAPIInstrumentor().instrument(...), which monkey-patches Future FastAPI
instances and therefore misses the existing FastAPI instance created earlier;
update Tracer to call FastAPIInstrumentor().instrument_app(app,
tracer_provider=tp, excluded_urls="health,metrics,docs,openapi.json") passing
the actual app instance (the same app variable created in main) so the existing
app gets OpenTelemetryMiddleware, or alternatively move the instrumentation call
to run before the app is instantiated; ensure you reference the tracer_provider
(tp) and excluded_urls arguments when switching to instrument_app.
In `@backend/app/services/notification_service.py`:
- Around line 346-357: The code logs and traces the full webhook_url which may
contain embedded secrets; update the webhook handling in notification_service.py
so logger.debug and trace.get_current_span().set_attributes do not include the
full URL: extract and log only a safe identifier such as the host (use
urlparse(webhook_url).netloc) or a masked version (e.g., keep scheme and host
and redact the path/query) and replace webhook_url with that safe value in the
logger.debug call and the "notification.webhook_url" span attribute; keep other
attributes (notification_id, payload_size, channel) unchanged and ensure no full
URL is stored or emitted.
In `@docs/operations/logging.md`:
- Around line 24-25: Update the include line ranges so the docs reference the
actual functions: change the broken include that points to
backend/app/core/logging.py lines 110–147 to instead include lines 57–84 to show
the complete setup_logger function, and replace the include that currently
references lines 35–59 with lines 26–54 so the snippet cleanly contains both
sanitize_sensitive_data and add_otel_context processors; ensure the identifiers
setup_logger, sanitize_sensitive_data, and add_otel_context are visible in the
updated snippets.
🧹 Nitpick comments (12)
backend/app/dlq/manager.py (1)
251-251: Inconsistent logging style: use structured keyword args instead of f-string.Every other log call in this file passes context as keyword arguments (
event_id=,reason=,status=), but this line uses an f-string, which buries theevent_idand error inside the message string. This defeats structlog's structured field extraction — these values won't be independently queryable/filterable in log aggregation.♻️ Proposed fix
- self.logger.error(f"Error retrying message {event_id}: {e}") + self.logger.error("Error retrying message", event_id=event_id, error=str(e))backend/app/services/user_settings_service.py (1)
51-51: Avoid f-strings in structlog event messages — use keyword arguments for variable data.With structlog, the first positional argument (the "event" string) should be a static, greppable key. Embedding
user_idvia f-strings creates unique event strings per user, defeating structured log aggregation and search.♻️ Proposed fix
- self.logger.debug(f"Settings cache hit for user {user_id}", cache_size=len(self._cache)) + self.logger.debug("Settings cache hit", user_id=user_id, cache_size=len(self._cache))- self.logger.debug(f"Invalidated cache for user {user_id}", cache_size=len(self._cache)) + self.logger.debug("Cache invalidated", user_id=user_id, cache_size=len(self._cache))- self.logger.debug(f"Cached settings for user {user_id}", cache_size=len(self._cache)) + self.logger.debug("Settings cached", user_id=user_id, cache_size=len(self._cache))- self.logger.info(f"Reset settings for user {user_id}") + self.logger.info("Settings reset", user_id=user_id)Also applies to: 201-201, 206-206, 212-212
backend/app/services/execution_service.py (2)
424-430: Avoid f-string in structlog event name; pass the dynamic value as a keyword argument.Structlog best practice is to keep event strings static so they're easily searchable and groupable in log aggregation tools. The execution count should be a separate keyword argument.
Proposed fix
self.logger.debug( - f"Retrieved {len(executions)} executions for user", + "Retrieved executions for user", user_id=user_id, filters={k: v for k, v in query.items() if k != "user_id"}, limit=limit, skip=skip, + count=len(executions), )
380-389: Consider whether loggingresource_usageverbatim is desirable.
resource_usagemay be a large or nested object. Logging it directly on every retrieval could add noise. If it's useful for debugging,debuglevel might be more appropriate for that specific field, or log only a summary (e.g., a booleanhas_resource_usage).backend/app/services/notification_service.py (2)
144-151: Prefer static event strings with structlog.Throughout this file, several log calls embed variables into f-string event messages (e.g.,
f"Creating notification for user {user_id}") while simultaneously passing the same data as keyword arguments. This is a structlog anti-pattern — dynamic event strings prevent grouping/filtering by event type in log aggregation tools, which is one of the main benefits of the structured logging migration.Affected locations: Lines 145, 347, 605, 663, 688.
♻️ Example fix for this call site (apply same pattern to others)
self.logger.info( - f"Creating notification for user {user_id}", + "creating_notification", user_id=user_id, channel=channel, severity=str(severity), tags=tags, scheduled=scheduled_for is not None, )
640-650: Backoff retry withjitter=Nonerisks thundering herd.Disabling jitter means concurrent retries for the same external endpoint (webhook/Slack) will fire in lockstep. Consider using
backoff.full_jitteror the default jitter to spread retry attempts.♻️ Suggested change
`@backoff.on_exception`( backoff.expo, Exception, max_tries=notification.max_retries, max_value=30, - jitter=None, + jitter=backoff.full_jitter, on_backoff=lambda details: self.logger.warning(backend/app/services/sse/redis_bus.py (1)
32-36: Log message mixes f-string interpolation with structured fields — not idiomatic structlog.The channel is duplicated (in the f-string and as a kwarg), and the exception is only embedded in the string rather than passed as a structured field. With structlog, prefer a static event string and push all dynamic data into keyword arguments.
♻️ Suggested refactor
except Exception as e: self.logger.warning( - f"Failed to parse Redis message on channel {self._channel}: {e}", + "Failed to parse Redis message", channel=self._channel, model=model.__name__, + error=str(e), )backend/app/settings.py (1)
114-123: Confusing naming between trace and metric endpoint settings.There are now two OTLP endpoint fields with inconsistent naming conventions:
OTLP_TRACES_ENDPOINT(line 115) — for tracesOTEL_EXPORTER_OTLP_ENDPOINT(line 157) — for metricsSimilarly,
TRACING_SERVICE_NAME/TRACING_SERVICE_VERSION(lines 122-123) duplicateSERVICE_NAME/SERVICE_VERSION(lines 152-153) with identical defaults. The tracingTracerclass uses the former;setup_metricsuses the latter.Consider unifying these — e.g., a single
OTLP_ENDPOINTwith sub-paths for traces/metrics, or at least aligning the naming prefix. The duplication will confuse operators configuring different service names for traces vs metrics when there's no obvious reason to diverge.Also applies to: 152-157
backend/tests/unit/core/test_logging_and_correlation.py (2)
154-161:capture_logs()overrides the processor chain configured bysetup_logger.
structlog.testing.capture_logs()replaces the processor pipeline with its own capturing processor for the duration of the context manager. This means thesetup_logger("INFO")configuration (includingsanitize_sensitive_dataandadd_otel_context) is not exercised here — the test only verifies that structlog's basic event/key binding works, not the custom pipeline.If the intent is to verify the configured processor chain end-to-end, consider calling the logger outside
capture_logsand asserting on the output (e.g., captured stderr), or test the processors individually (as you already do above). If the intent is simply "setup_logger returns a usable logger," the test name and assertions could be clarified accordingly.
106-107: Hardcoded pattern count is brittle.Asserting
len(SENSITIVE_PATTERNS) == 6will break silently (with a non-descriptive assertion error) whenever a new pattern is added. Consider removing this test or replacing it with a minimum-count assertion (>= 6) to avoid false failures when the list is intentionally extended.backend/app/services/admin/admin_events_service.py (2)
116-118:max_events = 1000is duplicated.The 1000 limit appears both in the validation check (Line 116) and in
ReplayConfig.max_events(Line 155). Consider extracting to a class-level constant to keep them in sync.Proposed fix
class AdminEventsService: + MAX_REPLAY_EVENTS = 1000 + def __init__( self, repository: AdminEventsRepository,Then reference
self.MAX_REPLAY_EVENTSin both locations.Also applies to: 148-157
62-66: Stale comment aboutLogRecordreserved attribute.Line 64's comment references
LogRecord, which is a stdlibloggingconcept. With structlog,filenameisn't reserved in the event dict. The comment is now misleading — consider updating or removing it.
…now taking stuff from settings obj)
There was a problem hiding this comment.
1 issue found across 8 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/services/pod_monitor/config.py">
<violation number="1" location="backend/app/services/pod_monitor/config.py:19">
P2: Hardcoding the namespace and kubeconfig path removes runtime configurability, so PodMonitor will always use "integr8scode" and no kubeconfig even when deployments rely on K8S_NAMESPACE/KUBECONFIG. This can break monitoring in non-default namespaces or out-of-cluster runs.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
1 issue found across 8 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/api/dependencies.py">
<violation number="1" location="backend/app/api/dependencies.py:13">
P2: The size check loads the full request body into memory before enforcing the limit, which can allow large payloads to consume memory and defeats the guard. Prefer checking Content-Length (or streaming with a cap) so oversized requests are rejected without buffering the entire body.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/api/dependencies.py">
<violation number="1" location="backend/app/api/dependencies.py:27">
P2: Guard the Content-Length parsing so malformed headers don’t raise `ValueError` and cause a 500. Fall back to the streaming check or return a 400.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
There was a problem hiding this comment.
3 issues found across 9 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/core/middlewares/request_size_limit.py">
<violation number="1">
P2: Invalid Content-Length values will raise ValueError and return 500. Catch parsing errors and return a 400/413 instead.</violation>
<violation number="2">
P1: Requests without a Content-Length header bypass the size limit, so oversized streaming bodies can be accepted without any enforcement. Add a streaming/receive-size check or reject requests missing Content-Length.</violation>
</file>
<file name="docs/architecture/middleware.md">
<violation number="1">
P2: This documentation overstates the request-size enforcement. The middleware only rejects requests when `Content-Length` is present and exceeds the limit, so requests without (or with incorrect) Content-Length can bypass it. Please clarify the limitation so readers don’t assume full DoS protection.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.



Summary by cubic
Switched backend to structlog and OpenTelemetry tracing with a single OTLP traces endpoint and tail sampling. Removed correlation IDs across APIs, events, and storage; request size limits remain enforced by RequestSizeLimitMiddleware using MAX_REQUEST_SIZE_MB.
Refactors
Migration
Written for commit ab40197. Summary will update on new commits.
Summary by CodeRabbit
New Features
Configuration Changes
Refactor