Skip to content

Improve batch worker error log serialization and test assertions#591

Open
cmraible wants to merge 5 commits intomainfrom
improve-error-logging
Open

Improve batch worker error log serialization and test assertions#591
cmraible wants to merge 5 commits intomainfrom
improve-error-logging

Conversation

@cmraible
Copy link
Collaborator

Summary

  • Updated batch worker error logging to use Pino's err field so error message and stack are serialized in structured logs
  • Added a reusable in-memory log capture utility for tests at test/utils/log-capture.ts
  • Migrated batch worker unit tests to assert emitted log output instead of logger spy calls, including error serialization checks

@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ab482d0-b947-4119-944f-701965ccdef9

📥 Commits

Reviewing files that changed from the base of the PR and between cf267f8 and 76e5e80.

📒 Files selected for processing (7)
  • .eslintrc.cjs
  • src/handlers/page-hit-handlers.ts
  • src/plugins/hmac-validation.ts
  • src/utils/error-formatters.ts
  • test/integration/error-logging.test.ts
  • test/unit/handlers/page-hit-handlers.test.ts
  • test/unit/utils/error-formatters.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • .eslintrc.cjs

Walkthrough

Adds an ESLint no-restricted-syntax rule to .eslintrc.cjs to forbid logging an error property (encourages err for Pino). Renames many catch variables from error to err and updates logger payload keys accordingly across multiple services (BatchWorker, EventPublisher, FileSaltStore, FirestoreSaltStore, UserSignatureService, page hit handlers, HMAC plugin). BatchWorker alters parseMessage to log and ACK parse failures and captures pending batch details for scheduled-flush errors. Introduces an in-memory test log-capture utility and updates unit/integration tests to assert on err and use the new log capture.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: improving error log serialization in batch worker and test assertions using Pino's err field.
Description check ✅ Passed The description clearly relates to the changeset, covering the three main objectives: error logging updates, new log capture utility, and test migration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch improve-error-logging

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cmraible cmraible force-pushed the improve-error-logging branch from b906863 to ceb2872 Compare March 3, 2026 00:05
@ErisDS
Copy link
Member

ErisDS commented Mar 3, 2026

🤖 Velo CI Failure Analysis

Classification: 🟠 SOFT FAIL

  • Workflow: CI
  • Failed Step: Lint & Test
  • Run: View failed run
    What failed: Test assertion failure in UserSignatureService.test.ts
    Why: The root cause is a test assertion failure in the UserSignatureService.test.ts file. This is a code issue, as the test is failing due to an issue in the application code, not an infrastructure problem.
    Action:
    The author should investigate the failing test case and fix the underlying issue in the UserSignatureService implementation.

@cmraible cmraible force-pushed the improve-error-logging branch from ceb2872 to 52ca8c9 Compare March 3, 2026 00:11
Comment on lines +45 to +48
const errorMessage = error instanceof Error ? error.message : String(error);
logger.error({
event: 'EventPublishFailed',
error: errorMessage,
err: errorMessage,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should log the whole error here, not just the error message.

Comment on lines +108 to +113
messageId: message.id,
messageData: this.getMessageData(message),
error
err: error
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename error to err, so we can just pass it in the object as-is

@cmraible cmraible force-pushed the improve-error-logging branch from 52ca8c9 to af99306 Compare March 3, 2026 00:51
Pino only auto-serializes errors (message, stack) on the `err` key.
Using `error` silently loses structured error data in production logs.
Rename `error` to `err` in all logger calls so Pino's error serializer
properly extracts message and stack trace in structured logs.
@cmraible cmraible force-pushed the improve-error-logging branch from af99306 to 185bbb5 Compare March 10, 2026 21:10
@ErisDS
Copy link
Member

ErisDS commented Mar 10, 2026

🤖 Velo CI Failure Analysis

Classification: 🟠 SOFT FAIL

  • Workflow: CI
  • Failed Step: Lint & Test
  • Run: View failed run
    What failed: CI failure - likely code issue
    Why: The failure appears to be related to code changes. Check the error output for details.
    Action:
    Review the error logs and fix the issue in your code.

@cmraible cmraible marked this pull request as ready for review March 10, 2026 22:19
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
test/unit/services/user-signature/UserSignatureService.test.ts (1)

296-313: Assert the serialized error log here.

This still only checks the raw spy argument. Since the behavior change is Pino serializing err, this test can pass even if the emitted log no longer contains the serialized message/stack. Consider switching this case to createInMemoryLogCapture() and asserting the parsed log entry instead, like the batch-worker tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/unit/services/user-signature/UserSignatureService.test.ts` around lines
296 - 313, Replace the current spy-based assertion with an in-memory log capture
so you can assert the serialized log payload: instead of spying on logger.error
(loggerErrorSpy) set up createInMemoryLogCapture() before instantiating
UserSignatureService(mockSaltStore), trigger the timeoutCallback from
setTimeoutSpy as before, then parse the captured log entry and assert it
contains event: 'SaltCleanupFailed' and that the serialized err includes
error.message ('Cleanup failed') and err.stack; keep the mockRejectedValue on
mockSaltStore.cleanup to simulate the error and still assert cleanupSpy was
called.
.eslintrc.cjs (1)

7-13: Broaden this selector to catch logger patterns beyond the logger binding.

This rule only enforces the err convention for calls like logger.error({error: ...}). The codebase also uses request.log.error(), reply.log.error(), and fastify.log.*() with error fields that remain uncaught, leaving enforcement incomplete.

Example unmatched patterns in codebase
  • request.log.error({event: ..., error: ...}) in src/handlers/page-hit-handlers.ts (lines 14, 118)
  • reply.log.error({event: ..., error: ...}) in src/handlers/page-hit-handlers.ts (line 81)
  • request.log.error({event: ..., error: ...}) in src/plugins/hmac-validation.ts (line 68)
  • request.log.warn({event: ..., error: ...}) in src/plugins/hmac-validation.ts (line 36)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.eslintrc.cjs around lines 7 - 13, Update the ESLint rule selector in
.eslintrc.cjs so it matches any member call that logs errors (not only calls on
an identifier named "logger"); replace the current selector with one that
targets CallExpression nodes whose callee is a MemberExpression with
property.name 'error' (and where the first argument is an ObjectExpression
containing a Property with key.name 'error'), e.g. use a selector like
"CallExpression[callee.type='MemberExpression'][callee.property.name='error'] >
ObjectExpression > Property[key.name='error']" so patterns such as
request.log.error(...), reply.log.error(...), fastify.log.error(...), etc. are
caught.
test/unit/services/batch-worker/batch-worker.test.ts (1)

72-75: Add await drain() before clearing and reading the log capture to prevent race conditions.

The test setup calls getLogCapture().clear() in beforeEach() without awaiting drain(), and multiple test assertions call getLogCapture().findByEvent() immediately after async operations without draining. Since Pino may queue writes internally before flushing to the stream, this creates a race condition where log entries can be missed or leak across tests. Make beforeEach() async and add await getLogCapture().drain() before clear(), and before each findByEvent() call in assertions that follow async operations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/unit/services/batch-worker/batch-worker.test.ts` around lines 72 - 75,
Make the test setup await pending log writes to avoid races: change the
beforeEach to be async and call await getLogCapture().drain() before
getLogCapture().clear(); also, in tests that call getLogCapture().findByEvent()
after async work, insert await getLogCapture().drain() immediately before each
findByEvent() to ensure all Pino writes have flushed. Ensure you reference
getLogCapture(), drain(), clear(), findByEvent(), and the beforeEach hook when
applying the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/services/batch-worker/BatchWorker.ts`:
- Around line 185-190: The catch in flushBatch() logs this.batch metadata which
is cleared before the error; change the log to use the copied batch
(batchToFlush) instead of this.batch so batchSize and messageIds reflect the
items that failed to flush; update the logger.error call to use
batchToFlush.length and batchToFlush.map(item => item.message.id) (or a
similarly named local copy used in flushBatch()) and preserve the err field.

---

Nitpick comments:
In @.eslintrc.cjs:
- Around line 7-13: Update the ESLint rule selector in .eslintrc.cjs so it
matches any member call that logs errors (not only calls on an identifier named
"logger"); replace the current selector with one that targets CallExpression
nodes whose callee is a MemberExpression with property.name 'error' (and where
the first argument is an ObjectExpression containing a Property with key.name
'error'), e.g. use a selector like
"CallExpression[callee.type='MemberExpression'][callee.property.name='error'] >
ObjectExpression > Property[key.name='error']" so patterns such as
request.log.error(...), reply.log.error(...), fastify.log.error(...), etc. are
caught.

In `@test/unit/services/batch-worker/batch-worker.test.ts`:
- Around line 72-75: Make the test setup await pending log writes to avoid
races: change the beforeEach to be async and call await getLogCapture().drain()
before getLogCapture().clear(); also, in tests that call
getLogCapture().findByEvent() after async work, insert await
getLogCapture().drain() immediately before each findByEvent() to ensure all Pino
writes have flushed. Ensure you reference getLogCapture(), drain(), clear(),
findByEvent(), and the beforeEach hook when applying the changes.

In `@test/unit/services/user-signature/UserSignatureService.test.ts`:
- Around line 296-313: Replace the current spy-based assertion with an in-memory
log capture so you can assert the serialized log payload: instead of spying on
logger.error (loggerErrorSpy) set up createInMemoryLogCapture() before
instantiating UserSignatureService(mockSaltStore), trigger the timeoutCallback
from setTimeoutSpy as before, then parse the captured log entry and assert it
contains event: 'SaltCleanupFailed' and that the serialized err includes
error.message ('Cleanup failed') and err.stack; keep the mockRejectedValue on
mockSaltStore.cleanup to simulate the error and still assert cleanupSpy was
called.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1e57acb7-0be5-4398-9441-3ca5a456f9d3

📥 Commits

Reviewing files that changed from the base of the PR and between a246028 and 185bbb5.

📒 Files selected for processing (9)
  • .eslintrc.cjs
  • src/services/batch-worker/BatchWorker.ts
  • src/services/events/publisher.ts
  • src/services/salt-store/FileSaltStore.ts
  • src/services/salt-store/FirestoreSaltStore.ts
  • src/services/user-signature/UserSignatureService.ts
  • test/unit/services/batch-worker/batch-worker.test.ts
  • test/unit/services/user-signature/UserSignatureService.test.ts
  • test/utils/log-capture.ts

Snapshot batch size and message IDs before calling flushBatch(),
which clears this.batch before throwing. Previously the log would
always show batchSize: 0 and messageIds: [].
Add second AST selector for CallExpression[callee.object.property.name='log']
to catch *.log.error({ error: ... }) patterns in addition to the existing
logger.error({ error: ... }) pattern.

Fix violations in page-hit-handlers, hmac-validation, and error-formatters.
Update corresponding test assertions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants