Fix database connection churn in production and Celery workers by JSv4 · Pull Request #882 · Open-Source-Legal/OpenContracts

JSv4 · 2026-02-11T04:56:41Z

Summary

This PR addresses high database connection count and churn issues in production by implementing connection health checks, TCP keepalive settings, and proper connection cleanup in Celery worker processes.

Key Changes

Enabled Django connection health checks (config/settings/production.py): Added CONN_HEALTH_CHECKS = True to validate that persistent connections are still alive before reuse. This prevents errors from stale connections that may have been closed by Cloud SQL proxy or PostgreSQL idle timeouts.
Added TCP keepalive options to production database config (config/settings/production.py): Configured connect_timeout, keepalives, keepalives_idle, keepalives_interval, and keepalives_count to prevent intermediate infrastructure (Cloud SQL proxy, firewalls, load balancers) from silently dropping idle connections. These settings were already present in test configuration but were missing from production.
Added Celery worker process initialization signal (config/celery_app.py): Implemented worker_process_init signal handler to explicitly close inherited database connections when Celery forks worker child processes. This prevents stale connection accumulation and errors, since inherited connections from the parent process are invalid in the child.
Updated environment variable documentation (docs/sample_env_files/backend/production/django.env): Added commented-out examples for the new database connection timeout and keepalive configuration options.

Implementation Details

The Celery signal handler closes all database connections immediately after a worker process is initialized, forcing Django to establish fresh connections. This is critical because when Celery uses the prefork pool, child processes inherit file descriptors for database connections that are no longer valid in the child process context.

The TCP keepalive settings work in conjunction with connection health checks to maintain connection validity across network boundaries and prevent silent connection drops that would otherwise cause connection churn (TIME_WAIT accumulation).

https://claude.ai/code/session_01HWzoYENK11CAvkCBLwraz8

- Enable CONN_HEALTH_CHECKS in production settings so Django validates persistent connections before reuse, preventing stale connection errors and reducing TIME_WAIT churn from failed-then-replaced connections. - Add TCP keepalive OPTIONS (connect_timeout, keepalives, keepalives_idle, keepalives_interval, keepalives_count) to production database config. These prevent Cloud SQL proxy and intermediate infrastructure from silently dropping idle connections. Settings are environment-configurable. - Add Celery worker_process_init signal to close inherited database connections when child workers are forked, preventing stale connection accumulation from the prefork pool parent process. https://claude.ai/code/session_01HWzoYENK11CAvkCBLwraz8

claude · 2026-02-11T04:58:33Z

Code Review: Database Connection Churn Fix

Overview

This PR addresses production database connection issues by implementing three complementary fixes:

Connection health checks
TCP keepalive settings
Celery worker process initialization cleanup

Positive Aspects ✓

1. Excellent Problem Analysis

The PR correctly identifies the root causes: stale connections from Cloud SQL proxy timeouts, inherited connections in Celery worker processes, and lack of keepalive probes
The three-pronged approach is appropriate and well-justified

2. Strong Documentation

CHANGELOG.md entries are detailed and include file paths, line numbers, and clear explanations
Code comments in celery_app.py explain the worker_process_init signal handler purpose
Environment variable documentation updated in sample files

3. Consistency with Test Configuration

The TCP keepalive settings and CONN_HEALTH_CHECKS were already in config/settings/test.py (lines 20-28), making this change a logical parity fix for production
Settings are nearly identical, demonstrating consistency across environments

4. Best Practice Implementation

CONN_HEALTH_CHECKS = True is the recommended Django 4.1+ approach for persistent connections
The Celery worker_process_init signal handler follows Django's documented pattern for multiprocessing
TCP keepalive parameters are reasonable defaults (30s idle, 10s interval, 5 retries = ~80s total)

Critical Issues ⚠️

1. Missing CONN_HEALTH_CHECKS Validation

While CONN_HEALTH_CHECKS is a critical addition, there's a potential edge case:

In config/settings/production.py:25, CONN_HEALTH_CHECKS is set unconditionally. However, health checks require CONN_MAX_AGE > 0 to be effective (line 24 defaults to 60). If someone sets CONN_MAX_AGE=0 in production via environment variable, health checks would be silently ineffective.

Recommendation: Add a configuration validation check or comment:

# CONN_HEALTH_CHECKS requires CONN_MAX_AGE > 0 to be effective
DATABASES["default"]["CONN_MAX_AGE"] = env.int("CONN_MAX_AGE", default=60)  # noqa F405
assert DATABASES["default"]["CONN_MAX_AGE"] > 0, "CONN_MAX_AGE must be > 0 for persistent connections"
DATABASES["default"]["CONN_HEALTH_CHECKS"] = True  # noqa F405

Or at minimum, add a comment explaining the dependency.

2. Celery Signal Import Pattern

In config/celery_app.py:4, the signal is imported at module level, but connections is imported inside the handler function (line 30). This is correct for avoiding circular imports with Django, but should be documented.

Recommendation: Add a comment explaining why the import is deferred:

def close_db_connections_on_worker_init(**kwargs):
    """Close database connections inherited from the parent (prefork) process.
    
    When Celery forks a worker child, it inherits the parent's DB connections.
    These inherited connections are invalid in the child and must be closed so
    Django opens fresh ones. Without this, stale inherited connections can cause
    errors and contribute to connection churn (TIME_WAIT accumulation).
    """
    # Import here to avoid circular dependency with Django settings
    from django.db import connections

    connections.close_all()

Moderate Concerns

3. TCP Keepalive Hardcoded Value

In config/settings/production.py:29, keepalives is hardcoded to 1 while all other keepalive parameters are environment-configurable. This inconsistency is minor but worth noting.

Recommendation: Make it configurable for consistency:

"keepalives": env.int("DATABASE_KEEPALIVES", default=1),

And update docs/sample_env_files/backend/production/django.env accordingly.

4. Missing Test Coverage

There are no tests verifying:

The Celery signal handler is actually registered and fires
connections.close_all() is called during worker initialization
Health checks properly detect stale connections

While this is infrastructure code that's difficult to unit test, consider adding:

A simple test that the signal handler is registered: assert worker_process_init.receivers
Integration test documentation in docs/test_scripts/ for manual verification (per CLAUDE.md guidelines)

Recommendation: Create docs/test_scripts/test_celery_connection_cleanup.md with steps to verify the fix works in a staging environment.

Minor Issues

5. Environment Variable Documentation

In docs/sample_env_files/backend/production/django.env:72-75, the new variables are commented out. This is fine for optional settings, but consider adding a comment explaining when they should be uncommented:

# Optional: Customize database connection keepalive settings
# Uncomment if experiencing connection timeouts with Cloud SQL proxy or firewalls
# DATABASE_CONNECT_TIMEOUT=10
# DATABASE_KEEPALIVES_IDLE=30
# DATABASE_KEEPALIVES_INTERVAL=10
# DATABASE_KEEPALIVES_COUNT=5

6. CHANGELOG Date Consistency

The CHANGELOG date changed from 2026-02-08 to 2026-02-10. This is fine, but ensure this matches the actual merge date or use [Unreleased] without a date until release.

Security Considerations ✓

No security issues identified
TCP keepalive settings don't expose sensitive information
Connection management improvements reduce attack surface for connection exhaustion

Performance Considerations ✓

CONN_HEALTH_CHECKS adds minimal overhead (single query per connection reuse)
TCP keepalive probes prevent silent failures and reduce retry overhead
Celery signal handler runs once per worker process initialization (negligible cost)
Overall expected improvement: reduced connection churn, lower TIME_WAIT count, fewer connection errors

Recommendations Summary

Must Fix:

Add validation or documentation for CONN_MAX_AGE > 0 requirement with CONN_HEALTH_CHECKS

Should Fix:
2. Document the deferred connections import in Celery signal handler
3. Make keepalives configurable like other keepalive parameters
4. Add manual test script documentation per CLAUDE.md guidelines

Nice to Have:
5. Improve environment variable documentation with usage guidance
6. Add basic unit test for signal registration

Conclusion

This is a well-researched and correctly implemented fix for a production infrastructure issue. The code quality is high, documentation is thorough, and the approach aligns with Django and Celery best practices. With the minor improvements suggested above, this will be production-ready.

Approval Status: ✅ Approve with minor suggestions

The critical issue (#1) should be addressed before merge to prevent misconfiguration. The other suggestions can be implemented as follow-up improvements if needed.

claude · 2026-02-13T06:59:02Z

Code Review: Database Connection Churn Fix

Thank you for this well-documented PR addressing production database connection issues. This is a solid fix with good attention to detail. Here's my comprehensive review:

✅ Strengths

1. Well-Researched Solution

All three changes (health checks, TCP keepalive, Celery signal handler) work together cohesively
Excellent technical documentation in PR description and CHANGELOG explaining the "why" behind each change
Proper alignment with test configuration (config/settings/test.py:20-27)

2. Code Quality

Clean, focused implementation following Django and Celery best practices
Proper use of worker_process_init signal - the correct lifecycle hook for this use case
Consistent with existing connections.close_all() usage patterns in the codebase (conftest.py:219, opencontractserver/mcp/tests/test_mcp.py)
Good docstring on the signal handler explaining the problem and solution

3. Configuration Management

Sensible defaults with environment variable overrides for production flexibility
Documentation in sample env file (django.env) for deployment teams
keepalives=1 hardcoded (not env-configurable) is correct - it's a boolean toggle

🔍 Observations & Questions

1. TCP Keepalive Timing Differences

Production (config/settings/production.py:31):

"keepalives_interval": env.int("DATABASE_KEEPALIVES_INTERVAL", default=10),

Test (config/settings/test.py:26):

"keepalives_interval": 5,

Question: Is the 10-second interval in production intentional? Test uses 5 seconds. For Cloud SQL proxy, the more aggressive 5-second interval might be better to detect stale connections faster, but 10 seconds reduces keepalive traffic. Was this a deliberate tradeoff?

Recommendation: Add a brief comment in production.py explaining the rationale (e.g., # 10s balances detection speed vs network overhead for Cloud SQL proxy) OR consider using 5s to match test config if faster detection is preferred.

2. Connection Health Checks + Keepalive Interaction

You've enabled both CONN_HEALTH_CHECKS and TCP keepalive. This is good defense-in-depth:

Health checks = Application-level validation before reuse
TCP keepalive = Network-level prevention of silent drops

However, with aggressive keepalive settings (30s idle, 10s interval, 5 retries = ~80s total), most connections won't go idle long enough to trigger health check failures.

Question: Have you profiled in production to see if CONN_HEALTH_CHECKS is catching issues that keepalive misses, or is this primarily "belts and suspenders" safety?

Recommendation: This is fine as-is (redundancy is good for production stability), but consider logging when health checks fail to validate both mechanisms are working as expected.

3. `CONN_MAX_AGE` Interaction

Production has CONN_MAX_AGE=60 (config/settings/production.py:24), meaning connections are closed after 60 seconds of inactivity. Combined with:

keepalives_idle=30 (first keepalive after 30s)
keepalives_interval=10 (subsequent keepalives every 10s)
Total keepalive timeout = 30 + (10 × 5) = 80s

Potential Issue: If a connection sits idle for 60 seconds, Django closes it before keepalive timeout (80s) expires. This means the keepalive mechanism won't rescue connections that Django has already aged out.

Question: Is CONN_MAX_AGE=60 the right value? If connections are typically reused within 60s, this is fine. If they sit idle longer, you might want to increase it or set to None for persistent connections (relying on keepalive instead).

Recommendation:

If connections are reused frequently (< 60s gaps), current config is optimal
If connections sit idle longer, consider CONN_MAX_AGE=600 (10 minutes) or None to let keepalive handle staleness

4. Celery Worker Pool Type

The worker_process_init signal handler assumes Celery is using the prefork pool (default). The docstring correctly mentions this.

If using other pool types:

gevent/eventlet: Signal still fires but connection inheritance works differently (greenlets share connections)
solo: No forking, so signal is less critical but harmless
threads: No forking, connections shared across threads (potential race conditions)

Recommendation: Add a comment in config/celery_app.py or production deployment docs confirming the expected pool type (probably prefork given the memory/task limits in config/settings/base.py:583-585).

🧪 Test Coverage

Gap: No Test for Celery Signal Handler

The worker_process_init signal handler (config/celery_app.py:21-32) is untested. While testing Celery signals is tricky, you could add:

# opencontractserver/tests/test_celery_signals.py
from unittest.mock import patch, MagicMock
from celery.signals import worker_process_init
from config.celery_app import close_db_connections_on_worker_init


def test_worker_process_init_closes_db_connections():
    """Test worker_process_init signal closes all database connections."""
    with patch('config.celery_app.connections') as mock_connections:
        # Simulate signal firing
        close_db_connections_on_worker_init(sender=MagicMock())
        
        # Verify close_all was called
        mock_connections.close_all.assert_called_once()

Recommendation: Add a simple unit test to verify the signal handler calls connections.close_all(). While integration testing would require actually forking workers (complex), this unit test ensures the handler logic is correct.

Manual Testing Suggestion

Since you're addressing production issues, consider documenting manual verification steps in docs/test_scripts/ per CLAUDE.md guidelines:

# Test: Database Connection Health After Celery Worker Fork

## Purpose
Verify that Celery worker processes properly close inherited database connections.

## Prerequisites
- Local or staging environment with PostgreSQL
- Celery worker running with prefork pool

## Steps
1. Start Celery worker with verbose logging:
   ```bash
   docker compose -f local.yml run --rm celery worker --loglevel=info

Monitor PostgreSQL connections:

SELECT pid, application_name, state, state_change 
FROM pg_stat_activity 
WHERE application_name LIKE '%celery%';

Trigger a task execution and observe worker forking
Verify new connections are created (not reused from parent)

Expected Results

Worker child processes create fresh connections
No errors about "connection already closed" in Celery logs
Connection count remains stable (no accumulation)


---

## 🔒 Security

**No concerns.** Changes are infrastructure-level and don't affect user-facing functionality or data access.

---

## ⚡ Performance

### **Positive Impact**
- Reduces connection churn (fewer TIME_WAIT states)
- Prevents connection pool exhaustion
- Reduces latency from failed connection retries

### **Minor Overhead**
- TCP keepalive packets: ~50 bytes every 10s per idle connection (negligible)
- Health check queries: Single lightweight query per connection reuse (sub-millisecond)

**Net result**: Significant performance improvement with minimal overhead.

---

## 📝 Minor Suggestions

### 1. **Environment Variable Naming Consistency**
All new env vars use `DATABASE_` prefix (good), but consider whether `DATABASE_SSL_MODE` should also be documented in the sample env file since it's referenced but not shown.

### 2. **CHANGELOG Formatting**
The CHANGELOG entry is excellent but slightly verbose. Consider condensing to:

```markdown
#### High Database Connection Count and Churn in Production
- **Enabled `CONN_HEALTH_CHECKS`** to validate persistent connections before reuse (config/settings/production.py:25)
- **Added TCP keepalive settings** to prevent Cloud SQL proxy from dropping idle connections (config/settings/production.py:27-32)
- **Added Celery `worker_process_init` signal** to close inherited connections in forked workers (config/celery_app.py:21-32)

3. Commit Message

Per CLAUDE.md, commit messages should not credit Claude/Claude Code. The current commit message includes:

https://claude.ai/code/session_01HWzoYENK11CAvkCBLwraz8

Recommendation: Remove this line from commit messages and PR descriptions in future PRs per project guidelines.

✅ Final Verdict

Approve with minor suggestions. This is a well-researched, correctly implemented fix for a real production issue. The code is clean, follows best practices, and includes good documentation.

Action Items (Optional - Non-blocking):

✅ Consider testing keepalives_interval=5 in production to match test config (or document why 10s is preferred)
✅ Add unit test for Celery signal handler
✅ Document expected Celery pool type (prefork) in deployment docs
✅ Remove Claude Code session URL from commit messages (per CLAUDE.md)
✅ Consider profiling CONN_MAX_AGE vs keepalive timeout interaction

Ship it! 🚀

codecov · 2026-02-13T07:51:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Signed-off-by: JSIV <5049984+JSv4@users.noreply.github.com>

claude · 2026-02-17T13:54:09Z

Code Review

This is a solid, focused fix targeting a real production pain point. The three-pronged approach (health checks + keepalives + fork-safe cleanup) is the correct solution for connection churn in a prefork Celery deployment. The changes are minimal and well-documented. A few items worth addressing before merge:

Issues

1. Missing blank line in CHANGELOG.md

The new changelog entry is missing a blank line separator before the next section heading:

+#### High Database Connection Count and Churn in Production
+- ...
+- ...
+- ...
 #### Security: LLM Prompt Injection Protection for Approval Bypass

A blank line is needed between the two #### blocks for proper Markdown rendering.

2. CONN_HEALTH_CHECKS = True is silently a no-op if CONN_MAX_AGE = 0

Per Django docs, CONN_HEALTH_CHECKS only takes effect when CONN_MAX_AGE > 0 (persistent connections must be enabled for there to be anything to check). The current default is CONN_MAX_AGE=60, so this works in practice — but a future operator who sets CONN_MAX_AGE=0 would be surprised that health checks stop silently.

A guard comment at the setting would help:

# CONN_HEALTH_CHECKS requires CONN_MAX_AGE > 0 to have any effect.
# If CONN_MAX_AGE is 0, persistent connections are disabled and health
# checks are never reached.
DATABASES["default"]["CONN_HEALTH_CHECKS"] = True  # noqa F405

3. keepalives_interval default differs between production and test config

The test config uses keepalives_interval=5 but the production default here is 10. This is probably intentional (less aggressive polling in prod), but it should be called out explicitly in a comment since the CHANGELOG says these settings were "already present in test config but missing from production" — readers may expect them to match.

4. Keepalive timeout vs CONN_MAX_AGE interaction

With the defaults as set:

keepalives_idle = 30s → first keepalive probe fires at 30s of idle
keepalives_interval = 10s × keepalives_count = 5 → up to 80s of retry before giving up
CONN_MAX_AGE = 60s → Django closes the connection after 60s idle regardless

This means Django will discard the connection at 60s, before the keepalive mechanism has a chance to rescue it from a stale state. The keepalives do still provide value (they prevent the OS from silently dropping the TCP connection before Django closes it cleanly), but they do not extend connection lifetime. This is fine architecturally — just worth a comment so future maintainers don't wonder why keepalives are set to values that seem to exceed CONN_MAX_AGE.

If the intent is to have keepalives actively rescue idle connections, consider either:

Lowering keepalives_idle below CONN_MAX_AGE (e.g., 20s), or
Increasing CONN_MAX_AGE above the total keepalive window (>80s)

5. worker_process_init only applies to the prefork pool

The signal handler is correct for the default prefork pool, but it is a no-op (or not invoked at all) for gevent/eventlet/solo pools since those don't fork. This is fine for the current deployment, but worth documenting in the docstring so it's clear this is intentional scope:

# NOTE: This signal fires only in the prefork pool (the default).
# For gevent/eventlet/solo pools, no fork occurs and this handler is never called.

Minor / Non-blocking

keepalives is not env-configurable

All other keepalive parameters are env-overridable, but keepalives: 1 is hardcoded. This is a very minor inconsistency — there's almost never a reason to disable keepalives once you've enabled the others — but for consistency it could be:

"keepalives": env.int("DATABASE_KEEPALIVES", default=1),

Deferred import in signal handler

The from django.db import connections import inside close_db_connections_on_worker_init is intentional (to avoid import-time side effects before Django is fully initialized). A brief inline comment would make this explicit so a future reviewer doesn't "fix" it by moving the import to the top of the file.

Verdict

Approve with minor suggestions. The core fix is correct and addresses a well-understood class of Celery+PostgreSQL connection problems. Items 1 and 4 are the most actionable; the rest are documentation/commentary improvements. None are blockers if the team is comfortable with the current behavior.

- Align keepalives_interval default to 5s (was 10s) to match test config and improve stale connection detection speed through Cloud SQL proxy - Expand Celery signal handler docstring to document expected pool type (prefork) and note harmless behavior on other pool types - Add unit test for worker_process_init signal handler - Add manual test script for verifying connection health post-deployment - Add DATABASE_SSL_MODE to sample env file for completeness - Condense CHANGELOG entry per review feedback https://claude.ai/code/session_01HWzoYENK11CAvkCBLwraz8

claude · 2026-02-17T14:01:44Z

PR Review: Fix database connection churn in production and Celery workers

Overall this is a well-scoped, well-documented fix for a real production issue. The changes are minimal, targeted, and include tests and docs. A few items worth considering before merging:

`config/celery_app.py` — `worker_process_init` signal handler

Correctness: The approach is correct and idiomatic. Closing inherited DB connections in the child after prefork is the standard Django + Celery pattern and the implementation is clean.

Minor — deferred import placement: The from django.db import connections import is inside the function body. Django's app registry is guaranteed to be ready by the time worker_process_init fires, so the deferred import is not strictly necessary. It's not wrong—just inconsistent with the file's other top-level imports. Either style is fine; flagging in case you'd prefer uniformity.

Docstring accuracy: The docstring says "For other pool types (solo, threads, gevent/eventlet) this signal still fires but is harmless". For solo and threads, worker_process_init does fire (with some Celery version caveats), but for gevent/eventlet, the signal is typically not emitted because there is no OS-level fork. The distinction is minor for practical purposes since the handler is safe to call in all cases, but the docstring is slightly misleading.

`config/settings/production.py` — database OPTIONS

Correctness: Adding TCP keepalive settings and CONN_HEALTH_CHECKS is the right call for Cloud SQL / proxy environments. These settings were already in the test config, so parity is good.

Potential cross-database portability: The keepalives* options are PostgreSQL/libpq-specific. The existing sslmode option has the same issue, so this isn't a regression—just noting for awareness.

connect_timeout semantics: Correctly placed in OPTIONS as a libpq connection-string parameter. No issue.

CONN_MAX_AGE interaction: With CONN_MAX_AGE=60 and CONN_HEALTH_CHECKS=True, Django validates a connection before reuse and discards it only if the health check fails—intended behaviour. Note: CONN_HEALTH_CHECKS requires Django ≥ 4.1. The project targets Django 4.x so this is fine, but worth documenting for anyone on older branches.

`opencontractserver/tests/test_celery_signals.py`

Correctness: The test correctly patches django.db.connections and verifies close_all() is called once.

Patch target: Because the handler does from django.db import connections inside the function body at call time, patching django.db.connections (not config.celery_app.connections) is correct—the import happens at invocation so it picks up the patched object. This is fine.

Test isolation: TestCase is appropriate here since connections is fully mocked and there are no DB side effects.

`docs/test_scripts/test_db_connection_health.md`

Good addition following the project's test script documentation format. The manual verification steps are practical and actionable.

Minor: Step 4 uses ss -tn state established | grep ':5432' to count Celery connections—this counts all established TCP connections to port 5432 on the pod, not just Celery's. Filtering via pg_stat_activity with an application_name WHERE clause would be more precise, though the shell approach works as a quick indicator.

`docs/sample_env_files/backend/production/django.env`

DATABASE_SSL_MODE=require is now commented out as an example, but the code already defaults to "require" when the env var is absent. A brief inline comment clarifying that the default is require (and it only needs to be set to override, e.g. disable for non-TLS setups) would reduce operator confusion.

Summary

Area	Status
Correctness of core fix	✅ Correct and idiomatic
Test coverage	✅ Unit test covers the signal handler
Documentation	✅ Test script + env sample + changelog updated
Security impact	✅ Neutral — no new attack surface
Docstring accuracy (`gevent`/`eventlet`)	⚠️ Minor inaccuracy
`DATABASE_SSL_MODE` comment in env file	⚠️ Minor UX nit

No blocking issues. The two flagged items are minor; address them at your discretion.

JSv4 · 2026-02-18T03:05:26Z

Mistakenly committed to main as 1e073cf

Merge main into feature branch, resolve CHANGELOG conflict

e97a63f

Merge branch 'main' into claude/fix-db-connection-count-bvErJ

25f6bd9

Signed-off-by: JSIV <5049984+JSv4@users.noreply.github.com>

JSv4 closed this Feb 18, 2026

JSv4 deleted the claude/fix-db-connection-count-bvErJ branch February 18, 2026 03:05

JSv4 restored the claude/fix-db-connection-count-bvErJ branch February 18, 2026 03:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix database connection churn in production and Celery workers#882

Fix database connection churn in production and Celery workers#882
JSv4 wants to merge 4 commits intomainfrom
claude/fix-db-connection-count-bvErJ

JSv4 commented Feb 11, 2026

Uh oh!

claude Bot commented Feb 11, 2026

Uh oh!

claude Bot commented Feb 13, 2026

Uh oh!

codecov Bot commented Feb 13, 2026

Uh oh!

claude Bot commented Feb 17, 2026

Uh oh!

claude Bot commented Feb 17, 2026

Uh oh!

JSv4 commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

JSv4 commented Feb 11, 2026

Summary

Key Changes

Implementation Details

Uh oh!

claude Bot commented Feb 11, 2026

Code Review: Database Connection Churn Fix

Overview

Positive Aspects ✓

Critical Issues ⚠️

Moderate Concerns

Minor Issues

Security Considerations ✓

Performance Considerations ✓

Recommendations Summary

Conclusion

Uh oh!

claude Bot commented Feb 13, 2026

Code Review: Database Connection Churn Fix

✅ Strengths

1. Well-Researched Solution

2. Code Quality

3. Configuration Management

🔍 Observations & Questions

1. TCP Keepalive Timing Differences

2. Connection Health Checks + Keepalive Interaction

3. CONN_MAX_AGE Interaction

4. Celery Worker Pool Type

🧪 Test Coverage

Gap: No Test for Celery Signal Handler

Manual Testing Suggestion

Expected Results

3. Commit Message

✅ Final Verdict

Uh oh!

codecov Bot commented Feb 13, 2026

Codecov Report

Uh oh!

claude Bot commented Feb 17, 2026

Code Review

Issues

Minor / Non-blocking

Verdict

Uh oh!

claude Bot commented Feb 17, 2026

PR Review: Fix database connection churn in production and Celery workers

config/celery_app.py — worker_process_init signal handler

config/settings/production.py — database OPTIONS

opencontractserver/tests/test_celery_signals.py

docs/test_scripts/test_db_connection_health.md

docs/sample_env_files/backend/production/django.env

Summary

Uh oh!

JSv4 commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

3. `CONN_MAX_AGE` Interaction

`config/celery_app.py` — `worker_process_init` signal handler

`config/settings/production.py` — database OPTIONS

`opencontractserver/tests/test_celery_signals.py`

`docs/test_scripts/test_db_connection_health.md`

`docs/sample_env_files/backend/production/django.env`