ENG-3299: Add correspondence enums, scopes, and encrypt comment_text#7866
ENG-3299: Add correspondence enums, scopes, and encrypt comment_text#7866
Conversation
- Extend CommentType with message_to_subject and reply_from_subject - Add CorrespondenceDeliveryStatus enum (pending/sent/delivered/bounced/failed) - Add MessagingActionType.CORRESPONDENCE with default template - Add EventAuditType correspondence events (string column, no migration) - Register CORRESPONDENCE_SEND, CORRESPONDENCE_READ, NOTIFICATION_READ scopes - Encrypt comment_text column using StringEncryptedType (AES-GCM) - Migration: create PG enum, batch-encrypt existing rows, seed RBAC scopes - Update TestRBACScopeRegistrySync to scan multiple migration files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
delivery_status uses varchar, not a PostgreSQL enum type. Validation happens at the Python model layer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added CORRESPONDENCE default template, so the count increased by 1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7866 +/- ##
=======================================
Coverage 85.06% 85.07%
=======================================
Files 629 629
Lines 40860 40879 +19
Branches 4748 4748
=======================================
+ Hits 34759 34776 +17
- Misses 5029 5030 +1
- Partials 1072 1073 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review — ENG-3299: Correspondence Enums, Scopes & comment_text Encryption
Overall this is a solid foundational PR. The encryption approach is consistent with existing patterns in the codebase, the batched migration is correct, RBAC seeding follows the established pattern, and the TLA+ compliance note is appreciated. A few issues worth addressing before merge:
Must-fix
CORRESPONDENCE_SCOPES mixes namespaces and the test hardcodes its name
The migration dict CORRESPONDENCE_SCOPES contains notification:read, which belongs to a different scope namespace. More importantly, the test_rbac.py test hardcodes SCOPE_DICT_NAMES = {"SCOPE_DOCS", "CORRESPONDENCE_SCOPES"} — meaning if any future migration introduces a different dict name (e.g. REPORTING_SCOPES), its scopes will be silently missing from the enforcement check. The test's own comment says "dicts whose name ends with _SCOPES" — the implementation should match that intent. See inline comments.
Should-fix
CorrespondenceDeliveryStatus is dead code at this point
The class is defined in comment.py but the Comment model has no delivery_status column, and no module imports the class. If it's scaffolding for PR 2, a comment to that effect would help; otherwise it should be deferred to the PR that actually introduces the column.
Downgrade loop lacks logging
The upgrade() logs the number of encrypted rows but downgrade() doesn't. Parity would aid debugging during rollbacks.
Suggestions / observations
- The owner-only RBAC assignment for
correspondence:readandnotification:readmay be intentional, but is worth a quick confirmation — these could reasonably be readable by lower-privileged roles. - The encryption migration has no idempotency guard. Double-running (e.g. after a partial downgrade failure) would double-encrypt rows. A doc note or guard condition would reduce risk.
🔬 Codegraph: unavailable
💡 Write /code-review in a comment to re-run this review.
| assert len(migration_scopes) > 0, "Could not parse SCOPE_DOCS from migration" | ||
| # Collect scope keys from all dicts whose name ends with _SCOPES | ||
| # or is SCOPE_DOCS across all migration files | ||
| SCOPE_DICT_NAMES = {"SCOPE_DOCS", "CORRESPONDENCE_SCOPES"} |
There was a problem hiding this comment.
tests/api/models/test_rbac.py:555
The comment above this line says "Collect scope keys from all dicts whose name ends with _SCOPES or is SCOPE_DOCS" but the implementation uses a hardcoded set rather than a pattern match. When a future migration introduces a dict with a new name (e.g. REPORTING_SCOPES), the test will silently pass even though those scopes are missing from the assertion.
Suggest replacing the hardcoded set with a name-based match to make the test self-maintaining:
# Before
if target.id in SCOPE_DICT_NAMES:
# After
if target.id == "SCOPE_DOCS" or target.id.endswith("_SCOPES"):This eliminates SCOPE_DICT_NAMES entirely and keeps the test consistent with its own documentation.
There was a problem hiding this comment.
Done — replaced the hardcoded SCOPE_DICT_NAMES set with target.id.endswith("_SCOPES") or target.id == "SCOPE_DOCS" pattern match. The test is now self-extending for future migrations.
|
/code-review |
There was a problem hiding this comment.
Code Review: ENG-3299 — Correspondence Enums, Scopes & comment_text Encryption
This PR introduces two new CommentType values, a CorrespondenceDeliveryStatus enum, a MessagingActionType.CORRESPONDENCE entry, new correspondence:* / notification:read scopes in the registry, and a migration that back-fills encryption on existing comment rows and seeds RBAC permissions.
The overall structure is clean, but the migration has two critical correctness issues around idempotency that could corrupt data in production or during re-runs.
Critical
1. Migration is not idempotent — re-running upgrade() double-encrypts all rows
The upgrade() loop calls enc.process_bind_param unconditionally on every comment_text value with no check for whether the row is already encrypted. A second run (CI retry, operator mistake, partial failure followed by retry) will encrypt ciphertext again, producing an unreadable value. The existing consent_encryption_migration.py pattern solves this with a trial-decrypt: attempt process_result_value first; skip on success, encrypt on failure. See the inline comment at line 59 for a concrete fix.
2. downgrade() crashes on plaintext rows
process_result_value raises InvalidCiphertextError on plaintext input. If downgrade() is called on a database where upgrade() never ran (fresh env, partial rollback), the decryption loop will throw and leave the table in a broken half-decrypted state. Same trial-decrypt guard fixes this.
Suggestions
3. No fast-fail for missing encryption key — The docstring notes FIDES__SECURITY__APP_ENCRYPTION_KEY as a PREREQ, but if it's absent the failure manifests deep in the crypto layer. An explicit assertion at the top of upgrade() would catch this immediately with a clear message.
4. ON CONFLICT DO NOTHING + unconditional downgrade delete — If the scope codes already exist in rbac_permission from another source, the INSERT silently no-ops, but downgrade() will still delete the pre-existing row. Consider querying after the insert to confirm ownership, or using DO UPDATE SET updated_at = now() to take ownership explicitly.
5. CorrespondenceDeliveryStatus is unused — Defined in comment.py but no column references it. If this is for a future column, a comment explaining the intent would help; if the column was accidentally omitted, it should be added here.
6. SCOPE_DICT_NAMES hardcoded in test — {"SCOPE_DOCS", "CORRESPONDENCE_SCOPES"} must be manually updated every time a future migration seeds scopes under a different dict name, making it easy to accidentally break test coverage. A comment calling this out (or a more generic heuristic) would help.
7. Per-row UPDATE loop — 500 individual UPDATE ... WHERE id = :id statements per batch increases lock-holding time for large tables. Worth documenting expected duration for tables with many rows, even if changing the approach is out of scope.
🔬 Codegraph: unavailable
💡 Write /code-review in a comment to re-run this review.
|
/code-review |
There was a problem hiding this comment.
Code Review — ENG-3299: Correspondence enums, scopes, and comment_text encryption
Overall this is a clean, well-structured first PR for the correspondence feature. The encryption migration is safe (uses the established encrypted_type helper, batches correctly, handles upgrade/downgrade symmetrically), the scope registry additions follow existing conventions, and the test counts are updated correctly.
Summary of findings
Medium
SCOPE_DICT_NAMEShardcoding vs. comment mismatch (test_rbac.py:555) — The comment describes a pattern-based rule (endswith("_SCOPES")), but the implementation is an exact-match set. Every future scope-seeding migration must update this set or the enforcement test silently misses the new scopes. Aligning the implementation with the comment's stated intent would make the test self-extending.
Minor / questions
- RBAC role assignment scope (migration:91) — All three new scopes are seeded only on the
ownerrole. Confirm whethercorrespondence:read/notification:readshould also be granted to lower-privilege standard roles (e.g.viewer), or add a note that PR 2 will revisit role assignments. CorrespondenceDeliveryStatusis unused (comment.py:37) — Intentional for PR 1 of 2, but a short inline comment pointing to the follow-up PR/column would prevent confusion.- Per-row UPDATE round-trips in migration (migration:58) — Each encrypted row is updated individually. Fine for typical comment table sizes, but worth noting for large installs. Not a blocker.
What looks good
- Batched encrypt/decrypt with stable
ORDER BY id+OFFSET— correct and safe. - Downgrade removes
rbac_role_permissionrows beforerbac_permissionrows — correct FK ordering. comment_textisnullable=False, so no null-guard needed in the migration loop.SENDconstant was already defined inscope_registry.py(line 64) —CORRESPONDENCE_SENDcomposes correctly.MessagingActionType.CORRESPONDENCEdefault template follows the same__PLACEHOLDER__convention used elsewhere.- Test counts bumped from 9 → 10 templates in all three
save_defaultstest cases.
🔬 Codegraph: connected (42425 nodes)
💡 Write /code-review in a comment to re-run this review.
- Add idempotency guards (trial-decrypt) to migration upgrade/downgrade
- Add row-count logging to downgrade for parity with upgrade
- Split CORRESPONDENCE_SCOPES into separate namespace dicts
- Replace hardcoded SCOPE_DICT_NAMES with endswith("_SCOPES") pattern match
- Add docstring to CorrespondenceDeliveryStatus referencing PR 2
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| - notes are internal comments. | ||
| - reply is reserved for future use and is not currently supported | ||
| - note: internal comments | ||
| - reply: reserved for future use |
There was a problem hiding this comment.
reply pre-dates this PR — it's been a reserved CommentType value for threaded reply support that hasn't been implemented yet. It's only referenced in tests as fixture data; no route, service, or schema uses it in production code.
Implementing threaded replies is out of scope for this project, but the plumbing is already in place (parent_id FK, parent/replies relationships, index) so when the time comes it should be quick to wire up a route and service layer for it.
| def _is_encrypted(enc, value, dialect): | ||
| """Return True if value is already AES-GCM ciphertext.""" | ||
| if value is None: | ||
| return False |
There was a problem hiding this comment.
do we want to consider null values unencrypted?
There was a problem hiding this comment.
Null values are skipped — not treated as encrypted or unencrypted. Line 90 checks row.comment_text is None first and skips those rows entirely. The column is nullable=False on the model so nulls shouldn't exist in practice, but the guard handles edge cases (e.g. rows inserted by raw SQL). I can add an inline comment clarifying this if it reads ambiguously.
| # Under normal Alembic operation the transaction guarantees all-or-nothing, | ||
| # but the guard protects against manual version-stamp edits or future | ||
| # refactors that add per-batch commits. | ||
| from fides.api.db.encryption_utils import encrypted_type, get_encryption_key |
There was a problem hiding this comment.
can we do top-level imports or is the local import needed for some reason?
There was a problem hiding this comment.
oops! I'll move them to the top of the file.
| text( | ||
| "SELECT id, comment_text FROM comment " | ||
| "ORDER BY id LIMIT :limit OFFSET :offset" | ||
| ), |
There was a problem hiding this comment.
using offset can be problematic in large tables... I imagine this isn't a problem here since the number of comments is somewhat limited because people are writing them manually, but do we have an estimate of the number of rows in this table for some of the bigger clients in prod ?
There was a problem hiding this comment.
Comments are manually written by operators on privacy requests — even the biggest clients have low thousands at most. The offset-based pagination is safe here because ORDER BY id is stable (we only update comment_text, never id), so rows don't shift between pages. The same LIMIT/OFFSET pattern is used in consent_encryption_migration.py. Batch size of 500 means even 10K rows is only 20 queries.
| "Skipped {} already-encrypted or NULL comment_text rows", total_skipped | ||
| ) | ||
|
|
||
| # 2. Seed RBAC permissions for new scopes |
There was a problem hiding this comment.
steps 2 and 3 seem unrelated from the encryption . maybe they should be two separate migrations? one to encrypt, one to do all the role/scope seeding
There was a problem hiding this comment.
Good call — I'll split this into two migrations: one for encrypting comment_text, one for RBAC scope seeding. Keeps each migration single-purpose and makes rollbacks cleaner.
Address review feedback from erosselli: - Split single migration into two: 6465b161a450 (encrypt comment_text) and f26bf4ecec76 (seed correspondence/notification RBAC scopes) - Move fides.api.db.encryption_utils imports to top-level, matching the pattern used by every other migration in the codebase Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ticket ENG-3299
Description Of Changes
PR 1 of 2 for ENG-3299 (Data Model, Enums, and PII Encryption). Establishes the foundational data structures for the correspondence feature.
What this PR does:
CommentTypeenum withmessage_to_subjectandreply_from_subjectvalues for outbound/inbound correspondenceCorrespondenceDeliveryStatusPython enum (pending,sent,delivered,bounced,failed) — used as varchar validation at the model layer, no PG enum type createdMessagingActionType.CORRESPONDENCEwith a default messaging templateEventAuditTypecorrespondence events (correspondence.sent,.delivered,.bounced,.reply_received) — string column, no migration neededCORRESPONDENCE_SEND,CORRESPONDENCE_READ,NOTIFICATION_READscopes with RBAC permission seedingcomment_textcolumn on theCommentmodel usingStringEncryptedType(AES-GCM), with a batched migration for existing rowsTLA+ spec compliance: Verified against
fidesplus/specs/correspondence/CorrespondenceConcurrency.tla— delivery status values,message_idUNIQUE constraint design,CommentTypevalues, andEventAuditTypeevents all match the formal spec.Pre-req audit (comment_text encryption safety): Confirmed no LIKE/ILIKE/contains/startswith queries exist on
comment_textanywhere in the codebase. The column is only used as a data field in API requests/responses, never for SQL-level filtering.Encryption key: All fides deployments already have
FIDES__SECURITY__APP_ENCRYPTION_KEYset — it is a required config value enforced at webserver startup, and is used by dozens of existing encrypted columns (connection secrets, user passwords, privacy preferences, etc.).Code Changes
src/fides/api/models/comment.py— ExtendedCommentType, addedCorrespondenceDeliveryStatusenum, encryptedcomment_textviaencrypted_type()src/fides/api/models/event_audit.py— Added 4 correspondence event typessrc/fides/api/models/messaging_template.py— Added default CORRESPONDENCE templatesrc/fides/api/schemas/messaging/messaging.py— AddedMessagingActionType.CORRESPONDENCEsrc/fides/common/scope_registry.py— Added 3 new scopes +SCOPE_DOCSentriestests/api/models/test_rbac.py— UpdatedTestRBACScopeRegistrySyncto scan migration files for scope dicts using pattern matching (endswith("_SCOPES"))6465b161a450— Batch-encrypts existingcomment_textrows (single-purpose: encryption only)f26bf4ecec76— Seeds RBAC permissions for correspondence and notification scopes, assigns to owner roleSteps to Confirm
alembic upgrade head— verify both migrations succeedalembic downgrade b3c8d5e7f2a1thenalembic upgrade headagain — verify full cycle works for both migrationsSELECT code FROM rbac_permission WHERE code LIKE 'correspondence%' OR code LIKE 'notification%';comment_textis encrypted in the DB (raw SELECT returns ciphertext) but decrypted when read via ORMpytest tests/ctl/models/test_comment.py tests/api/models/test_rbac.py::TestRBACScopeRegistrySync— all passPre-Merge Checklist
CHANGELOG.mdupdatedmaindowngrade()migration is correct and works