Skip to content

ENG-3299: Add correspondence enums, scopes, and encrypt comment_text#7866

Open
JadeCara wants to merge 21 commits intomainfrom
ENG-3299/enums-scopes-encryption
Open

ENG-3299: Add correspondence enums, scopes, and encrypt comment_text#7866
JadeCara wants to merge 21 commits intomainfrom
ENG-3299/enums-scopes-encryption

Conversation

@JadeCara
Copy link
Copy Markdown
Contributor

@JadeCara JadeCara commented Apr 8, 2026

Ticket ENG-3299

Description Of Changes

PR 1 of 2 for ENG-3299 (Data Model, Enums, and PII Encryption). Establishes the foundational data structures for the correspondence feature.

What this PR does:

  • Extends CommentType enum with message_to_subject and reply_from_subject values for outbound/inbound correspondence
  • Adds CorrespondenceDeliveryStatus Python enum (pending, sent, delivered, bounced, failed) — used as varchar validation at the model layer, no PG enum type created
  • Adds MessagingActionType.CORRESPONDENCE with a default messaging template
  • Adds EventAuditType correspondence events (correspondence.sent, .delivered, .bounced, .reply_received) — string column, no migration needed
  • Registers CORRESPONDENCE_SEND, CORRESPONDENCE_READ, NOTIFICATION_READ scopes with RBAC permission seeding
  • Encrypts the comment_text column on the Comment model using StringEncryptedType (AES-GCM), with a batched migration for existing rows

TLA+ spec compliance: Verified against fidesplus/specs/correspondence/CorrespondenceConcurrency.tla — delivery status values, message_id UNIQUE constraint design, CommentType values, and EventAuditType events all match the formal spec.

Pre-req audit (comment_text encryption safety): Confirmed no LIKE/ILIKE/contains/startswith queries exist on comment_text anywhere in the codebase. The column is only used as a data field in API requests/responses, never for SQL-level filtering.

Encryption key: All fides deployments already have FIDES__SECURITY__APP_ENCRYPTION_KEY set — it is a required config value enforced at webserver startup, and is used by dozens of existing encrypted columns (connection secrets, user passwords, privacy preferences, etc.).

Code Changes

  • src/fides/api/models/comment.py — Extended CommentType, added CorrespondenceDeliveryStatus enum, encrypted comment_text via encrypted_type()
  • src/fides/api/models/event_audit.py — Added 4 correspondence event types
  • src/fides/api/models/messaging_template.py — Added default CORRESPONDENCE template
  • src/fides/api/schemas/messaging/messaging.py — Added MessagingActionType.CORRESPONDENCE
  • src/fides/common/scope_registry.py — Added 3 new scopes + SCOPE_DOCS entries
  • tests/api/models/test_rbac.py — Updated TestRBACScopeRegistrySync to scan migration files for scope dicts using pattern matching (endswith("_SCOPES"))
  • Migration 6465b161a450 — Batch-encrypts existing comment_text rows (single-purpose: encryption only)
  • Migration f26bf4ecec76 — Seeds RBAC permissions for correspondence and notification scopes, assigns to owner role

Steps to Confirm

  1. Run alembic upgrade head — verify both migrations succeed
  2. Run alembic downgrade b3c8d5e7f2a1 then alembic upgrade head again — verify full cycle works for both migrations
  3. Verify RBAC scopes seeded: SELECT code FROM rbac_permission WHERE code LIKE 'correspondence%' OR code LIKE 'notification%';
  4. Create a comment via ORM — verify comment_text is encrypted in the DB (raw SELECT returns ciphertext) but decrypted when read via ORM
  5. Run pytest tests/ctl/models/test_comment.py tests/api/models/test_rbac.py::TestRBACScopeRegistrySync — all pass

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • All UX related changes have been reviewed by a designer
    • No UX review needed
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

JadeCara and others added 2 commits April 8, 2026 17:09
- Extend CommentType with message_to_subject and reply_from_subject
- Add CorrespondenceDeliveryStatus enum (pending/sent/delivered/bounced/failed)
- Add MessagingActionType.CORRESPONDENCE with default template
- Add EventAuditType correspondence events (string column, no migration)
- Register CORRESPONDENCE_SEND, CORRESPONDENCE_READ, NOTIFICATION_READ scopes
- Encrypt comment_text column using StringEncryptedType (AES-GCM)
- Migration: create PG enum, batch-encrypt existing rows, seed RBAC scopes
- Update TestRBACScopeRegistrySync to scan multiple migration files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored Preview Apr 15, 2026 10:28pm
fides-privacy-center Ignored Ignored Apr 15, 2026 10:28pm

Request Review

JadeCara and others added 2 commits April 8, 2026 17:19
delivery_status uses varchar, not a PostgreSQL enum type.
Validation happens at the Python model layer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added CORRESPONDENCE default template, so the count increased by 1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.07%. Comparing base (c6fe17d) to head (249d24c).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7866   +/-   ##
=======================================
  Coverage   85.06%   85.07%           
=======================================
  Files         629      629           
  Lines       40860    40879   +19     
  Branches     4748     4748           
=======================================
+ Hits        34759    34776   +17     
- Misses       5029     5030    +1     
- Partials     1072     1073    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JadeCara JadeCara added the db-migration This indicates that a change includes a database migration label Apr 9, 2026
@JadeCara JadeCara marked this pull request as ready for review April 9, 2026 20:04
@JadeCara JadeCara requested a review from a team as a code owner April 9, 2026 20:04
@JadeCara JadeCara requested review from johnewart and removed request for a team April 9, 2026 20:04
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — ENG-3299: Correspondence Enums, Scopes & comment_text Encryption

Overall this is a solid foundational PR. The encryption approach is consistent with existing patterns in the codebase, the batched migration is correct, RBAC seeding follows the established pattern, and the TLA+ compliance note is appreciated. A few issues worth addressing before merge:

Must-fix

CORRESPONDENCE_SCOPES mixes namespaces and the test hardcodes its name
The migration dict CORRESPONDENCE_SCOPES contains notification:read, which belongs to a different scope namespace. More importantly, the test_rbac.py test hardcodes SCOPE_DICT_NAMES = {"SCOPE_DOCS", "CORRESPONDENCE_SCOPES"} — meaning if any future migration introduces a different dict name (e.g. REPORTING_SCOPES), its scopes will be silently missing from the enforcement check. The test's own comment says "dicts whose name ends with _SCOPES" — the implementation should match that intent. See inline comments.

Should-fix

CorrespondenceDeliveryStatus is dead code at this point
The class is defined in comment.py but the Comment model has no delivery_status column, and no module imports the class. If it's scaffolding for PR 2, a comment to that effect would help; otherwise it should be deferred to the PR that actually introduces the column.

Downgrade loop lacks logging
The upgrade() logs the number of encrypted rows but downgrade() doesn't. Parity would aid debugging during rollbacks.

Suggestions / observations

  • The owner-only RBAC assignment for correspondence:read and notification:read may be intentional, but is worth a quick confirmation — these could reasonably be readable by lower-privileged roles.
  • The encryption migration has no idempotency guard. Double-running (e.g. after a partial downgrade failure) would double-encrypt rows. A doc note or guard condition would reduce risk.

🔬 Codegraph: unavailable

💡 Write /code-review in a comment to re-run this review.

Comment thread tests/api/models/test_rbac.py Outdated
assert len(migration_scopes) > 0, "Could not parse SCOPE_DOCS from migration"
# Collect scope keys from all dicts whose name ends with _SCOPES
# or is SCOPE_DOCS across all migration files
SCOPE_DICT_NAMES = {"SCOPE_DOCS", "CORRESPONDENCE_SCOPES"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/api/models/test_rbac.py:555

The comment above this line says "Collect scope keys from all dicts whose name ends with _SCOPES or is SCOPE_DOCS" but the implementation uses a hardcoded set rather than a pattern match. When a future migration introduces a dict with a new name (e.g. REPORTING_SCOPES), the test will silently pass even though those scopes are missing from the assertion.

Suggest replacing the hardcoded set with a name-based match to make the test self-maintaining:

# Before
if target.id in SCOPE_DICT_NAMES:

# After
if target.id == "SCOPE_DOCS" or target.id.endswith("_SCOPES"):

This eliminates SCOPE_DICT_NAMES entirely and keeps the test consistent with its own documentation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — replaced the hardcoded SCOPE_DICT_NAMES set with target.id.endswith("_SCOPES") or target.id == "SCOPE_DOCS" pattern match. The test is now self-extending for future migrations.

Comment thread src/fides/api/models/comment.py
@JadeCara
Copy link
Copy Markdown
Contributor Author

JadeCara commented Apr 9, 2026

/code-review

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: ENG-3299 — Correspondence Enums, Scopes & comment_text Encryption

This PR introduces two new CommentType values, a CorrespondenceDeliveryStatus enum, a MessagingActionType.CORRESPONDENCE entry, new correspondence:* / notification:read scopes in the registry, and a migration that back-fills encryption on existing comment rows and seeds RBAC permissions.

The overall structure is clean, but the migration has two critical correctness issues around idempotency that could corrupt data in production or during re-runs.


Critical

1. Migration is not idempotent — re-running upgrade() double-encrypts all rows

The upgrade() loop calls enc.process_bind_param unconditionally on every comment_text value with no check for whether the row is already encrypted. A second run (CI retry, operator mistake, partial failure followed by retry) will encrypt ciphertext again, producing an unreadable value. The existing consent_encryption_migration.py pattern solves this with a trial-decrypt: attempt process_result_value first; skip on success, encrypt on failure. See the inline comment at line 59 for a concrete fix.

2. downgrade() crashes on plaintext rows

process_result_value raises InvalidCiphertextError on plaintext input. If downgrade() is called on a database where upgrade() never ran (fresh env, partial rollback), the decryption loop will throw and leave the table in a broken half-decrypted state. Same trial-decrypt guard fixes this.


Suggestions

3. No fast-fail for missing encryption key — The docstring notes FIDES__SECURITY__APP_ENCRYPTION_KEY as a PREREQ, but if it's absent the failure manifests deep in the crypto layer. An explicit assertion at the top of upgrade() would catch this immediately with a clear message.

4. ON CONFLICT DO NOTHING + unconditional downgrade delete — If the scope codes already exist in rbac_permission from another source, the INSERT silently no-ops, but downgrade() will still delete the pre-existing row. Consider querying after the insert to confirm ownership, or using DO UPDATE SET updated_at = now() to take ownership explicitly.

5. CorrespondenceDeliveryStatus is unused — Defined in comment.py but no column references it. If this is for a future column, a comment explaining the intent would help; if the column was accidentally omitted, it should be added here.

6. SCOPE_DICT_NAMES hardcoded in test{"SCOPE_DOCS", "CORRESPONDENCE_SCOPES"} must be manually updated every time a future migration seeds scopes under a different dict name, making it easy to accidentally break test coverage. A comment calling this out (or a more generic heuristic) would help.

7. Per-row UPDATE loop — 500 individual UPDATE ... WHERE id = :id statements per batch increases lock-holding time for large tables. Worth documenting expected duration for tables with many rows, even if changing the approach is out of scope.

🔬 Codegraph: unavailable


💡 Write /code-review in a comment to re-run this review.

Comment thread src/fides/api/models/comment.py
Comment thread tests/api/models/test_rbac.py Outdated
@JadeCara
Copy link
Copy Markdown
Contributor Author

JadeCara commented Apr 9, 2026

/code-review

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — ENG-3299: Correspondence enums, scopes, and comment_text encryption

Overall this is a clean, well-structured first PR for the correspondence feature. The encryption migration is safe (uses the established encrypted_type helper, batches correctly, handles upgrade/downgrade symmetrically), the scope registry additions follow existing conventions, and the test counts are updated correctly.

Summary of findings

Medium

  • SCOPE_DICT_NAMES hardcoding vs. comment mismatch (test_rbac.py:555) — The comment describes a pattern-based rule (endswith("_SCOPES")), but the implementation is an exact-match set. Every future scope-seeding migration must update this set or the enforcement test silently misses the new scopes. Aligning the implementation with the comment's stated intent would make the test self-extending.

Minor / questions

  • RBAC role assignment scope (migration:91) — All three new scopes are seeded only on the owner role. Confirm whether correspondence:read / notification:read should also be granted to lower-privilege standard roles (e.g. viewer), or add a note that PR 2 will revisit role assignments.
  • CorrespondenceDeliveryStatus is unused (comment.py:37) — Intentional for PR 1 of 2, but a short inline comment pointing to the follow-up PR/column would prevent confusion.
  • Per-row UPDATE round-trips in migration (migration:58) — Each encrypted row is updated individually. Fine for typical comment table sizes, but worth noting for large installs. Not a blocker.

What looks good

  • Batched encrypt/decrypt with stable ORDER BY id + OFFSET — correct and safe.
  • Downgrade removes rbac_role_permission rows before rbac_permission rows — correct FK ordering.
  • comment_text is nullable=False, so no null-guard needed in the migration loop.
  • SEND constant was already defined in scope_registry.py (line 64) — CORRESPONDENCE_SEND composes correctly.
  • MessagingActionType.CORRESPONDENCE default template follows the same __PLACEHOLDER__ convention used elsewhere.
  • Test counts bumped from 9 → 10 templates in all three save_defaults test cases.

🔬 Codegraph: connected (42425 nodes)


💡 Write /code-review in a comment to re-run this review.

Comment thread tests/api/models/test_rbac.py Outdated
Comment thread src/fides/api/models/comment.py Outdated
JadeCara and others added 2 commits April 9, 2026 15:51
- Add idempotency guards (trial-decrypt) to migration upgrade/downgrade
- Add row-count logging to downgrade for parity with upgrade
- Split CORRESPONDENCE_SCOPES into separate namespace dicts
- Replace hardcoded SCOPE_DICT_NAMES with endswith("_SCOPES") pattern match
- Add docstring to CorrespondenceDeliveryStatus referencing PR 2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread src/fides/api/models/messaging.py Outdated
@JadeCara JadeCara requested review from erosselli and removed request for johnewart April 13, 2026 21:23
- notes are internal comments.
- reply is reserved for future use and is not currently supported
- note: internal comments
- reply: reserved for future use
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reply for?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reply pre-dates this PR — it's been a reserved CommentType value for threaded reply support that hasn't been implemented yet. It's only referenced in tests as fixture data; no route, service, or schema uses it in production code.

Implementing threaded replies is out of scope for this project, but the plumbing is already in place (parent_id FK, parent/replies relationships, index) so when the time comes it should be quick to wire up a route and service layer for it.

def _is_encrypted(enc, value, dialect):
"""Return True if value is already AES-GCM ciphertext."""
if value is None:
return False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to consider null values unencrypted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Null values are skipped — not treated as encrypted or unencrypted. Line 90 checks row.comment_text is None first and skips those rows entirely. The column is nullable=False on the model so nulls shouldn't exist in practice, but the guard handles edge cases (e.g. rows inserted by raw SQL). I can add an inline comment clarifying this if it reads ambiguously.

# Under normal Alembic operation the transaction guarantees all-or-nothing,
# but the guard protects against manual version-stamp edits or future
# refactors that add per-batch commits.
from fides.api.db.encryption_utils import encrypted_type, get_encryption_key
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do top-level imports or is the local import needed for some reason?

Copy link
Copy Markdown
Contributor Author

@JadeCara JadeCara Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops! I'll move them to the top of the file.

Comment on lines +80 to +83
text(
"SELECT id, comment_text FROM comment "
"ORDER BY id LIMIT :limit OFFSET :offset"
),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using offset can be problematic in large tables... I imagine this isn't a problem here since the number of comments is somewhat limited because people are writing them manually, but do we have an estimate of the number of rows in this table for some of the bigger clients in prod ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are manually written by operators on privacy requests — even the biggest clients have low thousands at most. The offset-based pagination is safe here because ORDER BY id is stable (we only update comment_text, never id), so rows don't shift between pages. The same LIMIT/OFFSET pattern is used in consent_encryption_migration.py. Batch size of 500 means even 10K rows is only 20 queries.

"Skipped {} already-encrypted or NULL comment_text rows", total_skipped
)

# 2. Seed RBAC permissions for new scopes
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

steps 2 and 3 seem unrelated from the encryption . maybe they should be two separate migrations? one to encrypt, one to do all the role/scope seeding

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — I'll split this into two migrations: one for encrypting comment_text, one for RBAC scope seeding. Keeps each migration single-purpose and makes rollbacks cleaner.

JadeCara and others added 2 commits April 15, 2026 16:06
Address review feedback from erosselli:
- Split single migration into two: 6465b161a450 (encrypt comment_text)
  and f26bf4ecec76 (seed correspondence/notification RBAC scopes)
- Move fides.api.db.encryption_utils imports to top-level, matching
  the pattern used by every other migration in the codebase

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

db-migration This indicates that a change includes a database migration

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants