Comprehensive consistency test suite for TieredStorage#363
Draft
Comprehensive consistency test suite for TieredStorage#363
Conversation
Ref FS-236 Formalizes the TieredStorage consistency invariants and adds structured tests proving they hold under normal operation and documenting where they break under failure, pod termination, and concurrency.
Extract insert_small/insert_large, make_failing_storage, payload constants, check_invariants_core, and SyncBackend builder to eliminate repeated boilerplate across 34 tests. No coverage changes.
Add three chaos fuzz tests that run concurrent operations against TieredStorage with a ChaosBackend (yield-based interleaving + error injection) and assert that the known invariant violations occur: - concurrent_insert_large_insert_small: DualData from racing inserts - concurrent_insert_delete_from_large_state: OrphanLT + DualData - concurrent_inserts_with_tombstone_write_errors: DualData from tombstone write failure + cleanup failure Each test documents a gap in the current algorithm. When the algorithm is hardened, flip the assertions to verify the violations are gone.
jan-auer
added a commit
that referenced
this pull request
Mar 20, 2026
Before this PR, the tiered storage implementation used unconditional
writes and stored all objects under the same key as their tombstone.
This could lead to lost updates and orphans on concurrent writes and
deletes. This PR adds a unique revision to the key path of each
large-object and uses check-and-set that commits only if the stored
tombstone's revision still matches the last known state. Together, these
provide atomic commit points for all three operations:
- Replaces `create_tombstone` with a single `compare_and_write` method
on `HighVolumeBackend`. The method takes a `TieredWrite` (Tombstone,
Object, or Delete) and an optional expected redirect target, and applies
the mutation only if the current row state matches.
- **Large-object writes** now store the payload at a unique revision key
(`{key}/{uuid_v7}`) so each write gets a distinct storage path. A
`get_tiered_metadata` read establishes the CAS precondition before
writing to GCS, and a subsequent `compare_and_write` atomically commits
the tombstone. CAS conflicts clean up the new GCS blob; CAS errors do
the same then propagate.
- **Small-object writes** that encounter an existing tombstone now
CAS-swap it for inline data rather than routing to LT. This fixes the
expiry-mismatch TODO and keeps small objects in HV.
- **Deletes** Remove the tombstone first (commit point), then clean up
GCS best-effort. This is the inverse of the previous ordering: if GCS
cleanup fails an orphan blob remains (accepted), but the tombstone is
gone and the object is unreachable.
Tests have been reorganized for readability. There are no tests covering
races and extreme edge cases. These will be added in #363.
---------
Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Formalizes three consistency invariants for
TieredStorageand adds a structured test suite that proves they hold under normal operation and documents where they break.Invariants
Testing strategy
Tests are organized into five categories, all using mocked backends (no real GCS/BigTable):
check_invariantspasses after each failure (state unchanged or safely degraded)SyncBackend+ timeout, proving OrphanLT occurs on insert kill and OrphanTombstone (safe) on delete killNotify-based sync hooks, proving insert+insert and insert+delete produce invariant violationsassert_consistentafter every operationKnown violations
All four known violations run through
check_invariantsand assert it returnsErr:When a fix lands,
check_invariantswill returnOk, theunwrap_err()will panic, and the test must be updated toassert_consistent— making fixes self-enforcing.Ref FS-236