Skip to content

fix: injective hash for trie value nodes (prevent Poseidon hash collisions)#512

Open
n13 wants to merge 3 commits intofix_genesis_accountsfrom
fix_value_hashing_in_trie_db
Open

fix: injective hash for trie value nodes (prevent Poseidon hash collisions)#512
n13 wants to merge 3 commits intofix_genesis_accountsfrom
fix_value_hashing_in_trie_db

Conversation

@n13
Copy link
Copy Markdown
Collaborator

@n13 n13 commented Apr 10, 2026

Summary

  • Fix non-injective value hashing in the trie: PoseidonHasher::hash() uses bytes_to_felts_compact, which is non-injective for variable-length data — distinct byte sequences can produce identical hashes. With MAX_INLINE_VALUE = 0 (all values hashed externally), this can silently corrupt storage by making different values look identical in the trie.
  • Introduce TrieLayout::hash_value() — a new trait method that separates value hashing from node hashing. The default falls back to H::hash(), but LayoutV0/LayoutV1 override it to use qp_poseidon_core::hash_bytes, which is injective (length-prefixed encoding). Trie node hashing remains unchanged (compact, ZK-friendly).
  • Harden MemoryDB against null-node collisions: the wrapper now includes a forced fallback map so values matching memory-db's internal null_node_data sentinel are never silently dropped on emplace.

What changed and why

1. TrieLayout::hash_value() — new trait method (trie-db/src/lib.rs)

Added to the TrieLayout trait with a default implementation (H::hash(value)). This gives each layout the ability to override how values are hashed without affecting how trie nodes are hashed. The default preserves backward compatibility for any layout that doesn't override it.

2. All value-hashing call sites updated (trie-db)

File Change
triedbmut.rs (2 sites) NodeToEncode::Node handling now calls L::hash_value(value) + db.emplace() instead of db.insert()
iter_build.rs (4 sites) All process_inner_hashed_value implementations (TrieBuilder, TrieRoot, TrieRootPrint, TrieRootUnhashed) call T::hash_value()
node.rs Value::to_owned_value uses L::hash_value() for cache consistency

3. LayoutV0 / LayoutV1 override (sp-trie/src/lib.rs)

Both layouts override hash_value() to call injective_value_hash::<H>(), which uses qp_poseidon_core::hash_bytes — an injective hash function that length-prefixes the input before hashing into field elements.

The trie_root / trie_root_unhashed implementations were also updated to use trie_db::trie_visit (which respects L::hash_value()) instead of the external trie-root crate functions (which hardcode H::hash()).

4. MemoryDB wrapper hardened (sp-trie/src/lib.rs)

memory_db::MemoryDB::emplace() silently drops values that exactly match its null_node_data sentinel (default: [0u8; 8]). When L::hash_value() produces a key different from H::hash(), the inner DB's contains check can fail even after a successful emplace. The wrapper now stores such values in a forced: HashMap<H::Out, DBValue> fallback that get/contains check as a last resort.

5. StorageProof dual-insertion (sp-trie/src/storage_proof.rs)

When deserializing a StorageProof into a MemoryDB, each node is now inserted under both H::hash(node) (for standard node lookups) and injective_value_hash(node) (for value lookups). This ensures reconstructed DBs can satisfy both access patterns.

Test changes

Modified test assertions

File Test Change
recorder.rs recorder_transaction_accessed_keys_works Blake2Hasher::hash(value) assertions changed to Layout::hash_value(value) — the trie now stores the injective hash, so get_hash() returns the new hash. Added use trie_db::TrieLayout import.
lib.rs test_realistic_chain_storage Switched from raw MemoryDBMeta to the MemoryDB wrapper — this test inserts vec![0x00; 8] which collides with null_node_data, so it needs the forced fallback.

Test-only layout (ForceHashedValuesLayoutV1)

Also overrides hash_value() with injective_value_hash and removes its custom trie_root/trie_root_unhashed overrides (now inherits defaults that correctly use trie_visit).

Test results

  • sp-trie: 61 passed, 0 failed, 1 ignored
  • trie-db: 14 passed, 0 failed

Known remaining work

  • sp-state-machine: 6 tests fail due to hardcoded expected root hashes and proof sizes that changed with the new value hash. These need their expected values updated (the logic is correct, only the golden values are stale).

PoseidonHasher::hash() uses bytes_to_felts_compact which is non-injective
for variable-length data — distinct values can produce the same hash.
With MAX_INLINE_VALUE=0, all trie leaf values are hashed externally,
so this non-injectivity can cause silent data corruption.

Introduce TrieLayout::hash_value() with a default that falls back to
H::hash(), overridden in LayoutV0/V1 to use qp_poseidon_core::hash_bytes
(injective, length-prefixed encoding). All trie-internal sites that hash
values (triedbmut commit, iter_build visitors, node cache) now call
L::hash_value() instead of H::hash() directly.

MemoryDB wrapper gains a forced-entry fallback map so values that collide
with memory-db's null_node_data are never silently dropped.

StorageProof deserialization dual-inserts each node under both the
standard H::hash key and the injective value key.
@n13 n13 changed the base branch from main to fix_genesis_accounts April 10, 2026 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant