[C++] feat: add streaming Snappy codec using official framing format #49183
+615
−59
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Implement streaming Snappy compressor/decompressor for Arrow C++ using the official Snappy framing format, including per-chunk masked CRC-32C verification, and enable the existing streaming tests for Snappy.
Details
crc32c_maskedhelper inarrow::utilto compute the masked CRC-32C checksum as defined by the Snappy framing specification.crc32c.ccand link it into the main util library.compression_snappy.cc:Codec::Compress/Decompressbased on raw Snappy bitstreams (RawCompress/RawUncompress).SnappyFramedCompressorthat emits the official stream identifier chunk and split the uncompressed stream into 64 KiB chunks, each wrapped as a framed chunk with a per-chunk masked CRC-32C checksum.SnappyFramedDecompressoras a stateful parser for Snappy framed streams that validates the stream identifier, handles compressed/uncompressed/skippable chunks, verifies the masked CRC-32C of the uncompressed payload, and supports incremental output via theDecompressAPI.Codec::MakeCompressor/Codec::MakeDecompressorforCompression::SNAPPYto the new framed implementations.compression_test.ccso that they:CheckStreamingDecompressorusing the streaming compressor rather than one-shot compression.StreamingCompressor,StreamingDecompressor,StreamingRoundtrip,StreamingDecompressorReuse, andStreamingMultiFlush, so streaming tests now cover Snappy as well as the existing codecs.Testing
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?