-
Notifications
You must be signed in to change notification settings - Fork 86
Open
Labels
enhancementNew feature or requestNew feature or requestgpuCuPy / CUDA GPU supportCuPy / CUDA GPU support
Description
Author of Proposal: @brendan
Reason or problem
nvCOMP ships batch LZ4 compress/decompress, but the geotiff package doesn't use it. LZ4 decompresses 2-4x faster than deflate at the cost of lower compression ratios. That tradeoff is worth it for Dask chunked reads where decompression speed matters more than file size.
GDAL uses TIFF tag 50004 for LZ4-compressed GeoTIFFs. We don't support it yet.
Proposal
Design:
_compression.py:COMPRESSION_LZ4 = 50004, CPU decompress/compress vialz4.frame(frompython-lz4), with the usualLZ4_AVAILABLEflag_gpu_decode.py: WirenvcompBatchedLZ4DecompressAsync/nvcompBatchedLZ4CompressAsyncinto the existing nvCOMP ctypes code. Just anotherelifnext to deflate and ZSTD._writer.py: Add'lz4'to_compression_tag()- Hook into
gpu_decode_tiles()andgpu_compress_tiles()
The nvCOMP batch API for LZ4 uses the same calling convention as deflate/ZSTD, so this is mostly copy-paste with different function names.
Usage:
write_geotiff(data, "fast.tif", compression="lz4")
da = read_geotiff("fast.tif")Stakeholders and impacts
Users with large rasters who want fast reads over small files. Useful for Dask workflows where tiles get decompressed on every chunk access. Additive, nothing existing changes.
Drawbacks
- Lower compression ratio than deflate/ZSTD
- Tag 50004 is a GDAL extension, not baseline TIFF. Files won't open in every TIFF reader.
python-lz4is another optional dependency
Alternatives
- ZSTD at a low compression level is a decent middle ground
- Uncompressed is the fastest read but wastes disk
Unresolved questions
- LZ4 frame format vs block format (GDAL uses frame for tag 50004)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgpuCuPy / CUDA GPU supportCuPy / CUDA GPU support