-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-48277: [C++][Parquet] unpack with shuffle algorithm #47994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AntoinePrv
wants to merge
81
commits into
apache:main
Choose a base branch
from
AntoinePrv:new-bpacking
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+2,020
−40,966
Open
Changes from all commits
Commits
Show all changes
81 commits
Select commit
Hold shift + click to select a range
f36e6bd
Add SSE4.2 implementation
AntoinePrv d019318
Add unpack uint8_t benchmark
AntoinePrv fdfd354
Add bool unpack benchmark
AntoinePrv 6826466
Bias benchmarks toward small scale
AntoinePrv a14a070
Add Kernel plan builder
AntoinePrv 4300cc1
Add simd kernel
AntoinePrv f38f774
Handle rshifts on SSE2
AntoinePrv 8b52a54
Use new kernel when possible in generated 128 code
AntoinePrv a992186
Refactor array to xsimd::batch_constant
AntoinePrv 3e3d2fa
Refactor right shift
AntoinePrv 77b118d
Add oversized plan
AntoinePrv 524ac1b
Add oversized kernel
AntoinePrv 454decc
Rename kernels
AntoinePrv fd3ae27
Add simd kernel dispatch
AntoinePrv 2525e4b
Call Simd kernel directly
AntoinePrv 05f6e7c
Fix SIMD level None
AntoinePrv d0d9064
Initialize swizzles to -1
AntoinePrv f16708a
Doc
AntoinePrv de6baeb
Improve test error message
AntoinePrv e30aebe
Use new kernel in avx2
AntoinePrv f607991
AVX2 swizzle fallback
AntoinePrv 2210fb4
Remove dead code
AntoinePrv 1dc32a7
Simplify Large masks
AntoinePrv e02a74a
Remove bpacking 256 generated file
AntoinePrv 91bf34a
Remove uint8_t fallback
AntoinePrv a939f29
Add boolean simd implementation
AntoinePrv 77fb735
Use std::is_base_of for arch detection
AntoinePrv 51ce7d6
Improve swizzle
AntoinePrv f23cd66
Only use lshift hack when available
AntoinePrv d38df81
Fix return type
AntoinePrv 415ebaa
Fix shift included size
AntoinePrv cf6b56d
Add Avx2 uint16_t shift fallback
AntoinePrv 99301c5
Refactor make_mult
AntoinePrv 3168e1c
Add Avx2 lshift unint8_t fallback
AntoinePrv 9db46a6
Refactor right shift excess
AntoinePrv e2c7367
Refactor make_mult
AntoinePrv 5db07c7
Add SSE var shift uint8_t fallback to uint16_t
AntoinePrv d5b9eca
Implement size reading reduction
AntoinePrv eb8cec0
Add fallback Avx2 right shift
AntoinePrv b91c087
Refactor static dispatch
AntoinePrv be9abd3
Forward oversized to larger uint when possible
AntoinePrv 1551710
Add arch detection functions
AntoinePrv 32335ee
Refactor traits usage
AntoinePrv 11f79b4
Forward x86_64 unpack64 to unpack32
AntoinePrv 2954c29
Simplify template usage
AntoinePrv e3744cd
Reorganize and doc
AntoinePrv c2fc546
Refactor KernelDispatch and remove Oversized dispatch
AntoinePrv 16110cb
Forward large unpack8 to unpack16 on SSE2
AntoinePrv 96410eb
Use fallback right shift on large uint8_t avx2
AntoinePrv 4102591
Fix enable_if
AntoinePrv 960bd9c
Add missing header
AntoinePrv d28d015
fmt
AntoinePrv 4962eee
Add SSE4.2 to dynamic dispatch
AntoinePrv 0a0b314
Rename bpacking_simd_impl > bpacking_simd_kernel
AntoinePrv 4a07ab0
Restore modifications to simd_codegen
AntoinePrv 68cddfb
Reduce reading size and declare bytes read
AntoinePrv 227b776
Add kBytesRead to scalar code
AntoinePrv 17a7231
Add kBytesRead to simd 512 generated code
AntoinePrv c51879e
Prevent overreading
AntoinePrv 3e86901
Fix pessimit overeading guard
AntoinePrv 6a61a87
Fix overreading guard comparison
AntoinePrv 5a000fc
Add UnpackOptions and max_read_bytes
AntoinePrv e6e097a
Use C++20 NTTP
AntoinePrv 743577f
xsimd 14.0 compatibility
AntoinePrv cecd14f
fmt
AntoinePrv c0ee9d5
C++20 NTTP options
AntoinePrv 1cff8bc
Homogenous wording
AntoinePrv 57f278b
Remove xsimd backward compatibility
AntoinePrv 43c8694
Apply doc fixes from code review
AntoinePrv 251437e
Documentation and code improvements
AntoinePrv 5ed131a
Move utilities into bpacking sub ns
AntoinePrv f64ad1d
Refactor plan builders
AntoinePrv 8ea86be
Move utilities
AntoinePrv e68f936
Kernel documentation
AntoinePrv 3c27968
adjust_bytes_per_read doc
AntoinePrv d050ee1
Fewer typename
AntoinePrv 1329b39
Add documentation
AntoinePrv 19a32e3
Fix bounds in plan builders
AntoinePrv b431b19
Change names
AntoinePrv 22dff86
Add extra comments
AntoinePrv b638570
Fix comments
AntoinePrv File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, why not put all the unpack-related APIs inside
arrow::internal::bpackingas well? Does it cause too much code churn, or would it fail for other reasons?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason, anything works really. My reasoning was
unpackis a "library-public" utility function, so it lives inarrow::internalwhilearrow::internal::bpackingis "private" to theunpackfunction. Does that makes sense?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of, though we might want to revisit later anyway. Not necessary for this PR in any case!