Skip to content

[vibebench] Optimize primitive group-by intern with batch hashing and local dedup#21042

Closed
Dandandan wants to merge 1 commit intoapache:mainfrom
Dandandan:optimize-primitive-intern
Closed

[vibebench] Optimize primitive group-by intern with batch hashing and local dedup#21042
Dandandan wants to merge 1 commit intoapache:mainfrom
Dandandan:optimize-primitive-intern

Conversation

@Dandandan
Copy link
Contributor

Summary

  • Use with_hashes for batch hash computation via thread-local buffer, separating hashing from hash table ops for better vectorization/pipelining
  • Process 4 rows at a time via chunks_exact(4) with local dedup within each chunk to reduce redundant hash table operations
  • Split hash table operations into find + insert_unique phases (lighter than entry which prepares an insertion slot even on hit)
  • Extract find_group, insert_new_group, get_or_create_null_group helpers to consolidate unsafe hash table logic with SAFETY comments
  • Separate null/no-null fast paths to eliminate validity checks when no nulls are present

Test plan

  • cargo test -p datafusion-physical-plan aggregat (82 tests pass)
  • cargo clippy -p datafusion-physical-plan --all-features -- -D warnings (clean)
  • cargo fmt --all (clean)
  • Benchmark with group-by queries on primitive columns (low and high cardinality)

🤖 Generated with Claude Code

…sing, and local dedup

- Use `with_hashes` for batch hash computation via thread-local buffer,
  separating hashing from hash table ops for better vectorization
- Process 4 rows at a time via `chunks_exact(4)` with local dedup to
  reduce redundant hash table operations
- Split lookup into find + insert_unique phases (lighter than `entry`)
- Extract find_group, insert_new_group, get_or_create_null_group helpers
  to consolidate duplicated unsafe hash table logic with SAFETY comments
- Separate null/no-null fast paths to eliminate validity checks when
  no nulls are present

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 18, 2026
@Dandandan Dandandan changed the title Optimize primitive group-by intern with batch hashing and local dedup [vibebench] Optimize primitive group-by intern with batch hashing and local dedup Mar 18, 2026
@Dandandan
Copy link
Contributor Author

run benchmarks

@adriangbot
Copy link

🤖 Benchmark running (GKE) | trigger
Linux bench-c4085355414-435-44p4r 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing optimize-primitive-intern (a551024) to cf0a182 (merge-base) diff using: tpcds
Results will be posted here when complete

@adriangbot
Copy link

🤖 Benchmark running (GKE) | trigger
Linux bench-c4085355414-436-nvpm6 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing optimize-primitive-intern (a551024) to cf0a182 (merge-base) diff using: tpch
Results will be posted here when complete

@adriangbot
Copy link

🤖 Benchmark running (GKE) | trigger
Linux bench-c4085355414-434-lj6sg 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing optimize-primitive-intern (a551024) to cf0a182 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

@adriangbot
Copy link

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and optimize-primitive-intern
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃      optimize-primitive-intern ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 46.79 / 47.42 ±0.80 / 48.98 ms │ 46.32 / 47.14 ±0.85 / 48.54 ms │     no change │
│ QQuery 2  │ 25.48 / 25.65 ±0.14 / 25.89 ms │ 24.00 / 24.44 ±0.39 / 25.13 ms │     no change │
│ QQuery 3  │ 33.78 / 34.13 ±0.24 / 34.52 ms │ 32.59 / 33.04 ±0.39 / 33.73 ms │     no change │
│ QQuery 4  │ 21.38 / 21.92 ±0.78 / 23.46 ms │ 20.76 / 22.23 ±1.52 / 24.94 ms │     no change │
│ QQuery 5  │ 50.62 / 52.93 ±1.24 / 54.14 ms │ 48.49 / 50.86 ±1.67 / 52.82 ms │     no change │
│ QQuery 6  │ 18.17 / 18.47 ±0.21 / 18.71 ms │ 17.43 / 17.69 ±0.15 / 17.89 ms │     no change │
│ QQuery 7  │ 58.50 / 60.78 ±1.72 / 63.76 ms │ 56.46 / 57.90 ±1.23 / 59.59 ms │     no change │
│ QQuery 8  │ 52.00 / 52.84 ±0.62 / 53.75 ms │ 49.29 / 49.86 ±0.37 / 50.31 ms │ +1.06x faster │
│ QQuery 9  │ 58.15 / 58.73 ±0.37 / 59.19 ms │ 55.18 / 56.13 ±0.88 / 57.69 ms │     no change │
│ QQuery 10 │ 76.30 / 76.62 ±0.26 / 77.01 ms │ 73.75 / 75.01 ±0.89 / 76.03 ms │     no change │
│ QQuery 11 │ 17.69 / 17.96 ±0.22 / 18.23 ms │ 16.96 / 18.12 ±0.85 / 19.39 ms │     no change │
│ QQuery 12 │ 28.81 / 30.42 ±1.22 / 32.44 ms │ 28.15 / 28.51 ±0.21 / 28.77 ms │ +1.07x faster │
│ QQuery 13 │ 40.08 / 41.11 ±0.56 / 41.62 ms │ 39.02 / 39.73 ±0.48 / 40.27 ms │     no change │
│ QQuery 14 │ 29.21 / 29.41 ±0.15 / 29.66 ms │ 28.46 / 29.21 ±0.92 / 30.97 ms │     no change │
│ QQuery 15 │ 36.79 / 37.75 ±0.52 / 38.30 ms │ 35.05 / 35.39 ±0.20 / 35.68 ms │ +1.07x faster │
│ QQuery 16 │ 17.86 / 18.34 ±0.40 / 19.04 ms │ 17.63 / 18.03 ±0.28 / 18.45 ms │     no change │
│ QQuery 17 │ 73.61 / 75.36 ±1.39 / 77.10 ms │ 73.89 / 77.90 ±3.31 / 81.82 ms │     no change │
│ QQuery 18 │ 79.04 / 80.00 ±0.51 / 80.51 ms │ 81.34 / 81.80 ±0.47 / 82.56 ms │     no change │
│ QQuery 19 │ 39.50 / 40.33 ±0.88 / 41.94 ms │ 39.48 / 40.82 ±1.33 / 43.28 ms │     no change │
│ QQuery 20 │ 42.80 / 43.40 ±0.64 / 44.56 ms │ 43.76 / 45.09 ±1.10 / 46.88 ms │     no change │
│ QQuery 21 │ 69.17 / 69.80 ±0.39 / 70.29 ms │ 70.63 / 72.54 ±1.43 / 74.48 ms │     no change │
│ QQuery 22 │ 19.79 / 20.31 ±0.50 / 20.90 ms │ 20.24 / 20.57 ±0.26 / 20.91 ms │     no change │
└───────────┴────────────────────────────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                        ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 953.67ms │
│ Total Time (optimize-primitive-intern)   │ 941.99ms │
│ Average Time (HEAD)                      │  43.35ms │
│ Average Time (optimize-primitive-intern) │  42.82ms │
│ Queries Faster                           │        3 │
│ Queries Slower                           │        0 │
│ Queries with No Change                   │       19 │
│ Queries with Failure                     │        0 │
└──────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 5.1s
Peak memory 4.0 GiB
Avg memory 3.6 GiB
CPU user 33.7s
CPU sys 3.3s
Disk read 0 B
Disk write 140.0 KiB

tpch — branch

Metric Value
Wall time 5.0s
Peak memory 4.1 GiB
Avg memory 3.6 GiB
CPU user 33.9s
CPU sys 2.8s
Disk read 0 B
Disk write 68.0 KiB

@adriangbot
Copy link

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and optimize-primitive-intern
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃                optimize-primitive-intern ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           50.10 / 51.54 ±0.96 / 52.56 ms │           49.31 / 50.38 ±1.02 / 52.08 ms │     no change │
│ QQuery 2  │        155.96 / 157.02 ±0.82 / 157.84 ms │        153.61 / 155.41 ±1.36 / 157.58 ms │     no change │
│ QQuery 3  │        118.65 / 120.46 ±1.82 / 123.74 ms │        116.81 / 118.04 ±1.31 / 120.37 ms │     no change │
│ QQuery 4  │    1484.05 / 1508.11 ±18.95 / 1542.01 ms │    1355.14 / 1366.81 ±10.73 / 1385.27 ms │ +1.10x faster │
│ QQuery 5  │        188.66 / 191.92 ±1.95 / 193.84 ms │        183.58 / 185.17 ±1.66 / 187.70 ms │     no change │
│ QQuery 6  │    1012.28 / 1061.53 ±34.98 / 1105.66 ms │    1035.71 / 1057.35 ±26.07 / 1103.53 ms │     no change │
│ QQuery 7  │        359.81 / 361.94 ±1.47 / 364.18 ms │        359.07 / 363.04 ±3.57 / 368.74 ms │     no change │
│ QQuery 8  │        121.63 / 122.25 ±0.39 / 122.67 ms │        120.50 / 121.10 ±0.41 / 121.76 ms │     no change │
│ QQuery 9  │        117.38 / 121.25 ±3.06 / 124.27 ms │        113.62 / 117.53 ±2.21 / 119.58 ms │     no change │
│ QQuery 10 │        114.56 / 116.08 ±0.84 / 117.11 ms │        116.04 / 116.49 ±0.49 / 117.37 ms │     no change │
│ QQuery 11 │        917.64 / 925.43 ±6.14 / 931.66 ms │        915.69 / 924.19 ±4.67 / 929.47 ms │     no change │
│ QQuery 12 │           47.67 / 48.79 ±0.89 / 50.17 ms │           47.02 / 49.31 ±1.44 / 51.58 ms │     no change │
│ QQuery 13 │        407.37 / 412.30 ±2.71 / 415.04 ms │        407.45 / 409.81 ±2.68 / 413.42 ms │     no change │
│ QQuery 14 │     1077.08 / 1084.39 ±4.87 / 1091.92 ms │     1073.18 / 1085.53 ±8.05 / 1094.61 ms │     no change │
│ QQuery 15 │           18.61 / 19.68 ±0.90 / 21.03 ms │           18.43 / 19.24 ±0.92 / 20.91 ms │     no change │
│ QQuery 16 │           45.26 / 46.17 ±0.79 / 47.29 ms │           44.45 / 46.01 ±1.21 / 47.73 ms │     no change │
│ QQuery 17 │        243.60 / 247.23 ±2.61 / 250.63 ms │        242.58 / 246.22 ±2.63 / 248.87 ms │     no change │
│ QQuery 18 │        134.85 / 137.14 ±1.71 / 140.04 ms │        134.71 / 135.42 ±0.78 / 136.75 ms │     no change │
│ QQuery 19 │        159.39 / 161.26 ±1.42 / 163.76 ms │        159.71 / 161.41 ±1.32 / 162.84 ms │     no change │
│ QQuery 20 │           15.98 / 16.78 ±0.46 / 17.33 ms │           15.92 / 16.36 ±0.56 / 17.44 ms │     no change │
│ QQuery 21 │           24.32 / 24.97 ±0.54 / 25.87 ms │           23.84 / 24.52 ±0.56 / 25.50 ms │     no change │
│ QQuery 22 │       487.25 / 495.78 ±11.67 / 518.68 ms │        494.55 / 499.54 ±3.25 / 504.51 ms │     no change │
│ QQuery 23 │       939.60 / 953.97 ±13.34 / 974.39 ms │        932.22 / 943.80 ±7.56 / 954.28 ms │     no change │
│ QQuery 24 │        440.72 / 445.08 ±3.68 / 449.07 ms │        427.68 / 432.30 ±2.73 / 435.99 ms │     no change │
│ QQuery 25 │        364.03 / 367.13 ±3.40 / 372.79 ms │        360.01 / 361.99 ±2.11 / 365.85 ms │     no change │
│ QQuery 26 │           87.17 / 91.84 ±2.99 / 95.22 ms │           87.61 / 88.87 ±0.97 / 90.04 ms │     no change │
│ QQuery 27 │        351.04 / 355.90 ±3.65 / 361.58 ms │        351.42 / 355.29 ±2.29 / 357.87 ms │     no change │
│ QQuery 28 │        157.87 / 161.18 ±1.87 / 162.92 ms │        157.29 / 159.08 ±1.22 / 161.10 ms │     no change │
│ QQuery 29 │        307.02 / 309.79 ±2.65 / 313.77 ms │        310.80 / 317.10 ±3.62 / 320.85 ms │     no change │
│ QQuery 30 │           48.04 / 51.47 ±2.38 / 54.65 ms │           48.13 / 49.92 ±1.57 / 52.22 ms │     no change │
│ QQuery 31 │        180.57 / 182.55 ±1.65 / 184.61 ms │        178.51 / 180.75 ±1.63 / 183.02 ms │     no change │
│ QQuery 32 │           61.24 / 62.25 ±0.94 / 63.69 ms │           61.53 / 62.29 ±0.94 / 64.10 ms │     no change │
│ QQuery 33 │        150.83 / 152.82 ±1.56 / 154.62 ms │        151.06 / 152.54 ±1.01 / 153.87 ms │     no change │
│ QQuery 34 │        109.20 / 110.78 ±1.07 / 112.02 ms │        110.88 / 111.62 ±0.61 / 112.46 ms │     no change │
│ QQuery 35 │        113.80 / 115.08 ±0.97 / 116.12 ms │        113.25 / 116.47 ±2.22 / 120.19 ms │     no change │
│ QQuery 36 │        218.21 / 223.49 ±3.07 / 226.85 ms │        219.76 / 225.57 ±3.29 / 229.23 ms │     no change │
│ QQuery 37 │        182.44 / 185.98 ±2.18 / 188.96 ms │        181.31 / 184.25 ±2.36 / 186.95 ms │     no change │
│ QQuery 38 │           93.23 / 95.49 ±1.87 / 98.15 ms │          91.74 / 96.41 ±3.33 / 101.55 ms │     no change │
│ QQuery 39 │        137.81 / 140.96 ±2.19 / 144.62 ms │        141.52 / 142.63 ±0.96 / 143.70 ms │     no change │
│ QQuery 40 │        117.94 / 120.80 ±3.31 / 127.07 ms │        114.78 / 122.09 ±7.56 / 135.42 ms │     no change │
│ QQuery 41 │           17.71 / 18.95 ±1.04 / 20.21 ms │           17.49 / 18.41 ±0.54 / 18.90 ms │     no change │
│ QQuery 42 │        110.88 / 112.65 ±1.07 / 113.82 ms │        110.26 / 111.87 ±1.63 / 114.78 ms │     no change │
│ QQuery 43 │           86.19 / 87.59 ±0.74 / 88.23 ms │           87.61 / 87.99 ±0.57 / 89.11 ms │     no change │
│ QQuery 44 │           18.64 / 19.16 ±0.41 / 19.79 ms │           18.53 / 18.77 ±0.24 / 19.11 ms │     no change │
│ QQuery 45 │           57.22 / 58.62 ±1.24 / 60.83 ms │           58.06 / 59.71 ±1.14 / 60.95 ms │     no change │
│ QQuery 46 │        234.28 / 237.13 ±2.21 / 240.63 ms │        243.33 / 245.75 ±1.55 / 247.67 ms │     no change │
│ QQuery 47 │        709.36 / 717.74 ±6.53 / 726.24 ms │       775.14 / 797.10 ±16.30 / 817.11 ms │  1.11x slower │
│ QQuery 48 │        289.34 / 295.35 ±3.87 / 299.75 ms │        299.61 / 304.04 ±3.93 / 310.10 ms │     no change │
│ QQuery 49 │        265.69 / 267.60 ±1.84 / 269.99 ms │        269.47 / 272.03 ±2.19 / 276.00 ms │     no change │
│ QQuery 50 │        228.79 / 237.49 ±5.07 / 243.74 ms │        244.70 / 254.01 ±5.16 / 259.14 ms │  1.07x slower │
│ QQuery 51 │        189.81 / 191.72 ±1.39 / 194.13 ms │        195.15 / 196.80 ±1.42 / 199.03 ms │     no change │
│ QQuery 52 │        110.42 / 111.39 ±0.90 / 112.96 ms │        113.64 / 114.56 ±1.09 / 116.66 ms │     no change │
│ QQuery 53 │        106.38 / 107.48 ±0.79 / 108.49 ms │        109.34 / 110.78 ±1.49 / 112.96 ms │     no change │
│ QQuery 54 │        156.08 / 157.20 ±0.57 / 157.58 ms │        157.40 / 159.13 ±1.26 / 160.64 ms │     no change │
│ QQuery 55 │        110.07 / 111.27 ±0.66 / 111.79 ms │        110.09 / 111.46 ±1.23 / 113.55 ms │     no change │
│ QQuery 56 │        151.53 / 152.74 ±1.06 / 154.68 ms │        151.92 / 153.87 ±1.78 / 156.63 ms │     no change │
│ QQuery 57 │        187.95 / 189.51 ±1.64 / 192.47 ms │        188.01 / 190.62 ±1.58 / 192.68 ms │     no change │
│ QQuery 58 │        306.55 / 314.83 ±5.14 / 319.91 ms │        303.92 / 311.24 ±4.86 / 316.26 ms │     no change │
│ QQuery 59 │        203.85 / 207.10 ±2.09 / 209.96 ms │        208.58 / 209.35 ±1.02 / 211.37 ms │     no change │
│ QQuery 60 │        155.27 / 157.45 ±2.12 / 161.37 ms │        156.71 / 157.70 ±0.89 / 159.22 ms │     no change │
│ QQuery 61 │        180.63 / 182.13 ±1.08 / 183.40 ms │        181.65 / 183.76 ±1.30 / 185.60 ms │     no change │
│ QQuery 62 │       863.58 / 903.11 ±23.55 / 930.15 ms │       910.12 / 957.02 ±24.16 / 979.47 ms │  1.06x slower │
│ QQuery 63 │        107.65 / 110.41 ±1.88 / 113.57 ms │        108.00 / 110.65 ±2.83 / 114.18 ms │     no change │
│ QQuery 64 │        718.73 / 722.93 ±3.50 / 729.26 ms │        737.81 / 739.98 ±2.29 / 744.09 ms │     no change │
│ QQuery 65 │        253.39 / 259.14 ±4.56 / 265.97 ms │        266.47 / 270.60 ±2.25 / 273.32 ms │     no change │
│ QQuery 66 │        256.71 / 269.02 ±8.09 / 277.40 ms │       245.31 / 265.42 ±14.03 / 280.91 ms │     no change │
│ QQuery 67 │        318.01 / 325.98 ±6.74 / 337.37 ms │        328.17 / 334.71 ±4.71 / 341.15 ms │     no change │
│ QQuery 68 │        280.78 / 285.87 ±3.69 / 291.31 ms │        289.88 / 294.36 ±3.79 / 300.18 ms │     no change │
│ QQuery 69 │        110.29 / 111.54 ±0.68 / 112.25 ms │        109.91 / 112.20 ±1.72 / 114.08 ms │     no change │
│ QQuery 70 │        339.41 / 351.02 ±8.06 / 361.80 ms │       335.82 / 355.35 ±14.50 / 380.24 ms │     no change │
│ QQuery 71 │        138.70 / 140.57 ±1.71 / 143.73 ms │        137.27 / 141.23 ±4.39 / 149.38 ms │     no change │
│ QQuery 72 │        721.76 / 725.88 ±2.93 / 729.29 ms │       708.95 / 728.06 ±11.94 / 744.40 ms │     no change │
│ QQuery 73 │        105.18 / 108.59 ±2.54 / 112.58 ms │        109.25 / 111.71 ±1.42 / 113.17 ms │     no change │
│ QQuery 74 │        568.99 / 576.68 ±4.09 / 581.19 ms │        617.83 / 626.49 ±9.85 / 644.83 ms │  1.09x slower │
│ QQuery 75 │        292.01 / 294.78 ±2.19 / 298.11 ms │        292.54 / 295.56 ±1.77 / 297.78 ms │     no change │
│ QQuery 76 │        138.42 / 140.60 ±1.75 / 143.77 ms │        139.57 / 141.95 ±1.78 / 144.76 ms │     no change │
│ QQuery 77 │        202.97 / 204.30 ±1.32 / 206.68 ms │        201.87 / 203.55 ±1.32 / 205.27 ms │     no change │
│ QQuery 78 │        361.22 / 365.35 ±2.79 / 368.42 ms │        365.31 / 370.45 ±5.07 / 377.90 ms │     no change │
│ QQuery 79 │        235.68 / 238.70 ±1.72 / 240.64 ms │        243.85 / 247.46 ±2.48 / 251.50 ms │     no change │
│ QQuery 80 │        344.75 / 346.48 ±2.02 / 350.34 ms │        342.78 / 346.97 ±3.18 / 352.00 ms │     no change │
│ QQuery 81 │           31.98 / 33.91 ±1.29 / 36.03 ms │           32.63 / 33.92 ±0.91 / 35.22 ms │     no change │
│ QQuery 82 │        206.81 / 209.16 ±1.67 / 211.59 ms │        206.52 / 208.30 ±1.99 / 212.05 ms │     no change │
│ QQuery 83 │           49.91 / 52.06 ±1.67 / 54.98 ms │           49.33 / 49.75 ±0.43 / 50.44 ms │     no change │
│ QQuery 84 │           52.16 / 53.48 ±1.47 / 56.35 ms │           52.27 / 53.68 ±1.21 / 55.74 ms │     no change │
│ QQuery 85 │        154.82 / 156.85 ±1.44 / 159.04 ms │        154.71 / 156.85 ±2.25 / 160.49 ms │     no change │
│ QQuery 86 │           40.86 / 43.14 ±1.24 / 44.27 ms │           41.25 / 42.60 ±0.86 / 43.41 ms │     no change │
│ QQuery 87 │          95.40 / 97.11 ±1.96 / 100.83 ms │           90.56 / 95.87 ±3.11 / 99.96 ms │     no change │
│ QQuery 88 │        115.08 / 116.47 ±1.12 / 118.17 ms │        114.00 / 114.72 ±0.67 / 115.83 ms │     no change │
│ QQuery 89 │        122.55 / 123.06 ±0.69 / 124.39 ms │        122.95 / 125.42 ±2.19 / 128.34 ms │     no change │
│ QQuery 90 │           29.59 / 30.09 ±0.47 / 30.96 ms │           28.80 / 29.51 ±0.38 / 29.88 ms │     no change │
│ QQuery 91 │           66.65 / 67.89 ±0.95 / 69.36 ms │           69.47 / 70.86 ±1.40 / 73.43 ms │     no change │
│ QQuery 92 │           60.86 / 62.96 ±1.16 / 64.05 ms │           63.26 / 63.67 ±0.30 / 64.08 ms │     no change │
│ QQuery 93 │        193.50 / 196.64 ±2.39 / 199.50 ms │        198.75 / 200.28 ±1.16 / 201.47 ms │     no change │
│ QQuery 94 │           66.80 / 67.36 ±0.54 / 68.17 ms │           66.63 / 68.01 ±1.04 / 69.77 ms │     no change │
│ QQuery 95 │        140.38 / 143.18 ±1.49 / 144.70 ms │        142.78 / 144.23 ±1.30 / 146.20 ms │     no change │
│ QQuery 96 │           77.25 / 78.72 ±0.98 / 80.13 ms │           76.57 / 78.36 ±1.32 / 80.65 ms │     no change │
│ QQuery 97 │        132.43 / 135.29 ±2.48 / 138.45 ms │        135.45 / 137.65 ±1.41 / 139.82 ms │     no change │
│ QQuery 98 │        155.90 / 159.17 ±1.78 / 160.95 ms │        159.34 / 161.67 ±1.82 / 164.73 ms │     no change │
│ QQuery 99 │ 10796.04 / 10975.68 ±93.32 / 11061.98 ms │ 10843.48 / 10884.16 ±40.25 / 10954.13 ms │     no change │
└───────────┴──────────────────────────────────────────┴──────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 34776.84ms │
│ Total Time (optimize-primitive-intern)   │ 34814.04ms │
│ Average Time (HEAD)                      │   351.28ms │
│ Average Time (optimize-primitive-intern) │   351.66ms │
│ Queries Faster                           │          1 │
│ Queries Slower                           │          4 │
│ Queries with No Change                   │         94 │
│ Queries with Failure                     │          0 │
└──────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 174.2s
Peak memory 5.4 GiB
Avg memory 4.4 GiB
CPU user 273.1s
CPU sys 22.3s
Disk read 0 B
Disk write 636.3 MiB

tpcds — branch

Metric Value
Wall time 174.4s
Peak memory 5.2 GiB
Avg memory 4.4 GiB
CPU user 275.0s
CPU sys 22.2s
Disk read 0 B
Disk write 156.0 KiB

@adriangbot
Copy link

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and optimize-primitive-intern
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃             optimize-primitive-intern ┃       Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.60 / 4.79 ±6.30 / 17.40 ms │          1.59 / 4.79 ±6.32 / 17.43 ms │    no change │
│ QQuery 1  │        15.25 / 15.36 ±0.09 / 15.52 ms │        14.79 / 15.26 ±0.28 / 15.62 ms │    no change │
│ QQuery 2  │        56.28 / 56.73 ±0.28 / 57.05 ms │        56.03 / 56.68 ±0.33 / 56.90 ms │    no change │
│ QQuery 3  │        46.78 / 49.41 ±2.86 / 54.02 ms │        46.53 / 50.92 ±2.55 / 54.00 ms │    no change │
│ QQuery 4  │     285.95 / 293.25 ±7.49 / 305.46 ms │     300.40 / 306.07 ±4.71 / 312.16 ms │    no change │
│ QQuery 5  │     341.27 / 348.42 ±3.80 / 352.66 ms │     342.48 / 346.56 ±4.01 / 353.89 ms │    no change │
│ QQuery 6  │           5.05 / 5.69 ±0.65 / 6.87 ms │           5.67 / 6.03 ±0.38 / 6.64 ms │ 1.06x slower │
│ QQuery 7  │        17.83 / 18.63 ±0.42 / 18.99 ms │        17.23 / 18.17 ±0.66 / 18.90 ms │    no change │
│ QQuery 8  │     408.69 / 419.89 ±7.17 / 429.75 ms │     416.86 / 425.65 ±7.33 / 438.13 ms │    no change │
│ QQuery 9  │     640.34 / 650.85 ±6.50 / 659.61 ms │    646.16 / 666.07 ±13.23 / 681.61 ms │    no change │
│ QQuery 10 │       94.38 / 97.03 ±2.96 / 102.79 ms │        91.97 / 94.57 ±2.11 / 97.06 ms │    no change │
│ QQuery 11 │     105.28 / 106.75 ±1.02 / 108.34 ms │     105.88 / 107.64 ±2.08 / 111.07 ms │    no change │
│ QQuery 12 │     339.17 / 344.16 ±3.34 / 349.26 ms │     344.18 / 347.13 ±3.41 / 353.44 ms │    no change │
│ QQuery 13 │     471.38 / 477.70 ±6.73 / 490.64 ms │     468.79 / 481.76 ±8.96 / 492.41 ms │    no change │
│ QQuery 14 │     351.60 / 355.31 ±2.26 / 357.69 ms │     349.04 / 355.11 ±4.96 / 363.01 ms │    no change │
│ QQuery 15 │    359.78 / 386.66 ±21.05 / 415.53 ms │    381.66 / 417.78 ±21.82 / 447.06 ms │ 1.08x slower │
│ QQuery 16 │    717.81 / 752.85 ±27.59 / 789.68 ms │    721.90 / 737.91 ±21.48 / 780.14 ms │    no change │
│ QQuery 17 │     709.79 / 715.39 ±6.68 / 726.66 ms │     712.51 / 715.54 ±2.32 / 717.82 ms │    no change │
│ QQuery 18 │ 1400.20 / 1457.65 ±35.33 / 1505.86 ms │ 1421.66 / 1467.62 ±27.04 / 1505.34 ms │    no change │
│ QQuery 19 │      35.48 / 53.22 ±34.96 / 123.13 ms │      36.33 / 52.05 ±30.46 / 112.96 ms │    no change │
│ QQuery 20 │    706.41 / 723.62 ±20.26 / 748.81 ms │    708.86 / 725.04 ±16.52 / 756.87 ms │    no change │
│ QQuery 21 │    755.32 / 770.63 ±13.68 / 795.65 ms │     751.82 / 757.64 ±4.03 / 763.28 ms │    no change │
│ QQuery 22 │  1120.56 / 1129.38 ±9.26 / 1146.50 ms │  1120.26 / 1124.78 ±3.79 / 1131.76 ms │    no change │
│ QQuery 23 │ 3181.88 / 3204.22 ±24.78 / 3251.00 ms │ 3166.90 / 3183.43 ±11.59 / 3197.99 ms │    no change │
│ QQuery 24 │     104.57 / 105.74 ±1.19 / 108.00 ms │     100.23 / 105.07 ±4.57 / 113.55 ms │    no change │
│ QQuery 25 │     141.04 / 141.66 ±0.67 / 142.55 ms │     139.65 / 142.54 ±2.74 / 146.99 ms │    no change │
│ QQuery 26 │      96.29 / 102.71 ±3.93 / 108.47 ms │      99.98 / 103.34 ±2.33 / 105.90 ms │    no change │
│ QQuery 27 │     842.36 / 850.50 ±7.39 / 863.65 ms │     838.47 / 845.67 ±9.77 / 864.87 ms │    no change │
│ QQuery 28 │ 7683.05 / 7738.96 ±32.64 / 7773.19 ms │ 7679.00 / 7728.26 ±28.22 / 7758.84 ms │    no change │
│ QQuery 29 │        56.27 / 60.79 ±5.16 / 70.08 ms │        56.68 / 62.07 ±5.01 / 69.85 ms │    no change │
│ QQuery 30 │     366.73 / 373.87 ±4.00 / 377.80 ms │     362.38 / 368.72 ±5.61 / 378.28 ms │    no change │
│ QQuery 31 │     369.14 / 380.34 ±6.97 / 388.94 ms │     354.46 / 373.26 ±9.86 / 383.63 ms │    no change │
│ QQuery 32 │ 1161.22 / 1230.08 ±56.51 / 1319.14 ms │ 1203.94 / 1290.01 ±50.21 / 1347.63 ms │    no change │
│ QQuery 33 │ 1438.58 / 1463.34 ±35.86 / 1533.32 ms │ 1437.02 / 1482.54 ±47.95 / 1570.68 ms │    no change │
│ QQuery 34 │ 1445.49 / 1477.09 ±25.22 / 1508.40 ms │ 1435.30 / 1484.09 ±29.35 / 1522.96 ms │    no change │
│ QQuery 35 │     379.12 / 384.77 ±5.21 / 391.73 ms │     408.11 / 418.61 ±7.99 / 431.93 ms │ 1.09x slower │
│ QQuery 36 │     123.73 / 125.38 ±1.58 / 127.35 ms │     120.11 / 124.55 ±3.00 / 128.00 ms │    no change │
│ QQuery 37 │        50.48 / 51.94 ±1.52 / 54.70 ms │        49.08 / 50.60 ±1.18 / 52.35 ms │    no change │
│ QQuery 38 │        75.19 / 76.76 ±1.65 / 79.68 ms │        75.45 / 77.51 ±1.68 / 80.18 ms │    no change │
│ QQuery 39 │     220.23 / 224.69 ±3.37 / 229.96 ms │     212.09 / 218.28 ±9.98 / 238.20 ms │    no change │
│ QQuery 40 │        22.34 / 25.94 ±3.32 / 31.39 ms │        24.09 / 25.78 ±1.37 / 28.24 ms │    no change │
│ QQuery 41 │        20.85 / 21.78 ±1.14 / 24.01 ms │        21.23 / 23.60 ±1.84 / 26.22 ms │ 1.08x slower │
│ QQuery 42 │        20.07 / 20.57 ±0.38 / 21.10 ms │        21.43 / 22.67 ±1.47 / 25.54 ms │ 1.10x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 27294.48ms │
│ Total Time (optimize-primitive-intern)   │ 27411.38ms │
│ Average Time (HEAD)                      │   634.76ms │
│ Average Time (optimize-primitive-intern) │   637.47ms │
│ Queries Faster                           │          0 │
│ Queries Slower                           │          5 │
│ Queries with No Change                   │         38 │
│ Queries with Failure                     │          0 │
└──────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 137.5s
Peak memory 39.1 GiB
Avg memory 27.9 GiB
CPU user 1281.8s
CPU sys 97.5s
Disk read 0 B
Disk write 4.5 GiB

clickbench_partitioned — branch

Metric Value
Wall time 138.1s
Peak memory 38.7 GiB
Avg memory 29.3 GiB
CPU user 1285.0s
CPU sys 100.8s
Disk read 0 B
Disk write 116.0 KiB

@Dandandan
Copy link
Contributor Author

run benchmark clickbench_extended

@adriangbot
Copy link

🤖 Benchmark running (GKE) | trigger
Linux bench-c4085458589-439-wb68c 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing optimize-primitive-intern (a551024) to cf0a182 (merge-base) diff using: clickbench_extended
Results will be posted here when complete

@adriangbot
Copy link

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and optimize-primitive-intern
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query    ┃                                    HEAD ┃                 optimize-primitive-intern ┃       Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0 │      830.38 / 887.33 ±29.07 / 911.86 ms │        840.76 / 880.03 ±37.16 / 946.06 ms │    no change │
│ QQuery 1 │       211.02 / 212.39 ±1.35 / 214.75 ms │         213.56 / 214.64 ±1.05 / 216.41 ms │    no change │
│ QQuery 2 │       502.62 / 505.70 ±2.12 / 508.84 ms │         507.14 / 509.59 ±1.85 / 512.38 ms │    no change │
│ QQuery 3 │       314.29 / 318.09 ±1.97 / 319.96 ms │         317.63 / 318.29 ±0.69 / 319.62 ms │    no change │
│ QQuery 4 │       687.09 / 698.49 ±8.77 / 711.34 ms │        681.43 / 703.70 ±19.76 / 737.70 ms │    no change │
│ QQuery 5 │ 9960.03 / 10120.28 ±92.23 / 10242.54 ms │ 10005.17 / 10293.21 ±206.45 / 10541.25 ms │    no change │
│ QQuery 6 │   1012.70 / 1024.91 ±14.80 / 1053.69 ms │      1006.09 / 1018.35 ±7.17 / 1028.01 ms │    no change │
│ QQuery 7 │      772.88 / 803.78 ±29.30 / 842.33 ms │        843.07 / 858.42 ±21.25 / 900.16 ms │ 1.07x slower │
└──────────┴─────────────────────────────────────────┴───────────────────────────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                        │ 14570.97ms │
│ Total Time (optimize-primitive-intern)   │ 14796.23ms │
│ Average Time (HEAD)                      │  1821.37ms │
│ Average Time (optimize-primitive-intern) │  1849.53ms │
│ Queries Faster                           │          0 │
│ Queries Slower                           │          1 │
│ Queries with No Change                   │          7 │
│ Queries with Failure                     │          0 │
└──────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_extended — base (merge-base)

Metric Value
Wall time 73.5s
Peak memory 32.5 GiB
Avg memory 28.0 GiB
CPU user 734.6s
CPU sys 23.8s
Disk read 0 B
Disk write 21.8 MiB

clickbench_extended — branch

Metric Value
Wall time 74.6s
Peak memory 31.8 GiB
Avg memory 26.5 GiB
CPU user 732.8s
CPU sys 25.4s
Disk read 0 B
Disk write 76.0 KiB

@Dandandan Dandandan closed this Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants