Revert "Revert "Improve `coalesce` and `concat` performance for views… #7625

Dandandan · 2025-06-07T17:07:20Z

This reverts commit da461c8.

This adds a test and fix for the wrong index issue.
I also verified the change for DataFusion (and benchmarks show notable improvements).

Which issue does this PR close?

Closes #NNN.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

…apache#7614)" (apache#7623)" This reverts commit da461c8.

alamb · 2025-06-08T11:23:28Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing revert_revert_concat (092914f) to da461c8 diff
BENCH_NAME=concatenate_kernel
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench concatenate_kernel
BENCH_FILTER=
BENCH_BRANCH_NAME=revert_revert_concat
Results will be posted here when complete

alamb · 2025-06-08T11:32:36Z

🤖: Benchmark completed

Details

group                                                          main                                   revert_revert_concat
-----                                                          ----                                   --------------------
concat 1024 arrays boolean 4                                   1.00     27.8±0.06µs        ? ?/sec    1.00     27.7±0.04µs        ? ?/sec
concat 1024 arrays i32 4                                       1.00     12.8±0.02µs        ? ?/sec    1.01     13.0±0.03µs        ? ?/sec
concat 1024 arrays str 4                                       1.00     54.7±0.26µs        ? ?/sec    1.01     55.0±0.54µs        ? ?/sec
concat boolean 1024                                            1.00    427.6±0.41ns        ? ?/sec    1.01    431.7±5.01ns        ? ?/sec
concat boolean 8192 over 100 arrays                            1.00     50.8±0.07µs        ? ?/sec    1.00     50.9±0.13µs        ? ?/sec
concat boolean nulls 1024                                      1.01    751.2±1.28ns        ? ?/sec    1.00    742.2±2.65ns        ? ?/sec
concat boolean nulls 8192 over 100 arrays                      1.00    109.5±0.19µs        ? ?/sec    1.00    109.8±0.11µs        ? ?/sec
concat fixed size lists                                        1.01   809.8±19.52µs        ? ?/sec    1.00   801.7±26.09µs        ? ?/sec
concat i32 1024                                                1.01    442.7±0.83ns        ? ?/sec    1.00    436.9±0.96ns        ? ?/sec
concat i32 8192 over 100 arrays                                1.14    240.8±2.94µs        ? ?/sec    1.00   211.7±10.56µs        ? ?/sec
concat i32 nulls 1024                                          1.00    746.4±2.65ns        ? ?/sec    1.00    749.8±4.35ns        ? ?/sec
concat i32 nulls 8192 over 100 arrays                          1.01    285.0±9.62µs        ? ?/sec    1.00    282.5±3.47µs        ? ?/sec
concat str 1024                                                1.01     14.0±1.17µs        ? ?/sec    1.00     13.8±1.05µs        ? ?/sec
concat str 8192 over 100 arrays                                1.00    105.1±0.78ms        ? ?/sec    1.00    105.5±1.87ms        ? ?/sec
concat str nulls 1024                                          1.01      7.1±0.89µs        ? ?/sec    1.00      7.0±0.71µs        ? ?/sec
concat str nulls 8192 over 100 arrays                          1.00     51.9±0.34ms        ? ?/sec    1.01     52.2±0.44ms        ? ?/sec
concat str_dict 1024                                           1.03      3.0±0.01µs        ? ?/sec    1.00      2.9±0.03µs        ? ?/sec
concat str_dict_sparse 1024                                    1.00      6.8±0.02µs        ? ?/sec    1.01      6.9±0.01µs        ? ?/sec
concat struct with int32 and dicts size=1024 count=2           1.04      6.9±0.20µs        ? ?/sec    1.00      6.6±0.06µs        ? ?/sec
concat utf8_view  max_str_len=128 null_density=0               1.16     89.6±0.19µs        ? ?/sec    1.00     77.6±0.35µs        ? ?/sec
concat utf8_view  max_str_len=128 null_density=0.2             1.78    149.8±0.40µs        ? ?/sec    1.00     84.0±0.32µs        ? ?/sec
concat utf8_view  max_str_len=20 null_density=0                1.10     99.5±1.80µs        ? ?/sec    1.00     90.1±0.30µs        ? ?/sec
concat utf8_view  max_str_len=20 null_density=0.2              1.56    150.7±0.84µs        ? ?/sec    1.00     96.5±0.20µs        ? ?/sec
concat utf8_view all_inline max_str_len=12 null_density=0      1.49     72.2±1.24µs        ? ?/sec    1.00     48.6±3.71µs        ? ?/sec
concat utf8_view all_inline max_str_len=12 null_density=0.2    1.51     79.0±0.56µs        ? ?/sec    1.00     52.2±2.92µs        ? ?/sec

alamb · 2025-06-08T11:32:39Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing revert_revert_concat (092914f) to da461c8 diff
BENCH_NAME=coalesce_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench coalesce_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=revert_revert_concat
Results will be posted here when complete

alamb · 2025-06-08T11:49:27Z

🤖: Benchmark completed

Details

group                                                                                main                                   revert_revert_concat
-----                                                                                ----                                   --------------------
filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001                               1.00    250.4±1.37ms        ? ?/sec    1.20    299.2±1.97ms        ? ?/sec
filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01                                1.02      9.0±0.06ms        ? ?/sec    1.00      8.8±0.05ms        ? ?/sec
filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1                                 1.00      4.4±0.12ms        ? ?/sec    1.01      4.4±0.12ms        ? ?/sec
filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8                                 1.00      3.5±0.02ms        ? ?/sec    1.01      3.5±0.03ms        ? ?/sec
filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001                             1.00    250.0±2.62ms        ? ?/sec    1.04    259.1±2.52ms        ? ?/sec
filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01                              1.01     10.4±0.11ms        ? ?/sec    1.00     10.3±0.07ms        ? ?/sec
filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1                               1.00      4.6±0.05ms        ? ?/sec    1.00      4.6±0.03ms        ? ?/sec
filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8                               1.00      4.5±0.02ms        ? ?/sec    1.00      4.6±0.01ms        ? ?/sec
filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001                               1.03     69.3±0.22ms        ? ?/sec    1.00     67.3±0.48ms        ? ?/sec
filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01                                1.00     12.7±0.15ms        ? ?/sec    1.00     12.8±0.12ms        ? ?/sec
filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1                                 1.00      9.7±0.15ms        ? ?/sec    1.01      9.8±0.27ms        ? ?/sec
filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8                                 1.00      8.1±0.21ms        ? ?/sec    1.01      8.2±0.12ms        ? ?/sec
filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001                             1.07     90.3±0.64ms        ? ?/sec    1.00     84.7±0.54ms        ? ?/sec
filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01                              1.00     14.6±0.14ms        ? ?/sec    1.00     14.6±0.13ms        ? ?/sec
filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1                               1.00     10.2±0.25ms        ? ?/sec    1.00     10.2±0.34ms        ? ?/sec
filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8                               1.00      9.4±0.14ms        ? ?/sec    1.02      9.6±0.13ms        ? ?/sec
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.001      1.25     86.3±0.48ms        ? ?/sec    1.00     68.8±0.33ms        ? ?/sec
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.01       1.21     10.7±0.05ms        ? ?/sec    1.00      8.8±0.05ms        ? ?/sec
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.1        1.20      6.1±0.27ms        ? ?/sec    1.00      5.1±0.17ms        ? ?/sec
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.8        1.23      4.2±0.02ms        ? ?/sec    1.00      3.4±0.03ms        ? ?/sec
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.001    1.23    106.8±0.46ms        ? ?/sec    1.00     87.2±0.44ms        ? ?/sec
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.01     1.23     14.6±0.04ms        ? ?/sec    1.00     11.8±0.04ms        ? ?/sec
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.1      1.30      7.6±0.19ms        ? ?/sec    1.00      5.9±0.22ms        ? ?/sec
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.8      1.08      4.0±0.01ms        ? ?/sec    1.00      3.7±0.01ms        ? ?/sec
filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.001       1.32     79.3±0.66ms        ? ?/sec    1.00     60.2±0.26ms        ? ?/sec
filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.01        1.30      9.2±0.03ms        ? ?/sec    1.00      7.1±0.02ms        ? ?/sec
filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1         1.31      3.8±0.02ms        ? ?/sec    1.00      2.9±0.12ms        ? ?/sec
filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8         1.88      4.7±0.02ms        ? ?/sec    1.00      2.5±0.01ms        ? ?/sec
filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.001     1.24     86.8±0.45ms        ? ?/sec    1.00     70.3±0.21ms        ? ?/sec
filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.01      1.30     13.5±0.05ms        ? ?/sec    1.00     10.4±0.04ms        ? ?/sec
filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.1       1.37      5.4±0.17ms        ? ?/sec    1.00      3.9±0.22ms        ? ?/sec
filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.8       1.58      7.5±0.02ms        ? ?/sec    1.00      4.8±0.02ms        ? ?/sec
filter: single_utf8view, 8192, nulls: 0, selectivity: 0.001                          1.48    133.4±0.52ms        ? ?/sec    1.00     89.9±0.28ms        ? ?/sec
filter: single_utf8view, 8192, nulls: 0, selectivity: 0.01                           1.32     15.6±0.04ms        ? ?/sec    1.00     11.9±0.03ms        ? ?/sec
filter: single_utf8view, 8192, nulls: 0, selectivity: 0.1                            1.30      7.0±0.11ms        ? ?/sec    1.00      5.4±0.18ms        ? ?/sec
filter: single_utf8view, 8192, nulls: 0, selectivity: 0.8                            2.09      8.9±0.02ms        ? ?/sec    1.00      4.2±0.01ms        ? ?/sec
filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.001                        1.33    164.9±0.82ms        ? ?/sec    1.00    123.5±0.89ms        ? ?/sec
filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.01                         1.34     22.5±0.06ms        ? ?/sec    1.00     16.8±0.05ms        ? ?/sec
filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.1                          1.43     10.4±0.08ms        ? ?/sec    1.00      7.2±0.07ms        ? ?/sec
filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.8                          1.81     12.7±0.06ms        ? ?/sec    1.00      7.0±0.02ms        ? ?/sec

alamb · 2025-06-08T11:49:30Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing revert_revert_concat (092914f) to da461c8 diff
BENCH_NAME=filter_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench filter_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=revert_revert_concat
Results will be posted here when complete

alamb · 2025-06-08T12:12:43Z

🤖: Benchmark completed

Details

group                                                                         main                                   revert_revert_concat
-----                                                                         ----                                   --------------------
filter context decimal128 (kept 1/2)                                          1.00     44.8±6.98µs        ? ?/sec    1.18     52.8±7.46µs        ? ?/sec
filter context decimal128 high selectivity (kept 1023/1024)                   1.00     50.9±1.22µs        ? ?/sec    1.02     52.2±1.68µs        ? ?/sec
filter context decimal128 low selectivity (kept 1/1024)                       1.02    246.5±0.40ns        ? ?/sec    1.00    241.1±1.00ns        ? ?/sec
filter context f32 (kept 1/2)                                                 1.00     90.6±0.13µs        ? ?/sec    1.00     90.7±0.12µs        ? ?/sec
filter context f32 high selectivity (kept 1023/1024)                          1.00     13.8±0.57µs        ? ?/sec    1.02     14.0±0.63µs        ? ?/sec
filter context f32 low selectivity (kept 1/1024)                              1.00    467.0±0.44ns        ? ?/sec    1.00    467.9±0.62ns        ? ?/sec
filter context fsb with value length 20 (kept 1/2)                            1.00     70.6±0.11µs        ? ?/sec    1.00     70.7±0.10µs        ? ?/sec
filter context fsb with value length 20 high selectivity (kept 1023/1024)     1.00     70.6±0.09µs        ? ?/sec    1.00     70.7±0.08µs        ? ?/sec
filter context fsb with value length 20 low selectivity (kept 1/1024)         1.00     70.7±0.13µs        ? ?/sec    1.00     70.7±0.09µs        ? ?/sec
filter context fsb with value length 5 (kept 1/2)                             1.00     70.7±0.12µs        ? ?/sec    1.00     70.8±0.12µs        ? ?/sec
filter context fsb with value length 5 high selectivity (kept 1023/1024)      1.00     70.6±0.07µs        ? ?/sec    1.00     70.7±0.12µs        ? ?/sec
filter context fsb with value length 5 low selectivity (kept 1/1024)          1.00     70.7±0.10µs        ? ?/sec    1.00     70.7±0.07µs        ? ?/sec
filter context fsb with value length 50 (kept 1/2)                            1.00     70.7±0.10µs        ? ?/sec    1.00     70.7±0.09µs        ? ?/sec
filter context fsb with value length 50 high selectivity (kept 1023/1024)     1.00     70.7±0.11µs        ? ?/sec    1.00     70.9±1.21µs        ? ?/sec
filter context fsb with value length 50 low selectivity (kept 1/1024)         1.00     70.7±0.10µs        ? ?/sec    1.00     70.8±0.30µs        ? ?/sec
filter context i32 (kept 1/2)                                                 1.00     22.6±0.07µs        ? ?/sec    1.00     22.6±0.04µs        ? ?/sec
filter context i32 high selectivity (kept 1023/1024)                          1.00      6.6±0.32µs        ? ?/sec    1.02      6.7±0.45µs        ? ?/sec
filter context i32 low selectivity (kept 1/1024)                              1.02    246.7±0.68ns        ? ?/sec    1.00    243.0±0.35ns        ? ?/sec
filter context i32 w NULLs (kept 1/2)                                         1.00     93.7±0.17µs        ? ?/sec    1.00     94.2±0.24µs        ? ?/sec
filter context i32 w NULLs high selectivity (kept 1023/1024)                  1.00     13.6±0.51µs        ? ?/sec    1.03     14.1±0.53µs        ? ?/sec
filter context i32 w NULLs low selectivity (kept 1/1024)                      1.01    471.8±0.69ns        ? ?/sec    1.00    464.9±0.48ns        ? ?/sec
filter context mixed string view (kept 1/2)                                   1.01    116.8±6.20µs        ? ?/sec    1.00    115.2±5.04µs        ? ?/sec
filter context mixed string view high selectivity (kept 1023/1024)            1.04     59.8±1.56µs        ? ?/sec    1.00     57.5±1.17µs        ? ?/sec
filter context mixed string view low selectivity (kept 1/1024)                1.02    667.2±3.86ns        ? ?/sec    1.00    651.5±0.80ns        ? ?/sec
filter context short string view (kept 1/2)                                   1.00    120.0±5.68µs        ? ?/sec    1.02    122.6±6.56µs        ? ?/sec
filter context short string view high selectivity (kept 1023/1024)            1.02     59.3±1.20µs        ? ?/sec    1.00     58.2±1.40µs        ? ?/sec
filter context short string view low selectivity (kept 1/1024)                1.00    485.3±0.76ns        ? ?/sec    1.01    490.6±0.76ns        ? ?/sec
filter context string (kept 1/2)                                              1.00   580.8±13.02µs        ? ?/sec    1.00   583.4±12.13µs        ? ?/sec
filter context string dictionary (kept 1/2)                                   1.01     23.7±0.08µs        ? ?/sec    1.00     23.5±0.05µs        ? ?/sec
filter context string dictionary high selectivity (kept 1023/1024)            1.05      7.3±0.31µs        ? ?/sec    1.00      7.0±0.44µs        ? ?/sec
filter context string dictionary low selectivity (kept 1/1024)                1.02    834.9±4.50ns        ? ?/sec    1.00    816.1±2.48ns        ? ?/sec
filter context string dictionary w NULLs (kept 1/2)                           1.00     94.5±0.13µs        ? ?/sec    1.00     94.4±0.15µs        ? ?/sec
filter context string dictionary w NULLs high selectivity (kept 1023/1024)    1.03     14.4±0.46µs        ? ?/sec    1.00     14.0±0.48µs        ? ?/sec
filter context string dictionary w NULLs low selectivity (kept 1/1024)        1.01   1071.5±4.81ns        ? ?/sec    1.00   1065.0±1.45ns        ? ?/sec
filter context string high selectivity (kept 1023/1024)                       1.03   628.4±19.04µs        ? ?/sec    1.00   609.6±12.59µs        ? ?/sec
filter context string low selectivity (kept 1/1024)                           1.23   1139.8±2.11ns        ? ?/sec    1.00    925.2±5.60ns        ? ?/sec
filter context u8 (kept 1/2)                                                  1.00     18.8±0.03µs        ? ?/sec    1.00     18.8±0.04µs        ? ?/sec
filter context u8 high selectivity (kept 1023/1024)                           1.00  1795.5±11.86ns        ? ?/sec    1.13      2.0±0.01µs        ? ?/sec
filter context u8 low selectivity (kept 1/1024)                               1.02    240.6±0.29ns        ? ?/sec    1.00    236.8±0.26ns        ? ?/sec
filter context u8 w NULLs (kept 1/2)                                          1.00     89.9±0.22µs        ? ?/sec    1.00     90.0±0.18µs        ? ?/sec
filter context u8 w NULLs high selectivity (kept 1023/1024)                   1.00      8.7±0.02µs        ? ?/sec    1.00      8.7±0.01µs        ? ?/sec
filter context u8 w NULLs low selectivity (kept 1/1024)                       1.20    559.2±1.82ns        ? ?/sec    1.00    464.7±0.49ns        ? ?/sec
filter decimal128 (kept 1/2)                                                  1.00     96.6±0.38µs        ? ?/sec    1.02     98.4±0.47µs        ? ?/sec
filter decimal128 high selectivity (kept 1023/1024)                           1.00     51.7±0.28µs        ? ?/sec    1.07     55.5±0.94µs        ? ?/sec
filter decimal128 low selectivity (kept 1/1024)                               1.00      3.0±0.00µs        ? ?/sec    1.00      3.0±0.01µs        ? ?/sec
filter f32 (kept 1/2)                                                         1.01    200.8±0.43µs        ? ?/sec    1.00    199.1±0.21µs        ? ?/sec
filter fsb with value length 20 (kept 1/2)                                    1.00    149.4±0.39µs        ? ?/sec    1.00    149.8±0.73µs        ? ?/sec
filter fsb with value length 20 high selectivity (kept 1023/1024)             1.00     70.5±1.75µs        ? ?/sec    1.01     71.3±1.98µs        ? ?/sec
filter fsb with value length 20 low selectivity (kept 1/1024)                 1.01      3.2±0.00µs        ? ?/sec    1.00      3.2±0.01µs        ? ?/sec
filter fsb with value length 5 (kept 1/2)                                     1.00    152.5±0.31µs        ? ?/sec    1.01    153.3±0.22µs        ? ?/sec
filter fsb with value length 5 high selectivity (kept 1023/1024)              1.00     11.0±0.70µs        ? ?/sec    1.01     11.2±0.63µs        ? ?/sec
filter fsb with value length 5 low selectivity (kept 1/1024)                  1.01      3.1±0.01µs        ? ?/sec    1.00      3.1±0.01µs        ? ?/sec
filter fsb with value length 50 (kept 1/2)                                    1.02    189.6±9.58µs        ? ?/sec    1.00    185.0±4.40µs        ? ?/sec
filter fsb with value length 50 high selectivity (kept 1023/1024)             1.01    210.9±8.00µs        ? ?/sec    1.00    209.1±8.05µs        ? ?/sec
filter fsb with value length 50 low selectivity (kept 1/1024)                 1.01      3.2±0.00µs        ? ?/sec    1.00      3.1±0.01µs        ? ?/sec
filter i32 (kept 1/2)                                                         1.00     92.3±0.15µs        ? ?/sec    1.01     93.6±0.11µs        ? ?/sec
filter i32 high selectivity (kept 1023/1024)                                  1.02      8.9±0.42µs        ? ?/sec    1.00      8.7±0.43µs        ? ?/sec
filter i32 low selectivity (kept 1/1024)                                      1.00      3.1±0.00µs        ? ?/sec    1.00      3.1±0.01µs        ? ?/sec
filter optimize (kept 1/2)                                                    1.00     84.5±0.17µs        ? ?/sec    1.00     84.7±0.43µs        ? ?/sec
filter optimize high selectivity (kept 1023/1024)                             1.00      2.7±0.01µs        ? ?/sec    1.05      2.8±0.01µs        ? ?/sec
filter optimize low selectivity (kept 1/1024)                                 1.00      2.8±0.01µs        ? ?/sec    1.01      2.8±0.00µs        ? ?/sec
filter run array (kept 1/2)                                                   1.00    361.0±1.26µs        ? ?/sec    1.00    360.6±0.87µs        ? ?/sec
filter run array high selectivity (kept 1023/1024)                            1.01    310.5±1.61µs        ? ?/sec    1.00    308.7±1.06µs        ? ?/sec
filter run array low selectivity (kept 1/1024)                                1.00    246.8±0.89µs        ? ?/sec    1.00    247.1±0.89µs        ? ?/sec
filter single record batch                                                    1.02     94.8±0.23µs        ? ?/sec    1.00     93.1±0.19µs        ? ?/sec
filter u8 (kept 1/2)                                                          1.00     92.2±0.10µs        ? ?/sec    1.01     93.6±0.14µs        ? ?/sec
filter u8 high selectivity (kept 1023/1024)                                   1.00      3.8±0.01µs        ? ?/sec    1.02      3.8±0.01µs        ? ?/sec
filter u8 low selectivity (kept 1/1024)                                       1.00      3.0±0.00µs        ? ?/sec    1.00      3.0±0.01µs        ? ?/sec

alamb · 2025-06-08T13:40:18Z

Love it -- looks great. Thank you @Dandandan

alamb

Thank you @Dandandan -- this looks great to me

I am also working on writing some tests to add additional coverage. I hope to have that done later this morning

alamb · 2025-06-08T11:21:14Z

arrow-select/src/coalesce.rs

+                if ideal_buffer_size == 0 {
+                    // If the ideal buffer size is 0, all views are inlined
+                    // so just reuse the views
+                    return Arc::new(unsafe {


I ran codecov and this is the only path that is not covered

cargo llvm-cov --html -p arrow-select

I think I can cover it with a sliced record batch. I will write some tests as a follow on PR to cover it

alamb · 2025-06-08T13:46:51Z

I also made a follow on PR for adding some additional coverage here:

Improve coalesce kernel tests #7626

@Dandandan

# Which issue does this PR close? - Follow on to #7625 from @Dandandan # Rationale for this change I want to eventually remove `gc_string_view` but currently the unit tests are in terms of that function # What changes are included in this PR? Rewrite tests to be in terms of `coalesce` instead Also, 1. Add additional coverage for the issue we saw in #7623 2. Add add coverage for the case where there are data buffers in the view, but they are not referenced by any view #7625 (comment) Codecov of this module is now 100% # Are there any user-facing changes? If there are user-facing changes then we may require documentation to be updated before approving the PR. If there are any breaking changes to public APIs, please call them out.

Revert "Revert "Improve coalesce and concat performance for views (…

e77928b

…apache#7614)" (apache#7623)" This reverts commit da461c8.

Dandandan marked this pull request as draft June 7, 2025 17:07

github-actions bot added the arrow Changes to the arrow crate label Jun 7, 2025

Add test / fix

092914f

Dandandan marked this pull request as ready for review June 7, 2025 17:22

Dandandan requested a review from alamb June 7, 2025 17:34

alamb merged commit 52d8d56 into apache:main Jun 8, 2025
29 checks passed

alamb approved these changes Jun 8, 2025

View reviewed changes

alamb mentioned this pull request Jun 8, 2025

Improve coalesce kernel tests #7626

Merged

alamb mentioned this pull request Jun 16, 2025

Optimize coalesce kernel for StringView (10-50% faster) #7650

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revert "Revert "Improve `coalesce` and `concat` performance for views… #7625

Revert "Revert "Improve `coalesce` and `concat` performance for views… #7625

Uh oh!

Dandandan commented Jun 7, 2025 •

edited

Loading

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

Uh oh!

alamb left a comment

Uh oh!

alamb Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Revert "Revert "Improve coalesce and concat performance for views… #7625

Revert "Revert "Improve coalesce and concat performance for views… #7625

Uh oh!

Conversation

Dandandan commented Jun 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

alamb commented Jun 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Revert "Revert "Improve `coalesce` and `concat` performance for views… #7625

Revert "Revert "Improve `coalesce` and `concat` performance for views… #7625

Dandandan commented Jun 7, 2025 •

edited

Loading