Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Aug 15, 2025

Which issue does this PR close?

I am also working on a blog post about this

TODOs

Rationale for this change

A new ParquetPushDecoder was implemented here

I need to refactor the async and sync readers to use the new push decoder in order to:

  1. avoid the xkcd standards effect (aka there are now three control loops)
  2. Prove that the push decoder works (by passing all the tests of the other two)
  3. Set the stage for improving filter pushdown more with a single control loop
image

What changes are included in this PR?

  1. Refactor the ParquetRecordBatchStream to use ParquetPushDecoder

Are these changes tested?

Yes, by the existing CI tests

I also ran several benchmarks, both in arrow-rs and in DataFusion and I do not see any substantial performance difference (as expected):

Are there any user-facing changes?

No

@github-actions github-actions bot added the parquet Changes to the parquet crate label Aug 15, 2025
@alamb

This comment was marked as outdated.

@alamb

This comment was marked as outdated.

@alamb
Copy link
Contributor Author

alamb commented Aug 19, 2025

From my perspective, the goal of "show we can use push decoder to rewrite the async decoder" is now complete and I will pick this PR up again once we get the push decoder merged

alamb added a commit that referenced this pull request Oct 29, 2025
# Which issue does this PR close?

- Part of #8000

- closes  #7983


# Rationale for this change

This PR is the first part of separating IO and decode operations in the
rust parquet decoder.

Decoupling IO and CPU enables several important usecases:
1. Different IO patterns  (e.g. not buffer the entire row group at once)
2. Different IO APIs e.g. use io_uring, or OpenDAL, etc.
3. Deliberate prefetching within a file
4. Avoid code duplication between the `ParquetRecordBatchStreamBuilder`
and `ParquetRecordBatchReaderBuilder`


# What changes are included in this PR?

1. Add new `ParquetDecoderBuilder`, and `ParquetDecoder` and tests

It is effectively an explicit version of the state machine that is used
in existing async reader (where the state machine is encoded as Rust
`async` / `await` structures)



# Are these changes tested?
Yes -- there are extensive tests for the new code

Note that this PR actually adds a **3rd** path for control flow (when I
claim this will remove duplication!) In follow on PRs I will convert the
existing readers to use this new pattern, similarly to the sequence I
did for the metadata decoder:
- #8080
- #8340

Here is a preview of a PR that consolidates the async reader to use the
push decoder internally (and removes one duplicate):
- #8159

- closes #8022

# Are there any user-facing changes?

Yes, a new API, but now changes to the existing APIs

---------

Co-authored-by: Matthijs Brobbel <[email protected]>
Co-authored-by: Adrian Garcia Badaracco <[email protected]>
@alamb alamb force-pushed the alamb/test_decode_with_async_reader branch from 314dc5c to f0f79dc Compare October 29, 2025 13:01
@alamb alamb changed the title WIP: Rewrite ParquetRecordBatchStream in terms of the PushDecoder Rewrite ParquetRecordBatchStream in terms of the PushDecoder Oct 29, 2025
@alamb alamb force-pushed the alamb/test_decode_with_async_reader branch from 88fe84b to 7fe9fa6 Compare October 29, 2025 16:40
}

fn compute_cache_projection_inner(&self, filter: &RowFilter) -> Option<ProjectionMask> {
// Do not compute the projection mask if the predicate cache is disabled
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the fix from the following PR applied to the push decoder (now that the paths are unified)

@alamb alamb force-pushed the alamb/test_decode_with_async_reader branch 2 times, most recently from 3b7837d to 776ee65 Compare October 30, 2025 14:27
) -> ReadResult<T> {
// TODO: calling build_array multiple times is wasteful

let meta = self.metadata.row_group(row_group_idx);
Copy link
Contributor Author

@alamb alamb Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stream reader has the same logic / algorithm, but now it uses the copy in the push decoder (which is based on this code) instead of this

Ok(decode_result)
}

/// Attempt to return the next [`ParquetRecordBatchReader`] or return what data is needed
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a new API on the ParquetPushDecoder that is needed to implement the existing next_row_group API: https://docs.rs/parquet/latest/parquet/arrow/async_reader/struct.ParquetRecordBatchStream.html#method.next_row_group

/// buffering.
///
/// [`Stream`]: https://docs.rs/futures/latest/futures/stream/trait.Stream.html
pub struct ParquetRecordBatchStream<T> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am quite pleased with this -- ParquetRecordBatchStreamBuilder is now clearly separated into the IO handling piece request_state and the decoding piece, decoder

let request_state = std::mem::replace(&mut self.request_state, RequestState::Done);
match request_state {
// No outstanding requests, proceed to setup next row group
RequestState::None { input } => {
Copy link
Contributor Author

@alamb alamb Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now the core state machine of ParquetRecordBatchStream, and I am pleased it represents what is going on in a straightforward way: it alternates between decode and I/O


#[tokio::test]
#[allow(deprecated)]
async fn test_in_memory_row_group_sparse() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was introduced in the following PR by @thinkharderdev

I believe it is meant to verify the PageIndex is used to prune IO,

The reason I propose deleting this test is:

  1. IO pruning based on PageIndex is covered in the newer io tests, for example
    // Expect to see only data IO for one page for each column for each row group
  2. This test is in terms of non public APIs (the ReaderFactory and InMemoryRowGroup) which don't reflect the requests that are actually made (the ranges are coalesced, for example, for each column's pages)

@alamb
Copy link
Contributor Author

alamb commented Oct 30, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_decode_with_async_reader (776ee65) to 2eabb59 diff
BENCH_NAME=arrow_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_test_decode_with_async_reader
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Oct 30, 2025

🤖: Benchmark completed

Details

group                                                                                                      alamb_test_decode_with_async_reader    main
-----                                                                                                      -----------------------------------    ----
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                           1.01   1275.1±6.90µs        ? ?/sec    1.00   1263.0±5.38µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                          1.00   1284.8±3.38µs        ? ?/sec    1.06   1360.3±4.57µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                            1.01   1281.6±4.43µs        ? ?/sec    1.00  1270.9±15.87µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs                                     1.02    502.9±2.84µs        ? ?/sec    1.00    495.1±4.97µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs                                    1.00    660.5±1.99µs        ? ?/sec    1.02    671.2±1.66µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs                                      1.03    514.0±5.17µs        ? ?/sec    1.00    499.6±2.55µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs                                          1.05   596.6±18.90µs        ? ?/sec    1.00    566.0±9.59µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs                                         1.00    728.4±4.34µs        ? ?/sec    1.02    742.3±9.53µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs                                           1.03   599.7±18.70µs        ? ?/sec    1.00   582.3±18.12µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs                                 1.10    251.0±2.74µs        ? ?/sec    1.00    229.2±2.33µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs                                1.03    257.6±1.03µs        ? ?/sec    1.00    250.0±0.65µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs                                  1.06    258.9±4.13µs        ? ?/sec    1.00    243.4±2.87µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs                                      1.03    374.3±6.05µs        ? ?/sec    1.00    364.9±7.92µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short string                        1.00    345.9±0.81µs        ? ?/sec    1.01    348.0±2.12µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs                                     1.01    330.3±2.00µs        ? ?/sec    1.00    328.3±2.53µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs                                       1.02    379.3±2.42µs        ? ?/sec    1.00    372.7±2.66µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs     1.00   1086.4±7.87µs        ? ?/sec    1.03   1123.2±5.54µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, half NULLs    1.00    939.7±6.43µs        ? ?/sec    1.03    964.6±5.20µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, no NULLs      1.00   1097.2±8.61µs        ? ?/sec    1.03  1131.5±25.20µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                 1.00    405.1±1.91µs        ? ?/sec    1.10    446.6±3.50µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                1.00    587.3±3.99µs        ? ?/sec    1.08    632.0±4.05µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                  1.00    417.4±4.34µs        ? ?/sec    1.10    460.1±4.97µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, mandatory, no NULLs        1.00    161.0±0.34µs        ? ?/sec    1.26    203.4±0.84µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, half NULLs       1.00    299.9±0.80µs        ? ?/sec    1.14    343.1±1.32µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, no NULLs         1.00    166.7±0.61µs        ? ?/sec    1.25    209.0±0.65µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs                    1.00     77.6±0.51µs        ? ?/sec    1.53    118.7±0.42µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs                   1.00    259.4±0.86µs        ? ?/sec    1.16    299.8±0.93µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs                     1.00     83.9±1.18µs        ? ?/sec    1.47    123.4±1.15µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, mandatory, no NULLs                    1.00    741.8±4.22µs        ? ?/sec    1.00    740.6±3.95µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, half NULLs                   1.00    585.0±7.34µs        ? ?/sec    1.00    584.4±3.18µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, no NULLs                     1.00    747.4±4.27µs        ? ?/sec    1.00    749.7±9.45µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, mandatory, no NULLs                                1.00     63.0±5.71µs        ? ?/sec    1.11     69.9±5.93µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs                               1.00    251.6±1.67µs        ? ?/sec    1.00    252.0±2.10µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, no NULLs                                 1.00     70.3±5.53µs        ? ?/sec    1.02     71.9±4.85µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, mandatory, no NULLs                     1.01     94.9±0.58µs        ? ?/sec    1.00     94.4±0.24µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, half NULLs                    1.00    235.5±0.86µs        ? ?/sec    1.00    235.1±0.74µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, no NULLs                      1.00    100.3±0.33µs        ? ?/sec    1.00    100.0±0.28µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, mandatory, no NULLs                                 1.02      9.7±0.27µs        ? ?/sec    1.00      9.6±0.27µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, half NULLs                                1.01    193.7±0.95µs        ? ?/sec    1.00    192.3±0.55µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, no NULLs                                  1.00     14.9±0.29µs        ? ?/sec    1.00     14.9±0.16µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, mandatory, no NULLs                     1.00    185.0±2.42µs        ? ?/sec    1.00    184.2±0.46µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, half NULLs                    1.00    346.3±1.15µs        ? ?/sec    1.00    347.3±1.19µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, no NULLs                      1.01    191.6±1.10µs        ? ?/sec    1.00    190.4±0.64µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, mandatory, no NULLs                                 1.00     14.6±0.16µs        ? ?/sec    1.00     14.6±0.34µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, half NULLs                                1.00    262.4±2.88µs        ? ?/sec    1.00    261.5±1.02µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, no NULLs                                  1.00     20.3±0.19µs        ? ?/sec    1.03     20.8±0.47µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, mandatory, no NULLs                     1.00    367.6±1.39µs        ? ?/sec    1.00    367.0±1.15µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, half NULLs                    1.00    384.7±1.35µs        ? ?/sec    1.00    383.5±1.71µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, no NULLs                      1.00    374.3±1.21µs        ? ?/sec    1.00    373.6±1.52µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, mandatory, no NULLs                                 1.00     28.1±0.26µs        ? ?/sec    1.07     30.2±1.10µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, half NULLs                                1.00    220.4±0.80µs        ? ?/sec    1.00    221.2±1.36µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, no NULLs                                  1.00     35.5±0.57µs        ? ?/sec    1.05     37.4±1.02µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    124.6±0.85µs        ? ?/sec    1.00    124.3±0.34µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, half NULLs                          1.01    139.6±0.76µs        ? ?/sec    1.00    138.3±0.41µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    127.8±0.80µs        ? ?/sec    1.00    127.7±0.25µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    178.7±0.68µs        ? ?/sec    1.00    179.4±0.73µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, half NULLs                               1.00    234.8±0.72µs        ? ?/sec    1.00    233.9±0.76µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, no NULLs                                 1.00    184.3±1.64µs        ? ?/sec    1.00    185.1±1.33µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.01     78.3±0.45µs        ? ?/sec    1.00     77.2±0.28µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.01    180.9±0.76µs        ? ?/sec    1.00    178.3±0.80µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.03     83.7±0.35µs        ? ?/sec    1.00     81.1±0.56µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.01    136.8±0.58µs        ? ?/sec    1.00    135.4±1.28µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, half NULLs                          1.02    215.5±0.98µs        ? ?/sec    1.00    211.4±1.21µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, no NULLs                            1.02    143.3±0.68µs        ? ?/sec    1.00    141.2±0.85µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, mandatory, no NULLs                                1.01     74.8±0.41µs        ? ?/sec    1.00     73.7±0.41µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, half NULLs                               1.01    178.5±0.76µs        ? ?/sec    1.00    176.1±0.74µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, no NULLs                                 1.00     77.1±0.38µs        ? ?/sec    1.01     77.8±0.36µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    111.2±0.36µs        ? ?/sec    1.00    111.8±0.35µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, half NULLs                          1.03    122.8±0.51µs        ? ?/sec    1.00    119.2±0.98µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, no NULLs                            1.01    114.9±0.40µs        ? ?/sec    1.00    114.0±0.46µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    167.4±0.72µs        ? ?/sec    1.00    167.4±0.67µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, half NULLs                               1.00    210.0±0.99µs        ? ?/sec    1.00    209.1±1.41µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, no NULLs                                 1.00    173.0±1.39µs        ? ?/sec    1.00    172.4±0.41µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00    201.2±0.55µs        ? ?/sec    1.01    203.8±1.18µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.01    225.3±1.56µs        ? ?/sec    1.00    223.2±0.53µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.00    207.8±0.78µs        ? ?/sec    1.01    209.9±0.87µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.01    144.9±2.07µs        ? ?/sec    1.00    143.6±0.46µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    192.9±0.82µs        ? ?/sec    1.00    192.5±2.81µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    148.7±0.59µs        ? ?/sec    1.01    149.8±0.45µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, mandatory, no NULLs                                1.04    110.6±1.54µs        ? ?/sec    1.00    106.8±1.40µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, half NULLs                               1.04    176.7±1.18µs        ? ?/sec    1.00    170.1±0.74µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, no NULLs                                 1.07    122.0±1.22µs        ? ?/sec    1.00    113.7±1.26µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, mandatory, no NULLs                                      1.00    100.5±0.42µs        ? ?/sec    1.01    101.4±0.97µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, half NULLs                                     1.00    116.4±0.32µs        ? ?/sec    1.01    117.4±0.32µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, no NULLs                                       1.00    103.1±0.39µs        ? ?/sec    1.02    104.9±0.29µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, mandatory, no NULLs                                           1.00    137.4±0.96µs        ? ?/sec    1.03    140.8±0.94µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, half NULLs                                          1.00    193.0±1.51µs        ? ?/sec    1.01    194.7±0.58µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, no NULLs                                            1.00    141.4±0.39µs        ? ?/sec    1.03    145.5±0.36µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, mandatory, no NULLs                               1.01     44.3±0.15µs        ? ?/sec    1.00     44.1±0.22µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, half NULLs                              1.00    143.8±0.57µs        ? ?/sec    1.00    143.4±0.84µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, no NULLs                                1.00     48.8±0.28µs        ? ?/sec    1.00     48.6±0.29µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, mandatory, no NULLs                                      1.02    103.5±0.41µs        ? ?/sec    1.00    101.8±0.27µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, half NULLs                                     1.00    176.3±0.67µs        ? ?/sec    1.01    177.3±1.79µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, no NULLs                                       1.00    107.8±0.41µs        ? ?/sec    1.00    107.3±1.08µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, mandatory, no NULLs                                           1.00     38.1±0.17µs        ? ?/sec    1.00     38.3±0.18µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, half NULLs                                          1.00    140.7±0.54µs        ? ?/sec    1.00    140.7±0.43µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, no NULLs                                            1.00     43.2±0.30µs        ? ?/sec    1.00     43.1±0.34µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, mandatory, no NULLs                                      1.00     96.8±0.29µs        ? ?/sec    1.02     98.4±0.29µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, half NULLs                                     1.00    107.9±0.36µs        ? ?/sec    1.03    111.2±0.29µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, no NULLs                                       1.00     99.1±0.32µs        ? ?/sec    1.02    100.9±0.35µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, mandatory, no NULLs                                           1.00    126.6±0.31µs        ? ?/sec    1.02    128.9±0.33µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, half NULLs                                          1.00    174.1±0.45µs        ? ?/sec    1.04    180.8±0.61µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, no NULLs                                            1.00    131.0±0.46µs        ? ?/sec    1.02    133.8±0.43µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     26.3±0.20µs        ? ?/sec    1.00     26.4±0.35µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, half NULLs                              1.00    124.6±0.69µs        ? ?/sec    1.01    125.4±0.63µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, no NULLs                                1.00     30.7±0.43µs        ? ?/sec    1.00     30.7±0.30µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, mandatory, no NULLs                                      1.00     82.9±0.54µs        ? ?/sec    1.03     85.0±1.15µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, half NULLs                                     1.00    156.8±0.54µs        ? ?/sec    1.01    158.1±0.47µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, no NULLs                                       1.00     89.5±8.04µs        ? ?/sec    1.00     89.7±0.31µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, mandatory, no NULLs                                           1.00     18.0±0.21µs        ? ?/sec    1.00     17.9±0.34µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, half NULLs                                          1.00    116.3±1.42µs        ? ?/sec    1.05    122.5±0.64µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, no NULLs                                            1.00     24.6±0.31µs        ? ?/sec    1.03     25.4±0.44µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, mandatory, no NULLs                                      1.00     83.1±0.76µs        ? ?/sec    1.00     83.2±0.25µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, half NULLs                                     1.00     90.6±1.02µs        ? ?/sec    1.00     90.2±0.30µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, no NULLs                                       1.00     85.5±0.28µs        ? ?/sec    1.00     85.9±0.39µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, mandatory, no NULLs                                           1.00    112.9±1.59µs        ? ?/sec    1.01    114.3±0.41µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, half NULLs                                          1.00    144.9±0.55µs        ? ?/sec    1.01    147.0±3.00µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, no NULLs                                            1.00    116.3±0.66µs        ? ?/sec    1.01    117.4±0.58µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, mandatory, no NULLs                               1.00    150.1±0.80µs        ? ?/sec    1.00    150.2±0.56µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, half NULLs                              1.00    168.1±0.80µs        ? ?/sec    1.00    167.6±0.45µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, no NULLs                                1.00    155.2±0.50µs        ? ?/sec    1.00    155.7±0.53µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, mandatory, no NULLs                                      1.00     91.0±0.60µs        ? ?/sec    1.00     90.7±0.49µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, half NULLs                                     1.00    135.9±0.63µs        ? ?/sec    1.00    136.1±0.78µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, no NULLs                                       1.00     96.1±0.65µs        ? ?/sec    1.00     95.9±0.79µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, mandatory, no NULLs                                           1.01     44.2±2.44µs        ? ?/sec    1.00     43.8±1.58µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, half NULLs                                          1.02    112.4±1.22µs        ? ?/sec    1.00    110.7±0.65µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, no NULLs                                            1.00     51.7±1.66µs        ? ?/sec    1.02     52.8±1.79µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, mandatory, no NULLs                                       1.00     96.9±0.28µs        ? ?/sec    1.04    100.6±0.27µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, half NULLs                                      1.00    111.5±0.41µs        ? ?/sec    1.02    114.2±0.49µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, no NULLs                                        1.00     99.7±1.25µs        ? ?/sec    1.03    103.0±0.35µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, mandatory, no NULLs                                            1.00    130.4±0.40µs        ? ?/sec    1.02    133.3±1.20µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, half NULLs                                           1.00    184.1±0.44µs        ? ?/sec    1.01    186.3±0.82µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, no NULLs                                             1.00    135.7±0.40µs        ? ?/sec    1.02    138.3±1.29µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, mandatory, no NULLs                                1.01     36.4±0.17µs        ? ?/sec    1.00     36.2±0.13µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, half NULLs                               1.00    134.5±0.30µs        ? ?/sec    1.01    135.6±0.39µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, no NULLs                                 1.00     40.5±0.28µs        ? ?/sec    1.01     40.8±0.37µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, mandatory, no NULLs                                       1.01     95.3±0.29µs        ? ?/sec    1.00     94.4±0.28µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, half NULLs                                      1.00    167.7±0.44µs        ? ?/sec    1.01    168.5±1.89µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, no NULLs                                        1.01    100.6±0.28µs        ? ?/sec    1.00     99.5±0.35µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, mandatory, no NULLs                                            1.00     30.5±0.11µs        ? ?/sec    1.00     30.4±0.33µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, half NULLs                                           1.00    131.9±0.45µs        ? ?/sec    1.01    132.8±1.54µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, no NULLs                                             1.01     35.3±0.26µs        ? ?/sec    1.00     35.1±0.11µs        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings half NULLs                                     1.02      7.3±0.12ms        ? ?/sec    1.00      7.2±0.12ms        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings no NULLs                                       1.07     14.1±1.21ms        ? ?/sec    1.00     13.2±0.68ms        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs                                     1.05    518.6±3.60µs        ? ?/sec    1.00    491.7±2.86µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs                                    1.00    660.3±2.46µs        ? ?/sec    1.01    670.2±1.67µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs                                      1.04    517.5±9.53µs        ? ?/sec    1.00    499.3±3.62µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, mandatory, no NULLs                                          1.08   741.0±12.05µs        ? ?/sec    1.00   686.1±18.62µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, half NULLs                                         1.02    805.0±7.98µs        ? ?/sec    1.00    785.6±3.64µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, no NULLs                                           1.08    744.5±6.33µs        ? ?/sec    1.00    690.6±8.71µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, mandatory, no NULLs                                1.00    303.3±0.99µs        ? ?/sec    1.01    306.4±4.16µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, half NULLs                               1.09    392.3±3.26µs        ? ?/sec    1.00    358.6±4.71µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, no NULLs                                 1.00    308.0±1.35µs        ? ?/sec    1.02    313.5±3.79µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, mandatory, no NULLs                                 1.05    252.6±2.31µs        ? ?/sec    1.00    240.1±2.38µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, half NULLs                                1.06    258.1±1.31µs        ? ?/sec    1.00    242.9±0.64µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, no NULLs                                  1.11    259.7±2.51µs        ? ?/sec    1.00    233.6±2.42µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, mandatory, no NULLs                                      1.02    466.7±5.49µs        ? ?/sec    1.00    457.0±5.33µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, half NULLs                                     1.01    374.5±1.91µs        ? ?/sec    1.00    369.9±5.35µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, no NULLs                                       1.02    476.2±4.13µs        ? ?/sec    1.00    467.8±4.56µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, mandatory, no NULLs                                     1.00    109.8±0.17µs        ? ?/sec    1.05    115.5±0.27µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, half NULLs                                    1.00    122.2±0.68µs        ? ?/sec    1.01    123.3±0.39µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, no NULLs                                      1.00    112.0±0.24µs        ? ?/sec    1.06    118.3±0.32µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, mandatory, no NULLs                                          1.00    149.2±0.35µs        ? ?/sec    1.06    157.5±0.34µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, half NULLs                                         1.00    200.7±0.68µs        ? ?/sec    1.01    203.7±0.65µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, no NULLs                                           1.00    154.2±0.65µs        ? ?/sec    1.06    162.9±0.56µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, mandatory, no NULLs                              1.01     44.5±0.17µs        ? ?/sec    1.00     44.0±0.27µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, half NULLs                             1.02    144.7±0.69µs        ? ?/sec    1.00    141.5±0.72µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, no NULLs                               1.00     48.6±0.15µs        ? ?/sec    1.00     48.5±0.54µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, mandatory, no NULLs                                     1.01    103.1±0.22µs        ? ?/sec    1.00    102.1±0.42µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, half NULLs                                    1.00    177.0±0.68µs        ? ?/sec    1.00    176.5±0.78µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, no NULLs                                      1.01    108.3±0.57µs        ? ?/sec    1.00    107.3±0.33µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, mandatory, no NULLs                                          1.00     38.3±0.15µs        ? ?/sec    1.00     38.3±0.19µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, half NULLs                                         1.00    140.6±0.44µs        ? ?/sec    1.00    140.2±0.31µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, no NULLs                                           1.00     42.9±0.20µs        ? ?/sec    1.01     43.2±0.15µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, mandatory, no NULLs                                     1.00     98.9±0.45µs        ? ?/sec    1.01     99.5±0.36µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, half NULLs                                    1.00    111.6±0.32µs        ? ?/sec    1.00    111.0±0.27µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, no NULLs                                      1.00    101.1±0.42µs        ? ?/sec    1.00    101.4±0.26µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, mandatory, no NULLs                                          1.00    129.3±0.47µs        ? ?/sec    1.00    129.6±2.06µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, half NULLs                                         1.00    180.5±1.15µs        ? ?/sec    1.00    181.4±1.72µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, no NULLs                                           1.00    134.0±1.40µs        ? ?/sec    1.00    134.2±1.21µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, mandatory, no NULLs                              1.00     26.3±0.16µs        ? ?/sec    1.01     26.5±0.23µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, half NULLs                             1.02    125.8±0.73µs        ? ?/sec    1.00    123.8±0.57µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, no NULLs                               1.00     29.1±0.25µs        ? ?/sec    1.06     31.0±0.23µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, mandatory, no NULLs                                     1.01     86.7±0.46µs        ? ?/sec    1.00     85.5±0.38µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, half NULLs                                    1.00    158.9±0.54µs        ? ?/sec    1.00    158.5±0.60µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, no NULLs                                      1.01     91.2±0.39µs        ? ?/sec    1.00     89.9±0.51µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, mandatory, no NULLs                                          1.02     21.2±0.43µs        ? ?/sec    1.00     20.8±0.24µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, half NULLs                                         1.00    122.7±0.45µs        ? ?/sec    1.01    123.5±1.32µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, no NULLs                                           1.00     25.9±0.40µs        ? ?/sec    1.01     26.1±0.57µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, mandatory, no NULLs                                     1.00     83.5±0.24µs        ? ?/sec    1.00     83.4±0.26µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, half NULLs                                    1.00     90.4±0.39µs        ? ?/sec    1.00     90.5±0.27µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, no NULLs                                      1.00     85.9±0.40µs        ? ?/sec    1.00     86.0±0.38µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, mandatory, no NULLs                                          1.00    111.5±0.46µs        ? ?/sec    1.02    114.1±0.64µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, half NULLs                                         1.00    144.3±0.47µs        ? ?/sec    1.01    145.5±0.36µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, no NULLs                                           1.00    114.9±3.23µs        ? ?/sec    1.01    115.7±0.36µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, mandatory, no NULLs                              1.00    149.6±0.91µs        ? ?/sec    1.00    150.3±0.83µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, half NULLs                             1.00    168.4±0.85µs        ? ?/sec    1.00    169.1±1.29µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, no NULLs                               1.00    154.6±0.49µs        ? ?/sec    1.00    155.0±0.37µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, mandatory, no NULLs                                     1.02     91.8±0.38µs        ? ?/sec    1.00     89.7±0.55µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, half NULLs                                    1.00    135.7±0.63µs        ? ?/sec    1.00    135.7±0.43µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, no NULLs                                      1.02     96.1±0.54µs        ? ?/sec    1.00     94.2±0.34µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, mandatory, no NULLs                                          1.05     46.3±1.76µs        ? ?/sec    1.00     44.1±1.80µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, half NULLs                                         1.00    111.4±0.54µs        ? ?/sec    1.00    111.5±0.45µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, no NULLs                                           1.04     54.1±2.07µs        ? ?/sec    1.00     51.9±2.15µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, mandatory, no NULLs                                      1.01    104.2±0.30µs        ? ?/sec    1.00    103.1±0.27µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, half NULLs                                     1.01    116.3±0.58µs        ? ?/sec    1.00    115.7±0.28µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, no NULLs                                       1.01    106.9±0.33µs        ? ?/sec    1.00    105.8±0.55µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, mandatory, no NULLs                                           1.01    139.6±0.40µs        ? ?/sec    1.00    137.7±0.36µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, half NULLs                                          1.00    190.1±0.37µs        ? ?/sec    1.00    189.9±0.52µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, no NULLs                                            1.01    144.1±0.41µs        ? ?/sec    1.00    142.8±0.45µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, mandatory, no NULLs                               1.07     36.7±0.30µs        ? ?/sec    1.00     34.4±0.09µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, half NULLs                              1.00    133.0±0.29µs        ? ?/sec    1.02    135.3±0.84µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, no NULLs                                1.00     39.1±0.18µs        ? ?/sec    1.04     40.7±0.38µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, mandatory, no NULLs                                      1.01     95.2±0.25µs        ? ?/sec    1.00     94.3±0.21µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, half NULLs                                     1.00    167.8±0.47µs        ? ?/sec    1.00    168.1±0.52µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, no NULLs                                       1.01    100.4±0.23µs        ? ?/sec    1.00     99.1±0.22µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, mandatory, no NULLs                                           1.01     30.5±0.20µs        ? ?/sec    1.00     30.2±0.28µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, half NULLs                                          1.00    132.1±0.52µs        ? ?/sec    1.01    133.7±4.65µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, no NULLs                                            1.01     35.4±0.40µs        ? ?/sec    1.00     34.9±0.38µs        ? ?/sec

@alamb
Copy link
Contributor Author

alamb commented Oct 30, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_decode_with_async_reader (18038b2) to 2eabb59 diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_test_decode_with_async_reader
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Oct 30, 2025

🤖: Benchmark completed

Details

group                                alamb_test_decode_with_async_reader    main
-----                                -----------------------------------    ----
arrow_reader_clickbench/async/Q1     1.00      2.4±0.04ms        ? ?/sec    1.01      2.4±0.05ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     15.3±1.01ms        ? ?/sec    1.00     15.3±0.98ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.01     16.6±0.83ms        ? ?/sec    1.00     16.5±0.96ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.00     29.0±1.01ms        ? ?/sec    1.03     29.8±0.77ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.00     40.4±0.93ms        ? ?/sec    1.02     41.1±1.12ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.00     38.2±0.76ms        ? ?/sec    1.01     38.6±0.54ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      6.0±0.35ms        ? ?/sec    1.01      6.1±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.00    121.5±2.01ms        ? ?/sec    1.02    123.7±1.80ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.00    143.1±3.70ms        ? ?/sec    1.01    143.9±2.07ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.01   304.4±11.53ms        ? ?/sec    1.00    301.2±9.41ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.02   434.6±15.95ms        ? ?/sec    1.00    427.5±8.20ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.02     47.9±1.84ms        ? ?/sec    1.00     47.1±1.60ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.00    105.6±0.71ms        ? ?/sec    1.03    108.7±1.76ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.00    108.3±2.01ms        ? ?/sec    1.02    110.0±2.02ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.00     55.0±1.18ms        ? ?/sec    1.01     55.5±0.96ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.00    128.0±2.59ms        ? ?/sec    1.02    130.6±2.21ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.00    101.7±1.56ms        ? ?/sec    1.02    104.2±2.49ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     38.6±0.81ms        ? ?/sec    1.00     38.4±0.49ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     50.5±1.94ms        ? ?/sec    1.02     51.6±2.29ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.00     47.3±1.04ms        ? ?/sec    1.00     47.1±1.03ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.01     37.2±1.02ms        ? ?/sec    1.00     36.9±0.98ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.00     13.8±0.30ms        ? ?/sec    1.02     14.0±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.1±0.01ms        ? ?/sec    1.01      2.1±0.02ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.01     10.2±0.19ms        ? ?/sec    1.00     10.1±0.18ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.00     12.0±0.31ms        ? ?/sec    1.01     12.0±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.00     38.9±0.77ms        ? ?/sec    1.04     40.5±1.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     50.7±0.96ms        ? ?/sec    1.02     51.7±1.00ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     48.5±1.16ms        ? ?/sec    1.01     48.8±0.72ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.00      4.4±0.15ms        ? ?/sec    1.00      4.4±0.17ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00    180.2±4.66ms        ? ?/sec    1.02    183.1±2.13ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    238.7±3.53ms        ? ?/sec    1.03    245.9±6.50ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00   496.4±11.74ms        ? ?/sec    1.03   513.4±12.22ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.00   449.8±16.26ms        ? ?/sec    1.04   466.9±17.26ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.00     52.2±0.97ms        ? ?/sec    1.04     54.4±2.07ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    156.7±3.65ms        ? ?/sec    1.03    161.9±5.50ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.00    157.1±4.84ms        ? ?/sec    1.00    157.4±3.13ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.00     52.6±0.75ms        ? ?/sec    1.01     53.0±0.83ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.00    158.6±3.41ms        ? ?/sec    1.03    162.6±5.14ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.00     92.8±1.98ms        ? ?/sec    1.00     93.1±2.05ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     30.1±0.34ms        ? ?/sec    1.01     30.3±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.00     36.3±1.32ms        ? ?/sec    1.00     36.2±1.16ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     45.3±1.59ms        ? ?/sec    1.00     45.5±2.02ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.02     34.7±1.30ms        ? ?/sec    1.00     34.2±1.06ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.00     12.8±0.20ms        ? ?/sec    1.03     13.1±0.28ms        ? ?/sec

@alamb
Copy link
Contributor Author

alamb commented Oct 30, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/test_decode_with_async_reader (18038b2) to 2eabb59 diff
BENCH_NAME=arrow_reader_row_filter
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_row_filter
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_test_decode_with_async_reader
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Oct 30, 2025

🤖: Benchmark completed

Details

group                                                                                alamb_test_decode_with_async_reader    main
-----                                                                                -----------------------------------    ----
arrow_reader_row_filter/float64 <= 99.0/all_columns/async                            1.02  1751.8±42.92µs        ? ?/sec    1.00  1723.3±13.62µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/all_columns/sync                             1.00      2.1±0.06ms        ? ?/sec    1.04      2.1±0.15ms        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/async                  1.01  1614.2±64.28µs        ? ?/sec    1.00  1599.6±30.29µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/sync                   1.02  1713.7±68.14µs        ? ?/sec    1.00  1682.4±36.93µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/async              1.00  1558.3±40.62µs        ? ?/sec    1.03  1603.8±123.68µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/sync               1.00  1931.6±51.92µs        ? ?/sec    1.03  1987.9±82.95µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/async    1.02  1404.8±60.79µs        ? ?/sec    1.00  1380.1±24.55µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/sync     1.00  1499.9±56.99µs        ? ?/sec    1.03  1537.5±83.36µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/async                             1.03  1778.1±94.76µs        ? ?/sec    1.00  1731.6±23.27µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/sync                              1.01      2.1±0.15ms        ? ?/sec    1.00      2.1±0.08ms        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/async                   1.01  1601.1±30.04µs        ? ?/sec    1.00  1584.7±36.91µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/sync                    1.00  1677.1±35.46µs        ? ?/sec    1.00  1670.7±29.88µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/async                              1.00   985.3±29.96µs        ? ?/sec    1.00   981.0±48.23µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/sync                               1.00    997.3±7.15µs        ? ?/sec    1.03  1023.7±34.83µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/async                    1.00   867.7±13.82µs        ? ?/sec    1.02   882.8±10.70µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/sync                     1.01  1007.5±10.26µs        ? ?/sec    1.00    995.5±8.37µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/async                                 1.00      4.1±0.07ms        ? ?/sec    1.02      4.2±0.13ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/sync                                  1.00      4.2±0.08ms        ? ?/sec    1.00      4.1±0.15ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/async                       1.00      3.7±0.08ms        ? ?/sec    1.00      3.6±0.08ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                        1.00      3.6±0.08ms        ? ?/sec    1.01      3.6±0.13ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/async                                  1.00  1977.7±43.60µs        ? ?/sec    1.05      2.1±0.11ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/sync                                   1.01      2.3±0.10ms        ? ?/sec    1.00      2.2±0.04ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/async                        1.00  1831.8±43.93µs        ? ?/sec    1.02  1871.9±64.94µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/sync                         1.00  1942.4±50.74µs        ? ?/sec    1.03      2.0±0.06ms        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/async                                 1.00  1304.0±53.55µs        ? ?/sec    1.02  1325.0±66.96µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/sync                                  1.00  1445.3±54.41µs        ? ?/sec    1.04  1497.9±112.21µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/async                       1.00  1172.4±28.85µs        ? ?/sec    1.02  1192.2±26.91µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/sync                        1.00  1314.7±50.16µs        ? ?/sec    1.00  1317.3±42.30µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/async                             1.00      4.3±0.06ms        ? ?/sec    1.00      4.2±0.06ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/sync                              1.00      5.0±0.11ms        ? ?/sec    1.01      5.0±0.12ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/async                   1.00      3.6±0.06ms        ? ?/sec    1.02      3.6±0.11ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/sync                    1.01      3.6±0.17ms        ? ?/sec    1.00      3.5±0.09ms        ? ?/sec

impl ParquetDecoderState {
/// If actively reading a RowGroup, return the currently active
/// ParquetRecordBatchReader and advance to the next group.
fn try_next_reader(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a newly added "batched" API that makes it possible to read the next reader (that is ready to produce record batches)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this so that we can preserve the pub async fn next_row_group(&mut self) -> Result<Option<ParquetRecordBatchReader>> API on the async reader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually turns out that is a pretty clever API that I didn't know about -- it lets one interleave IO and CPU more easily:

///
/// This function is called in a loop until the decoder is ready to return
/// data (has the required pages buffered) or is finished.
fn transition(self) -> Result<(Self, DecodeResult<()>), ParquetError> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reworked so it can be shared between try_next_batch and try_next_reader

It also now avoids a self-recursive call which I think is a (minor) improvement

///
/// See examples on [`ParquetRecordBatchStreamBuilder::new`]
pub fn build(self) -> Result<ParquetRecordBatchStream<T>> {
let num_row_groups = self.metadata.row_groups().len();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole point of this PR is to remove all this code (and instead use the copy in the push decoder)


let request_state = RequestState::None { input: input.0 };

Ok(ParquetRecordBatchStream {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see the Stream is much simpler now -- only the decoder and an object to track the current I/O state

@alamb
Copy link
Contributor Author

alamb commented Oct 31, 2025

This PR is now pretty much ready for review. It builds on a test refactor here:

Once that is merged I will mark this one ready for review

alamb added a commit that referenced this pull request Oct 31, 2025
…level APIs (#8754)

# Which issue does this PR close?

- Related to #8677
- part of #8159


# Rationale for this change

I am reworking how the parquet decoder's state machine works in
#8159

One of the unit tests, `test_cache_projection_excludes_nested_columns`
uses non-public APIs that I am changing

Rather than rewrite them into other non public APIs I think it would be
better if this test is in terms of public APIs

# What changes are included in this PR?
1. refactor `test_cache_projection_excludes_nested_columns` to use high
level APIs

# Are these changes tested?

They are run in CI

I also verified this test covers the intended functionality by
commenting it out:

```diff
--- a/parquet/src/arrow/async_reader/mod.rs
+++ b/parquet/src/arrow/async_reader/mod.rs
@@ -724,7 +724,9 @@ where
             cache_projection.union(predicate.projection());
         }
         cache_projection.intersect(projection);
-        self.exclude_nested_columns_from_cache(&cache_projection)
+        // TEMP don't exclude nested columns
+        //self.exclude_nested_columns_from_cache(&cache_projection)
+        Some(cache_projection)
     }

     /// Exclude leaves belonging to roots that span multiple parquet leaves (i.e. nested columns)
```

And then running the test:
```shell
cargo test --all-features --test arrow_reader
```

And the test fails (as expected)
```
---- predicate_cache::test_cache_projection_excludes_nested_columns stdout ----

thread 'predicate_cache::test_cache_projection_excludes_nested_columns' panicked at parquet/tests/arrow_reader/predicate_cache.rs:244:9:
assertion `left == right` failed: Expected 0 records read from cache, but got 100
  left: 100
 right: 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    predicate_cache::test_cache_projection_excludes_nested_columns

test result: FAILED. 88 passed; 1 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.20s
```
# Are there any user-facing changes?

No, this is only test changes
@alamb alamb force-pushed the alamb/test_decode_with_async_reader branch from 5c4dbc0 to 73a16cf Compare October 31, 2025 21:25
}

fn get_bytes(&self, start: u64, length: usize) -> Result<Bytes, ParquetError> {
if start > self.file_len {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the async decoder doesn't know (or need to know) the entire file length so I removed this somewhat more specific error message and instead will rely on the underlying source reporting errors when appropriate

@alamb alamb marked this pull request as ready for review October 31, 2025 21:29
Copy link

@vustef vustef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Andrew. This is my first review in the codebase, but fwiw, this looks good to me.


if let Some(limit) = &mut self.limit {
*limit -= rows_after;
// Issue a request to fetch a single range, returining the Outstanding state
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's to fetch a single range, why does it take Vec<Range<u64>> as a parameter?
I suppose the comment is wrong, not the parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ee64444

// (aka can have references internally) and thus must
// own the input while the request is outstanding.
let future = async move {
let data = input.get_byte_ranges(ranges_captured).await?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An aside: I don't understand why the default implementation for the AsyncReader fetches range by range sequentially instead of utilizing concurrency of the underlying runtime:

/// Retrieve multiple byte ranges. The default implementation will call `get_bytes` sequentially

Please let me know if it doesn't resonate with you either and I can open an issue for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An aside: I don't understand why the default implementation for the AsyncReader fetches range by range sequentially instead of utilizing concurrency of the underlying runtime:

I think you are referring to this:

fn get_byte_ranges(&mut self, ranges: Vec<Range<u64>>) -> BoxFuture<'_, Result<Vec<Bytes>>> {
async move {
let mut result = Vec::with_capacity(ranges.len());
for range in ranges.into_iter() {
let data = self.get_bytes(range).await?;
result.push(data);
}
Ok(result)
}
.boxed()
}

I think one reason is that the concurrency model is different depending on the runtime (e.g. the way you launch concurrent IO using tokio is different than how you launch concurrent tasks for io_uring, for example). Also there may be benefits to doing larger swaths of IO -- e.g. S3 doesn't actually support multiple ranges in a single requests

So in my mind the way "utilizing concurrency of the underlying runtime:" is achieved is by providing an implementation of AsyncFileReader with the appropriate specialization for get_ranges.

One thing we could consider, FWIW, is to remove the default implementation which would force each impl to specialize get_ranges 🤔

BTW One of my primary motivations for extracting the parquet state machine into ParquetPushDecoder is precisely to make it easier to do such specialized IO. I have plans to write a blog post about this topic, but it will probably take me another month or so

} else {
// All rows skipped, read next row group
continue;
let request_state = std::mem::replace(&mut self.request_state, RequestState::Done);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this ownership trick needed? Perhaps you could comment in the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the way I could get the Rust ownership rules to be happy (aka ensure that self.request_state always has a valid value and can't be in some partial state). I have added a comment in 3ec7448

impl ParquetDecoderState {
/// If actively reading a RowGroup, return the currently active
/// ParquetRecordBatchReader and advance to the next group.
fn try_next_reader(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this so that we can preserve the pub async fn next_row_group(&mut self) -> Result<Option<ParquetRecordBatchReader>> API on the async reader?

let decoder = ParquetPushDecoderBuilder {
// Async reader doesn't know the overall size of the input, but it
// is not required for decoding as we will already have the metadata
input: 0,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the previous PR, but am a bit confused with the input field of ArrowReaderBuilder. Is it meant to represent arbitrary input, specific to a specialized type (in this case file_length for ParquetPushDecoderBuilder)? I wonder if it would've been better if we had something like this:

struct ParquetPushDecoderBuilder {
    reader_builder::ArrowReaderBuilder
    file_len::ut6
}

Just a thought, not intended to be addressed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would've been better if we had something like this:

I agree this would be much cleaner.

The input field is confusing in the context of the "push decoder" as there is (by design) no input.

However, the current structure is designed so the exact same builder code can be shared for the three different decoder types. Using an ArrowReaderBuilder internally is an interesting idea, but we would need to find some way to pass along options (either by duplicating methods from ArrowReaderBuilder to pass through, or constructing the push decoder builder from the ArrowReaderBuilder)

However, I will try and change the type from u64 to some new type where this context can be commented rather than have this strange 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually removed the u64 from the push decoder builder and I think the code is much nicer now.

Thank you for the suggestion @vustef

@alamb
Copy link
Contributor Author

alamb commented Nov 4, 2025

Thank you very much for the review @vustef

parquet_metadata,
ArrowReaderOptions::default(),
)
pub fn try_new_decoder(parquet_metadata: Arc<ParquetMetaData>) -> Result<Self, ParquetError> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was introduced in

Which has not been release yet -- and thus this is not a breaking API change. Likewise for the changes to ParquetPushDecoderBuilder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rewrite ParquetRecordBatchStream (async API) in terms of the PushDecoder

2 participants