-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Goal: a complete row implementation, fully used in pipeline breaker operators when possible.
Summary
TLDR: The key focus of this work is to speed up fundamentally row oriented operations like hash table lookup or comparisons (e.g. #2427)
Background
DataFusion, like many Arrow systems, is a classic "vectorized computation engine" which works quite well for many common operations. The following paper, gives a good treatment on the various tradeoffs between vectorized and JIT's compilation of query plans: https://db.in.tum.de/~kersten/vectorization_vs_compilation.pdf?lang=de
As mentioned in the paper, there are some fundamentally "row oriented" operations in a database that are not typically amenable to vectorization. The "classics" are: Hash table updates in Joins and Hash Aggregates, as well as comparing tuples in sort.
When operating with a Row based format, the per-tuple type dispatch overhead becomes quite important, so such operations are typically implemented using just in time compilation (JIT) or other unsafe mechanims to minimize the overhead
@yjshen added initial support for JIT'ing in #1849 and it currently lives in https://github.com/apache/arrow-datafusion/tree/master/datafusion/jit. He also added partial support for aggregates in #2375
This ticket tracks the remaining work to fully support row formats, including JIT'ing
Getters and setters
- Avoid unnecessary branching in row read/write if schema is null-free #1891
- 1. Support all types that ScalarValue supports
- 1.1 all basic types
- 1.1.1 Decimal
- 1.1.2 Timestamp
- 1.1.3 Date
- 1.1.4 Interval
- 1.1.5 Null
- 1.2 composite types: List / Struct
- 1.1 all basic types
- 2. Make varlena offset + length a type parameter for reader and writer, for space efficiency
- 3. Assertion based on schema before getting. Think
date64as an example.
Formats
- 1. basics: Support Multiple row layout #2188
- 2. Compact: write once, never update, Eq comparable
- 2.1 all type supports
- 3. WordAligned: update heavy on cells
- 3.1 all basic type supports
- 3.2 Varlena out-of-place store in memory, and inline/de-inline while serializing/deserializing
- 4. RawComparable: best effort comparable based on raw bytes
- 4.1 null-inline
- 4.2 float bytes comparable
- 4.3 comparator with best effort &[u8] comp, and interleave with varlena compare field-by-field
Hook into execution (mainly the pipeline-breakers)
- Sort
- Improve Sorting / Merge performance #2427
- RawComparable as SortByKey
- payload Buffer records in row format in memory for SortExec #2146
- Improve Sorting / Merge performance #2427
- Aggregate
- basics: Compact row as GroupByKey, WordAligned row as aggregation buffer Grouped Aggregate in row format #2375
- Unify GroupByRowRow and GroupByRow: Consolidate GroupByHash implementations
row_hash.rsandhash.rs(remove duplication) #2723 - GroupBy: complete full row-based accumulator support.
- Join
- Compact row as HashJoinKey
- RawComparable row as MergeJoinKey
- Compact row as Join payloads
Cleanups
- Getter / setter / accessor consolidation, DRY
JIT
- basics: JIT the tuple field get/set with schema, avoid branching for each field in each row. (Try to fix in Introduce JIT code generation #1849 )
- TBD