-
Notifications
You must be signed in to change notification settings - Fork 1k
[Variant] Simplify Builder buffer operations
#7795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
parquet-variant/src/builder.rs
Outdated
| fn make_room_for_header(buffer: &mut Vec<u8>, start_pos: usize, header_size: usize) { | ||
| let current_len = buffer.len(); | ||
| buffer.resize(current_len + header_size, 0); | ||
| fn write_header(header_size: usize, header_byte: u8, is_large: bool, num_items: usize) -> Vec<u8> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the ObjectBuilder and ListBuilder were writing out the header in the same way.
Moved it into a freestanding function called write_header.
b85ebc8 to
822924f
Compare
scovich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One naming nit and several follow-ups to consider.
parquet-variant/src/builder.rs
Outdated
| for i in 0..nbytes { | ||
| buf[i as usize] = (value >> (i * 8)) as u8; | ||
| buf.push((value >> (i * 8)) as u8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside: I wonder if it would be cheaper (better compiler codegen) to do something like:
let bytes = value.to_le_bytes();
buf.extend_from_slice(&bytes[..nbytes]);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Godbolt showed the same instruction count for both implementations, so appending n bytes at a time makes sense to me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I think extend_from_slice is more "idomatic" rust but I realize that is a opinion and I don't think this needs to be changed (though if I were writing the code I would use extend_from_slice)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR uses extend_from_slice!
arrow-rs/parquet-variant/src/builder.rs
Lines 56 to 60 in e42df82
| /// Write little-endian integer to buffer | |
| fn write_offset(buf: &mut Vec<u8>, value: usize, nbytes: u8) { | |
| let bytes = value.to_le_bytes(); | |
| buf.extend_from_slice(&bytes[..nbytes as usize]); | |
| } |
Sorry if my comment above was unclear, I was agreeing with @scovich
|
|
||
| let mut metadata = Vec::with_capacity(metadata_size); | ||
|
|
||
| // Write header: version=1, not sorted, with calculated offset_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a TODO for supporting sorted dictionaries? I liked @alamb idea to let callers pre-populate and pre-sort the dictionary if they wanted (before any variant objects reference it), and here we just check whether it's sorted as we write it out. If not sorted, go back and rewrite the header (which is at metadata[0], pretty safe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I think sorted dictionaries are covered by this ticket (and technically it was @zeroshade 's idea :) )
Also after this PR I think it will be rather elegant / easy to add support for sorted dictionaries. Thank you @friendlymatthew
| write_offset(&mut metadata, cur_offset, offset_size); | ||
|
|
||
| // Write string data | ||
| for key in self.field_names.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding correctly...
- We need a map (not necessarily ordered) in order to cheaply find the field id for a given field name.
- We need a vec that remembers insertion order?
- The map cannot reference strings in the vec, unless we're willing to mess with interior mutability and "fun" like that (but in theory we could use a hashmap that stores indexes into the vec, with a fancy custom indirect hasher)
- The vec cannot reference strings in the map, because there's no stable way to reference the strings it hosts.
Have we considered using an IndexSet? An IndexSet<String> should behave like a Vec<String> but with O(1) cost to return the index of any entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree IndexSet would be a good choice of structure here and we have used it elsewhere to good effect
A good follow on ticket perhaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @scovich says, if/when we decide we need to optimize even more and avoid individual allocations for strings, there are fancier structures we could use but I think an IndexSet will get us pretty far
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm a bit confused to why we need IndexSet. From what I understand, IndexSets offer a deterministic order of elements by insertion order and O(1) lookup time.
Our MetadataBuilder has a Vec of field names. We don't support Object field deletion, so don't we already have a deterministic ordering of field names?
Plus, we already have O(1) lookups, since field ids basically serve as indices into the Vec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our MetadataBuilder has a Vec of field names. We don't support Object field deletion, so don't we already have a deterministic ordering of field names?
MetadataBuilder has both a vec and a BTreeMap. Each field name is stored twice, once in the Vec and once in the BTreeMap.
arrow-rs/parquet-variant/src/builder.rs
Lines 235 to 237 in 674dc17
| struct MetadataBuilder { | |
| field_name_to_id: BTreeMap<String, u32>, | |
| field_names: Vec<String>, |
I think the idea is if we used an IndexSet we would not need the second copy of the field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parquet-variant/src/builder.rs
Outdated
| } | ||
| write_offset( | ||
| &mut self.parent_buffer.0[pos..pos + offset_size as usize], | ||
| data_size, | ||
| offset_size, | ||
| ); | ||
|
|
||
| write_offset(&mut self.parent_buffer.0, data_size, offset_size); | ||
|
|
||
| self.parent_buffer.0.extend_from_slice(&self.buffer.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure I'm understanding:
- When building a nested list or object, the value bytes go into a separate buffer (
self.buffer), and the field ids+offsets are tracked separately as well. - When finalizing, we first write out the field ids, then the offsets, and then the value bytes
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed this carefully and I think it is a significant improvement in the structure of the code and I think it sets us up nicely to support sorted dictionaries and other things. Thank you @friendlymatthew and @scovich
fyi @PinkCrow007 and @mkarbo as you may be interested in how the VariantBuilder is progressing along
| self.append_slice(µs.to_le_bytes()); | ||
| } | ||
|
|
||
| fn append_timestamp_ntz_micros(&mut self, value: chrono::NaiveDateTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the way this code now looks 👍
|
|
||
| let mut metadata = Vec::with_capacity(metadata_size); | ||
|
|
||
| // Write header: version=1, not sorted, with calculated offset_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I think sorted dictionaries are covered by this ticket (and technically it was @zeroshade 's idea :) )
Also after this PR I think it will be rather elegant / easy to add support for sorted dictionaries. Thank you @friendlymatthew
| write_offset(&mut metadata, cur_offset, offset_size); | ||
|
|
||
| // Write string data | ||
| for key in self.field_names.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @scovich says, if/when we decide we need to optimize even more and avoid individual allocations for strings, there are fancier structures we could use but I think an IndexSet will get us pretty far
|
Given there are no blockers to merge and two approvals, I am going to merge this one in to keep the code flowing. We can continue to iterate in follow on PRs. |
Rationale for this change
This PR simplifies how we build up the internal
VariantBufferby appending to it directly, rather than pre-allocating a buffer filled with zeroes and setting values at indices. This avoids indexing math that can be hard to follow and reason about.This PR also aims to design a well-defined API for
ValueBuffer. My thought here was we should not touch the innerVec<u8>. It's quite sensitive.Are there any user-facing changes?
Nope!