Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 14 additions & 2 deletions parquet/src/arrow/arrow_writer/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,14 @@ impl<W: Write + Send> ArrowWriter<W> {
Ok(())
}

/// Writes the given buf bytes to the internal buffer.
///
/// It's safe to use this method to write data to the underlying writer,
/// because it will ensure that the buffering and byte‐counting layers are used.
pub fn write_all(&mut self, buf: &[u8]) -> std::io::Result<()> {
self.writer.write_all(buf)
}

/// Flushes all buffered rows into a new row group
pub fn flush(&mut self) -> Result<()> {
let in_progress = match self.in_progress.take() {
Expand Down Expand Up @@ -326,8 +334,12 @@ impl<W: Write + Send> ArrowWriter<W> {

/// Returns a mutable reference to the underlying writer.
///
/// It is inadvisable to directly write to the underlying writer, doing so
/// will likely result in a corrupt parquet file
/// **Warning**: if you write directly to this writer, you will skip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

/// the `TrackedWrite` buffering and byte‐counting layers. That’ll cause
/// the file footer’s recorded offsets and sizes to diverge from reality,
/// resulting in an unreadable or corrupted Parquet file.
///
/// If you want to write safely to the underlying writer, use [`Self::write_all`].
pub fn inner_mut(&mut self) -> &mut W {
self.writer.inner_mut()
}
Expand Down
19 changes: 18 additions & 1 deletion parquet/src/file/writer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -394,9 +394,26 @@ impl<W: Write + Send> SerializedFileWriter<W> {
self.buf.inner()
}

/// Writes the given buf bytes to the internal buffer.
///
/// This can be used to write raw data to an in-progress parquet file, for
/// example, custom index structures or other payloads. Other parquet readers
/// will skip this data when reading the files.
///
/// It's safe to use this method to write data to the underlying writer,
/// because it will ensure that the buffering and byte‐counting layers are used.
pub fn write_all(&mut self, buf: &[u8]) -> std::io::Result<()> {
self.buf.write_all(buf)
}

/// Returns a mutable reference to the underlying writer.
///
/// It is inadvisable to directly write to the underlying writer.
/// **Warning**: if you write directly to this writer, you will skip
/// the `TrackedWrite` buffering and byte‐counting layers. That’ll cause
/// the file footer’s recorded offsets and sizes to diverge from reality,
/// resulting in an unreadable or corrupted Parquet file.
///
/// If you want to write safely to the underlying writer, use [`Self::write_all`].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

pub fn inner_mut(&mut self) -> &mut W {
self.buf.inner_mut()
}
Expand Down
Loading