-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add interleave kernel (#1523)
#2838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
d9dc4ad
27566b1
8a8c904
7344e35
c498005
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,218 @@ | ||||||||||
| // Licensed to the Apache Software Foundation (ASF) under one | ||||||||||
| // or more contributor license agreements. See the NOTICE file | ||||||||||
| // distributed with this work for additional information | ||||||||||
| // regarding copyright ownership. The ASF licenses this file | ||||||||||
| // to you under the Apache License, Version 2.0 (the | ||||||||||
| // "License"); you may not use this file except in compliance | ||||||||||
| // with the License. You may obtain a copy of the License at | ||||||||||
| // | ||||||||||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||||||||||
| // | ||||||||||
| // Unless required by applicable law or agreed to in writing, | ||||||||||
| // software distributed under the License is distributed on an | ||||||||||
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||||||||||
| // KIND, either express or implied. See the License for the | ||||||||||
| // specific language governing permissions and limitations | ||||||||||
| // under the License. | ||||||||||
|
|
||||||||||
| use arrow_array::{make_array, new_empty_array, Array, ArrayRef}; | ||||||||||
| use arrow_data::transform::MutableArrayData; | ||||||||||
| use arrow_schema::ArrowError; | ||||||||||
|
|
||||||||||
| /// | ||||||||||
| /// Takes elements by index from a list of [`Array`], creating a new [`Array`] from those values. | ||||||||||
| /// | ||||||||||
| /// Each element in `indices` is a pair of `usize` with the first identifying the index | ||||||||||
| /// of the [`Array`] in `values`, and the second the index of the value within that [`Array`] | ||||||||||
| /// | ||||||||||
| /// ```text | ||||||||||
| /// ┌─────────────────┐ ┌─────────┐ ┌─────────────────┐ | ||||||||||
| /// │ A │ │ (0, 0) │ interleave( │ A │ | ||||||||||
| /// ├─────────────────┤ ├─────────┤ [values0, values1], ├─────────────────┤ | ||||||||||
| /// │ D │ │ (1, 0) │ indices │ B │ | ||||||||||
| /// └─────────────────┘ ├─────────┤ ) ├─────────────────┤ | ||||||||||
| /// values array 0 │ (1, 1) │ ─────────────────────────▶ │ C │ | ||||||||||
| /// ├─────────┤ ├─────────────────┤ | ||||||||||
| /// │ (0, 1) │ │ D │ | ||||||||||
| /// └─────────┘ └─────────────────┘ | ||||||||||
| /// ┌─────────────────┐ indices | ||||||||||
| /// │ B │ array | ||||||||||
| /// ├─────────────────┤ result | ||||||||||
| /// │ C │ | ||||||||||
| /// ├─────────────────┤ | ||||||||||
| /// │ E │ | ||||||||||
| /// └─────────────────┘ | ||||||||||
| /// values array 1 | ||||||||||
| /// ``` | ||||||||||
| /// | ||||||||||
| /// For selecting values by index from a single array see [compute::take](crate::compute::take) | ||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||||||||||
| /// | ||||||||||
| /// # Panics | ||||||||||
|
||||||||||
| /// | ||||||||||
| /// Panics if the arrays do not have the same data type or `values` is empty | ||||||||||
| pub fn interleave( | ||||||||||
| values: &[&dyn Array], | ||||||||||
| indices: &[(usize, usize)], | ||||||||||
| ) -> Result<ArrayRef, ArrowError> { | ||||||||||
| if values.is_empty() { | ||||||||||
| return Err(ArrowError::InvalidArgumentError( | ||||||||||
| "interleave requires input of at least one array".to_string(), | ||||||||||
| )); | ||||||||||
| } | ||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we also return an error for single array case and suggest to use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I vacillated a bit on this, I think given the concat kernel which makes even less sense to be called on a single array, doesn't error in this case - I would be inclined to leave it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that it is nicer to support handling a single input (mostly as a convenience for downstream users so they don't have to special case their code for |
||||||||||
| let data_type = values[0].data_type(); | ||||||||||
|
|
||||||||||
| if values | ||||||||||
| .iter() | ||||||||||
| .skip(1) | ||||||||||
| .any(|array| array.data_type() != data_type) | ||||||||||
| { | ||||||||||
| return Err(ArrowError::InvalidArgumentError( | ||||||||||
| "It is not possible to interleave arrays of different data types." | ||||||||||
| .to_string(), | ||||||||||
|
||||||||||
| "It is not possible to interleave arrays of different data types." | |
| .to_string(), | |
| format!("It is not possible to interleave arrays of different data types ({:?} and {:?})" | |
| array.data_type(), data_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the specialized implementation should be tracked by a different ticket than the initial kernel, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for checking repeated indexes (0,3) and (0,3)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert!(v.is_empty()); | |
| assert!(v.is_empty()); | |
| assert!(v.data_type(), DataType::Int32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ for the ascii art (lol though I am biased)
cc @Dandandan and @yjshen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You created it for the original ticket 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, nothing like a little self praise of my monodraw skillz to lighten up the review process