-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Move AggregateExpr, PhysicalExpr and PhysicalSortExpr to physical-expr-core
#9926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 10 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
5338f61
move PhysicalExpr
jayzhan211 450ae4b
cleanup
jayzhan211 3624964
move physical sort
jayzhan211 835f147
cleanup dependencies
jayzhan211 c5d80c8
add readme
jayzhan211 7851de7
disable doc test
jayzhan211 f5aafb3
move column
jayzhan211 7bfc074
fmt
jayzhan211 675d2fe
move aggregatexp
jayzhan211 5220087
move other two utils
jayzhan211 113a000
license
jayzhan211 fea87e3
switch to ignore
jayzhan211 06d87bc
move reverse order
jayzhan211 26e5782
rename to common
jayzhan211 26f852c
cleanup
jayzhan211 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| # Licensed to the Apache Software Foundation (ASF) under one | ||
| # or more contributor license agreements. See the NOTICE file | ||
| # distributed with this work for additional information | ||
| # regarding copyright ownership. The ASF licenses this file | ||
| # to you under the Apache License, Version 2.0 (the | ||
| # "License"); you may not use this file except in compliance | ||
| # with the License. You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, | ||
| # software distributed under the License is distributed on an | ||
| # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| # KIND, either express or implied. See the License for the | ||
| # specific language governing permissions and limitations | ||
| # under the License. | ||
|
|
||
| [package] | ||
| name = "datafusion-physical-expr-core" | ||
| description = "Core physical expression implementation for DataFusion query engine" | ||
| keywords = ["arrow", "query", "sql"] | ||
| readme = "README.md" | ||
| version = { workspace = true } | ||
| edition = { workspace = true } | ||
| homepage = { workspace = true } | ||
| repository = { workspace = true } | ||
| license = { workspace = true } | ||
| authors = { workspace = true } | ||
| rust-version = { workspace = true } | ||
|
|
||
| [lib] | ||
| name = "datafusion_physical_expr_core" | ||
| path = "src/lib.rs" | ||
|
|
||
| [dependencies] | ||
| arrow = { workspace = true } | ||
| datafusion-common = { workspace = true, default-features = true } | ||
| datafusion-expr = { workspace = true } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| <!--- | ||
| Licensed to the Apache Software Foundation (ASF) under one | ||
| or more contributor license agreements. See the NOTICE file | ||
| distributed with this work for additional information | ||
| regarding copyright ownership. The ASF licenses this file | ||
| to you under the Apache License, Version 2.0 (the | ||
| "License"); you may not use this file except in compliance | ||
| with the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, | ||
| software distributed under the License is distributed on an | ||
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| KIND, either express or implied. See the License for the | ||
| specific language governing permissions and limitations | ||
| under the License. | ||
| --> | ||
|
|
||
| # DataFusion Core Physical Expressions | ||
|
|
||
| [DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format. | ||
|
|
||
| This crate is a submodule of DataFusion that provides the core functionality of physical expressions. | ||
| Like `PhysicalExpr` or `PhysicalSortExpr` and related things. | ||
|
|
||
| [df]: https://crates.io/crates/datafusion | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| // Licensed to the Apache Software Foundation (ASF) under one | ||
| // or more contributor license agreements. See the NOTICE file | ||
| // distributed with this work for additional information | ||
| // regarding copyright ownership. The ASF licenses this file | ||
| // to you under the Apache License, Version 2.0 (the | ||
| // "License"); you may not use this file except in compliance | ||
| // with the License. You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, | ||
| // software distributed under the License is distributed on an | ||
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| // KIND, either express or implied. See the License for the | ||
| // specific language governing permissions and limitations | ||
| // under the License. | ||
|
|
||
| pub mod utils; | ||
|
|
||
| use std::any::Any; | ||
| use std::fmt::Debug; | ||
| use std::sync::Arc; | ||
|
|
||
| use crate::physical_expr::PhysicalExpr; | ||
| use crate::sort_expr::PhysicalSortExpr; | ||
|
|
||
| use arrow::datatypes::Field; | ||
| use datafusion_common::{not_impl_err, Result}; | ||
| use datafusion_expr::{Accumulator, GroupsAccumulator}; | ||
|
|
||
| /// An aggregate expression that: | ||
| /// * knows its resulting field | ||
| /// * knows how to create its accumulator | ||
| /// * knows its accumulator's state's field | ||
| /// * knows the expressions from whose its accumulator will receive values | ||
| /// | ||
| /// Any implementation of this trait also needs to implement the | ||
| /// `PartialEq<dyn Any>` to allows comparing equality between the | ||
| /// trait objects. | ||
| pub trait AggregateExpr: Send + Sync + Debug + PartialEq<dyn Any> { | ||
| /// Returns the aggregate expression as [`Any`] so that it can be | ||
| /// downcast to a specific implementation. | ||
| fn as_any(&self) -> &dyn Any; | ||
|
|
||
| /// the field of the final result of this aggregation. | ||
| fn field(&self) -> Result<Field>; | ||
|
|
||
| /// the accumulator used to accumulate values from the expressions. | ||
| /// the accumulator expects the same number of arguments as `expressions` and must | ||
| /// return states with the same description as `state_fields` | ||
| fn create_accumulator(&self) -> Result<Box<dyn Accumulator>>; | ||
|
|
||
| /// the fields that encapsulate the Accumulator's state | ||
| /// the number of fields here equals the number of states that the accumulator contains | ||
| fn state_fields(&self) -> Result<Vec<Field>>; | ||
|
|
||
| /// expressions that are passed to the Accumulator. | ||
| /// Single-column aggregations such as `sum` return a single value, others (e.g. `cov`) return many. | ||
| fn expressions(&self) -> Vec<Arc<dyn PhysicalExpr>>; | ||
|
|
||
| /// Order by requirements for the aggregate function | ||
| /// By default it is `None` (there is no requirement) | ||
| /// Order-sensitive aggregators, such as `FIRST_VALUE(x ORDER BY y)` should implement this | ||
| fn order_bys(&self) -> Option<&[PhysicalSortExpr]> { | ||
| None | ||
| } | ||
|
|
||
| /// Human readable name such as `"MIN(c2)"`. The default | ||
| /// implementation returns placeholder text. | ||
| fn name(&self) -> &str { | ||
| "AggregateExpr: default name" | ||
| } | ||
|
|
||
| /// If the aggregate expression has a specialized | ||
| /// [`GroupsAccumulator`] implementation. If this returns true, | ||
| /// `[Self::create_groups_accumulator`] will be called. | ||
| fn groups_accumulator_supported(&self) -> bool { | ||
| false | ||
| } | ||
|
|
||
| /// Return a specialized [`GroupsAccumulator`] that manages state | ||
| /// for all groups. | ||
| /// | ||
| /// For maximum performance, a [`GroupsAccumulator`] should be | ||
| /// implemented in addition to [`Accumulator`]. | ||
| fn create_groups_accumulator(&self) -> Result<Box<dyn GroupsAccumulator>> { | ||
| not_impl_err!("GroupsAccumulator hasn't been implemented for {self:?} yet") | ||
| } | ||
|
|
||
| /// Construct an expression that calculates the aggregate in reverse. | ||
| /// Typically the "reverse" expression is itself (e.g. SUM, COUNT). | ||
| /// For aggregates that do not support calculation in reverse, | ||
| /// returns None (which is the default value). | ||
| fn reverse_expr(&self) -> Option<Arc<dyn AggregateExpr>> { | ||
| None | ||
| } | ||
|
|
||
| /// Creates accumulator implementation that supports retract | ||
| fn create_sliding_accumulator(&self) -> Result<Box<dyn Accumulator>> { | ||
| not_impl_err!("Retractable Accumulator hasn't been implemented for {self:?} yet") | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| // Licensed to the Apache Software Foundation (ASF) under one | ||
| // or more contributor license agreements. See the NOTICE file | ||
| // distributed with this work for additional information | ||
| // regarding copyright ownership. The ASF licenses this file | ||
| // to you under the Apache License, Version 2.0 (the | ||
| // "License"); you may not use this file except in compliance | ||
| // with the License. You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, | ||
| // software distributed under the License is distributed on an | ||
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| // KIND, either express or implied. See the License for the | ||
| // specific language governing permissions and limitations | ||
| // under the License. | ||
|
|
||
| use std::{any::Any, sync::Arc}; | ||
|
|
||
| use arrow::{ | ||
| compute::SortOptions, | ||
| datatypes::{DataType, Field}, | ||
| }; | ||
|
|
||
| use crate::sort_expr::PhysicalSortExpr; | ||
|
|
||
| use super::AggregateExpr; | ||
|
|
||
| /// Downcast a `Box<dyn AggregateExpr>` or `Arc<dyn AggregateExpr>` | ||
| /// and return the inner trait object as [`Any`] so | ||
| /// that it can be downcast to a specific implementation. | ||
| /// | ||
| /// This method is used when implementing the `PartialEq<dyn Any>` | ||
| /// for [`AggregateExpr`] aggregation expressions and allows comparing the equality | ||
| /// between the trait objects. | ||
| pub fn down_cast_any_ref(any: &dyn Any) -> &dyn Any { | ||
| if let Some(obj) = any.downcast_ref::<Arc<dyn AggregateExpr>>() { | ||
| obj.as_any() | ||
| } else if let Some(obj) = any.downcast_ref::<Box<dyn AggregateExpr>>() { | ||
| obj.as_any() | ||
| } else { | ||
| any | ||
| } | ||
| } | ||
|
|
||
| /// Construct corresponding fields for lexicographical ordering requirement expression | ||
| pub fn ordering_fields( | ||
| ordering_req: &[PhysicalSortExpr], | ||
| // Data type of each expression in the ordering requirement | ||
| data_types: &[DataType], | ||
| ) -> Vec<Field> { | ||
| ordering_req | ||
| .iter() | ||
| .zip(data_types.iter()) | ||
| .map(|(sort_expr, dtype)| { | ||
| Field::new( | ||
| sort_expr.expr.to_string().as_str(), | ||
| dtype.clone(), | ||
| // Multi partitions may be empty hence field should be nullable. | ||
| true, | ||
| ) | ||
| }) | ||
| .collect() | ||
| } | ||
|
|
||
| /// Selects the sort option attribute from all the given `PhysicalSortExpr`s. | ||
| pub fn get_sort_options(ordering_req: &[PhysicalSortExpr]) -> Vec<SortOptions> { | ||
| ordering_req.iter().map(|item| item.options).collect() | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a very minor suggestion: