-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Refactor: Unify Expr::ScalarFunction and Expr::ScalarUDF, introduce unresolved functions by name
#8258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| pub struct ScalarFunction { | ||
| /// The function | ||
| pub fun: built_in_function::BuiltinScalarFunction, | ||
| pub func_def: ScalarFunctionDefinition, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the major change
| args, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the major change
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @2010YOUY01 -- I think this looks great 🦾 . I had a few small comments, but otherwise I think it is ready to go.
I want to leave this PR open for a few days to allow people a chance to comment
then we can add a AnalyzerRule to resolve the function.
This PR only include refactor preparation for UDF Expr support.
While waiting, it might help to create another draft PR on top of this one that implements name resolution so people can see how that would work
Some other todos (for myself):
- File a ticket to do the same thing for AggregateUDF and WindowUDF (to maintain a consistent interface)
| ScalarFunction::new_udf(fun, transform_vec(args, &mut transform)?), | ||
| ), | ||
| ScalarFunctionDefinition::Name(_) => { | ||
| return internal_err!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is probably ok to support tree walks on unresolved function names. I don't see any reason to throw an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe it's necessary, but I'm not sure how it will be implemented right now, so I plan to leave this change to next PR which resolves function Expr name
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs
Outdated
Show resolved
Hide resolved
|
@alamb Thank you! Review comments are addressed |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @2010YOUY01 -- I had one comment on name() signature but I also think we can do that as a follow on PR. As a reminder I want to leave this open a few more days to allow for comment prior to merging
datafusion/expr/src/expr.rs
Outdated
|
|
||
| impl ScalarFunctionDefinition { | ||
| /// Function's name for display | ||
| pub fn name(&self) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to allow the callsites to decide if they needed to make the string copy -- so passing back &str I think would make for a better API:
| pub fn name(&self) -> String { | |
| pub fn name(&self) -> &str { |
datafusion/expr/src/expr.rs
Outdated
| Expr::ScalarUDF(ScalarUDF { fun, args }) => { | ||
| fmt_function(f, fun.name(), false, args, true) | ||
| Expr::ScalarFunction(ScalarFunction { func_def, args }) => { | ||
| fmt_function(f, &func_def.name(), false, args, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example, this callsite doesn't need an owned String, &str would work well
Expr::ScalarFunction and Expr::ScalarUDFExpr::ScalarFunction and Expr::ScalarUDF, introduce unresolved functions by name
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great -- thanks again @2010YOUY01
Which issue does this PR close?
Part of #8157
Rationale for this change
See #8157 for the rationale:
Creation of
Exprs forBuiltinScalarFunctionscan be done statelesslyHowever, we can't do that for ScalarUDFs because they are registered within
SessionState, it's only possible to doFor the ongoing plan to migrate functions from
BuiltinScalarFunctiontoScalarUDFbased implementation (ref #8045 ), we would like theScalarUDFbased functions to continue to support the originalExprconstruction API, so theExprstruct should have a variant for unresolved name, and then we can add aAnalyzerRuleto resolve the function.This PR only include refactor preparation for UDF
Exprsupport.What changes are included in this PR?
Expr::ScalarFunctionExpr::ScalarUDFThis way we can support unresolved function name variant.
Also, merge the execution path for
BuiltinScalarFunction/ScalarUDF, to make future work towards #8045 a bit easierAre these changes tested?
Covered by existing tests.
Are there any user-facing changes?
Yes, see comments for API changes