-
Notifications
You must be signed in to change notification settings - Fork 1.8k
fix: lazy evaluation for coalesce #17357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
nuno-faria
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| # due to the reason describe in https://github.com/apache/datafusion/issues/8927, | ||
| # the following queries will fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the test below now works, I think we could move this comment to the test below this one.
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @chenkovsky and @nuno-faria -- I think this PR is quite good and probably can be merged. My only potential concern is that we may mess up comet. Let's see if we get any more comments
| 02)--TableScan: t projection=[x, y] | ||
| physical_plan | ||
| 01)ProjectionExec: expr=[coalesce(1, CAST(y@1 / x@0 AS Int64)) as coalesce(Int64(1),t.y / t.x), coalesce(2, CAST(y@1 / x@0 AS Int64)) as coalesce(Int64(2),t.y / t.x)] | ||
| 01)ProjectionExec: expr=[CASE WHEN true THEN 1 ELSE CAST(y@1 / x@0 AS Int64) END as coalesce(Int64(1),t.y / t.x), CASE WHEN true THEN 2 ELSE CAST(y@1 / x@0 AS Int64) END as coalesce(Int64(2),t.y / t.x)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this should be able to be simplifed even more -- CASE WHEN true THEN ... should go to 1
I filed a ticket to track this idea:
| )); | ||
| } | ||
|
|
||
| let n = args.len(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite a clever implementation.
However, I worry it may cause problems for comet which uses physical evaluation directly
@comphead or @andygrove do you know if comet uses the COALESCE implementation directly?
Comet was actually just thinking about coalesce: Maybe @coderfender has some thoughts about this? |
It seems like this PR does the same thing described in the comet PR Thus it seems like a good thing to merge Let's wait a while to see if anyone else has comments, otherwise I'll plan to merge this PR |
|
@alamb , @mbutrovich I made changes to comet to fallback to CASE statement to replicate |
|
Thanks everyone! |
|
I believe this PR caused the extended tests to fail (as more queries now start woring): |
This reverts commit e5dcc8c.
…ion (#18567) Note this targets the branch-51 release branch ## Which issue does this PR close? - part of #17558 - resolves #17801 in the 51 release branch ## Rationale for this change - We merged some clever rewrites for `coalesce` and `nvl2` to use `CASE` which are faster and more correct (👏 @chenkovsky @kosiew ) - However, these rewrites cause subtle schema mismatches in some cases planning (b/c the CASE simplification nullability logic can't determine the correct nullability in some cases - see #17801) - @pepijnve has some heroic efforts to fix the schema mismatch in #17813 (comment), but it is non trivial and I am worried about merging it so close to the 51 release and introducing new edge cases ## What changes are included in this PR? 1. Revert #17357 / e5dcc8c 3. Revert #17991 / ea83c26 2. Revert #18191 / 22c4214 2. Cherry-pick 6202254, a test that reproduces the schema mismatch issue (from #18536) 3. Cherry-pick 735cacf, a fix for the benchmarks that regressed due to the revert (from #17833) 4. Update datafusion-testing (see separate PR here) for extended tests (see apache/datafusion-testing#15) ## Are these changes tested? Yes I added a new test ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
Which issue does this PR close?
Rationale for this change
arguments of coalesce are evaluated eagerly.
What changes are included in this PR?
simplify coalesce to case when expr.
Are these changes tested?
UT
Are there any user-facing changes?
No