Skip to content

Conversation

@jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Jun 5, 2024

Which issue does this PR close?

Part of #10731

Rationale for this change

What changes are included in this PR?

I move the test to slt if possible and trivial. Others are moved to core/tests

Are these changes tested?

Are there any user-facing changes?

Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jun 5, 2024
@jayzhan211 jayzhan211 marked this pull request as ready for review June 6, 2024 11:37
}

#[test]
fn id_array_visitor() -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand why you moved these tests to the core crate (as they rely on aggregate functions that will soon not be direct dependencies on datafusion-optimizer

However, with this change now the tests for individual passes may be spread in two places (core and optimizer) so evaluating test coverage will be harder. I think we should try to keep all the tests together

Also, I think it is quite nice that the tests are with each optimizer pass (I found it quite helpful when I was working on avoiding as many copies in the optimizer).

Options I can see:

  1. Move all the other optimizer tests over to datafusion/core
  2. find a way to leave the tests in datafusion-optimizer

One idea I had about keeping the tests in datafusion-optimizer would be to add stu aggregate functions to the optimizer crate (for example we could add a sum() AggregateUDF that had the same signature as sum but didn't actually run (would panic if we tried to create an accumulator, etc)?

The downside is that then there is a danger that the stubs don't behave the same way as the actual functions and there are bugs / limitations when they are used together...

Copy link
Contributor Author

@jayzhan211 jayzhan211 Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example we could add a sum() AggregateUDF that had the same signature as sum

Introduce another similar function for test only increase the maintain cost unless they are not changed frequently 🤔 I also prefer that the tests stay close to the code, maybe it is a acceptable tradeoff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, there are tradeoffs both way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update is that @jayzhan211 prototyped the stub approach here #10816

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants