Skip to content

Improve efficiency of multiple optimizer passes #3892

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Improve efficiency of multiple optimizer passes

Describe the solution you'd like

Thanks to @isidentical for this suggestion.

Instead of having the optimizer decide when it is done by seeing if the last pass changed the plan or not, based on the Display representation of the plan, it might also make sense to compute a unique plan id (bottom up) so that we can also use this to detect optimization cycles.

A very basic example is (assuming each letter is a unique plan id) A -> B -> C -> A -> B -> [max passes times more], where even though the previous plan is different from the current one we would still need to exit the loop. Having a unique id would mean we can just store a set somewhere and check against if known_plans.contains(new_plan.id) and it would break the loop.

Describe alternatives you've considered

Additional context
Discussion at https://github.com/apache/arrow-datafusion/pull/3880/files#r998491734

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions