Skip to content

Conversation

@notatallshaw
Copy link
Member

@notatallshaw notatallshaw commented Oct 14, 2024

Fixes: #12993
Fixes: #12990
Fixes: #12430
Fixes: #13030

This PR is built on top of #12982 so that the unit tests can be expanded, either that PR can be reviewed first, or this PR can supplant that PR.

I have developed some benchmark scripts to ensure that changes to pip's resolution algorithm don't regress common real world requirements: https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks.

I plan to keep building out more scenarios, you can see the current ones so far here: https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks/tree/main/scenarios

Upon testing this PR compared to pip 24.2 I see one small regressions and two big improvements:

Difference for scenario scenarios/problematic.toml - autogluon:
    	Success: False -> True.
    	Failure Reason: Build Failure -> None.

Difference for scenario scenarios/problematic.toml - boto3-urllib3-transient:
    	Number of packages processed: 869 -> 871

Difference for scenario scenarios/big-packages.toml - apache-airflow-all:
    	Number of requirements processed: 593 -> 592
    	Number of packages processed: 681 -> 661

The fact that autogluon can resolve is a big improvement, apache-airflow[all] gets a noticeable improvement in how many packages it has to process (and this has real time improvement, as the number of packages processed can have O(n^2) complexity) , and a scenario involving boto3 and urllib3 as transient requirements gets a small regression in having to process 2 more packages.

I am hoping to find more real world scenarios where this has a noticeable difference, but I think these results are sufficient to show this approach is a net positive.

@notatallshaw
Copy link
Member Author

notatallshaw commented Oct 14, 2024

Very tentatively adding this to the 24.3 milestone on the basis of:

But I understand if no maintainer is available to review.

@notatallshaw
Copy link
Member Author

notatallshaw commented Oct 15, 2024

Added more problematic scenarios in: https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks/blob/main/scenarios/problematic.toml

And found this also fixes #12430 (which was merged into another issue, but the specific resolution the user had is now solved by this).

@potiuk
Copy link
Contributor

potiuk commented Oct 15, 2024

I do not know pip resiolution internals - but the rules explained make sense and might improve a number of cases indeed.

@notatallshaw
Copy link
Member Author

notatallshaw commented Oct 18, 2024

I took a look to see whether it made any difference to put upper bound preference above or below backtracking cause preference, and at least in the scenarios I currently have in https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks/blob/main/scenarios it didn't make any significant difference (there was a very slight regression of apache-airflow-beam putting it below, as it visited 1 extra package).

So I consider this good in its current position, and if I find a scenario in the future, or a user reports one, where it does make a significant difference, then it can be changed.

@notatallshaw
Copy link
Member Author

Found a minor improvement, in acryl-datahub[all] which has over 300 total dependencies, it visited 1 less requirement, 6 less packages, and produced a slightly better solution: notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks#2 (comment)

@sbidoul
Copy link
Member

sbidoul commented Oct 26, 2024

While this looks very reasonable I'd prefer to have another resolver expert (which I am not, unfortunately) to look into this. So postponing.

@sbidoul sbidoul modified the milestones: 24.3, 25.0 Oct 26, 2024
@notatallshaw
Copy link
Member Author

While this looks very reasonable I'd prefer to have another resolver expert (which I am not, unfortunately) to look into this. So postponing.

I knew this one was pretty unlikely but I thought I'd give it a shot since the recent real world issues raised that this solves.

@notatallshaw
Copy link
Member Author

notatallshaw commented Nov 10, 2024

Going to make a single follow up PR once #13001 lands, I'll comment here once done.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 26, 2024
@notatallshaw notatallshaw removed this from the 25.0 milestone Dec 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.