Optimize maybe_macroexpand
#1991
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR optimizes and fixes inference for
maybe_macroexpandandsplit_funcname. This reduces the time to first X (TTFX) and running time.The reason that better type inference causes lower TTFX is because
maybe_macroexpandis called somewhere in theSessionActions.open. Via theprecompilethat we've put onSessionActions.open, the compiler will use type inference to figure out what methods it needs to be compile. However, if the compiler cannot infer some type when stepping through all the function bodies, it cannot compile the called methods. For example, we can make a functiongwhich callsfbut make the object that is passed intofnon-inferable viaBase.inferencebarrierand callprecompileon the outer function:Now, we can see that Julia didn't compile any specializations:
The compilation will happen just-in-time when we call
g:On the other hand, without the inferencebarrier, precompilation hits the inner and outer function straight away:
Another thing that saved surprisingly much compilation time is getting rid of broadcasting. It sounds a bit annoying and I love broadcasting too, but in some cases is it worth it if it means extra compilation time for every person that is running Pluto? See also Chris saying "I'm the compiler now!" (SciML/OrdinaryDiffEq.jl#1465). I don't think we have to get rid of all broadcasts, but mainly on places where type inference is falling back to
Anyand gives up precompiling. These cases can be quite easily found via JET.jl and@report_opt annotate_types=true SessionActions.open. Just search for failed to optimize in the output.Anyway, time for the benchmarks:
maybe_macroexpandBranch
mainBranch
rh/maybe_macroexpand(this PR)So, that's about 50 MiB reduction in allocations on
SessionActions.open.cc: @ghaetinger