-
-
Notifications
You must be signed in to change notification settings - Fork 120
Unified nlsolve! implementation #325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I forgot to mention that I added an additional status In the latest commit I moved the implementation of nlsolve! to different functions, such that a similar logic can be used for general |
|
This needed to happen.
Good
Yes, I think in one of the issues I mentioned we should do this.
I like that
Makes sense.
So then we should just do a semi-implicit nlsolver as well?
Interesting. |
I think that's the way to go. Commit to it being generic and make specific caches add the things they need. My goal here is to get a documentable interface for adding new nlsolvers.
This one I'll have to think about. For now, having the integrator isn't bad, so that's fine. Eventually making it integrator free could make it useful outside of diffeq? Maybe. |
|
Let's get other eyeballs on this. How does it interact with DAEs? How does it interact with LHL? Are there any more changes people can think of? I think it looks good, but if there's a use case that needs something more than this then we should plan for it since at this point we'd know what those uses are. |
|
I just checked this out and it seems that DAEs can be handled better in this approach. Now regarding the conflicting PRs all around, I have a way -
A function Base.getproperty(nls::NLSolver,s::Symbol)
if s === :uf
return nls.cache.uf
else
return getfield(nls,s)
end
end
Then the next milestone would be DAEs. I get a good feeling about this 😄 |
|
I still don't see why DAEs cannot use the same nlsolver if we try hard enough. It's exactly the same problem. In ODEs, you move everything to one side and you get |
Yes we can, and if this PR takes more time than we expect, I can make the changes by then. I have some of the changes already done, I will complete that and show that in a PR. |
I think we should approach these PRs in a different order. IMO SciML/OrdinaryDiffEq.jl#876 should be merged first, since tests pass and it does not require any new DiffEqBase release, so it's safe to do. I'm still confused about some aspects of the PseudoNLsolver changes, and I think we should prioritize the general NLsolver changes in DiffEqBase. IMO, the most important thing is to change the DiffEqBase version bounds in OrdinaryDiffEq and StochasticDiffEq to the standard way of using Caret specifiers such that we can release breaking changes of NLsolver without breaking these packages. Then we should update the existing drafts for fixing OrdinaryDiffEq and StochasticDiffEq, which should only require replacing some field names and accesses. IMO we should try to keep any algorithm changes or improvements in a separate PRs and just make sure that we have some version ready that plays niceley with the DiffEqBase changes. I quickly did this partly to be able to run the mass matrix tests in OrdinaryDiffEq. I think other features should only be considered if we think they might require additions/changes in the DiffEqBase PR; then we could change that PR. If we agree on it, we release it with a version number that is not supported by OrdinaryDiffEq and StochasticDiffEq. Then we merge the fixes in OrdinaryDiffEq and StochasticDiffEq, and release a version that is compatible with the latest DiffEqBase version. And then we add new features and improvements in the downstream packages. |
|
I've merged SciML/OrdinaryDiffEq.jl#876 and started to fix the upper bounds of the dependencies of OrdinaryDiffEq and StochasticDiffEq. IMO, otherwise the main thing left to do is to check if the current design of this PR is sufficient for DAEs as well.
I've never worked with DAEs, so I might be wrong, but wouldn't just dispatching on Thinking a bit more about this, it seems slightly weird to have such ODE-specific algorithms in DiffEqBase. Maybe the actual implementations of the algorithms (or maybe rather the |
Yes they are used in StochasticDiffEq which is why they were moved here. We could have a StarDiffEq.jl for high level *DiffEq handling, but that hasn't seemed necessary yet.
We could do that, but I'm not convinced we even have to. It should be even simpler. |
|
Yes, looking at the algorithms, I don't think we need to have any new perform steps for DAEs. It should work by just chainging |
|
For a quick reference, take a look at DASSL: https://www.osti.gov/servlets/purl/5882821 . We might need to add a hook for updating |
| # check convergence and divergence criteria | ||
| check_status!(nlsolver, integrator) | ||
| end | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have forgot to update z in the last step.
function nlsolve!(nlsolver::NLSolver{algType, iip}, integrator) where {algType, iip}
preamble!(nlsolver, integrator)
while get_status(nlsolver) === SlowConvergence
# (possibly modify and) accept step
apply_step!(nlsolver, integrator)
# compute next iterate
perform_step!(nlsolver, integrator)
# check convergence and divergence criteria
check_status!(nlsolver, integrator)
end
if iip
recursivecopy!(nlsolver.z, nlsolver.gz)
else
nlsolver.z = nlsolver.gz
end
postamble!(nlsolver, integrator)
endThis one should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I had noticed this issue as well some days ago. Unfortunately, there are still some issues However, I've come to believe that it's wrong to push such a huge update to DiffEqBase directly. IMO we should rather update OrdinaryDiffEq to the new algorithm design without touching DiffEqBase, and then move these changes upstream. In that way it's also easier to integrate DAE support in the redesign as well. As I mentioned in the PR and on Slack, I'm currently preparing such a branch of OrdinaryDiffEq that also adds the different step by step to separate the different parts of these changes more clearly.
I hope to be done until the end of the week with it, but since I'm currently on vacation and travelling around, it might take some additional days...
Since the changes of the non-linear solvers are going to be breaking anyway, I took this to the extreme and changed the structure of the non-linear solvers even more fundamentally than in #315.
Basically, with this PR we get a unified implementation of
nlsolve!for generalNLSolverstructs, leading to identical convergence and divergence checks for different algorithms. New algorithms only have to implement aperform_step!(nlsolver, integrator, iter)function that computes the next iteratenlsolver.zbased on the previous valuenlsolver.zprev. Implementations can be optimized, e.g., by caching residuals and implementingnorm_of_residuals(nlsolver, integrator)as well, which otherwise by default just calls intocalculate_residualswithnlsolver.zprevandnlsolver.zand tolerances and norm ofintegrator. Moreover, further customization is possible by implementingpreamble!(nlsolver, integrator)(which, e.g., can also be used for caching precalculated values) andloopheader!(nlsolver, integrator, iter)(which by default only updatesnlsolver.zprevwithnlsolver.zbut can, e.g., be used to perform Anderson acceleration).Since the
nlsolve!interface is so generic, I moved most fields (such asdzandk) to the cache and only kept the fields inNLSolverthat are part of the problem (tmp,gamma, andc), the convergence/divergence checks (kappa,eta_old,fast_convergence_cutoff), or the general algorithm logic (z,zprev,max_iter,nl_iters,status) andcacheandalg. This choice is of course debatable and I'm not sure if it's the best one. I think, one could argue that fields that are probably used by most algorithms (such asdzork, maybe?) should be added to theNLSolverstruct; however,kis not needed for the out-of-place methods and in some methods one does not computedzexplicitly while computing the next iterate, so there might be cases in which one would not need these fields. On the other hand, one could also argue thatpanddtare part of the mathematical problem formulationdt⋅f(tmp + γ⋅z, p, t + c⋅dt) - z = 0, and hence should be part of theNLSolverstruct as well. Maybe one would even want a solver that is almost independent from theintegrator, such that only values such astmp,tor other values in the cache are updated from the integrator at the start ofnlsolve!and then the algorithm runs (almost) independently and just returns a solutionzat the end.In addition to the generic
nlsolve!this PR contains the following changes (which are also debatable):uis never touched, in contrast to the current implementation which actually changesintegrator.uin https://github.com/JuliaDiffEq/DiffEqBase.jl/blob/master/src/nlsolve/functional.jl#L205 (but only for the in-place version, so there exists an inconsistency between both versions at the moment)z - zprevis computed based on the maximum values ofzandzprev(at the moment, in lines such as https://github.com/JuliaDiffEq/DiffEqBase.jl/blob/master/src/nlsolve/functional.jl#L221 it's based onu(which changes in every iteration only in the in-place variant) anduprev)G(z_k) - z_kinstead ofz_{k+1} - z_kiipnlsolveandoopnlsolveare replaced withbuild_nlsolverwhich has a signature very similar toalg_cachebuild_jac_configandbuild_nlsolver(andbuild_solution),get_iip_ufandget_oop_ufare replaced withbuild_ufJandWtobuild_nlsolver, the implementation ofbuild_nlsolvercallsbuild_J_WforNLNewtondroptol,fast_convergence_cutoff, etc.) whose types are not fixed (such asaa_start) are converted to the appropriate unitless types ofuortand added to theNLSolverstruct if they are used for the convergence/divergence checks or the caches otherwisedroptolis changed to1e10(according to Walker's implementation of Anderson acceleration, see the latest changes in NLsolve)Base.resize!(nlcache, i), which is the default fallback ofBase.resize!(nlcache, nlsolver, integrator, i), which can be implemented instead if the information in the cache alone is not sufficient