Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding linsolve option for trust region method solver #278

Open
kpobrien opened this issue Jan 18, 2022 · 5 comments
Open

adding linsolve option for trust region method solver #278

kpobrien opened this issue Jan 18, 2022 · 5 comments

Comments

@kpobrien
Copy link

Hi, thanks for writing and maintaining this package! I'd like to get input on adding a linsolve option to the trust region method solver, identical to the linsolve option already available for the newton method solver.

A compelling use case is to perform a KLU factorization or other non-default factorizations before solving the linear system. For some problems, this leads to improvements in speed and memory usage. The proposed changes don't change the behavior or speed when used with default options. For example:

default linear solver

@time out=nlsolve(od,method = :trust_region,x)
14.935686 seconds (25.79 k allocations: 20.709 GiB, 1.04% gc time)
12.432642 seconds (25.83 k allocations: 20.709 GiB, 0.39% gc time)
15.237596 seconds (25.93 k allocations: 20.709 GiB, 0.35% gc time)

default behavior with changes below

@time out=nlsolve(od,method = :trust_region,x)
13.542611 seconds (25.98 k allocations: 20.709 GiB, 0.38% gc time)
15.107630 seconds (25.96 k allocations: 20.709 GiB, 0.36% gc time)
13.196935 seconds (25.96 k allocations: 20.709 GiB, 0.40% gc time)

perform KLU before solving the linear system with changes below

@time out=nlsolve(od,method = :trust_region,x,linsolve=(x, A, b) -> copyto!(x,klu(A)\b))
8.797286 seconds (28.68 k allocations: 13.647 GiB, 0.92% gc time)
8.853783 seconds (28.64 k allocations: 13.647 GiB, 0.55% gc time)
8.579454 seconds (28.58 k allocations: 13.647 GiB, 0.67% gc time)

Here are the diffs for the two files I changed, nlsolve.jl and trust_region.jl. I can make a PR if that's easier.

diff solvers/trust_region.jl ~/.julia/packages/NLsolve/gJL1I/src/solvers/trust_region.jl
48c48
<                  r, d, J, delta::Real)
---
>                  r, d, J, delta::Real, linsolve)
51c51
<         copyto!(p_i, J \ vec(r)) # Gauss-Newton step
---
>         linsolve(p_i, J, vec(r)) # Gauss-Newton step
116a117
>                        linsolve,
170c171
<         dogleg!(cache.p, cache.p_c, cache.pi, cache.r, cache.d, jacobian(df), delta)
---
>         dogleg!(cache.p, cache.p_c, cache.pi, cache.r, cache.d, jacobian(df), delta, linsolve)
234,235c235,237
<                       cache = NewtonTrustRegionCache(df)) where T
<     trust_region_(df, initial_x, convert(real(T), xtol), convert(real(T), ftol), iterations, store_trace, show_trace, extended_trace, convert(real(T), factor), autoscale, cache)
---
>                       cache = NewtonTrustRegionCache(df);
>                       linsolve=(x, A, b) -> copyto!(x, A\b)) where T
>     trust_region_(df, initial_x, convert(real(T), xtol), convert(real(T), ftol), iterations, store_trace, show_trace, extended_trace, convert(real(T), factor), autoscale, linsolve, cache)
diff nlsolve/nlsolve.jl ~/.julia/packages/NLsolve/gJL1I/src/nlsolve/nlsolve.jl
28c28
<                      autoscale)
---
>                      autoscale; linsolve=linsolve)
@ChrisRackauckas
Copy link
Contributor

See the proposal in JuliaNLSolvers/NLSolvers.jl#24. Giving it a bit more infrastructure like that would make things like Jacobian-Free Newton-Krylov more automatic, plus make building a bank of already wrapped linear solvers be something that comes along for the ride.

@kpobrien
Copy link
Author

Thanks for your comment, I agree the LinearSolve.jl interface is a better long term solution than passing the solver function directly. It would likely be a breaking change, because passing a linsolve function is already used in solvers/newton.jl:

function newton(df::OnceDifferentiable,
                initial_x::AbstractArray{T},
                xtol::Real,
                ftol::Real,
                iterations::Integer,
                store_trace::Bool,
                show_trace::Bool,
                extended_trace::Bool,
                linesearch,
                cache = NewtonCache(df);
                linsolve=(x, A, b) -> copyto!(x, A\b)) where T
    newton_(df, initial_x, convert(real(T), xtol), convert(real(T), ftol), iterations, store_trace, show_trace, extended_trace, linesearch, linsolve, cache)

In my opinion, we should add the linsolve option to trust_region.jl to make it consistent with newton.jl while folks consider whether to add a LinearSolve.jl style interface (and someone finds the time to do it).

Since you mentioned caching factorizations in your comment at the link, caching helps nlsolve significantly and can be implemented with the above proposal.

without caching the factorization

@time out=nlsolve(od,method = :trust_region,x,linsolve=(x, A, b) -> copyto!(x,klu(A)\b))
8.595526 seconds (28.64 k allocations: 13.647 GiB, 3.96% gc time)
8.620742 seconds (28.58 k allocations: 13.647 GiB, 0.45% gc time)
8.982089 seconds (28.64 k allocations: 13.647 GiB, 3.81% gc time)

with caching the factorization (sparsity structure does not change)

F = klu(J)
@time out=nlsolve(od,method = :trust_region,x,linsolve=(x, A, b) -> copyto!(x,(klu!(F,A);F\b)))
6.310709 seconds (25.54 k allocations: 9.697 GiB, 2.86% gc time)
4.685632 seconds (25.52 k allocations: 9.697 GiB, 1.01% gc time)
6.036442 seconds (25.52 k allocations: 9.697 GiB, 0.55% gc time)

@pkofod
Copy link
Member

pkofod commented Feb 10, 2022

could you provide the example?

@kpobrien
Copy link
Author

The example from above is too deeply embedded in other code to easily extricate. Here is a very very contrived example (solving a linear equation) which also is faster with that change.

using SparseArrays
using NLsolve
using LinearAlgebra
using KLU
using BenchmarkTools


m = 50
C = Diagonal(rand(Complex{Float64},m)) + sprand(Complex{Float64},m,m,0.45)
d = rand(Complex{Float64},m)
x = zeros(Complex{Float64},m)  
F = zeros(Complex{Float64},m)  
J = spzeros(Complex{Float64},m,m)

function calcf!(F,x,C,d)
    F .= C*x-d
end

function calcj!(J,x,C,d)
    J .= C
    return nothing
end

function f!(F,x)
    calcf!(F,x,C,d)
    return nothing
end

function j!(J,x)
    calcj!(J,x,C,d)
    return nothing
end

j!(J,x)
FK = klu(J)
df = OnceDifferentiable(f!, j!, x, F, J);

@btime nlsolve(df,method = :trust_region,x,linsolve=(x, A, b) -> copyto!(x,(klu!(FK,A);FK\b)));
@btime nlsolve(df,method = :trust_region,x);

280.979 μs (356 allocations: 71.06 KiB)
928.750 μs (352 allocations: 1.85 MiB)

If you change the method to :newton you can see the improvement without changing any NLsolve.jl code. You should see a benefit for sparse matrices anytime a factorization is faster when reusing the symbolic factorization. For example, when the sum of the last two values is smaller than either of the previous two.

@btime C \ d;
@btime FK = klu(C);
@btime klu!(FK,C);
@btime FK\d;

188.056 μs (72 allocations: 448.84 KiB)
93.439 μs (76 allocations: 114.13 KiB)
45.093 μs (52 allocations: 4.12 KiB)
3.784 μs (20 allocations: 2.27 KiB)

Let me know if you don't find this example convincing and I will make a nonlinear example.

@kpobrien
Copy link
Author

It's also a little bit faster for your test case in test/sparse.jl

using NLsolve

function f_sparse!(F, x)
    F[1] = (x[1]+3)*(x[2]^3-7)+18
    F[2] = sin(x[2]*exp(x[1])-1)
end

function g_sparse!(J, x)
    J[1, 1] = x[2]^3-7
    J[1, 2] = 3*x[2]^2*(x[1]+3)
    u = exp(x[1])*cos(x[2]*exp(x[1])-1)
    J[2, 1] = x[2]*u
    J[2, 2] = u
end

J = spzeros(2, 2)
x = rand(2)
F = rand(2)

g_sparse!(J,x)
FK = klu(J);
df = OnceDifferentiable(f_sparse!, g_sparse!, x, F, J)
@btime nlsolve(df,x,method = :trust_region,autoscale = true,linsolve=(x, A, b) -> copyto!(x,(klu!(FK,A);FK\b)));
@btime nlsolve(df,x,method = :trust_region,autoscale = true);

11.012 μs (408 allocations: 31.30 KiB)
17.567 μs (388 allocations: 82.74 KiB)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants