Skip to content

Use DifferentiationInterface.jl #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1-Bart-1 opened this issue Feb 20, 2025 · 20 comments · Fixed by #174
Closed

Use DifferentiationInterface.jl #162

1-Bart-1 opened this issue Feb 20, 2025 · 20 comments · Fixed by #174
Labels
enhancement New feature or request

Comments

@1-Bart-1
Copy link

I would like to experiment with using FiniteDiff, as I expect it to be faster in my case. So I want to be able to provide an ADType which is then used both internally in ModelPredictiveControl.jl and JuMP.
https://github.com/JuliaDiff/DifferentiationInterface.jl?tab=readme-ov-file

@franckgaga
Copy link
Member

Yes that's also in my TODO list!

In the last commits, I struggled to completely remove the memory allocations at runtime with ForwardDiff.jl (and without relying on StaticArray). It seems that DifferentiationInterface.jl implementation is cautious with the in-place differentiation APIs.

It's also possible to completely customize the AD that JuMP uses by changing the defaults here:

@operator(model, operator, dim, f[, ∇f[, ∇²f]])

Another advantage would be to provide the Hessian (or not, it would be an option). Right now the Hessian is skipped and Ipopt uses a BFGS approximation.

@1-Bart-1
Copy link
Author

Why don't you want to use StaticArrays.jl? It should improve the speed of a lot of operations. They provide some kind of benchmark in their README.
https://github.com/JuliaArrays/StaticArrays.jl

@franckgaga
Copy link
Member

Yeah nice bench!

But I think it's important to prioritize the built-in arrays of Julia. That's one of the main selling point of the language!

@franckgaga
Copy link
Member

I could make the move to AbstractArray and AbstractVector although, to let the user decide.

@1-Bart-1
Copy link
Author

Letting the user decide would be a good solution.

@gdalle
Copy link

gdalle commented Feb 25, 2025

In the last commits, I struggled to completely remove the memory allocations at runtime with ForwardDiff.jl (and without relying on StaticArray). It seems that DifferentiationInterface.jl implementation is cautious with the in-place differentiation APIs.

Hi @franckgaga, can you explain what you mean by this? Happy to help if necessary.

@franckgaga
Copy link
Member

franckgaga commented Feb 25, 2025

Hi @gdalle! I'm not using DifferentionInterface.jl at all right now in the package. I directly use ForwardDiff.jl.

This related to my question here on discourse:

https://discourse.julialang.org/t/getting-rid-of-forwarddiff-jacobian-allocations-when-using-closures/123939/7

I will continue the discussion on discourse for posterity.

@franckgaga
Copy link
Member

franckgaga commented Feb 25, 2025

@gdalle False alarm, I benchmarked and @code_warntype my Jacobian buffer and there is no allocations and type-instabilities at runtime.

I will push a slightly modified version of the buffer to prepare the gound for DifferentiationInterface.jl.

BTW, excellent work with this package, the API and the documentation is simple and clear 😄

ps: closure with many captured variables can get very mindf*** haha

@gdalle
Copy link

gdalle commented Mar 9, 2025

Hi there! Now that #171 is merged, what's the next step?

@franckgaga
Copy link
Member

Not so much, the next release will probably integrate DI.jl.

About that @baggepinnen, the new default could be ForwardDiff.jl with the tracer sparsity detector, similar to the example here: https://juliadiff.org/DifferentiationInterface.jl/DifferentiationInterface/stable/tutorials/advanced/#Sparse-backends. Do you think it would be a good new default for nonlinear MPC ?

@baggepinnen
Copy link
Member

Do you think it would be a good new default for nonlinear MPC ?

yes, if a sparse transcription is used. I don't think it would make sense as a default for single shooting.

@franckgaga
Copy link
Member

franckgaga commented Mar 13, 2025

@gdalle everything work as expected with dense differentiation using AutoForwardDiff, the transition was smooth!

Now, when it comes to sparse AD, I'm hitting a weird error with a cryptic stacktrace, and I cannot reproduce it in a MWE. To reproduce the error, you can install the ModelPredictiveControl#diff_interface_debug branch and run this:

using ModelPredictiveControl, JuMP
Ts = 4.0
A =  [  0.800737  0.0       0.0  0.0
        0.0       0.606531  0.0  0.0
        0.0       0.0       0.8  0.0
        0.0       0.0       0.0  0.6    ]
Bu = [  0.378599  0.378599
        -0.291167  0.291167
        0.0       0.0
        0.0       0.0                   ]
Bd = [  0; 0; 0.5; 0.5;;                ]
C =  [  1.0  0.0  0.684   0.0
        0.0  1.0  0.0    -0.4736        ]
Dd = [  0.19; -0.148;;                  ]
Du = zeros(2,2)
f(x,u,d,p) = p.A*x + p.Bu*u + p.Bd*d
h(x,d,p)   = p.C*x + p.Dd*d
model = NonLinModel(f, h, Ts, 2, 4, 2, 1, solver=nothing, p=(;A,Bu,Bd,C,Dd))
model = setop!(model, uop=[10,10], yop=[50,30], dop=[5])
mpc = NonLinMPC(model; Hp=10, transcription=MultipleShooting())
mpc = setconstraint!(mpc, ymax=[Inf,30.5])
unset_time_limit_sec(mpc.optim)
res = sim!(mpc, 15, x̂_0=zeros(6))

It gives this error:

ERROR: UndefRefError: access to undefined reference
Stacktrace:
  [1] getindex
    @ ./essentials.jl:917 [inlined]
  [2] iterate (repeats 2 times)
    @ ./array.jl:902 [inlined]
  [3] iterate
    @ ./iterators.jl:206 [inlined]
  [4] iterate
    @ ./iterators.jl:205 [inlined]
  [5] jacobian_tracers_to_matrix(xt::Vector{…}, yt::Vector{…})
    @ SparseConnectivityTracer ~/.julia/packages/SparseConnectivityTracer/aHOLr/src/parse_outputs_to_matrix.jl:17
  [6] _jacobian_sparsity(f!::Function, y::Vector{…}, x::Vector{…}, ::Type{…})
    @ SparseConnectivityTracer ~/.julia/packages/SparseConnectivityTracer/aHOLr/src/trace_functions.jl:88
  [7] jacobian_sparsity
    @ ~/.julia/packages/SparseConnectivityTracer/aHOLr/src/adtypes_interface.jl:60 [inlined]
  [8] _prepare_sparse_jacobian_aux(::DifferentiationInterface.PushforwardFast, ::Vector{…}, ::Tuple{…}, ::ADTypes.AutoSparse{…}, ::Vector{…})
    @ DifferentiationInterfaceSparseMatrixColoringsExt ~/.julia/packages/DifferentiationInterface/qrWdQ/ext/DifferentiationInterfaceSparseMatrixColoringsExt/jacobian.jl:61
  [9] prepare_jacobian
    @ ~/.julia/packages/DifferentiationInterface/qrWdQ/ext/DifferentiationInterfaceSparseMatrixColoringsExt/jacobian.jl:49 [inlined]
 [10] get_optim_functions(mpc::NonLinMPC{…}, optim::Model, grad_backend::ADTypes.AutoForwardDiff{…}, jac_backend::ADTypes.AutoSparse{…})
    @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:681
 [11] init_optimization!(mpc::NonLinMPC{…}, model::NonLinModel{…}, optim::Model, gradient::ADTypes.AutoForwardDiff{…}, jacobian::ADTypes.AutoSparse{…})
    @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:540
 [12] (NonLinMPC{})(estim::UnscentedKalmanFilter{…}, Hp::Int64, Hc::Int64, M_Hp::Matrix{…}, N_Hc::Matrix{…}, L_Hp::Matrix{…}, Cwt::Float64, Ewt::Float64, JE::ModelPredictiveControl.var"#141#145", gc!::ModelPredictiveControl.var"#142#146", nc::Int64, p::@NamedTuple{}, transcription::MultipleShooting, optim::Model, gradient::ADTypes.AutoForwardDiff{…}, jacobian::ADTypes.AutoSparse{…})
    @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:119
 [13] NonLinMPC(estim::UnscentedKalmanFilter{…}; Hp::Int64, Hc::Int64, Mwt::Vector{…}, Nwt::Vector{…}, Lwt::Vector{…}, M_Hp::Matrix{…}, N_Hc::Matrix{…}, L_Hp::Matrix{…}, Cwt::Float64, Ewt::Float64, JE::ModelPredictiveControl.var"#141#145", gc!::ModelPredictiveControl.var"#160#164", gc::ModelPredictiveControl.var"#142#146", nc::Int64, p::@NamedTuple{}, transcription::MultipleShooting, optim::Model, gradient::ADTypes.AutoForwardDiff{…}, jacobian::ADTypes.AutoSparse{…})
    @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:378
 [14] NonLinMPC
    @ ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:346 [inlined]
 [15] #NonLinMPC#138
    @ ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:284 [inlined]
 [16] top-level scope
    @ ~/Dropbox/Programmation/Julia/TestMPC/src/newtest.jl:30
Some type information was truncated. Use `show(err)` to see complete types.

I'm a bit clueless. Some findings:

  • It does not seems to be related to PreallocationTools.jl (the DiffCache objects) since I implemented a MWE of a function with get_tmp calls and it was working well.
  • I tried replacing TracerSparsityDetector() with SparseConnectivityTracer.TracerLocalSparsityDetector() in the picture below and I get the same error

Image

Do you have any tips ?

@franckgaga
Copy link
Member

franckgaga commented Mar 13, 2025

@gdalle Ok I used the debugger with step-by-step. It seems to be a bug related to PreallocationTools.jl and SparseConnectivityTracer.jl interaction.

When we create a DiffCache with:

n = 10
a_cache = DiffCache(zeros(n),  Ncache)

it will initializes the Float64 and Dual cache with zeros. But it will not initialize the SparseConnectivityTracer.GradientTracer cache with zeros. So this cache will start with undefined values. Indeed, if I add this in the differentiated function:

println(get_tmp(a_cache, T))

i get at the beginning:

Any[#undef, #undef, #undef, #undef, #undef, #undef, #undef, #undef, #undef, #undef]

I will raise an issue on PreallocationTools.jl

edit: here SciML/PreallocationTools.jl#117

@franckgaga
Copy link
Member

@gdalle

As just mentionned on SciML/PreallocationTools.jl#117, I have another issue. DiffCache does not support all AbstractADType. The following backend works with DiffCache:

  • AutoFiniteDiff
  • AutoForwardDiff
  • AutoReverseDiff

but not them:

  • AutoZygote
  • AutoEnzyme

It feels weird to integrate an interface to support all ADTypes but some of them won't work at all with MPC.jl.

Are you aware of any alternative to PreallocationTools.jl to reduce the allocations in a fonction that will be called with floats, duals (and also SparseConnectivityTracer.GradientTracer) ?

@gdalle
Copy link

gdalle commented Mar 13, 2025

The idea behind the context type DifferentiationInterface.Cache was precisely to replace PreallocationTools, at least in simple cases.
AT THE MOMENT it still allocates at every call, but with DI's preparation mechanism it is fairly simple to solve that by allocating only once during preparation. There hadn't been any demand so far so I didn't get around to it, but you're the second person to ask that today so I might give it a shot.

@franckgaga
Copy link
Member

Okay I'll take a look at the DI.Cache to see if it's applicable here.

@franckgaga
Copy link
Member

franckgaga commented Mar 13, 2025

Yes ! I can rely on Cache for my application. I do not need PreallocationTools anymore. One less dependency!

Yes, for sure, it would be perfect that the Cache are allocated once only at preparation! Keep up the good work!

edit: and the code is simpler without the multiple get_tmp calls also!

@gdalle
Copy link

gdalle commented Mar 13, 2025

I now have a clear view of how it can be done. I should be able to finish it tomorrow

@gdalle
Copy link

gdalle commented Mar 14, 2025

@franckgaga wanna take this branch for a spin, see if I got rid of all the allocations?

JuliaDiff/DifferentiationInterface.jl#741

@gdalle
Copy link

gdalle commented Mar 15, 2025

The necessary changes are released

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
4 participants