Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convergence logging #27

Open
mileslucas opened this issue Mar 25, 2020 · 25 comments · May be fixed by #32
Open

Convergence logging #27

mileslucas opened this issue Mar 25, 2020 · 25 comments · May be fixed by #32

Comments

@mileslucas
Copy link

mileslucas commented Mar 25, 2020

Feature Request

Being able to log progress without having a set number of iterations would be great. Currently AbstractMCMC is using ProgressLogging as a backend and they've just added convergence sampling.

This is currently implemented in ProgressMeters.jl. I really like how it spits out the total time information in addition to just updating current values vs. threshold.

Example Usage

Something like this:

cur = 0
goal = 10
@withprogresslogging while true
    @logprogress cur goal
    cur += rand()
    cur < goal || break
end
@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

It sounds like a good addition. I think a straightforward implementation would be to add a new type like ProgressLogging.Progress. Something like Convergence? Descent/Ascent?

I think the surface syntax requires a separate macro; e.g., @logconvergence, @logdescent, @logascent.

@c42f
Copy link
Member

c42f commented Mar 26, 2020

Something like Convergence?

This sounds good to me.

Convergence logging may also be interesting to TensorBoardLogger users and devs who at the moment have their own custom wrapper types and dispatch rules. @PhilipVinc do you have any thoughts?

@mileslucas
Copy link
Author

In ProgressMeter they use threshold. So

@logthreshold cur goal

has the "least surprising" behavior to me, although I see the keyword threshold is tied up with the @progress macro, so it may complicate things here...

@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

How about Descent/Ascent? Do we want to support only the minimization case? Optim has maximize function so I thought it might make sense to have different directions. Another possibility is to have a boolean flag in Convergence struct.

@c42f
Copy link
Member

c42f commented Mar 26, 2020

My thought about threshold is that there's a lot of different ways to measure convergence and it would be nice to have a more general wrapper for convergence measures. I guess Optim.jl is a good place to look? For example here https://julianlsolvers.github.io/Optim.jl/stable/#examples/generated/ipnewton_basics/ they list convergence criteria in the output reports such as

 * Status: success

 * Candidate solution
    Minimizer: [1.00e+00, 1.00e+00]
    Minimum:   5.998937e-19

 * Found with
    Algorithm:     Interior Point Newton
    Initial Point: [0.00e+00, 0.00e+00]

 * Convergence measures
    |x - x'|               = 1.50e-09 ≰ 0.0e+00
    |x - x'|/|x'|          = 1.50e-09 ≰ 0.0e+00
    |f(x) - f(x')|         = 1.80e-18 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 3.00e+00 ≰ 0.0e+00
    |g(x)|                 = 7.92e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    34
    f(x) calls:    63
    ∇f(x) calls:   63

@pkofod if we were to have a general solution for logging optimization progress, what would you need it to do?

@c42f
Copy link
Member

c42f commented Mar 26, 2020

See also JuliaNLSolvers/Optim.jl#442

@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

We can have multiple Descent instances with different IDs. So the progress monitor UI should be able to show different convergence criteria if we only define one or two convergence message types for scalar quantities.

@c42f
Copy link
Member

c42f commented Mar 26, 2020

We can have multiple Descent instances with different IDs

On the other hand, it's a single computational process with multiple convergence criteria which apply. So I would have thought the UI should present it as a single message. In that case we should also emit the progress as a single message (or the UI will need to try to figure out how to group them).

@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

If they share the same parent ID, I suppose it's not too hard for the UI to figure out the relationships? Though it's appealing that we can reduce shouldlog overhead this way. I guess this may be one of the cases where "vectorized API" is OK due to the performance reason.

@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

Hmm... I wonder if it is better to somehow combine this to table log monitor JuliaLogging/TerminalLoggers.jl#8

Peek 2019-11-11 19-07

The idea is that each "column" can have independent threshold value, descent/ascent flag, and leave-trace-or-not flag.

@PhilipVinc
Copy link
Member

PhilipVinc commented Mar 26, 2020

@c42f TensorBoard does not natively support logging convergence information like this, but I guess it would be a nice (and easy) addition.

The way TB.jl is structured, we parse one log message at a time. It would surely makes things easier if you log just one message with all the metrics in it.

@c42f
Copy link
Member

c42f commented Mar 26, 2020

The way TB.jl is structured, we parse one log message at a time. It would surely makes things easier if you log just one message with all the metrics in it.

So if you're tracking model accuracy or some other measures, the user needs to accumulate that each iteration and pass the resulting array to the logger?

@PhilipVinc
Copy link
Member

So if you're tracking model accuracy or some other measures, the user needs to accumulate that each iteration and pass the resulting array to the logger?

No, sorry, I simply meant that we do not have a special-cased convergence object. You can simply log at every iteration your loss function and those will all be displayed. For example

struct ConvergenceLog
 L1
 L2
 weirdnorm
end

with_logger(TBLogger()) begin
  for i=1:100
    loss = ConvergenceLog(exp(-i), exp(-2i), rand())
    @info "" loss
  end
end

will by default create 3 plots, one for L1, one for L2 and one for weird norm. Like anything else.

@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

@PhilipVinc You need to pass log_step_increment=0 to log multiple objects in one iteration, right? It may be what @c42f meant by "accumulate."

"Vectorized" message type is indeed more TensorBoardLogger.jl friendly. Although TensorBoardLogger.jl needs to understand that it is an "interface type" rather than a "data type" (ref StructTypes.jl). More concretely, the message type would be something like

struct Convergence
    names::Vector{String}
    values::Vector{Float64}
    thresholds::Vector{Float64}
    descending::Vector{Bool}  # or maybe directions::Vector{Int8} that contains -1, 0, or 1 (0: no threshold)
    id::UUID
    parentid::UUID
    done::Bool
    step::Int
end

We could make it a subtype of AbstractDict{String,Float64} s.t. (c::Convergence)[name] == c.values[findfirst(name, c.names)] so that it works without a special handling in TensorBoardLogger.jl. But it's strange that most of semantic data is ignored at the interface type level.

(Edit: step::Int added)

@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

It's probably better to add step field, as it may be costly to log it for each iteration.

@PhilipVinc
Copy link
Member

TensorBoard handles Interface types (if I understand them correctly) if you define logable_propertynames(o::YourObject) to return only the properties that you want to log. That is not necessary if your object is a dict, of course.

I agree that making it a subtype of AbstractDict might be a nice addition, because then it can be iterated easily.

@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

@PhilipVinc Once this is added to ProgressLogging.jl, it would be nice if TensorBoardLogger.jl depends on ProgressLogging.jl and define appropriate methods for it. I don't think it's a good idea to make it a dict subtype or iterator of Pairs. Furthermore, IIUC, doing so won't propagate step information to TensorBoardLogger.jl.

@tkf
Copy link
Collaborator

tkf commented Mar 26, 2020

One idea of the surface syntax is something like

@logconvergence x₁/θ₁ x₂\θ₂ ... xₙ/θₙ step=i

which logs

Convergence(
    names = ["x₁", "x₂", ..., "xₙ"],
    values = [x₁, x₂, ..., xₙ],
    thresholds = [θ₁, θ₂, ..., θₙ],
    descending = [true, false, ..., true],
    step = i,
    id = ...,
    parentid = ...,
    done = false,
)

@tkf
Copy link
Collaborator

tkf commented Mar 28, 2020

Another related package is this cool looking SolverTraces.jl

by @jagot (ref [ANN] SolverTraces.jl v0.1.0 - Community / Package announcements - JuliaLang).

@jagot Do you think the data type struct Convergence in #27 (comment) covers SolverTraces.jl use cases? (I think the formatting and color scaling thing can be handled by the UI part like TerminalLoggers.jl.)

@jagot
Copy link

jagot commented Mar 30, 2020

Sure, I have something similar, but simpler: https://github.com/jagot/SolverTraces.jl/blob/master/src/tolerance.jl#L33
The colouring is performed by a linear color scale, that goes from red to green, and thus I use a variable transform to map from the logarithmic convergence scale to linear.

@pkofod
Copy link

pkofod commented Apr 5, 2020

@pkofod if we were to have a general solution for logging optimization progress, what would you need it to do?

That's a good question. SolverTraces looks pretty neat above, but I'd love to be able to simultaneously log some arrays to a file or something like that. It's often interesting to look at the specific states and gradient vectors (and even Hessian approximations) to see if it appears to progress or not, and the norms are not always enough. So I'd love to be able to send summaries to one target and larger things like arrays to another target (a file probably).

@c42f
Copy link
Member

c42f commented Apr 24, 2020

So I'd love to be able to send summaries to one target and larger things like arrays to another target (a file probably).

The right way forward here may possibly be to log the summary at progress level, and the larger arrays at @debug level? Then if you really need to see the detail, install a log filter into your active session which will filter out the debug records and dump them into a file?

The tricky thing is figuring out how to make stuff like "install a log filter" very much more simple, while not being tempted to take complete control of the application-level logging from your library APIs. I guess your package could define a convenience function for controlling logging which is applicable to your package in particular. For example

report_convergence("somefile.txt", gradients=true, hessians=true) do
    optimize_the_things(...)
end

report_convergence would install a logger which collects messages specific to your library, and if necessary can put them in the file in a format that is applicable to your users.

@mileslucas
Copy link
Author

Any updates on the ideas for this? I’d love to get this working for a few of my packages.

@tkf
Copy link
Collaborator

tkf commented May 13, 2020

Unfortunately, the discussion diverged as there are many nice-to-have things...

I think it's important to focus on a minimal format that is:

  • easy to produce
  • easy to render by various progress monitor UIs
  • "rich enough" so that it is attractive to use by the monitor UIs and various iterative algorithm packages

I still think #27 (comment) is a decent sweet-spot. I don't think handling "large arrays/objects" fits in this perspective as it's hard for various monitor UIs to render/handle. Using dedicated solutions like TensorBoardLogger.jl seems to be the best solution for this.

@mileslucas mileslucas linked a pull request Jul 22, 2020 that will close this issue
@goretkin
Copy link

goretkin commented Mar 2, 2021

In some discrete search problems, there's not necessarily a measure of remaining effort, but I would like to log the effort done so far (i.e. iteration number). This is the purpose of ProgressMeter.ProgressUnknown.

Would the step field of #27 (comment) serve that purpose?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants