[RDF] Improvements for the progress bar #21181

hageboeck · 2026-02-06T16:50:03Z

I realised that we have enough information to ensure that the progress bar doesn't jump around as new files are opened:
So far, when a file was split in ranges, RDF was using the upper end of each range to update the estimated number of events, so the completion times are wrong until the last cluster is opened. Furthermore, unopened files weren't taken into account to estimate the completion.

With this PR, by getting the number of entries from the TTree or RNT datasources, and taking into account how many files haven't been opened yet, one can significantly improve this. Now, the bar doesn't jump backwards (but it might jump a bit faster/slower when the next files opened have significantly less/more events than the previous ones).

Furthermore, I did a lot of refactoring, trying to outline as much as possible into the .cxx, and reducing the times that locks are taken etc. (see list at the end). Lastly, when the output doesn't go to a tty, the width of the line and frequency of updates is reduced to not unnecessarily clutter the files.

New look

|=>          |   [Elapsed: 0:10m  files: 3 / 4  events: 42886000 / (184621239 + x)  4.29e+06 evt/s  remaining ca.: 0:47m]
|===>        |   [Elapsed: 0:20m  files: 3 / 4  events: 94453000 / (184621239 + x)  4.72e+06 evt/s  remaining ca.: 0:32m]
|======>     |   [Elapsed: 0:30m  files: 3 / 4  events: 144333000 / (184621239 + x)  4.81e+06 evt/s  remaining ca.: 0:21m]
|========>   |   [Elapsed: 0:40m  files: 4 / 4  events: 189735000 / 246161652  4.74e+06 evt/s  remaining ca.: 0:11m]
|==========> |   [Elapsed: 0:50m  files: 4 / 4  events: 226096000 / 246161652  4.52e+06 evt/s  remaining ca.: 0:04m]
[Total elapsed time: 0:56m  processed files: 4 / 4  processed evts: 246161652 / 246161652]

Check the times. The first two estimates are 57s and 52s, and it completes in 56s.
It also says + x for the total number of events until all files have been opened.

Old look

|>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |   [Elapsed time: 0:01m  processing file: 3 / 4  processed evts: 1000 / 24650850  6.03e+02 evt/s 11:20:54h  remaining time (per file being processed)]   
|=======================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |   [Elapsed time: 0:02m  processing file: 3 / 4  processed evts: 4529000 / 24650850  2.26e+06 evt/s 0:08m  remaining time (per file being processed)]   
|======================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |   [Elapsed time: 0:03m  processing file: 3 / 4  processed evts: 10675000 / 24650850  3.56e+06 evt/s 0:03m  remaining time (per file being processed)]

Check again the times: For a workflow of about 1 minute, the estimated times in the first 3 updates are: 11h, 10s and 6s. I understand that this is per file, but the above seems to do a better job.
And the width when written to a file seems to be somehow arbitrary. 🙂

More details of the refactoring

Handle locking logic for updates with a single RAII, update locks to C++17.
Remove a lock and mutex that didn't have any effect.
Reduce repeated function calls to functions that hold locks.
Simplify computation of average number of events.
Relax memory order of the atomics to what's necessary.
Outline as many functions as possible. Since RDFHelpers.hxx goes
through JITting and pcms, it gets compiled frequently, so outlining is
probably beneficial.
Make a lot of members const to avoid unintenional modifications. This
might be interesting because it's an MT context.
Collect helper functions in one anonymous namespace.
Remove a constructor argument that was ignored.
Bring back evts/s to the final line that's printed.

vepadulano

Very cool! Some minor comments for consideration

tree/dataframe/inc/ROOT/RDF/RSampleInfo.hxx

tree/dataframe/inc/ROOT/RDFHelpers.hxx

github-actions · 2026-02-06T19:36:00Z

Test Results

21 files 21 suites 3d 6h 29m 19s ⏱️
3 787 tests 3 786 ✅ 0 💤 1 ❌
72 740 runs 72 739 ✅ 0 💤 1 ❌

For more details on these failures, see this check.

Results for commit f672564.

♻️ This comment has been updated with latest results.

TomasDado · 2026-02-09T08:39:35Z

Thanks for these updates, this is great! One comemnt that we discussed at some point, would it be possible to add an option for the users to pass the total number of events (if they know it)? We often need to open each file before we run RDF to build the file lists and the total sum of weights for normalisation. We could also trivially collect the total number of events.

hageboeck · 2026-02-10T09:14:16Z

Thanks for these updates, this is great! One comemnt that we discussed at some point, would it be possible to add an option for the users to pass the total number of events (if they know it)? We often need to open each file before we run RDF to build the file lists and the total sum of weights for normalisation. We could also trivially collect the total number of events.

Hello @TomasDado, yes, it hasn't been forgotten. 🙂
Let's first merge this one, and then we can use some of the infrastructure here to deal with hints by the users.

When samples are split in multiple ranges, the progress bar takes very long to find out the total number of events. By adding tree->GetEntries() (and the RNTuple equivalent), the total number of events is known as soon as a file is opened.

This removes redundant information from the line printed to the terminal and shortens the output. Otherwise, the progress bar frequently overflows the terminal. When the progress bar prints to a file, reduce the frequency to every 10 seconds, and limit its width to 60 chars to avoid cluttering the file. Furthermore, significantly improve how completion is estimated. The following two heuristics are employed: - Files that have been opened count as fractionOfFilesAlreadyOpened * eventsProcessed/totalEvents This tracks the progress of all files for which the number of events has been seen. - Files that have not been opened count as 1/totalFiles until they have been opened. This means that the progress bar e.g. can't reach 50% if half of the files haven't been opened yet. This change significantly reduces the jumps of the progress bar when a new file is opened. Finally, the code was refactored: - Handle locking logic for updates with a single RAII, update locks to C++17. - Remove a lock and mutex that didn't have any effect. - Reduce repeated function calls to functions that hold locks. - Simplify computation of average number of events. - Relax memory order of the atomics to what's necessary. - Outline as many functions as possible. Since RDFHelpers.hxx goes through JITting and pcms, it gets compiled frequently, so outlining is probably beneficial. - Make a lot of members const to avoid unintenional modifications. This makes it clear that no mutex needs to be locked to read these. - Collect helper functions in one anonymous namespace. - Remove a constructor argument that was ignored.

vepadulano

Thank you, this is very cool!

hageboeck self-assigned this Feb 6, 2026

hageboeck force-pushed the progressBarImprovements branch from 4dc45e4 to 56f0fa2 Compare February 6, 2026 16:56

vepadulano requested changes Feb 6, 2026

View reviewed changes

hageboeck force-pushed the progressBarImprovements branch 4 times, most recently from 7a04681 to 8f0378c Compare February 10, 2026 12:52

hageboeck force-pushed the progressBarImprovements branch from 8f0378c to f672564 Compare February 10, 2026 12:55

vepadulano approved these changes Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RDF] Improvements for the progress bar #21181

[RDF] Improvements for the progress bar #21181

hageboeck commented Feb 6, 2026 •

edited

Loading

Uh oh!

vepadulano left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

TomasDado commented Feb 9, 2026

Uh oh!

hageboeck commented Feb 10, 2026

Uh oh!

vepadulano left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[RDF] Improvements for the progress bar #21181

Are you sure you want to change the base?

[RDF] Improvements for the progress bar #21181

Conversation

hageboeck commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New look

Old look

More details of the refactoring

Uh oh!

vepadulano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

TomasDado commented Feb 9, 2026

Uh oh!

hageboeck commented Feb 10, 2026

Uh oh!

vepadulano left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hageboeck commented Feb 6, 2026 •

edited

Loading

github-actions bot commented Feb 6, 2026 •

edited

Loading