Skip to content

Conversation

@zawadzkidiana
Copy link
Contributor

Summary

  • add exponential histogram metrics for replication timers alongside existing timer metrics
  • wire histogram recording into replication task ack manager and store paths
  • add coverage for histogram metric emission

Test plan

  • Not run (not requested)

Keep the PR focused on replication histogram code by undoing
local config and docker file changes from the original commit.
@zawadzkidiana zawadzkidiana changed the title Add replication exponential histograms migrate: add replication exponential histograms Feb 2, 2026
@zawadzkidiana zawadzkidiana changed the title migrate: add replication exponential histograms refactor: add replication exponential histograms Feb 2, 2026
zawadzkidiana and others added 4 commits February 2, 2026 13:36
Pass a clock.TimeSource into TaskStore and use it for
latency measurement and log throttling to keep timing
consistent and testable.

Co-authored-by: Cursor <[email protected]>
Inject a clock.TimeSource into TaskStore to measure
histogram latency and log throttling in a testable,
consistent way.
@zawadzkidiana zawadzkidiana force-pushed the diana/m3-histogram-replication-clean branch from df3202e to 9659bb6 Compare February 2, 2026 19:15
@@ -0,0 +1,193 @@
package replication
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to test the histograms. We can remove this file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do either one by one or include all of the changes in one PR. Right now we are doing a change in the task_ack_manager and one in the task_store. I'm fine with either going one by one, file by file or all of them, but right now we are doing two different files and not all of them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, for the cache size I think we should go with gauge and not histogram

Record replication cache size as a gauge rather than a timer
when updating cache size metrics.
Drop the exponential histogram test file per review feedback.
Emit an exponential histogram alongside the replication tasks
applied latency timer for consistent latency coverage.
@gitar-bot
Copy link

gitar-bot bot commented Feb 2, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Clean implementation of exponential histogram metrics for replication latency tracking. Follows established dual-emit migration pattern, adds proper injectable time sources for testability, and includes a sensible bug fix changing CacheSize from Timer to Gauge.

✅ 1 resolved
Quality: Inconsistent time source usage in TaskStore

📄 service/history/replication/task_store.go:135
In task_store.go, the histogram latency measurement uses time.Now() directly, while task_ack_manager.go uses t.timeSource.Now() for the same purpose. This inconsistency means:

  1. The TaskStore histogram cannot be properly tested with mocked time sources
  2. The pattern is inconsistent with the adjacent code in task_ack_manager.go

Looking at the test file metrics_exponential_histograms_test.go, the TaskAckManager test uses a MockedTimeSource that can be advanced, allowing proper testing of the histogram emission. However, the TaskStore test relies on time.Sleep(2 * time.Millisecond) in the errHydrator to ensure a non-zero duration, which is less reliable and introduces real time delays in tests.

Suggested fix:
If TaskStore has a time source field (or can be injected with one), use it consistently:

cacheLatencyStart := m.timeSource.Now()
// ...
defer func() {
    scope.ExponentialHistogram(metrics.ExponentialCacheLatency, m.timeSource.Since(cacheLatencyStart))
}()

If no time source is available, this could be acceptable, but the pattern inconsistency is worth noting for future maintainability.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants