Improve metric contention perf#8077
Open
jack-berg wants to merge 3 commits intoopen-telemetry:mainfrom
Open
Conversation
jack-berg
commented
Feb 12, 2026
| // (AggregatorHolder), and so if a recording thread encounters an odd value, | ||
| // all it needs to do is release the "read lock" it just obtained (decrementing by 2), | ||
| // and then grab and record against the new current interval (AggregatorHolder). | ||
| private final AtomicInteger activeRecordingThreads = new AtomicInteger(0); |
Member
Author
There was a problem hiding this comment.
TODO: update javadoc comments to reflect current design before merging
jack-berg
commented
Feb 12, 2026
| forThread().unlock(); | ||
| } | ||
|
|
||
| @SuppressWarnings("ThreadPriorityCheck") |
Member
Author
There was a problem hiding this comment.
TODO: remove leftover annotation from a previous revision
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8077 +/- ##
============================================
+ Coverage 90.20% 90.21% +0.01%
- Complexity 7594 7595 +1
============================================
Files 841 841
Lines 22915 22943 +28
Branches 2290 2296 +6
============================================
+ Hits 20670 20698 +28
- Misses 1529 1531 +2
+ Partials 716 714 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
jack-berg
commented
Feb 13, 2026
| // all it needs to do is release the "read lock" it just obtained (decrementing by 2), | ||
| // and then grab and record against the new current interval (AggregatorHolder). | ||
| private final AtomicInteger activeRecordingThreads = new AtomicInteger(0); | ||
| private final ReentrantLock[] locks; |
Member
Author
There was a problem hiding this comment.
Thinking out loud:
Initially I went with a striped set of AtomicInt here: AtomicInteger[]
But a striped set of ReentrantLocks was more performant in the benchmark:
| Benchmark | Temporality | Cardinality | Instrument Type | With Striped Atomic Long | With Striped Reentrant Lock | Change % |
|---|---|---|---|---|---|---|
| threads1 | DELTA | 1 | COUNTER_SUM | 10624.568 | 8113.116 | -23.6% |
| threads1 | DELTA | 1 | UP_DOWN_COUNTER_SUM | 7051.987 | 7116.562 | +0.9% |
| threads1 | DELTA | 1 | GAUGE_LAST_VALUE | 3004.456 | 3579.052 | +19.1% |
| threads1 | DELTA | 1 | HISTOGRAM_EXPLICIT | 6193.367 | 5108.527 | -17.5% |
| threads1 | DELTA | 1 | HISTOGRAM_BASE2_EXPONENTIAL | 4068.654 | 3767.116 | -7.4% |
| threads1 | DELTA | 100 | COUNTER_SUM | 7345.512 | 7705.679 | +4.9% |
| threads1 | DELTA | 100 | UP_DOWN_COUNTER_SUM | 6728.076 | 7870.198 | +17.0% |
| threads1 | DELTA | 100 | GAUGE_LAST_VALUE | 2816.403 | 2998.816 | +6.5% |
| threads1 | DELTA | 100 | HISTOGRAM_EXPLICIT | 6707.058 | 4143.646 | -38.2% |
| threads1 | DELTA | 100 | HISTOGRAM_BASE2_EXPONENTIAL | 3486.534 | 3668.988 | +5.2% |
| threads1 | CUMULATIVE | 1 | COUNTER_SUM | 14545.199 | 15211.867 | +4.6% |
| threads1 | CUMULATIVE | 1 | UP_DOWN_COUNTER_SUM | 14764.813 | 15272.696 | +3.4% |
| threads1 | CUMULATIVE | 1 | GAUGE_LAST_VALUE | 5054.376 | 7315.585 | +44.7% |
| threads1 | CUMULATIVE | 1 | HISTOGRAM_EXPLICIT | 8409.192 | 6646.129 | -21.0% |
| threads1 | CUMULATIVE | 1 | HISTOGRAM_BASE2_EXPONENTIAL | 3602.837 | 4202.725 | +16.7% |
| threads1 | CUMULATIVE | 100 | COUNTER_SUM | 7977.291 | 8553.166 | +7.2% |
| threads1 | CUMULATIVE | 100 | UP_DOWN_COUNTER_SUM | 8651.882 | 9148.823 | +5.7% |
| threads1 | CUMULATIVE | 100 | GAUGE_LAST_VALUE | 8066.028 | 8658.896 | +7.4% |
| threads1 | CUMULATIVE | 100 | HISTOGRAM_EXPLICIT | 6841.747 | 5590.956 | -18.3% |
| threads1 | CUMULATIVE | 100 | HISTOGRAM_BASE2_EXPONENTIAL | 3769.735 | 4061.193 | +7.7% |
| threads4 | DELTA | 1 | COUNTER_SUM | 1036.584 | 3642.403 | +251.4% |
| threads4 | DELTA | 1 | UP_DOWN_COUNTER_SUM | 953.509 | 2202.933 | +131.0% |
| threads4 | DELTA | 1 | GAUGE_LAST_VALUE | 1063.733 | 1886.270 | +77.3% |
| threads4 | DELTA | 1 | HISTOGRAM_EXPLICIT | 1657.467 | 2254.040 | +36.0% |
| threads4 | DELTA | 1 | HISTOGRAM_BASE2_EXPONENTIAL | 928.849 | 1623.778 | +74.8% |
| threads4 | DELTA | 100 | COUNTER_SUM | 1593.081 | 4987.529 | +213.1% |
| threads4 | DELTA | 100 | UP_DOWN_COUNTER_SUM | 3318.371 | 4880.035 | +47.1% |
| threads4 | DELTA | 100 | GAUGE_LAST_VALUE | 2630.933 | 3816.537 | +45.1% |
| threads4 | DELTA | 100 | HISTOGRAM_EXPLICIT | 2153.202 | 5494.981 | +155.2% |
| threads4 | DELTA | 100 | HISTOGRAM_BASE2_EXPONENTIAL | 2039.287 | 3385.470 | +66.0% |
| threads4 | CUMULATIVE | 1 | COUNTER_SUM | 4189.110 | 6061.089 | +44.7% |
| threads4 | CUMULATIVE | 1 | UP_DOWN_COUNTER_SUM | 4060.842 | 6635.251 | +63.4% |
| threads4 | CUMULATIVE | 1 | GAUGE_LAST_VALUE | 1386.341 | 1971.709 | +42.2% |
| threads4 | CUMULATIVE | 1 | HISTOGRAM_EXPLICIT | 1588.722 | 3745.206 | +135.7% |
| threads4 | CUMULATIVE | 1 | HISTOGRAM_BASE2_EXPONENTIAL | 1159.695 | 1818.146 | +56.8% |
| threads4 | CUMULATIVE | 100 | COUNTER_SUM | 4060.759 | 7510.675 | +85.0% |
| threads4 | CUMULATIVE | 100 | UP_DOWN_COUNTER_SUM | 4674.983 | 6670.130 | +42.7% |
| threads4 | CUMULATIVE | 100 | GAUGE_LAST_VALUE | 4033.773 | 6777.530 | +68.0% |
| threads4 | CUMULATIVE | 100 | HISTOGRAM_EXPLICIT | 3511.330 | 9629.410 | +174.3% |
| threads4 | CUMULATIVE | 100 | HISTOGRAM_BASE2_EXPONENTIAL | 3415.420 | 5692.648 | +66.7% |
But this might not be the whole picture:
- The benchmarks show improvement in the
threads4cumulative case, despite the cumulative path being unchanged here. So there's definitely some noise / variance in the benchmark. - The lock approach should never have contention if each recording thread is obtaining its own lock. And this should be true some of the time. But its not true all the time since I use
threadId % locks.lengthto select the lock. This means that there could be cases where the recording threadIds hit the same lock, and then are blocked waiting for the other to complete. TheAtomicInteger[]design may be more expensive, but allows multiple recording threads to access to the same underlying aggregator concurrently.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DoubleExplicitBucketHistogramAggregatorlocking mechanism to use a striped set of cells approach.Summary view of perf changes to
MetricRecordBenchmarkops/son my machine:Raw benchmark output before and after
Before (on my machine):
After (on my machine):