-
Notifications
You must be signed in to change notification settings - Fork 469
perf(debugger): reduce code origin startup time #15272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
d2f1ae1 to
533e5b8
Compare
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 249 ± 3 ms. The average import time from base is: 252 ± 3 ms. The import time difference between this PR and base is: -3.2 ± 0.1 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate tyler.finethy/benchmark-co-instrumentation (1467c02) with baseline main (6f87e75) ❌ Test Failures (1 suite)❌ telemetryaddmetric - 29/30✅ 1-count-metric-1-timesTime: ✅ 3.373µs (SLO: <20.000µs 📉 -83.1%) vs baseline: 📈 +16.5% Memory: ✅ 34.760MB (SLO: <35.500MB -2.1%) vs baseline: +4.4% ✅ 1-count-metrics-100-timesTime: ✅ 201.033µs (SLO: <220.000µs -8.6%) vs baseline: +1.0% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +4.6% ✅ 1-distribution-metric-1-timesTime: ✅ 3.277µs (SLO: <20.000µs 📉 -83.6%) vs baseline: +0.3% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.6% ✅ 1-distribution-metrics-100-timesTime: ✅ 216.607µs (SLO: <230.000µs -5.8%) vs baseline: +2.1% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +4.4% ✅ 1-gauge-metric-1-timesTime: ✅ 2.152µs (SLO: <20.000µs 📉 -89.2%) vs baseline: -0.6% Memory: ✅ 34.819MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +4.6% ✅ 1-gauge-metrics-100-timesTime: ✅ 136.259µs (SLO: <150.000µs -9.2%) vs baseline: +0.2% Memory: ✅ 34.760MB (SLO: <35.500MB -2.1%) vs baseline: +4.7% ✅ 1-rate-metric-1-timesTime: ✅ 3.081µs (SLO: <20.000µs 📉 -84.6%) vs baseline: +1.3% Memory: ✅ 34.819MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +4.7% ✅ 1-rate-metrics-100-timesTime: ✅ 215.340µs (SLO: <250.000µs 📉 -13.9%) vs baseline: +0.6% Memory: ✅ 34.721MB (SLO: <35.500MB -2.2%) vs baseline: +4.3% ✅ 100-count-metrics-100-timesTime: ✅ 20.423ms (SLO: <22.000ms -7.2%) vs baseline: +0.2% Memory: ✅ 34.662MB (SLO: <35.500MB -2.4%) vs baseline: +3.9% ❌ 100-distribution-metrics-100-timesTime: ❌ 2.314ms (SLO: <2.300ms +0.6%) vs baseline: +2.3% Memory: ✅ 34.878MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +3.7% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.424ms (SLO: <1.550ms -8.2%) vs baseline: +1.7% Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +4.7% ✅ 100-rate-metrics-100-timesTime: ✅ 2.203ms (SLO: <2.550ms 📉 -13.6%) vs baseline: -1.1% Memory: ✅ 34.760MB (SLO: <35.500MB -2.1%) vs baseline: +4.3% ✅ flush-1-metricTime: ✅ 4.541µs (SLO: <20.000µs 📉 -77.3%) vs baseline: +1.9% Memory: ✅ 35.036MB (SLO: <35.500MB 🟡 -1.3%) vs baseline: +4.2% ✅ flush-100-metricsTime: ✅ 174.438µs (SLO: <250.000µs 📉 -30.2%) vs baseline: -0.2% Memory: ✅ 35.193MB (SLO: <35.500MB 🟡 -0.9%) vs baseline: +4.7% ✅ flush-1000-metricsTime: ✅ 2.189ms (SLO: <2.500ms 📉 -12.5%) vs baseline: -0.2% Memory: ✅ 35.960MB (SLO: <36.500MB 🟡 -1.5%) vs baseline: +4.3% 📈 Performance Regressions (2 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 0.405µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -1.3% Memory: ✅ 40.205MB (SLO: <41.500MB -3.1%) vs baseline: +4.4% ✅ add_inplace_aspectTime: ✅ 0.406µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.3% Memory: ✅ 40.083MB (SLO: <41.500MB -3.4%) vs baseline: +4.2% ✅ add_inplace_noaspectTime: ✅ 0.317µs (SLO: <10.000µs 📉 -96.8%) vs baseline: +0.5% Memory: ✅ 40.326MB (SLO: <41.500MB -2.8%) vs baseline: +5.1% ✅ add_noaspectTime: ✅ 0.276µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +0.2% Memory: ✅ 40.140MB (SLO: <41.500MB -3.3%) vs baseline: +4.6% ✅ bytearray_aspectTime: ✅ 1.352µs (SLO: <10.000µs 📉 -86.5%) vs baseline: +3.5% Memory: ✅ 40.223MB (SLO: <41.500MB -3.1%) vs baseline: +4.7% ✅ bytearray_extend_aspectTime: ✅ 1.507µs (SLO: <10.000µs 📉 -84.9%) vs baseline: ~same Memory: ✅ 40.045MB (SLO: <41.500MB -3.5%) vs baseline: +4.7% ✅ bytearray_extend_noaspectTime: ✅ 0.612µs (SLO: <10.000µs 📉 -93.9%) vs baseline: +0.3% Memory: ✅ 40.143MB (SLO: <41.500MB -3.3%) vs baseline: +4.4% ✅ bytearray_noaspectTime: ✅ 0.483µs (SLO: <10.000µs 📉 -95.2%) vs baseline: +1.4% Memory: ✅ 40.022MB (SLO: <41.500MB -3.6%) vs baseline: +4.1% ✅ bytes_aspectTime: ✅ 1.300µs (SLO: <10.000µs 📉 -87.0%) vs baseline: +1.6% Memory: ✅ 40.244MB (SLO: <41.500MB -3.0%) vs baseline: +4.8% ✅ bytes_noaspectTime: ✅ 0.498µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.7% Memory: ✅ 40.165MB (SLO: <41.500MB -3.2%) ✅ bytesio_aspectTime: ✅ 1.358µs (SLO: <10.000µs 📉 -86.4%) vs baseline: +2.6% Memory: ✅ 40.221MB (SLO: <41.500MB -3.1%) vs baseline: +4.6% ✅ bytesio_noaspectTime: ✅ 0.502µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.2% Memory: ✅ 40.080MB (SLO: <41.500MB -3.4%) vs baseline: +4.0% ✅ capitalize_aspectTime: ✅ 0.738µs (SLO: <10.000µs 📉 -92.6%) vs baseline: ~same Memory: ✅ 40.124MB (SLO: <41.500MB -3.3%) vs baseline: +4.7% ✅ capitalize_noaspectTime: ✅ 0.432µs (SLO: <10.000µs 📉 -95.7%) vs baseline: -1.2% Memory: ✅ 40.323MB (SLO: <41.500MB -2.8%) vs baseline: +5.2% ✅ casefold_aspectTime: ✅ 0.734µs (SLO: <10.000µs 📉 -92.7%) vs baseline: ~same Memory: ✅ 40.181MB (SLO: <41.500MB -3.2%) vs baseline: +4.5% ✅ casefold_noaspectTime: ✅ 0.369µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.2% Memory: ✅ 40.120MB (SLO: <41.500MB -3.3%) vs baseline: +4.4% ✅ decode_aspectTime: ✅ 0.723µs (SLO: <10.000µs 📉 -92.8%) vs baseline: -0.2% Memory: ✅ 40.262MB (SLO: <41.500MB -3.0%) vs baseline: +5.2% ✅ decode_noaspectTime: ✅ 0.416µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.7% Memory: ✅ 40.244MB (SLO: <41.500MB -3.0%) vs baseline: +5.0% ✅ encode_aspectTime: ✅ 0.710µs (SLO: <10.000µs 📉 -92.9%) vs baseline: +0.3% Memory: ✅ 40.264MB (SLO: <41.500MB -3.0%) vs baseline: +4.7% ✅ encode_noaspectTime: ✅ 0.402µs (SLO: <10.000µs 📉 -96.0%) vs baseline: +0.2% Memory: ✅ 40.243MB (SLO: <41.500MB -3.0%) vs baseline: +4.7% ✅ format_aspectTime: ✅ 3.410µs (SLO: <10.000µs 📉 -65.9%) vs baseline: +1.5% Memory: ✅ 40.145MB (SLO: <41.500MB -3.3%) vs baseline: +4.8% ✅ format_map_aspectTime: ✅ 3.529µs (SLO: <10.000µs 📉 -64.7%) vs baseline: -0.5% Memory: ✅ 40.105MB (SLO: <41.500MB -3.4%) vs baseline: +4.6% ✅ format_map_noaspectTime: ✅ 0.770µs (SLO: <10.000µs 📉 -92.3%) vs baseline: -0.2% Memory: ✅ 40.246MB (SLO: <41.500MB -3.0%) vs baseline: +4.6% ✅ format_noaspectTime: ✅ 0.594µs (SLO: <10.000µs 📉 -94.1%) vs baseline: -0.3% Memory: ✅ 40.285MB (SLO: <41.500MB -2.9%) vs baseline: +4.9% ✅ index_aspectTime: ✅ 0.366µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +2.4% Memory: ✅ 40.166MB (SLO: <41.500MB -3.2%) vs baseline: +4.4% ✅ index_noaspectTime: ✅ 0.276µs (SLO: <10.000µs 📉 -97.2%) vs baseline: -1.9% Memory: ✅ 40.038MB (SLO: <41.500MB -3.5%) vs baseline: +4.4% ✅ join_aspectTime: ✅ 1.350µs (SLO: <10.000µs 📉 -86.5%) vs baseline: -3.1% Memory: ✅ 40.267MB (SLO: <41.500MB -3.0%) vs baseline: +4.7% ✅ join_noaspectTime: ✅ 0.491µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.5% Memory: ✅ 40.220MB (SLO: <41.500MB -3.1%) vs baseline: +4.8% ✅ ljust_aspectTime: ✅ 2.597µs (SLO: <20.000µs 📉 -87.0%) vs baseline: +5.5% Memory: ✅ 40.059MB (SLO: <41.500MB -3.5%) vs baseline: +4.4% ✅ ljust_noaspectTime: ✅ 0.401µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -0.5% Memory: ✅ 40.448MB (SLO: <41.500MB -2.5%) vs baseline: +5.1% ✅ lower_aspectTime: ✅ 2.299µs (SLO: <10.000µs 📉 -77.0%) vs baseline: +3.5% Memory: ✅ 40.120MB (SLO: <41.500MB -3.3%) vs baseline: +4.5% ✅ lower_noaspectTime: ✅ 0.368µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +1.4% Memory: ✅ 40.104MB (SLO: <41.500MB -3.4%) vs baseline: +4.2% ✅ lstrip_aspectTime: ✅ 2.278µs (SLO: <20.000µs 📉 -88.6%) vs baseline: +3.1% Memory: ✅ 40.163MB (SLO: <41.500MB -3.2%) vs baseline: +4.5% ✅ lstrip_noaspectTime: ✅ 0.384µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +1.0% Memory: ✅ 40.245MB (SLO: <41.500MB -3.0%) vs baseline: +4.9% ✅ modulo_aspectTime: ✅ 1.044µs (SLO: <10.000µs 📉 -89.6%) vs baseline: +4.9% Memory: ✅ 40.064MB (SLO: <41.500MB -3.5%) vs baseline: +4.4% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.543µs (SLO: <10.000µs 📉 -84.6%) vs baseline: -0.5% Memory: ✅ 40.107MB (SLO: <41.500MB -3.4%) vs baseline: +4.5% ✅ modulo_aspect_for_bytesTime: ✅ 0.974µs (SLO: <10.000µs 📉 -90.3%) vs baseline: -3.2% Memory: ✅ 40.087MB (SLO: <41.500MB -3.4%) vs baseline: +3.9% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.247µs (SLO: <10.000µs 📉 -87.5%) vs baseline: -0.2% Memory: ✅ 40.309MB (SLO: <41.500MB -2.9%) vs baseline: +4.7% ✅ modulo_noaspectTime: ✅ 0.625µs (SLO: <10.000µs 📉 -93.7%) vs baseline: -0.4% Memory: ✅ 40.182MB (SLO: <41.500MB -3.2%) vs baseline: +4.7% ✅ replace_aspectTime: ✅ 4.895µs (SLO: <10.000µs 📉 -51.0%) vs baseline: +0.6% Memory: ✅ 40.003MB (SLO: <41.500MB -3.6%) vs baseline: +3.9% ✅ replace_noaspectTime: ✅ 0.462µs (SLO: <10.000µs 📉 -95.4%) vs baseline: -0.3% Memory: ✅ 40.079MB (SLO: <41.500MB -3.4%) vs baseline: +4.5% ✅ repr_aspectTime: ✅ 0.914µs (SLO: <10.000µs 📉 -90.9%) vs baseline: +1.2% Memory: ✅ 40.086MB (SLO: <41.500MB -3.4%) vs baseline: +4.2% ✅ repr_noaspectTime: ✅ 0.415µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.7% Memory: ✅ 40.283MB (SLO: <41.500MB -2.9%) vs baseline: +5.3% ✅ rstrip_aspectTime: ✅ 1.955µs (SLO: <20.000µs 📉 -90.2%) vs baseline: +4.4% Memory: ✅ 40.225MB (SLO: <41.500MB -3.1%) vs baseline: +4.7% ✅ rstrip_noaspectTime: ✅ 0.377µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -0.8% Memory: ✅ 40.163MB (SLO: <41.500MB -3.2%) vs baseline: +4.6% ✅ slice_aspectTime: ✅ 0.495µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.4% Memory: ✅ 40.206MB (SLO: <41.500MB -3.1%) vs baseline: +4.5% ✅ slice_noaspectTime: ✅ 0.443µs (SLO: <10.000µs 📉 -95.6%) vs baseline: -1.5% Memory: ✅ 40.300MB (SLO: <41.500MB -2.9%) vs baseline: +4.9% ✅ stringio_aspectTime: ✅ 1.546µs (SLO: <10.000µs 📉 -84.5%) vs baseline: +0.3% Memory: ✅ 40.224MB (SLO: <41.500MB -3.1%) vs baseline: +4.8% ✅ stringio_noaspectTime: ✅ 0.716µs (SLO: <10.000µs 📉 -92.8%) vs baseline: -0.8% Memory: ✅ 40.120MB (SLO: <41.500MB -3.3%) vs baseline: +4.4% ✅ strip_aspectTime: ✅ 2.277µs (SLO: <20.000µs 📉 -88.6%) vs baseline: +4.1% Memory: ✅ 40.140MB (SLO: <41.500MB -3.3%) vs baseline: +4.4% ✅ strip_noaspectTime: ✅ 0.387µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +0.5% Memory: ✅ 40.162MB (SLO: <41.500MB -3.2%) vs baseline: +4.1% ✅ swapcase_aspectTime: ✅ 2.784µs (SLO: <10.000µs 📉 -72.2%) vs baseline: 📈 +15.5% Memory: ✅ 40.181MB (SLO: <41.500MB -3.2%) vs baseline: +4.7% ✅ swapcase_noaspectTime: ✅ 0.536µs (SLO: <10.000µs 📉 -94.6%) vs baseline: +0.1% Memory: ✅ 40.223MB (SLO: <41.500MB -3.1%) vs baseline: +4.9% ✅ title_aspectTime: ✅ 2.445µs (SLO: <10.000µs 📉 -75.6%) vs baseline: +4.1% Memory: ✅ 40.320MB (SLO: <41.500MB -2.8%) vs baseline: +5.0% ✅ title_noaspectTime: ✅ 0.502µs (SLO: <10.000µs 📉 -95.0%) vs baseline: -0.4% Memory: ✅ 40.100MB (SLO: <41.500MB -3.4%) vs baseline: +4.7% ✅ translate_aspectTime: ✅ 3.338µs (SLO: <10.000µs 📉 -66.6%) vs baseline: +3.8% Memory: ✅ 40.242MB (SLO: <41.500MB -3.0%) vs baseline: +4.7% ✅ translate_noaspectTime: ✅ 1.038µs (SLO: <10.000µs 📉 -89.6%) vs baseline: -1.0% Memory: ✅ 39.981MB (SLO: <41.500MB -3.7%) vs baseline: +4.2% ✅ upper_aspectTime: ✅ 2.317µs (SLO: <10.000µs 📉 -76.8%) vs baseline: +5.0% Memory: ✅ 40.177MB (SLO: <41.500MB -3.2%) vs baseline: +4.3% ✅ upper_noaspectTime: ✅ 0.372µs (SLO: <10.000µs 📉 -96.3%) vs baseline: ~same Memory: ✅ 40.201MB (SLO: <41.500MB -3.1%) vs baseline: +4.6% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 5.154µs (SLO: <10.000µs 📉 -48.5%) vs baseline: 📈 +19.3% Memory: ✅ 40.206MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.8% ✅ ospathbasename_noaspectTime: ✅ 1.080µs (SLO: <10.000µs 📉 -89.2%) vs baseline: -0.3% Memory: ✅ 40.206MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.7% ✅ ospathjoin_aspectTime: ✅ 6.160µs (SLO: <10.000µs 📉 -38.4%) vs baseline: ~same Memory: ✅ 40.187MB (SLO: <41.000MB 🟡 -2.0%) vs baseline: +4.6% ✅ ospathjoin_noaspectTime: ✅ 2.292µs (SLO: <10.000µs 📉 -77.1%) vs baseline: -0.3% Memory: ✅ 40.305MB (SLO: <41.000MB 🟡 -1.7%) vs baseline: +5.2% ✅ ospathnormcase_aspectTime: ✅ 3.455µs (SLO: <10.000µs 📉 -65.5%) vs baseline: -2.7% Memory: ✅ 40.462MB (SLO: <41.000MB 🟡 -1.3%) vs baseline: +5.4% ✅ ospathnormcase_noaspectTime: ✅ 0.573µs (SLO: <10.000µs 📉 -94.3%) vs baseline: -0.2% Memory: ✅ 40.364MB (SLO: <41.000MB 🟡 -1.6%) vs baseline: +5.1% ✅ ospathsplit_aspectTime: ✅ 4.757µs (SLO: <10.000µs 📉 -52.4%) vs baseline: -2.3% Memory: ✅ 40.305MB (SLO: <41.000MB 🟡 -1.7%) vs baseline: +5.0% ✅ ospathsplit_noaspectTime: ✅ 1.591µs (SLO: <10.000µs 📉 -84.1%) vs baseline: -0.3% Memory: ✅ 40.226MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.6% ✅ ospathsplitdrive_aspectTime: ✅ 3.669µs (SLO: <10.000µs 📉 -63.3%) vs baseline: -1.1% Memory: ✅ 40.069MB (SLO: <41.000MB -2.3%) vs baseline: +4.1% ✅ ospathsplitdrive_noaspectTime: ✅ 0.696µs (SLO: <10.000µs 📉 -93.0%) vs baseline: -0.2% Memory: ✅ 40.206MB (SLO: <41.000MB 🟡 -1.9%) vs baseline: +4.4% ✅ ospathsplitext_aspectTime: ✅ 4.527µs (SLO: <10.000µs 📉 -54.7%) vs baseline: -1.4% Memory: ✅ 40.246MB (SLO: <41.000MB 🟡 -1.8%) vs baseline: +4.8% ✅ ospathsplitext_noaspectTime: ✅ 1.383µs (SLO: <10.000µs 📉 -86.2%) vs baseline: +0.4% Memory: ✅ 40.147MB (SLO: <41.000MB -2.1%) vs baseline: +4.6% 🟡 Near SLO Breach (16 suites)🟡 coreapiscenario - 10/10 (1 unstable)
|
Start with a benchmark to measure opportunities for improvements refs: DEBUG-4605
3102d43 to
fc88001
Compare
| ) | ||
|
|
||
|
|
||
| def test_instrument_view_benchmark(benchmark): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@P403n1x87 curious what the numbers look like with and without lazy wrapping? I was seeing ~0.5ms with and ~2ms without (so 4x speed-up but it's so hardware and func size dependent)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With
---------------------------------------------------------------- benchmark: 1 tests ---------------------------------------------------------------
Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------
test_instrument_view_benchmark[py3.13] 283.6670 1,747.8330 653.5566 271.7520 605.7295 312.4579 30;2 1.5301 100 1
---------------------------------------------------------------------------------------------------------------------------------------------------
Without
------------------------------------------------------- benchmark: 1 tests ------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
---------------------------------------------------------------------------------------------------------------------------------
test_instrument_view_benchmark[py3.13] 1.8341 3.5558 2.3846 0.3518 2.2829 0.4392 27;3 419.3619 100 1
---------------------------------------------------------------------------------------------------------------------------------
So this is 0.6 ms vs 2.4 ms on my machine.
f92a00e to
1467c02
Compare
Description
We adopt a lazy wrapping approach in Code Origin to reduce the startup cost when instrumenting view functions. With the lazy approach, the heavy instrumentation is performed when the view function is invoked for the first time, preventing a potentially large delay on boot.
refs: DEBUG-4605
Testing
Added benchmark to validate the performance improvements
Risks
N/A
Additional Notes