Skip to content

Samply multiple perf events with samply#400

Open
GuillaumeLagrange wants to merge 3 commits into
mainfrom
cod-2810-check-samply-possibilitieslimitations-regarding-additional
Open

Samply multiple perf events with samply#400
GuillaumeLagrange wants to merge 3 commits into
mainfrom
cod-2810-check-samply-possibilitieslimitations-regarding-additional

Conversation

@GuillaumeLagrange

Copy link
Copy Markdown
Contributor

Related: CodSpeedHQ/samply-codspeed#6 which adds support to provide extra perf events

GuillaumeLagrange and others added 3 commits June 11, 2026 16:05
…iler

Set SAMPLY_PERF_EVENTS when wrapping the benchmark command with samply,
so the profile carries per-sample hardware event deltas alongside the
stack samples, mirroring the event capture the perf profiler does.

Make PerfEvent's perf names architecture-aware: each variant is a
semantic slot of the cache model, backed by the architected PMU events
on arm64 (l1d_cache, l2d_cache, l2d_cache_refill, as before) and by the
generalized cache events on x86_64 (L1-dcache-loads,
L1-dcache-load-misses, cache-misses), where no combined L2 event is
exposed. This also gives the perf profiler a working event set on
x86_64, which previously disabled event sampling entirely.

samply degrades gracefully to cycles-only sampling when the PMU can't
deliver the events, so the env var is set unconditionally on Linux.

Refs COD-2810
Co-Authored-By: Claude <noreply@anthropic.com>
@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 7 untouched benchmarks


Comparing cod-2810-check-samply-possibilitieslimitations-regarding-additional (ca886e6) with main (41f4db9)

Open in CodSpeed

@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown

Greptile Summary

This PR extends the samply profiler integration to capture multiple hardware perf events (CPU cycles, L1/L2 cache accesses, cache misses, instructions) alongside the default sampling event, using a new SAMPLY_PERF_EVENTS env var. It also bumps the samply-codspeed fork rev to pick up the supporting changes and adds a developer-workflow script for locally patching samply's transitive dependencies.

  • perf_event.rs: Adds to_samply_spec(), samply_name(), and perf_strings() with arch-specific PMU encodings (Intel raw events for x86_64, ARM architected PMU events for aarch64). Cache event selection on x86_64 is guarded to genuine Intel CPUs via CPUID.
  • samply/mod.rs: On Linux, builds a comma-separated SAMPLY_PERF_EVENTS value from the full event list, filtered through to_samply_spec() so unsupported events are silently dropped.
  • scripts/samply-dev.sh: New helper for patching samply, framehop, and linux-perf-event-reader to local sibling checkouts during development; contains a bug where the reader_tracked_rev grep needle uses the wrong GitHub org name.

Confidence Score: 4/5

Core profiling changes are safe; the dev-workflow script has a bug that breaks reader sync but does not affect production builds.

The samply profiler and perf_event changes are correct and well-tested. The only real defect is in the new developer script: reader_tracked_rev() greps for the wrong GitHub org (CodSpeedHQ instead of AvalancheHQ), so the reader checkout is silently skipped in both sync and recap. This is isolated to the dev script and has no impact on the compiled binary or CI, but it means the script doesn't fully work as documented from day one.

scripts/samply-dev.sh — the reader_tracked_rev grep needle uses the wrong org name

Important Files Changed

Filename Overview
crates/runner-shared/src/perf_event.rs Adds arch-specific PMU encodings for cache event slots, to_samply_spec(), samply_name(), and perf_strings(). Logic is well-documented and tests cover uniqueness/portability; one println! violates the project style rule.
src/executor/wall_time/profiler/samply/mod.rs Adds Linux-only SAMPLY_PERF_EVENTS env var built from PerfEvent::all_events() filtered through to_samply_spec(); straightforward and correctly gated behind #[cfg(target_os = "linux")].
scripts/samply-dev.sh New developer-workflow script for patching samply/framehop/reader deps locally. Contains a bug: reader_tracked_rev() greps for CodSpeedHQ/linux-perf-event-reader but the actual URL is AvalancheHQ/linux-perf-event-reader, so the reader's tracked rev is never found and sync always skips it.
Cargo.toml Bumps samply git rev to new CodSpeedHQ fork commit; uses a 12-char short hash inconsistent with all other full-SHA git pins in the same file.
Cargo.lock Reflects updated samply rev and new forked linux-perf-data / linux-perf-event-reader git dependencies introduced transitively by the new samply revision.

Sequence Diagram

sequenceDiagram
    participant R as Runner
    participant PE as PerfEvent::all_events()
    participant S as samply (subprocess)
    participant PMU as Linux PMU

    R->>PE: all_events().iter()
    PE-->>R: [CpuCycles, L1DCache, L2DCache, CacheMisses, Instructions]
    loop each event
        R->>PE: event.to_samply_spec()
        PE->>PE: perf_event_attr() → (type, config)
        Note over PE: x86_64: CPUID check (GenuineIntel)<br/>aarch64: architected PMU nums<br/>other: None
        PE-->>R: Some("name:type:0xconfig") or None
    end
    R->>S: "env SAMPLY_PERF_EVENTS="cpu-cycles:0:0x0,...""
    S->>PMU: perf_event_open() for each spec
    PMU-->>S: per-sample delta counters
    S-->>R: profile JSON with extra event columns
Loading

Comments Outside Diff (1)

  1. scripts/samply-dev.sh, line 483 (link)

    P1 Wrong org name in reader_tracked_rev needle

    The grep needle is 'CodSpeedHQ/linux-perf-event-reader', but the actual upstream URL (and READER_URL set just above) is https://github.com/AvalancheHQ/linux-perf-event-reader. No line in samply's Cargo.toml will ever match that needle, so reader_tracked_rev() always returns an empty string. As a result, recap permanently shows reader as "no tracked rev" and sync silently skips the reader checkout with "skipped (no tracked rev in manifest)", defeating the purpose of the reader sync logic.

Reviews (1): Last reviewed commit: "feat(walltime): capture per-arch hardwar..." | Re-trigger Greptile

Comment on lines +214 to +219
#[test]
fn print_specs_for_this_host() {
for event in PerfEvent::all_events() {
println!("{event:?} -> {:?}", event.to_samply_spec());
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 This test uses println!, which is forbidden by the project style guide. Use a tracing macro (debug!, info!, etc.) instead — or, since its only purpose is human-readable debug output during manual inspection, it can reasonably be removed or replaced with a debug! call.

Suggested change
#[test]
fn print_specs_for_this_host() {
for event in PerfEvent::all_events() {
println!("{event:?} -> {:?}", event.to_samply_spec());
}
}
#[test]
fn print_specs_for_this_host() {
for event in PerfEvent::all_events() {
eprintln!("{event:?} -> {:?}", event.to_samply_spec());
}
}

Rule Used: Never use println!/eprintln! in Rust; use trac... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment thread Cargo.toml
which = "8.0.2"
crc32fast = "1.5.0"
samply = { git = "https://github.com/CodSpeedHQ/samply-codspeed", rev = "ec97a70c0667098f8607f30a607ddd031a15a8b8" }
samply = { git = "https://github.com/CodSpeedHQ/samply-codspeed", rev = "81ba2c346e71" }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The samply dep uses a 12-character abbreviated SHA (81ba2c346e71), while every other git-pinned dep in this file uses a full 40-character SHA. Short hashes can become ambiguous as a repo grows. The full hash from Cargo.lock is 81ba2c346e71d05aaef94448e2962961d29cb4c8.

Suggested change
samply = { git = "https://github.com/CodSpeedHQ/samply-codspeed", rev = "81ba2c346e71" }
samply = { git = "https://github.com/CodSpeedHQ/samply-codspeed", rev = "81ba2c346e71d05aaef94448e2962961d29cb4c8" }

Rule Used: Pin Cargo git dependencies to tag or rev, neve... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant