feat: add replay runner and scheduler with CLI integration by JasonXuDeveloper · Pull Request #234 · Azure/kperf

JasonXuDeveloper · 2026-02-15T01:26:57Z

Summary

Runner: Worker pool with time-bucketed scheduling, per-worker metrics to avoid lock contention, pool-first WATCH connection assignment with overflow support
Scheduler: Orchestrator for both local multi-runner and distributed single-runner modes with configuration validation and warnings
CLI: kperf replay run command for local replay execution, kperf runner replay subcommand for distributed runner pods

Test plan

go build ./... passes
go vet ./... passes
go test ./replay/... passes (runner, schedule tests)
go test ./cmd/kperf/commands/replay/... passes

Part 5 of 6 in the replay feature PR stack. Depends on PR #233.

- Fix "traget" → "target" typo in LoadProfileSpec comment - Fix "letencies" → "latencies" typo in runner CLI flag description - Add empty specs validation in loadConfig to prevent index-out-of-range panic when config file has no specs - Preserve nodeAffinity from runnergroup spec when CLI --affinity flag is not provided Signed-off-by: JasonXuDeveloper - 傑 <jason@xgamedev.net>

Add foundation types for the timeseries replay system: - ReplayRequest, ReplayProfile, ReplayProfileSpec types with validation - IsReplayMode() method on RunnerGroupSpec for detecting replay configs - ReplayProfileSpec field in RunnerGroupSpec for distributed mode - Sample replay profile and runner group config test data Signed-off-by: JasonXuDeveloper - 傑 <jason@xgamedev.net>

Extract shared report-building logic into metrics.BuildPercentileLatenciesReport() to avoid duplication between runner and replay report builders. Refactor buildRunnerMetricReport() to use the shared utility. Signed-off-by: JasonXuDeveloper - 傑 <jason@xgamedev.net>

Core replay data processing (no execution engine yet): - Loader: YAML/gzip profile loading from file or URL - Partition: request distribution across runners using object-key consistent hashing to preserve per-object ordering - Builder: HTTP request construction with URL building, verb mapping, and URL masking for metrics aggregation Signed-off-by: JasonXuDeveloper - 傑 <jason@xgamedev.net>

Execution engine for the timeseries replay system: - Runner: worker pool with time-bucketed scheduling, per-worker metrics, pool-first WATCH connection assignment with overflow support - Scheduler: orchestrator for both local multi-runner and distributed single-runner modes with configuration validation and warnings - CLI: 'kperf replay run' command for local replay execution and 'kperf runner replay' subcommand for distributed runner pods Signed-off-by: JasonXuDeveloper - 傑 <jason@xgamedev.net>

JasonXuDeveloper mentioned this pull request Feb 15, 2026

feat: add replay support to runner group and deployment infrastructure #235

Open

3 tasks

JasonXuDeveloper force-pushed the replay/pr5 branch from 6e9bd4c to 588d89f Compare February 15, 2026 09:14

JasonXuDeveloper added 5 commits February 15, 2026 20:18

JasonXuDeveloper force-pushed the replay/pr5 branch from 588d89f to 0aaf1c9 Compare February 15, 2026 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add replay runner and scheduler with CLI integration#234

feat: add replay runner and scheduler with CLI integration#234
JasonXuDeveloper wants to merge 5 commits intoAzure:unstable-replayfrom
JasonXuDeveloper:replay/pr5

JasonXuDeveloper commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JasonXuDeveloper commented Feb 15, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant