You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Provide a first-class way for agentv to deliver an eval benchmark repo (code + fixtures) onto a target before running evals, so the same eval definition can run unchanged against local, ephemeral CI, and long-lived VM targets.
Current Problem
Today, getting a benchmark repo onto a remote eval target (Azure VM, runner, etc.) is left to the caller. There is no agentv-level model for:
where the source of truth lives (repo, ref, fixtures),
how it is delivered to a given target,
how agentv verifies prerequisites are in place before a run starts.
This forces users to script ad-hoc clones / rsyncs per target and re-implement retries, atomic swap, and auth. As more eval targets land (Azure VM, GHA runner, long-lived dev boxes), this gap compounds.
Proposal
1. Declarative source + targets config
Separate what (source of truth) from where/how (delivery):
oneshot — single shallow clone (e.g. git clone --depth 1 --filter=blob:none), suitable for ephemeral runners.
continuous — long-running mirror; initial backing impl is the kubernetes/git-sync Go binary (partial-clone and sparse-checkout capable, atomic swap via worktree+symlink, supports GitHub App auth).
(future) rsync-from-bucket, entireio/git-sync regional ref-mirror pre-stage, etc.
The binary backing continuous is an implementation detail. Strategy is swappable without breaking user config.
3. BYO external binaries — no bundling, no postinstall download
agentv is an npm/Node package (~MBs). git-sync is a ~25MB Go binary, per-platform. Bundling or postinstalling it has known failure modes at enterprise scale: npm install --ignore-scripts policies, registries that strip postinstall, locked-down VMs whose HTTPS proxy does not allowlist GitHub releases. Auto-downloading at first use just moves the same failure to a more confusing place mid-run.
Instead:
Runtime preflight. When a target's strategy requires an external binary, check PATH at run start. If missing, fail fast with an actionable message:
agentv: target 'vm-eastus' uses sync mode 'continuous', which requires the
git-sync binary. It is not installed.
Install with: agentv install git-sync
or via: brew install git-sync | apt install git-sync | manual download
For air-gapped environments, place the binary on PATH manually.
agentv install git-sync subcommand. Opt-in installer. Downloads a pinned version from GitHub releases, verifies SHA-256 against a vendored manifest, installs to ~/.agentv/bin/ (which agentv prepends to PATH for child processes it spawns). Idempotent.
agentv doctor subcommand. Lists every external dependency (git-sync, rsync, ssh, target-specific CLIs) with version + resolved path, and any missing ones with their install hint.
Never resolve latest. Bumps are deliberate, reviewable, and rollback-able.
4. Cross-platform scope
Linux + macOS: primary, both covered by agentv install.
Windows: remote-target sync via WSL is primary. Windows-native is best-effort and not blocking for v1. local target on Windows-native continues to work without git-sync.
Design Latitude
Strategy names (noop / oneshot / continuous) are suggestions — pick what fits the existing target/provider vocabulary.
The shape of the internal SyncStrategy interface is up to the implementer.
Plugin vs built-in is open. Per AGENTS.md design principles BbEval TypeScript Migration #1 (Lightweight Core / Plugin Extensibility) and Refactor #3 (Composition), the implementer should first audit whether before_all / after_all hooks + a target-provider plugin can cover this without a new built-in primitive. If composition is sufficient, this issue resolves as "document the pattern" rather than "ship new core code."
agentv install may start as a single-purpose installer or be generalised to agentv install <dep> reading from deps.json — both are acceptable.
kubernetes/git-sync is the initial backing impl for continuous. Not a long-term commitment — the strategy interface is the contract.
Acceptance Signals
An eval can be defined once and run against local, ci, and a remote VM target via the same agentv eval ... invocation, with target-appropriate sync happening transparently.
agentv doctor reports presence/absence and version of any required external binaries for the configured targets.
agentv install git-sync (or generic equivalent) installs a pinned version with SHA-256 verification into ~/.agentv/bin/ and is idempotent.
When a target requires an external binary that isn't installed, the run fails fast at preflight with an actionable error message (not mid-run).
No postinstall script in the agentv npm package downloads or extracts external binaries.
User-facing docs describe the source / targets schema and the supported sync: modes. Raw git-sync flags are not documented as user surface.
Non-Goals
Bundling or auto-downloading git-sync (or any third-party binary) at install time.
Exposing raw git-sync flags as first-class config. An sync_options.advanced.extra_args escape hatch may exist but is intentionally absent from golden-path docs.
Two-way sync, conflict resolution, or write-back to the source repo.
Replacing existing before_all / after_all hooks for users who already script their own delivery — the new schema is opt-in.
Windows-native (non-WSL) support for continuous sync in v1.
Related
Brainstorm notes (workspace repo christso/agentv-allagents) comparing kubernetes/git-sync vs DIY vs entireio/git-sync.
kubernetes/git-sync — initial backing impl candidate for continuous.
entireio/git-sync — candidate future strategy for regional ref-mirror pre-staging (remote-to-remote, no local checkout).
AGENTS.md §Design Principles, particularly §1 (Lightweight Core / Plugin Extensibility), §3 (Composition), and §5 (YAGNI) — implementer should validate that this cannot be covered by composing existing primitives before adding new core surface.
Objective
Provide a first-class way for
agentvto deliver an eval benchmark repo (code + fixtures) onto a target before running evals, so the same eval definition can run unchanged against local, ephemeral CI, and long-lived VM targets.Current Problem
Today, getting a benchmark repo onto a remote eval target (Azure VM, runner, etc.) is left to the caller. There is no
agentv-level model for:agentvverifies prerequisites are in place before a run starts.This forces users to script ad-hoc clones / rsyncs per target and re-implement retries, atomic swap, and auth. As more eval targets land (Azure VM, GHA runner, long-lived dev boxes), this gap compounds.
Proposal
1. Declarative
source+targetsconfigSeparate what (source of truth) from where/how (delivery):
User config describes intent. The sync mechanism is not exposed.
2. Internal
SyncStrategyinterface (not user-facing)Built-in implementations selected by
sync::noop— source already present on target.oneshot— single shallow clone (e.g.git clone --depth 1 --filter=blob:none), suitable for ephemeral runners.continuous— long-running mirror; initial backing impl is thekubernetes/git-syncGo binary (partial-clone and sparse-checkout capable, atomic swap via worktree+symlink, supports GitHub App auth).rsync-from-bucket,entireio/git-syncregional ref-mirror pre-stage, etc.The binary backing
continuousis an implementation detail. Strategy is swappable without breaking user config.3. BYO external binaries — no bundling, no postinstall download
agentvis an npm/Node package (~MBs).git-syncis a ~25MB Go binary, per-platform. Bundling or postinstalling it has known failure modes at enterprise scale:npm install --ignore-scriptspolicies, registries that strip postinstall, locked-down VMs whose HTTPS proxy does not allowlist GitHub releases. Auto-downloading at first use just moves the same failure to a more confusing place mid-run.Instead:
PATHat run start. If missing, fail fast with an actionable message:agentv install git-syncsubcommand. Opt-in installer. Downloads a pinned version from GitHub releases, verifies SHA-256 against a vendored manifest, installs to~/.agentv/bin/(whichagentvprepends to PATH for child processes it spawns). Idempotent.agentv doctorsubcommand. Lists every external dependency (git-sync,rsync,ssh, target-specific CLIs) with version + resolved path, and any missing ones with their install hint.deps.json:{ "git-sync": { "version": "vX.Y.Z", "sha256": { "linux-amd64": "...", "linux-arm64": "...", "darwin-amd64": "...", "darwin-arm64": "..." } } }latest. Bumps are deliberate, reviewable, and rollback-able.4. Cross-platform scope
agentv install.localtarget on Windows-native continues to work withoutgit-sync.Design Latitude
noop/oneshot/continuous) are suggestions — pick what fits the existing target/provider vocabulary.SyncStrategyinterface is up to the implementer.before_all/after_allhooks + a target-provider plugin can cover this without a new built-in primitive. If composition is sufficient, this issue resolves as "document the pattern" rather than "ship new core code."agentv installmay start as a single-purpose installer or be generalised toagentv install <dep>reading fromdeps.json— both are acceptable.kubernetes/git-syncis the initial backing impl forcontinuous. Not a long-term commitment — the strategy interface is the contract.Acceptance Signals
local,ci, and a remote VM target via the sameagentv eval ...invocation, with target-appropriate sync happening transparently.agentv doctorreports presence/absence and version of any required external binaries for the configured targets.agentv install git-sync(or generic equivalent) installs a pinned version with SHA-256 verification into~/.agentv/bin/and is idempotent.postinstallscript in theagentvnpm package downloads or extracts external binaries.source/targetsschema and the supportedsync:modes. Rawgit-syncflags are not documented as user surface.Non-Goals
git-sync(or any third-party binary) at install time.git-syncflags as first-class config. Ansync_options.advanced.extra_argsescape hatch may exist but is intentionally absent from golden-path docs.before_all/after_allhooks for users who already script their own delivery — the new schema is opt-in.continuoussync in v1.Related
christso/agentv-allagents) comparingkubernetes/git-syncvs DIY vsentireio/git-sync.kubernetes/git-sync— initial backing impl candidate forcontinuous.entireio/git-sync— candidate future strategy for regional ref-mirror pre-staging (remote-to-remote, no local checkout).