feat: Eval benchmark repo sync to remote targets

## Objective

Provide a first-class way for `agentv` to deliver an eval benchmark repo (code + fixtures) onto a target before running evals, so the same eval definition can run unchanged against local, ephemeral CI, and long-lived VM targets.

## Current Problem

Today, getting a benchmark repo onto a remote eval target (Azure VM, runner, etc.) is left to the caller. There is no `agentv`-level model for:

- where the source of truth lives (repo, ref, fixtures),
- how it is delivered to a given target,
- how `agentv` verifies prerequisites are in place before a run starts.

This forces users to script ad-hoc clones / rsyncs per target and re-implement retries, atomic swap, and auth. As more eval targets land (Azure VM, GHA runner, long-lived dev boxes), this gap compounds.

## Proposal

### 1. Declarative `source` + `targets` config

Separate *what* (source of truth) from *where/how* (delivery):

```yaml
source:
  repo: https://github.com/org/eval-benchmarks
  ref: main
  fixtures:
    manifest: evals/fixtures.lock     # content-addressed blobs
    cache_path: .agentv/cache

targets:
  local:
    type: local
    sync: noop                        # source already on disk

  ci:
    type: gha-runner
    sync: oneshot                     # one-shot clone, no daemon

  vm-eastus:
    type: azure-vm
    host: eval-vm-1.eastus.example.com
    workdir: /srv/agentv/repo
    sync: continuous                  # long-running, kept fresh
    sync_options:                     # declarative, not raw flags
      mode: shallow                   # shallow | full | sparse
      sparse_paths: [evals/, tests/]
      refresh_interval: 60s
```

User config describes intent. The sync mechanism is **not** exposed.

### 2. Internal `SyncStrategy` interface (not user-facing)

Built-in implementations selected by `sync:`:

- `noop` — source already present on target.
- `oneshot` — single shallow clone (e.g. `git clone --depth 1 --filter=blob:none`), suitable for ephemeral runners.
- `continuous` — long-running mirror; initial backing impl is the `kubernetes/git-sync` Go binary (partial-clone and sparse-checkout capable, atomic swap via worktree+symlink, supports GitHub App auth).
- (future) `rsync-from-bucket`, `entireio/git-sync` regional ref-mirror pre-stage, etc.

The binary backing `continuous` is an implementation detail. Strategy is swappable without breaking user config.

### 3. BYO external binaries — no bundling, no postinstall download

`agentv` is an npm/Node package (~MBs). `git-sync` is a ~25MB Go binary, per-platform. Bundling or postinstalling it has known failure modes at enterprise scale: `npm install --ignore-scripts` policies, registries that strip postinstall, locked-down VMs whose HTTPS proxy does not allowlist GitHub releases. Auto-downloading at first use just moves the same failure to a more confusing place mid-run.

Instead:

- **Runtime preflight.** When a target's strategy requires an external binary, check `PATH` at run start. If missing, fail fast with an actionable message:
  ```
  agentv: target 'vm-eastus' uses sync mode 'continuous', which requires the
  git-sync binary. It is not installed.

  Install with:    agentv install git-sync
  or via:          brew install git-sync   |   apt install git-sync   |   manual download

  For air-gapped environments, place the binary on PATH manually.
  ```
- **`agentv install git-sync` subcommand.** Opt-in installer. Downloads a *pinned* version from GitHub releases, verifies SHA-256 against a vendored manifest, installs to `~/.agentv/bin/` (which `agentv` prepends to PATH for child processes it spawns). Idempotent.
- **`agentv doctor` subcommand.** Lists every external dependency (`git-sync`, `rsync`, `ssh`, target-specific CLIs) with version + resolved path, and any missing ones with their install hint.
- **Version pinning via vendored `deps.json`:**
  ```json
  {
    "git-sync": {
      "version": "vX.Y.Z",
      "sha256": {
        "linux-amd64": "...",
        "linux-arm64": "...",
        "darwin-amd64": "...",
        "darwin-arm64": "..."
      }
    }
  }
  ```
  Never resolve `latest`. Bumps are deliberate, reviewable, and rollback-able.

### 4. Cross-platform scope

- Linux + macOS: primary, both covered by `agentv install`.
- Windows: remote-target sync via WSL is primary. Windows-native is best-effort and not blocking for v1. `local` target on Windows-native continues to work without `git-sync`.

## Design Latitude

- Strategy names (`noop` / `oneshot` / `continuous`) are suggestions — pick what fits the existing target/provider vocabulary.
- The shape of the internal `SyncStrategy` interface is up to the implementer.
- **Plugin vs built-in is open.** Per AGENTS.md design principles #1 (Lightweight Core / Plugin Extensibility) and #3 (Composition), the implementer should first audit whether `before_all` / `after_all` hooks + a target-provider plugin can cover this *without* a new built-in primitive. If composition is sufficient, this issue resolves as "document the pattern" rather than "ship new core code."
- `agentv install` may start as a single-purpose installer or be generalised to `agentv install <dep>` reading from `deps.json` — both are acceptable.
- `kubernetes/git-sync` is the *initial* backing impl for `continuous`. Not a long-term commitment — the strategy interface is the contract.

## Acceptance Signals

- [ ] An eval can be defined once and run against `local`, `ci`, and a remote VM target via the same `agentv eval ...` invocation, with target-appropriate sync happening transparently.
- [ ] `agentv doctor` reports presence/absence and version of any required external binaries for the configured targets.
- [ ] `agentv install git-sync` (or generic equivalent) installs a pinned version with SHA-256 verification into `~/.agentv/bin/` and is idempotent.
- [ ] When a target requires an external binary that isn't installed, the run fails fast at preflight with an actionable error message (not mid-run).
- [ ] No `postinstall` script in the `agentv` npm package downloads or extracts external binaries.
- [ ] User-facing docs describe the `source` / `targets` schema and the supported `sync:` modes. Raw `git-sync` flags are not documented as user surface.

## Non-Goals

- Bundling or auto-downloading `git-sync` (or any third-party binary) at install time.
- Exposing raw `git-sync` flags as first-class config. An `sync_options.advanced.extra_args` escape hatch may exist but is intentionally absent from golden-path docs.
- Two-way sync, conflict resolution, or write-back to the source repo.
- Replacing existing `before_all` / `after_all` hooks for users who already script their own delivery — the new schema is opt-in.
- Windows-native (non-WSL) support for `continuous` sync in v1.

## Related

- Brainstorm notes (workspace repo `christso/agentv-allagents`) comparing `kubernetes/git-sync` vs DIY vs `entireio/git-sync`.
- `kubernetes/git-sync` — initial backing impl candidate for `continuous`.
- `entireio/git-sync` — candidate future strategy for regional ref-mirror pre-staging (remote-to-remote, no local checkout).
- AGENTS.md §Design Principles, particularly §1 (Lightweight Core / Plugin Extensibility), §3 (Composition), and §5 (YAGNI) — implementer should validate that this cannot be covered by composing existing primitives before adding new core surface.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Eval benchmark repo sync to remote targets #1232

Objective

Current Problem

Proposal

1. Declarative `source` + `targets` config

2. Internal `SyncStrategy` interface (not user-facing)

3. BYO external binaries — no bundling, no postinstall download

4. Cross-platform scope

Design Latitude

Acceptance Signals

Non-Goals

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Eval benchmark repo sync to remote targets #1232

Description

Objective

Current Problem

Proposal

1. Declarative source + targets config

2. Internal SyncStrategy interface (not user-facing)

3. BYO external binaries — no bundling, no postinstall download

4. Cross-platform scope

Design Latitude

Acceptance Signals

Non-Goals

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Declarative `source` + `targets` config

2. Internal `SyncStrategy` interface (not user-facing)