rel/0.41.0 → main by mobileoverlord · Pull Request #153 · avocado-linux/avocado-cli

mobileoverlord · 2026-06-02T15:02:00Z

Integrates the rel/0.41.0 line into main, rebased onto the latest main (picks up 0.40.1/0.40.2 and the connect org fallback; the duplicate node24/indent-fix/release commits were dropped as already-applied).

Features

feat(connect): connect ext publish/status/list — super-admin build-once publish of a packaged extension RPM to the feed, plus status/list of published versions.
feat(ext): nested extension layout — packaged extensions self-describe Provides: avocado-ext-layout(nested) and nest content under /<ext_name>/; ext_fetch installs them into the shared includes installroot (one rpmdb, no cross-extension collisions). Legacy packages keep the per-extension installroot.
feat(snapshots): reproducible channel snapshot pinning — lock-file channel snapshots, auto-pinned against the default feed; repo URL single-sourced.
deploy: avocado deploy on macOS via VM port forwarding (+ design notes).
config: top-level permissions: block for rootfs/initramfs.

Fixes / hardening

fix(tui): route the unset-env-var warning through the output module — a raw eprintln! during {{ env.VAR }} interpolation landed inside the TaskRenderer's cursor region without being counted in rendered_lines, stranding task lines (stacked "sdk bootstrap" spinner lines) during installs that fetch remote extensions.
repo TLS: custom CA + insecure mode across all dnf phases.
runtime build: fall back to default rootfs/initramfs for permissions resolution.
stamps: split input hashes per build step to fix over-invalidation.

Build

build(ext): package avocado-cli itself as the avocado-ext-cli extension (manifest + compile/install/clean scripts).

Notes

Cargo.toml is still at 0.40.2 — no release: 0.41.0 version bump is included in this PR.
Local verification: cargo fmt --check, cargo clippy --all-targets --all-features -- -D warnings, and cargo test all pass on the rebased tip.

Users and groups are now declared in a top-level `permissions:` map and referenced by name from `rootfs.<name>.permissions` / `initramfs.<name>.permissions` (or inlined), instead of buried inside one extension. This puts identity provisioning at the image layer where a single coherent passwd/shadow/group makes sense, lets the same block be reused across rootfs and initramfs, and leaves room to grow into directory perms or sudoers without further grammar churn. When no `permissions:` is set on an image, no script section is emitted — the base packages' generic /etc/passwd/shadow/group are left untouched. Extensions that still declare `users:` / `groups:` continue to work but emit a deprecation warning; that path will be removed in a future release. The script generator was extracted from ext/build.rs into a shared `utils::permissions::render_users_groups_script` helper so the legacy extension path and the new rootfs/initramfs path share one implementation.

Previously, ext install/build/image all shared a single `compute_ext_input_hash` and runtime install/build shared a single `compute_runtime_input_hash`. Editing a field that only affects the build (e.g. ext `image:` kabtool args, `var_files:`, runtime `var:`, runtime `post_build:`) invalidated the install stamp too, which cascaded into the install step being re-run via the dependency chain. Per-step hash functions now cover exactly what each step uses: ext install -> packages, types, source ext build -> install inputs + image, overlay, post_build (path + content) ext image -> build inputs + var_files, subvolumes, filesystem runtime install -> packages, target runtime build -> install inputs + narrowed kernel, var, var_files, post_build (path + content), rootfs/initramfs filesystem, ext docker_images sdk install -> sdk.packages/image/repo_url/repo_release (no longer includes rootfs/initramfs.packages — those have their own install stamps) rootfs install -> rootfs.packages, rootfs.overlay, narrowed kernel, post_install (path + content) initramfs install -> same shape as rootfs The `kernel:` block is now hashed via a narrow {package, version, compile, install} mapping at every call site, so cosmetic edits (metadata, new fields) don't invalidate stamps that don't actually consume them. The `post_build` / `post_install` hooks now hash script *contents* in addition to the path, so editing the script body invalidates the stamp without `--no-stamps`. `validate_stamps_batch` now accepts a slice of (component, command, hash) triples so each requirement is compared against the matching step's hash instead of one shared hash applied to all stamps for a component. STAMP_VERSION bumped 1 -> 2; older stamps invalidate on first run after upgrade, then the new narrower hashes apply going forward. Adds 14 negative-invalidation tests locking the new shape in place ("X must NOT invalidate Y" for each step+field pair we untangled).

…resolution When a runtime has no explicit `rootfs:` / `initramfs:` ref (the common case for projects that define images at the top level), the resolver returned None and the permissions section came out empty — meaning the root user's shadow entry never got rewritten, root login was silently broken on the resulting image. Fix: in runtime/build.rs, fall back to `config.rootfs_default()` / `config.initramfs_default()` when the runtime-level ref is unset, same fallback the image build itself uses for filesystem/post_install. Adds a regression test in `utils::config::tests` that mirrors the test project shape (top-level rootfs/initramfs with `permissions: dev`, runtime declares no rootfs/initramfs of its own) and asserts the fallback path picks up the permissions block. Verified end-to-end: after rebuild, the rootfs erofs image's /etc/shadow now carries `root::19000:...` (empty password) instead of the inherited `root:*:...` from the sysroot.

Investigate why `avocado runtime deploy` fails on macOS and design the fix. Root cause: the deploy script runs inside the SDK container, which runs inside the slirp-NAT'd avocado-vm, so the TUF repo HTTP server (:8585) it starts is unreachable by the target device, and the script's host-IP autodetect returns container/VM addresses. Plan: a per-deploy QMP hostfwd (bound 0.0.0.0, opened only during deploy) + publishing the container repo port to the VM + setting AVOCADO_DEPLOY_REPO_HOST to the macOS LAN IP (get_local_ip_for_remote), surfaced as a reusable `avocado vm port-forward` primitive. No desktop change — the CLI owns the qemu lifecycle. Plan only; no behavior change yet.

On macOS the deploy container runs inside the slirp-NAT'd avocado-vm, so the TUF repo HTTP server it starts (:8585) was unreachable by the target device and the in-container host-IP autodetect returned VM-internal addresses. Bridge the device->repo path: - qmp: add human_monitor_command + hostfwd_add/hostfwd_remove (runtime slirp port forwarding via the QEMU monitor), with unit tests. - deploy: on macOS/Windows (is_docker_desktop), set AVOCADO_DEPLOY_REPO_HOST to this host's LAN IP and publish the repo port; on the avocado-vm (is_vm_routing_active) also open a `hostfwd 0.0.0.0:PORT->guest:PORT` for the deploy and tear it down afterward. Skip `-p` when the SDK container uses host networking (docker discards it and the hostfwd already reaches the VM-bound port). Linux (native docker) is untouched. Validated end-to-end to a LAN Raspberry Pi 4: device fetched the repo metadata over the forward (HTTP 200). See docs/features/macos-deploy-port-forwarding.md.

Lets the SDK trust a self-signed / private-CA package endpoint (e.g. an internal Pulp behind package-ca). Centralized so it covers EVERY dnf invocation - sdk bootstrap, sdk packages, ext, runtime, rootfs, initramfs, and the per-module 'dnf' subcommands, host AND target repo confs: - config: distro.repo.ca + distro.repo.tls_verify; resolvers get_repo_ca()/get_repo_insecure() (env AVOCADO_REPO_CA / AVOCADO_REPO_INSECURE win over config). promote_repo_tls_env() pushes config values to the process env at load so the container env-builders pick them up uniformly. - container: inject_repo_tls_env() adds AVOCADO_REPO_CA_B64 (base64 of the CA file) + AVOCADO_REPO_INSECURE to the container env at the env-builder chokepoints. REPO_TLS_SETUP_SNIPPET appends the CA to the SDK trust bundle (which SSL_CERT_FILE/CURL_CA_BUNDLE and every explicit sslcacert point at) and, for insecure, adds --setopt=sslverify=0 to DNF_SDK_HOST (base of every dnf call). Emitted by both entrypoint generators. - sdk bootstrap: snippet appended to the bootstrap command so the FIRST dnf (target pkg from sdk/all) is covered too.

Pin each target to an immutable point-in-time snapshot of its feed channel so a clean + rebuild reproduces exactly, even after the live channel head advances or evicts the NEVRAs the lock file references. Mechanism: every dnf baseurl is ${repo_url}/$releasever/... with releasever = {release}/{channel}; pinning injects one segment -> {release}/{channel}/ snapshots/<id>, exposed via AVOCADO_RELEASEVER (which get_releasever() already honors first), so all sysroots freeze together with no per-call-site plumbing. - Lock file v7: per-target `repo-snapshot` (RepoSnapshot). Additive — v6 reads as v7 with no pin (= track head), fully backward-compatible. merge adopts a disk pin when the writer has none; unlock (clear_all) drops it. - utils/snapshot.rs: resolve-and-apply runs once per command — reuse a matching pin, auto-pin to the channel's latest snapshot on first fetch, pre-flight a pinned snapshot and emit an actionable "run avocado update" error if it was GC'd, warn + track head on a stale release/channel, degrade to head if the feed serves no snapshots (snapshots-latest.json 404s). Honors repo CA / TLS. - Wired into install (umbrella) + fetch + sdk/rootfs/runtime/ext/initramfs install; fetch stays the reproducible metadata cache. - avocado update: Cargo-style move-forward — advance the snapshot pin to newest and clear package/kernel pins so the next install re-resolves + re-locks. - Tests: v6->v7 migration, round-trip, clear-on-unlock, merge-adopts-disk-pin, plus pure releasever/pin-status/url transforms.

…repo URL The snapshot resolver early-returned when distro.repo.url was unset, so projects relying on the baked default feed (no explicit repo.url) never recorded a repo-snapshot pin even though their dnf fetch hit that default. Fix by deriving the same default the container uses. Single source of truth: add Config::DEFAULT_REPO_URL + Config::effective_repo_url() in config.rs. The snapshot resolver uses effective_repo_url(); the container env-builder always sets AVOCADO_SDK_REPO_URL from the same const, so the shell's duplicated literal default is removed (it just consumes the env now).

Packaged extensions now nest their content under /<ext_name>/ and self-describe the layout via `Provides: avocado-ext-layout(nested)`. ext_fetch repoqueries that provide (repo metadata, no download) and installs nested packages into the SHARED $AVOCADO_PREFIX/includes installroot, so one rpmdb tracks every installed extension with no cross-extension file collisions. Legacy packages lacking the provide keep the per-extension installroot. Either way the final content lands at includes/<ext_name>/, so consumers are unchanged.

Build-once publish of a packaged extension RPM to the feed, plus status and list of published versions. Adds the commands::connect::ext module and wires the ConnectExtCommands subcommands and dispatch in main.rs.

Add the avocado.yaml manifest plus compile/install/clean helper scripts that build avocado-cli into the avocado-ext-cli extension, and gitignore the transient /.cargo/ cross-compile config that avocado-cli-compile.sh writes during the build.

`{{ env.VAR }}` interpolation of an unset variable emitted its warning with a raw eprintln!, which lands inside the TaskRenderer's live cursor region without being counted in rendered_lines. The next redraw's MoveUp/Clear then cleared one line too few and stranded a task line, showing as stacked "sdk bootstrap" spinner lines during installs that fetch remote extensions (whose configs use `{{ env.AVOCADO_EXT_VERSION }}`). Route it through print_warning, which is suppressed while a TUI/JSON renderer is active and still prints in plain/CI runs.

runtime/deploy.rs referenced crate::utils::vm::qmp::QmpClient unconditionally, but the qmp module is `#[cfg(unix)]` (unix-socket transport). That broke the Windows `cargo check` (E0433: cannot find `qmp` in `vm`). Gate the port-forward setup and teardown behind cfg(unix) with a non-unix no-op; avocado-vm routing only occurs on unix hosts, so there is no behavior change on unix.

QEMU's `-machine virt` doesn't emit `cpu-idle-states` device-tree bindings, so CONFIG_ARM_PSCI_CPUIDLE never binds and idle CPUs fall back to bare WFI. Under HVF that pattern bounces the vCPU thread through vmexit/vmenter instead of blocking on the WFI handler's pthread_cond_timedwait, costing ~80% host CPU per vCPU at guest idle. On arm64 launches we now dump QEMU's auto-generated DTB once (via `-machine virt,dumpdtb=`), splice in `/idle-states/cpu-sleep-0` plus per-CPU `cpu-idle-states` properties, cache the patched copy under `~/.avocado/vm/dtb/` keyed by (smp, memory, qemu_version), and pass it back with `-dtb`. Cache hits on subsequent launches. Measured on smp=8 idle: 670% -> 275-344% host CPU. State1 stays cosmetic on HVF (PSCI CPU_SUSPEND isn't deeper than WFI) but the framework binding alone fixes the vmexit-loop pattern. Pure-Rust FDT v17 parse/serialize in fdt.rs; no external dtc dependency. Failures degrade gracefully to the previous auto-generated DTB path. `AVOCADO_VM_DTB` env var preserved as a debug override.

Adds a long-lived `avocado vm supervise` process spawned alongside QEMU. Owns the user-facing SSH port and docker socket; QEMU's hostfwd moves to a loopback-only internal port. The supervisor: - Proxies inbound TCP to QEMU's internal hostfwd. On accept, sends QMP `cont` if the VM is paused; SSH handshake then continues against the freshly-resumed guest. - Owns `~/.avocado/vm/docker.sock`. On accept, wakes the VM and lazily spawns the ssh -L tunnel to /run/docker.sock in the guest (cached for the awake-window, torn down on pause so QEMU can sleep cleanly). - Tracks active connections + idle timer. With no inbound activity for `idle.hibernate_after_secs` (default 10s for testing), sends QMP `stop`. Host CPU on QEMU drops to ~0% while RAM stays resident. Any subsequent SSH or docker connection wakes it transparently. Cache key for the DTB also switches from `qemu --version` to the QEMU binary mtime — saves ~300-500ms of subprocess overhead on every VM start. Mtime naturally invalidates on `brew upgrade qemu`. Known limitations (deferred): - Docker forwarder lifecycle is now supervisor-owned when hibernation is enabled (idle_after_secs > 0); legacy long-lived forwarder still used when disabled to avoid regressing existing non-hibernating setups. - CPU hotplug for awake-but-idle floor: QMP `device_add` returns "machine does not support hot-plugging CPUs" on QEMU 11 + HVF + ARM virt. Defer; Linux CPU offline is a fallback path if needed later.

The 10s default was useful while iterating on the supervisor — short enough to verify pause/wake every few minutes of testing. For real use, 10s pauses mid-SSH-session whenever the user pauses to think, which adds noticeable wake latency on every command. 60s is comfortable for normal interactive work while still freeing host CPU within a minute of stepping away. Users who want either extreme can override via `avocado vm config set idle.hibernate_after_secs N` or the `AVOCADO_VM_IDLE_HIBERNATE_SECS` env var.

`cargo fmt --check` and `cargo clippy --all-targets --all-features -- -D warnings` were both failing on the just-merged supervisor and DTB changes. Auto-applies rustfmt and rewrites three `pos % 4 != 0` clippy::manual_is_multiple_of sites in fdt.rs to `!pos.is_multiple_of(4)`. No behavior changes.

The hibernation supervisor uses tokio's UnixListener/UnixStream for the docker socket path and tokio::signal::unix for graceful shutdown, neither of which exist on Windows. Without gating, `cargo check --target x86_64-pc-windows-gnu` fails with E0432 (unresolved UnixListener/UnixStream imports). Gated unix-only: - `pub mod supervisor` in utils/vm/mod.rs - `pub mod supervise` in commands/vm/mod.rs - `VmCommands::Supervise` variant + dispatch in main.rs - `spawn_supervisor` / `stop_supervisor` / `resolve_idle_after_secs` / `DEFAULT_IDLE_AFTER_SECS` in lifecycle.rs - The internal-port pick + ssh_port file write in `start` On Windows the hibernation feature is unavailable: QEMU binds the user-facing port directly (today's pre-supervisor behavior), the legacy long-lived docker forwarder runs, and the VM never auto-pauses.

mobileoverlord and others added 12 commits June 2, 2026 10:58

feat(connect): add connect ext publish/status/list (super-admin)

b4bdc47

Build-once publish of a packaged extension RPM to the feed, plus status and list of published versions. Adds the commands::connect::ext module and wires the ConnectExtCommands subcommands and dispatch in main.rs.

github-advanced-security AI found potential problems Jun 2, 2026

View reviewed changes

Comment thread src/utils/permissions.rs Dismissed

Comment thread src/utils/permissions.rs Dismissed

Comment thread src/utils/permissions.rs Dismissed

Comment thread src/utils/permissions.rs Dismissed

mobileoverlord added 7 commits June 2, 2026 11:18

release: bump to 0.41.0

3487b50

mobileoverlord force-pushed the rel/0.41.0 branch from 3caa3e3 to 3487b50 Compare June 3, 2026 01:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rel/0.41.0 → main#153

rel/0.41.0 → main#153
mobileoverlord wants to merge 19 commits into
mainfrom
rel/0.41.0

mobileoverlord commented Jun 2, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mobileoverlord commented Jun 2, 2026

Features

Fixes / hardening

Build

Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants