Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
97bbc03
docs: add spec for 004-full-implementation — all 28 sub-issues from #57
jeremymanning Apr 17, 2026
00abad2
docs: clarify coordinator quorum loss behavior — graceful degradation
jeremymanning Apr 17, 2026
5a2976e
docs: add implementation plan for 004-full-implementation
jeremymanning Apr 17, 2026
b428dac
docs: add task breakdown for 004-full-implementation — 211 tasks acro…
jeremymanning Apr 17, 2026
11033ff
docs: fix 7 analysis findings — deps, testing, SPA, calibration, tiers
jeremymanning Apr 17, 2026
d2bf6c7
feat: add all Phase 1 dependencies — rsa, p256, p384, aes-gcm, nix, c…
jeremymanning Apr 17, 2026
6c03d56
feat: Phase 2 foundational types — InclusionProof, ConfidentialBundle…
jeremymanning Apr 17, 2026
a4d58c8
feat: Phase 3 — deep cryptographic attestation + Rekor Merkle proofs …
jeremymanning Apr 17, 2026
a567bf2
feat: Phases 4-6 — agent lifecycle, policy engine, sandbox depth (#30…
jeremymanning Apr 17, 2026
51583dc
feat: Phase 7 — security hardening: adversarial tests, confidential c…
jeremymanning Apr 17, 2026
c3c4597
feat: Phases 8-10 — test coverage, runtime systems, platform adapters…
jeremymanning Apr 17, 2026
c6c64ab
feat: Phases 11-14 — GUI, ops, mesh LLM, polish + README/whitepaper u…
jeremymanning Apr 17, 2026
a835df6
feat: P2P daemon — libp2p Swarm with mDNS + Kademlia + GossipSub
jeremymanning Apr 17, 2026
eb684cd
feat: production NAT traversal + distributed job dispatch
jeremymanning Apr 18, 2026
42f556e
feat: option A — public libp2p bootstrap relays + AutoRelay
jeremymanning Apr 18, 2026
02874ae
style: cargo fmt — merge request_response import line + wrap long boo…
jeremymanning Apr 18, 2026
7d55b0c
fix: CI failures (collapsible_match, Windows colons in sysfs paths)
jeremymanning Apr 18, 2026
63cdd46
fix: add clippy::items_after_test_module allows to all 3 adapter main.rs
jeremymanning Apr 18, 2026
f214709
fix: allow(dead_code) on gui/commands.rs — handlers are wired via tau…
jeremymanning Apr 18, 2026
c60d769
fix: clippy 1.95 errors — useless_vec in test_matchmaking/test_firecr…
jeremymanning Apr 19, 2026
d1764a6
docs: honest accounting of spec 004 status — distinguish verified fro…
jeremymanning Apr 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
378 changes: 244 additions & 134 deletions .omc/project-memory.json

Large diffs are not rendered by default.

143 changes: 0 additions & 143 deletions .omc/state/subagent-tracking.json

This file was deleted.

4 changes: 3 additions & 1 deletion .specify/feature.json
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
{"feature_directory":"specs/003-stub-replacement"}
{
"feature_directory": "specs/004-full-implementation"
}
36 changes: 25 additions & 11 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
# world-compute Development Guidelines

Last updated: 2026-04-16
Last updated: 2026-04-18

## Project Overview

World Compute is a decentralized, volunteer-built compute federation. The codebase is a Rust workspace with 94+ source files, 489+ passing tests, and 20 library modules. All 5 CLI command groups are functional (donor, job, cluster, governance, admin). Core modules implemented: WASM sandbox with CID store integration, real Ed25519 signature verification, certificate chain validation (TPM2/SEV-SNP/TDX), BrightID/OAuth2/phone identity verification, Sigstore Rekor transparency logging, OTLP telemetry, STUN-based NAT detection, Raft coordinator consensus, and Firecracker/Apple VF sandbox drivers.
World Compute is a decentralized, volunteer-built compute federation. The codebase is a Rust workspace with 150+ source files, 802 passing tests, and 20 library modules. All 5 CLI command groups are functional (donor, job, cluster, governance, admin). Production P2P daemon with full libp2p NAT-traversal stack (TCP + QUIC, Noise, mDNS + Kademlia DHT, identify, ping, AutoNAT, Relay v2 server+client, DCUtR) and distributed job dispatch (TaskOffer + TaskDispatch request-response with CBOR + real WASM execution) — validated end-to-end in-process via `tests/nat_traversal.rs`. Core modules implemented: WASM sandbox with CID store integration, real Ed25519 signature verification, certificate chain validation (TPM2/SEV-SNP/TDX), BrightID/OAuth2/phone identity verification, Sigstore Rekor transparency logging, OTLP telemetry, STUN-based NAT detection, Raft coordinator consensus, and Firecracker/Apple VF sandbox drivers.

## Active Technologies
- Rust stable (tested on 1.95.0) + libp2p 0.54, tonic 0.12, ed25519-dalek 2, wasmtime 27, openraft 0.9, opentelemetry 0.27, clap 4 (003-stub-replacement)
- CID-addressed content store (cid 0.11, multihash 0.19), erasure-coded (reed-solomon-erasure 6) (003-stub-replacement)
- Rust stable (tested on 1.95.0) + libp2p 0.54, tonic 0.12, ed25519-dalek 2, wasmtime 27, openraft 0.9, opentelemetry 0.27, clap 4, reqwest 0.12, oauth2 4, x509-parser 0.16, reed-solomon-erasure 6, cid 0.11, multihash 0.19 (004-full-implementation)
- CID-addressed content store (SHA-256), erasure-coded RS(10,18) (004-full-implementation)

- **Language**: Rust (stable, tested on 1.95.0)
- **Networking**: rust-libp2p 0.54 (QUIC, TCP, mDNS, Kademlia, gossipsub)
Expand Down Expand Up @@ -67,7 +69,7 @@ gui/src-tauri/ # Tauri GUI scaffold

```sh
# Build and test
cargo test # 489+ tests (351+ lib + 138+ integration)
cargo test # 802 tests (500+ lib + 300+ integration)
cargo clippy --lib -- -D warnings # Zero warnings enforced

# Build only
Expand Down Expand Up @@ -109,13 +111,25 @@ The project is governed by a ratified constitution at `.specify/memory/constitut
4. **Efficiency & Self-Improvement** — energy-aware scheduling, mesh LLM
5. **Direct Testing** — real hardware tests required, no mocks for production

## Remaining Stubs

Most of the original 76 stubs replaced (issue #7, branch 003-stub-replacement). Remaining:
- **Egress allowlist**: Endpoint allowlist field in JobManifest (egress is default-deny, correct behavior)
- **Artifact registry lookup**: Full CID lookup against ApprovedArtifact registry (structural gate in place)
- **Apple VF helper binary**: Swift helper (`wc-apple-vf-helper`) needs separate macOS compilation
- **Full Merkle proof verification**: Rekor inclusion proof (format validation in place)
## Remaining Stubs and Placeholders

Zero TODO comments in src/ and zero `#[ignore]` tests remain. However, several subsystems have scaffolding landed but placeholders in critical paths — these are not production-ready and are tracked in open issues:

- **Mesh LLM** (#27, #54): `src/agent/mesh_llm/expert.rs::load_model()` is a placeholder — no real LLaMA inference. Orchestration (router, aggregator, safety tiers, kill switch) is complete.
- **AMD / Intel root CA fingerprints** (#28): pinned as `[0u8; 32]` in `src/verification/attestation.rs`. Validators enter permissive bypass mode when fingerprints are zero.
- **Rekor public key** (#29): pinned as `[0u8; 32]` in `src/ledger/transparency.rs`. Signed tree head verification is skipped when the key is zero.
- **Agent lifecycle → gossip wiring** (#30): heartbeat/pause/withdraw return payloads but don't broadcast over gossipsub (the daemon event loop does broadcast separately).
- **Firecracker rootfs** (#33): concatenates layer bytes; does NOT run mkfs.ext4 + OCI tar extraction. A real boot would fail.
- **Admin `ban()`** (#34): `src/governance/admin_service.rs::ban()` returns `Ok(())` without updating the trust registry.
- **Platform adapters** (#37, #38, #39): Slurm/K8s/Cloud scaffolds exist but have not been exercised against live systems.
- **GUI** (#40): never built or run.
- **Deployment** (#41): Dockerfile and Helm chart exist but have never been built or deployed.
- **REST gateway** (#43): routing + auth + rate-limit logic exist but no HTTP listener is bound in the daemon.
- **Churn simulator** (#51): statistical model, not a real kill-rejoin harness.
- **Apple VF Swift helper** (#52): never built on macOS.
- **Receipt verification** (`src/verification/receipt.rs`): structural check only; coordinator public key not yet wired.
- **Daemon `current_load()`** (`src/agent/daemon.rs:500`): stub returning 0.1.
- **Cross-machine firewall traversal** (#60): production NAT stack validated in-process only. Real WAN operation behind institutional firewalls is unverified.

## CI

Expand All @@ -125,6 +139,6 @@ Two GitHub Actions workflows:

## Recent Changes

- **004-full-implementation** (2026-04-18): Merged scaffolding + significant implementation for #57 and its sub-issues (#28–#56, and a first pass on #27/#54 mesh LLM). 802 tests passing across Linux/macOS/Windows + Sandbox KVM + swtpm CI. Landed: full production P2P daemon with libp2p NAT-traversal stack (TCP + QUIC + Noise + mDNS + Kademlia + identify + ping + AutoNAT + Relay v2 server/client + DCUtR), AutoRelay reservations, public libp2p bootstrap relays as default rendezvous, TaskOffer + TaskDispatch request-response protocols over CBOR, real WASM execution of dispatched jobs, `worldcompute job submit --executor <multiaddr> --workload <wasm>` CLI command, end-to-end 3-node relay-circuit integration test. Also landed: ~12 sub-issues fully completed (policy engine, GPU passthrough, adversarial tests, test coverage, credit decay, preemption, confidential compute, mTLS, energy metering, storage GC, documentation, scheduler matchmaking); ~16 sub-issues partially addressed with scaffolding (see Remaining Stubs above); #27/#54 mesh LLM orchestration shell complete but real LLaMA inference deferred. Critical open issue #60 tracks cross-machine WAN mesh formation behind firewalls.
- **003-stub-replacement** (2026-04-16): Replaced all implementation stubs (#7, #8–#26). 77 tasks, 489+ tests. Added reqwest, oauth2, x509-parser, rcgen dependencies. Wired CLI, sandboxes, attestation, identity, transparency, telemetry, consensus, network.
- **002-safety-hardening** (2026-04-16): Red team review (#4). Policy engine, attestation, governance, incident response, egress, identity hardening. 110 tasks, PR #6.
- **001-world-compute-core** (2026-04-15): Initial architecture and implementation across 11 phases.
28 changes: 27 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@ libp2p = { version = "0.54", features = [
"dns",
"identify",
"ping",
"autonat",
"request-response",
"cbor",
"ed25519",
"macros",
] }
Expand All @@ -61,6 +64,21 @@ ciborium = "0.2"
ed25519-dalek = { version = "2", features = ["serde", "rand_core"] }
sha2 = "0.10"
rand = "0.8"
rand_04 = { package = "rand", version = "0.4" }
rsa = { version = "0.9", features = ["sha2"] }
p256 = { version = "0.13", features = ["ecdsa"] }
p384 = { version = "0.13", features = ["ecdsa"] }
aes-gcm = "0.10"
x25519-dalek = { version = "2", features = ["static_secrets"] }
threshold_crypto = "0.2"

# TLS / certificate management
rcgen = "0.13"
tokio-rustls = "0.26"
rustls = "0.23"

# Unix signals (preemption supervisor)
nix = { version = "0.29", features = ["signal", "process"] }

# Content addressing
cid = { version = "0.11", features = ["serde"] }
Expand Down Expand Up @@ -101,8 +119,16 @@ uuid = { version = "1", features = ["v4", "serde"] }
hex = "0.4"
base64 = "0.22"

# ML inference (mesh LLM)
candle-core = "0.8"
candle-transformers = "0.8"
tokenizers = "0.20"

# System info (energy metering)
sysinfo = "0.32"

[dev-dependencies]
rcgen = "0.13"
time = "0.3"

[build-dependencies]
tonic-build = "0.12"
10 changes: 10 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Stage 1: Build
FROM rust:1.95-bookworm AS builder
WORKDIR /build
COPY . .
RUN cargo build --release --bin worldcompute

# Stage 2: Runtime
FROM gcr.io/distroless/cc-debian12
COPY --from=builder /build/target/release/worldcompute /usr/local/bin/worldcompute
ENTRYPOINT ["worldcompute"]
53 changes: 35 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,27 +9,44 @@

---

> **Honesty notice — please read before going further.**
> **Status notice (updated 2026-04-18)**
>
> This repository contains a ratified governing constitution, a full research package (~28,600 words), detailed feature specifications, and substantial library code (391 tests passing across safety-critical modules). **However, there is no runnable agent, no working CLI, no testnet, and no deployable binary.** The CLI compiles but all commands print "not yet implemented." The library modules (policy engine, attestation verification, governance, incident response, egress enforcement) work as tested Rust code but are not wired into a running daemon.
> This repository contains a ratified governing constitution, a full research package (~28,600 words), detailed feature specifications, and a substantial implementation with **802 passing tests** across all modules on Linux/macOS/Windows CI. Core systems and the P2P daemon are wired and exercised by unit + integration tests. **However, several subsystems have production scaffolding with placeholder values in critical paths — they are NOT production-ready as shipped.** The open GitHub issues track which pieces remain.
>
> **What exists and works (as of 2026-04-16):**
> - Library crate with 422 passing tests covering safety-critical paths
> - Deterministic policy engine (10-step evaluation pipeline)
> - Attestation verification (TPM2/SEV-SNP/TDX — measurement validation and signature binding; full CA certificate-chain validation is pluggable but not yet integrated)
> - Governance separation of duties, quorum thresholds, time-locks
> - Network egress blocking (RFC1918, link-local, cloud metadata)
> - Incident response containment primitives with audit trails
> - CI on Linux/macOS/Windows via GitHub Actions
> **What is complete and verified in code:**
> - P2P daemon: full libp2p NAT-traversal stack (TCP + QUIC + Noise + mDNS + Kademlia + identify + ping + AutoNAT + Relay v2 server/client + DCUtR). Validated end-to-end in-process by `tests/nat_traversal.rs` — a 3-node relay-circuit test that dispatches a real WASM job through the relay in ~5ms.
> - Distributed job dispatch: TaskOffer and TaskDispatch request-response protocols over CBOR. Real WASM execution on the executor. `worldcompute job submit --executor <multiaddr> --workload <wasm>` CLI command for end-to-end remote dispatch.
> - All 5 CLI command groups functional
> - WASM sandbox with CID-store integration and real workload execution (wasmtime)
> - Deterministic 10-step policy engine with artifact registry + egress allowlist
> - Preemption supervisor with SIGSTOP via nix (measured and logged)
> - BrightID / OAuth2 / phone identity verification
> - Scheduler with ClassAd matchmaking + R=3 disjoint-AS placement
> - All 8 adversarial test scenarios implemented
> - Confidential compute: AES-256-GCM + X25519 key wrapping
> - mTLS certificate lifecycle via rcgen + Ed25519 auth tokens
> - Credit decay with 45-day half-life + anti-hoarding
> - Storage GC + acceptable-use filter + shard residency enforcement
> - Energy metering via Intel RAPL
> - 802 tests passing on CI (Linux/macOS/Windows + Sandbox KVM + swtpm)
>
> **What does NOT exist yet:**
> - A running agent daemon
> - Working CLI subcommands (all print "not yet implemented")
> - P2P networking between nodes
> - Actual job execution inside sandboxes
> - Any form of testnet or multi-node deployment
> **What has scaffolding but placeholder values or missing integration (see issues):**
> - Mesh LLM (#27, #54): orchestration + router + aggregator + safety + kill switch all exist, but `load_model()` is a placeholder — no real LLaMA inference yet
> - Attestation root CA fingerprints (#28): AMD ARK / Intel DCAP pinned as `[0u8; 32]` (bypass mode) — need real fingerprints before production
> - Rekor public key (#29): pinned as `[0u8; 32]` — tree-head signature verification is skipped
> - Firecracker rootfs (#33): concatenates layer bytes; real mkfs.ext4 + OCI-layer extraction not yet wired
> - Platform adapters #37/#38/#39 (Slurm, K8s, Cloud): scaffolds + parsers; not exercised against live systems
> - Tauri GUI (#40): scaffold; never built or run
> - Docker / Helm deployment (#41): files present; never built or deployed
> - REST gateway (#43): routing + auth logic present; no HTTP listener bound in daemon
> - Admin ban (#34): `admin_service::ban()` is an explicit stub returning `Ok(())`
> - Churn simulator (#51): statistical model; no real kill-rejoin
> - Apple VF Swift helper (#52): scaffold; never built on macOS
>
> If you want to help build it, see [Contributing](#contributing). If you want to be notified when it becomes installable, watch this repository.
> **Critical open issue:**
> - #60: cross-machine firewall traversal. The production NAT stack is validated in-process only. Real WAN operation behind institutional / corporate firewalls is unverified, and our attempts from behind Dartmouth's firewall showed libp2p connections not completing. Resolving this is the next milestone.
>
> If you want to help build or test it, see [Contributing](#contributing).

---

Expand Down Expand Up @@ -84,7 +101,7 @@ Five constitutional principles govern every design decision. They are not aspira

## Status

World Compute has completed library-level implementation across core and safety modules. The CLI and agent daemon are scaffolded but not yet functional. Updated 2026-04-16.
World Compute has substantial implementation with 802 passing tests and a fully-wired P2P daemon. All 5 CLI command groups functional. Several subsystems still have placeholder values in critical paths (see status notice at top of README and open issues #27, #28, #29, #33, #34, #37–#43, #51–#54, #56, #60). Updated 2026-04-18.

### Design artifacts (complete)

Expand Down
2 changes: 2 additions & 0 deletions adapters/cloud/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,5 @@ license = "Apache-2.0"
worldcompute = { path = "../.." }
tokio = { version = "1", features = ["full"] }
clap = { version = "4", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
Loading
Loading