RFC: Treat apm.yml as package.json for skills — load SKILL.md as agent entrypoint

## TL;DR

`apm.yml` is `package.json` for agent skills. This RFC proposes two additive entry points inside MAF so a developer can write one `SKILL.md`, declare its dependencies in `apm.yml`, and run it directly with no build, no bundle, no transformation step:

```bash
$ maf run ./code-reviewer/SKILL.md
```

That's it. Edit-rerun loop, dependency add, share-with-teammate, and CI all collapse to one extra step (`apm install`) that mirrors `npm install` / `pip install` exactly.

## Why this matters for MAF developers

| Step | Today (hand-authored agent.yaml + manual deps) | With this RFC |
|---|---|---|
| Edit-rerun | edit `agent.yaml` → re-run | edit `SKILL.md` → re-run |
| Add a dependency skill | clone source, copy files into project, wire paths manually | `apm install acme/code-reviewer` → re-run |
| Share with a teammate | "here's the repo, also clone these other 3 and put them at `./skills/`" | clone → `apm install` → `maf run .` |
| CI reproducibility | ad-hoc | lockfile + `apm install` |
| Debug surface | agent.yaml ↔ MAF runtime (2 layers) | SKILL.md ↔ MAF runtime (2 layers) |

The mental model collapses to: **the SKILL.md IS the agent**, and **`apm.yml` is its `package.json`**. Nothing else.

## The two asks of MAF (additive, ~190 LoC Python, ~230 LoC C#)

### Ask 1 — Accept `SKILL.md` as the agent entrypoint

A SKILL.md (Anthropic Agent Skills spec, already widely adopted) IS the agent definition. Frontmatter holds `model:`, `tools:`, `connection:` (reusing existing MAF schema objects verbatim — no new schema types). Markdown body is the agent instructions.

```yaml
---
name: code-reviewer
description: Reviews code for style and correctness
model:
 id: gpt-4.1-mini
 provider: OpenAI
 connection:
 kind: env
 apiKey: OPENAI_API_KEY
tools:
 - kind: mcp
 url: https://github.com/anthropics/fetch-mcp
---
You are a code review assistant. ...
```

**Implementation surface (Python; .NET mirrors)**:

```python
# agent_framework_declarative/_skill_entrypoint.py (NEW, ~120 LoC)
class SkillEntrypointLoader:
 @staticmethod
 def load(skill_path: Path) -> PromptAgent:
 """Parse SKILL.md frontmatter + body -> PromptAgent (in memory)."""
```

CLI dispatch is a one-line branch: if the path ends in `SKILL.md` (or is a directory containing one), call `SkillEntrypointLoader.load()` then feed the result into the existing private `_create_agent_from_prompt_agent()`. **`AgentFactory`, `_models.py`, and `PROVIDER_TYPE_OBJECT_MAPPING` are untouched.**

### Ask 2 — Read `apm.yml` and shell to `apm install` for dependencies

If MAF detects an `apm.yml` next to (or above) the entrypoint skill, and `apm_modules/` is missing, MAF runs `apm install` as a **subprocess** — not as a library import. APM stays in its lane (resolver + lockfile compiler); MAF stays in its lane (runtime).

```python
# agent_framework_declarative/_apm_deps.py (NEW, ~50 LoC)
class ApmDependencyHook:
 @staticmethod
 def ensure(project_root: Path) -> None:
 """No-op if apm_modules/ + apm.lock.yaml present. Else fail-closed
 (lock missing) or run `apm install` (lock present, modules missing)."""
```

**Total cost on the MAF side: 2 new files in `agent_framework_declarative/`, ~190 LoC Python, ~230 LoC C#, no changes to existing dispatch.**

## Architecture

### Component view

Yellow nodes are the new MAF additions; everything else is existing behavior on both sides.

```mermaid
flowchart TD
 subgraph project["Developer Project"]
 APM_YML["apm.yml (deps + entrypoint)"]
 SKILL["code-reviewer/SKILL.md (agent entrypoint)"]
 MODULES["apm_modules/ (resolved skill tree)"]
 LOCK["apm.lock.yaml"]
 end

 subgraph apm["APM CLI (unchanged)"]
 INSTALL["apm install"]
 RESOLVER["Resolver + Lockfile compiler"]
 end

 subgraph maf["MAF Runtime (new additions in yellow)"]
 CLI_RUN["maf run ./code-reviewer/SKILL.md"]
 SKILL_LOADER["SkillEntrypointLoader (NEW: reads SKILL.md frontmatter, builds PromptAgent in memory)"]
 DEP_HOOK["ApmDependencyHook (NEW: calls apm install if apm.yml present)"]
 FACTORY["AgentFactory (existing)"]
 SKILLS_SOURCE["AgentFileSkillsSource (existing, per ADR 0021)"]
 AGENT["Running Agent"]
 end

 CLI_RUN -->|"1. parse entrypoint path"| SKILL_LOADER
 SKILL_LOADER -->|"2. detect apm.yml in project root"| DEP_HOOK
 DEP_HOOK -->|"3. subprocess: apm install"| INSTALL
 INSTALL --> RESOLVER
 RESOLVER -->|"FS write"| MODULES
 RESOLVER -->|"FS write"| LOCK
 SKILL_LOADER -->|"4. read SKILL.md frontmatter + body"| SKILL
 SKILL_LOADER -->|"5. synthesize PromptAgent object"| FACTORY
 FACTORY -->|"6. load skill tree"| SKILLS_SOURCE
 SKILLS_SOURCE -->|"FS read"| MODULES
 FACTORY --> AGENT

 APM_YML -.->|"declares deps"| INSTALL
 SKILL -.->|"is the agent"| SKILL_LOADER

 style SKILL_LOADER fill:#fff3b0,stroke:#d47600
 style DEP_HOOK fill:#fff3b0,stroke:#d47600
 style CLI_RUN fill:#fff3b0,stroke:#d47600
```

### Sequence view

```mermaid
sequenceDiagram
 actor Dev as Developer
 participant CLI as maf run
 participant SL as SkillEntrypointLoader (NEW)
 participant DH as ApmDependencyHook (NEW)
 participant APM as apm install (subprocess)
 participant FS as Filesystem
 participant AF as AgentFactory (existing)
 participant SKS as AgentFileSkillsSource (existing)
 participant Agent as Running Agent

 Dev->>CLI: maf run ./code-reviewer/SKILL.md
 CLI->>SL: resolve_entrypoint(path)
 SL->>FS: read code-reviewer/SKILL.md frontmatter
 FS-->>SL: name, description, model, tools

 SL->>SL: walk up to find apm.yml
 alt apm.yml found AND apm_modules/ missing
 SL->>DH: ensure_dependencies(project_root)
 alt apm.lock.yaml missing (fail-closed)
 DH-->>CLI: error "Run 'apm install' first"
 else lockfile present
 DH->>APM: subprocess.run(["apm", "install"])
 APM->>FS: resolve deps, populate apm_modules/
 APM-->>DH: exit code 0
 DH-->>SL: dependencies ready
 end
 end

 SL->>SL: build PromptAgent from frontmatter + body
 Note right of SL: instructions = SKILL.md body, model and tools from frontmatter

 SL->>AF: create_agent(prompt_agent_obj)
 AF->>SKS: load skills from apm_modules/
 SKS->>FS: scan SKILL.md files in skill dirs
 FS-->>SKS: skill tree
 SKS-->>AF: list of AgentSkill

 AF-->>Agent: configured Agent instance
 Agent-->>Dev: agent ready, accepting input
```

## Design principles (the contract this RFC asks MAF to adopt)

1. **Skill is the primitive.** A SKILL.md is a self-contained agent definition. No separate `agent.yaml` is required. Everything (deps, model config, tools) is metadata on that one primitive.
2. **Filesystem is the contract boundary.** MAF and APM communicate exclusively through the filesystem (`apm_modules/`, `apm.yml` existence check). No shared memory, no library imports, no IPC beyond subprocess exit codes.
3. **Runtime owns execution; package manager owns resolution.** MAF never resolves deps. APM never runs agents. This is the npm/node and pip/python boundary.
4. **Additive only, never narrowing.** Existing `maf run agent.yaml` paths keep working unmodified. The new SKILL.md path is a separate code branch that converges on the same `PromptAgent` object.

## Why subprocess (not library import) for the APM call

| Criterion | Subprocess (recommended) | Library import (rejected) |
|---|---|---|
| Coupling | Zero. APM is a PATH dep, like `git`. | Hard. MAF must declare `apm-cli` as a dep; version conflicts become release blockers. |
| Failure isolation | Non-zero exit -> clean `RuntimeError` with stderr. | APM internal exception leaks into MAF stack traces. |
| .NET parity | `Process.Start("apm", "install")` is identical. | No library import path exists for .NET; would need a separate bridge per language. |
| Async compat | Wrap in `asyncio.to_thread()`. | APM uses `asyncio` internally -> nested event loops -> `nest_asyncio` hacks. |

## Trust boundary and invariants (security review)

The composition seam (MAF triggering `apm install`) is also a trust boundary. The RFC commits to the following invariants, modelled after Bundler's fail-closed posture:

1. **Lockfile is authoritative.** MAF MUST NOT start an agent if `apm.lock.yaml` is missing. No silent re-resolution, ever.
2. **Subprocess exit code is a security signal.** `apm install` exits non-zero when its scan gate finds critical issues. MAF MUST abort agent startup on non-zero exit.
3. **Entrypoint skill is held to the same scan gate as transitive deps.** In this design the entrypoint skill IS the agent — if it's excluded from scanning or pinning, it's the widest-open attack surface in the system.
4. **`tools:` and `mcp:` declarations from transitive skills are surfaced before execution.** APM deploys these as static content; MAF activates them. The trust decision must be visible.

<details><summary>Three concrete failure modes the design must address (click to expand)</summary>

### Lockfile drift after dependency update
Dev runs `apm install --update`, commits code but forgets the updated `apm.lock.yaml`. CI installs against stale lock. MAF must verify `apm.lock.yaml` consistency at startup and fail with a "which dep drifted" diagnostic.

### `apm.yml` ↔ APM version skew
Skill author publishes a package using a new `apm.yml` field; deployment env has older APM that silently ignores unknown fields. The RFC requires `apm.yml` to declare a minimum APM version (`requires_apm: ">=0.8.0"`); install fails if binary is below.

### Typosquatted skill as agent entrypoint
Because the entrypoint skill IS the agent, a typosquat (`acme-corp/code-reviewr`) gets maximum blast radius — it defines instructions, tools, and MCP servers. Mitigation: `apm audit` must flag entrypoint skills with new `tools:` / `mcp:` declarations vs the previous lockfile; the RFC SHOULDs pinning entrypoint skills to exact commits in production; MAF should display the resolved `org/repo@sha` of the entrypoint at startup.

</details>

## What we use from MAF (read-only consumers)

| MAF surface | How this RFC uses it |
|---|---|
| `PromptAgent` model | `SkillEntrypointLoader` synthesizes one in memory from SKILL.md |
| `_create_agent_from_prompt_agent()` (existing private method) | Called by the CLI shim with the synthesized `PromptAgent` |
| `AgentFileSkillsSource` (per ADR 0021) | Reads `apm_modules/` as the skill tree — APM populates the directory; MAF consumes it |
| Existing `Model`, `Tool` schema objects | Reused verbatim in SKILL.md frontmatter; no new schema types |

## What this RFC is explicitly NOT proposing

- **No new `kind:` in declarative dispatch.** No changes to `_models.py:531` Python dispatch or `PromptAgentFactory.TryCreateAsync` in .NET.
- **No bundling step.** No transformation pipeline that emits a derived MAF artifact from the source skill. MAF reads the live files.
- **No `agent_framework` import in APM.** The split stays clean.
- **No vendoring of APM into MAF.** Subprocess only. APM is a PATH dependency, like `git`.
- **No changes to existing `maf run agent.yaml` flows.** Strictly additive.

## Specific questions for maintainers

1. **Two new files in `agent_framework_declarative/` (`_skill_entrypoint.py`, `_apm_deps.py`) and a one-branch CLI dispatch — is that the right shape?** The .NET equivalents would mirror in `Microsoft.Agents.AI.Declarative/`.
2. **`SKILL.md` frontmatter as agent config** — does putting `model:`, `tools:`, `connection:` in the entrypoint skill's frontmatter conflict with any planned ADR or schema change?
3. **Subprocess to `apm install`** — would MAF rather take a hard PyPI dep on `apm-cli`? We see strong reasons against (table above), but happy to be talked out of it.
4. **ADR 0021 status.** Still PROPOSED. This RFC only consumes `AgentFileSkillsSource` (already shipped); the v2 `ApmSkillsSource` extension via 0021 stays a future, optional thing.
5. **Backward-compat principle ("additive only, never narrowing")** — is that the right invariant to commit to in the RFC, or do you want stronger language about deprecation paths?

## Status

Filing the design first to incorporate maintainer feedback before implementation. Happy to open a draft PR against `agent_framework_declarative/` to make the surfaces concrete, or to schedule a sync.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

MAF surface	How this RFC uses it
`PromptAgent` model	`SkillEntrypointLoader` synthesizes one in memory from SKILL.md
`_create_agent_from_prompt_agent()` (existing private method)	Called by the CLI shim with the synthesized `PromptAgent`
`AgentFileSkillsSource` (per ADR 0021)	Reads `apm_modules/` as the skill tree — APM populates the directory; MAF consumes it
Existing `Model`, `Tool` schema objects	Reused verbatim in SKILL.md frontmatter; no new schema types

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Treat apm.yml as package.json for skills — load SKILL.md as agent entrypoint #5571

TL;DR

Why this matters for MAF developers

The two asks of MAF (additive, ~190 LoC Python, ~230 LoC C#)

Ask 1 — Accept `SKILL.md` as the agent entrypoint

Ask 2 — Read `apm.yml` and shell to `apm install` for dependencies

Architecture

Component view

Sequence view

Design principles (the contract this RFC asks MAF to adopt)

Why subprocess (not library import) for the APM call

Trust boundary and invariants (security review)

Lockfile drift after dependency update

`apm.yml` ↔ APM version skew

Typosquatted skill as agent entrypoint

What we use from MAF (read-only consumers)

What this RFC is explicitly NOT proposing

Specific questions for maintainers

Status

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Step	Today (hand-authored agent.yaml + manual deps)	With this RFC
Edit-rerun	edit `agent.yaml` → re-run	edit `SKILL.md` → re-run
Add a dependency skill	clone source, copy files into project, wire paths manually	`apm install acme/code-reviewer` → re-run
Share with a teammate	"here's the repo, also clone these other 3 and put them at `./skills/`"	clone → `apm install` → `maf run .`
CI reproducibility	ad-hoc	lockfile + `apm install`
Debug surface	agent.yaml ↔ MAF runtime (2 layers)	SKILL.md ↔ MAF runtime (2 layers)

Criterion	Subprocess (recommended)	Library import (rejected)
Coupling	Zero. APM is a PATH dep, like `git`.	Hard. MAF must declare `apm-cli` as a dep; version conflicts become release blockers.
Failure isolation	Non-zero exit -> clean `RuntimeError` with stderr.	APM internal exception leaks into MAF stack traces.
.NET parity	`Process.Start("apm", "install")` is identical.	No library import path exists for .NET; would need a separate bridge per language.
Async compat	Wrap in `asyncio.to_thread()`.	APM uses `asyncio` internally -> nested event loops -> `nest_asyncio` hacks.

RFC: Treat apm.yml as package.json for skills — load SKILL.md as agent entrypoint #5571

Description

TL;DR

Why this matters for MAF developers

The two asks of MAF (additive, ~190 LoC Python, ~230 LoC C#)

Ask 1 — Accept SKILL.md as the agent entrypoint

Ask 2 — Read apm.yml and shell to apm install for dependencies

Architecture

Component view

Sequence view

Design principles (the contract this RFC asks MAF to adopt)

Why subprocess (not library import) for the APM call

Trust boundary and invariants (security review)

Lockfile drift after dependency update

apm.yml ↔ APM version skew

Typosquatted skill as agent entrypoint

What we use from MAF (read-only consumers)

What this RFC is explicitly NOT proposing

Specific questions for maintainers

Status

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Ask 1 — Accept `SKILL.md` as the agent entrypoint

Ask 2 — Read `apm.yml` and shell to `apm install` for dependencies

`apm.yml` ↔ APM version skew