feat(inspect): add IR-based pipeline reasoning tools (inspect, graph, whatif, lint, catalog, trace, mcp-author)#998
Conversation
🔍 Rust PR ReviewSummary: Looks good overall — layered design is clean, trust boundary is maintained, error handling is solid. A few correctness and maintenance issues worth addressing. Findings🐛 Bugs / Logic Issues
|
|
@copilot address feedback and suggestions |
Addressed the feedback in commit c4063bd (plus prior follow-up f5c2c65): fixed |
|
/rust-review |
|
❌ Rust PR Reviewer failed. Please review the logs for details. |
|
/rust-review |
|
✅ Rust PR Reviewer completed successfully! |
🔍 Rust PR ReviewSummary: Mostly solid — the layered IR inspection design is clean and well-tested, but there are two security issues and one logic bug worth addressing before merge. Findings🔒 Security Concerns
🐛 Bugs / Logic Issues
|
…dback
Security:
- mcp_author: manual Debug impls for TraceFailureParams and AuditBuildParams so the optional pat field is no longer leaked via {:?} / dbg!() / rmcp error traces.
- audit::pipeline_graph::resolve_source_path: require .md extension and reject parent-dir / tilde components to close the /home/user/.ssh/id_rsa exfiltration vector flagged by the reviewer. populate_pipeline_graph downgrades a rejection to a warning so legitimate audits still complete.
Maintainability:
- audit::model::JobData: add matches_ir_id helper. findings::add_downstream_impact_findings and inspect::trace::find_runtime_job now both delegate to it instead of carrying duplicate Stage.Job suffix-match heuristics.
- audit::findings::add_downstream_impact_findings: emit a Recommendation pointing operators at the failed-job logs (the _recommendations parameter is no longer unused).
- inspect::lint::rule_step_id_collisions: collision message now names BOTH the first-seen producer location and the colliding consumer (was only pointing at the second occurrence).
- inspect::whatif::classify_condition: doc-comment now records the coverage gap for variable-based conditions (e.g. Agent.JobStatus comparisons) which we deliberately classify as Skipped; new test pins that behaviour.
New tests:
- audit::pipeline_graph: 4 resolver tests covering the rejected attack patterns and the accepted legitimate absolute-path case.
- audit::pipeline_graph::populate_pipeline_graph_records_warning_on_malicious_source — end-to-end test that a malicious source string is recorded as a warning rather than read from disk.
- inspect::whatif::variable_based_condition_is_conservatively_skipped — pins the documented limitation.
Validated locally with cargo build, cargo test (1880 unit tests + integration suites pass), and cargo clippy --all-targets --all-features (clean).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Solid layered design with good test coverage — one substantive logic bug in the condition classifier, a security suggestion for the MCP source path, and a lint rule that's effectively dead code. Findings🐛 Bugs / Logic Issues
Minimal repro that would fail: detection.condition = Some("not(succeededOrFailed())".to_string());
// classify_condition returns RunsAnyway, should return SkippedFix: add a word-boundary guard before accepting a substring match, e.g. verify that the character immediately before the match is fn is_word_boundary_before(s: &str, idx: usize) -> bool {
idx == 0 || s.as_bytes().get(idx - 1).map_or(true, |&b| !b.is_ascii_alphanumeric())
}Then in The PR already tests 🔒 Security Concerns
fn source_path(path: &str) -> PathBuf {
PathBuf::from(path) // no validation
}All nine MCP tools that accept
|
🔍 Rust PR ReviewSummary: Solid layered design with good security discipline in most places; a few logic issues worth fixing before merge. Findings🐛 Bugs / Logic Issues
// Branch 3 (the fallback):
timeline_name
.rsplit('.')
.next()
.is_some_and(|suffix| suffix == job_id)
// More precise: only accept "anything.job_id" (single-level prefix)
if let Some((_, suffix)) = timeline_name.rsplit_once('.') {
suffix == job_id
} else {
false
}
When the result comes from a fresh download, 🔒 Security Concerns
fn source_path(path: &str) -> PathBuf {
PathBuf::from(path)
}
|
…dback
Bugs:
- audit::pipeline_graph::timeline_name_matches_job and
audit::model::JobData::matches_ir_id: tighten the dotted-name suffix
fallback so multi-level names like `Stage1.SubStage.Agent` no longer
spuriously match against the bare id `Agent`. The fallback now only
accepts a single-level `Stage.Job` qualifier (rsplit_once + prefix
must contain no dot). New regression tests pin the behaviour.
- audit::cli::fetch_audit_data_inner: extract the post-fetch enrichment
(pipeline_graph + metrics + derive_findings) into a single helper so
the fresh-download and cache-load paths both produce structurally
identical AuditData. The cache-load path now persists the recomputed
data back to run-summary.json when it changed, so subsequent runs see
the canonical shape and tooling diffing successive outputs is no
longer confused by drift between the saved file and in-memory result.
Suggestions:
- mcp_author::audit_build: route `no_cache: true` invocations through a
per-invocation `tempfile::tempdir()` so two concurrent calls for the
same build can no longer race on partially-written artifacts in the
shared `${TEMP}/ado-aw/audit/build-<id>` directory. Cached calls
continue to use the shared cache root so warm-cache reads still
benefit from cross-invocation reuse.
- compile::build_pipeline_ir: add `/* skip_integrity */` and
`/* debug_pipeline */` inline-name comments to every dispatched
target-builder call (only Standalone had them) so future callers
cannot cargo-cult the wrong positional bool.
Validated locally: cargo build, cargo test (1884 unit tests + every
integration suite pass), cargo clippy --all-targets --all-features
(clean).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Looks good overall — well-structured layered design with solid security defaults. A few items worth addressing before merge. Findings
|
…998 round 4 All four items in the latest review are documentation-only — the runtime behaviour is already correct and the reviewer asked for explicit comments so future maintainers do not weaken the guards or re-introduce the fixed bugs by accident. - inspect::whatif::classify_condition: doc-comment now explicitly flags the string-literal false-positive (e.g. a condition like eq(variables['result'], 'failed()') would match the literal substring) alongside the existing variable-condition limitation. This is acceptable because ADO conditions are compiler-generated, but the residual gap is now documented. - audit::cli::fetch_audit_data_inner cache write path: add an inline comment recording the deliberate lack of filesystem locking on save_run_summary. Two concurrent ado-aw audit runs for the same build can race on this write; the failure is recorded as a warning rather than aborting the audit, and both writers derive from the same on-disk artifacts so the resulting summary stays internally consistent. - audit::pipeline_graph::resolve_source_path: complete the previously truncated security comment so the residual risk is unambiguous — the .md extension check is the primary gate for absolute paths, and weakening or removing it without adding a containment check would silently re-open the arbitrary-file-read vector flagged earlier. - main: comment on the std::process::exit(1) in the Lint dispatch arm explaining the intentional Drop-bypass (mirrors tsc --noEmit / eslint) and that the only resources in scope are runtime-managed async I/O. No behavioural changes. Validated locally with cargo build, cargo test --bin ado-aw (1884 unit tests pass), and cargo clippy --all-targets --all-features (clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Solid layered design — the IR summary shape is clean, trust boundary is well-maintained, PAT redaction is correct, and the path-validation work in Findings🐛 Bugs / Logic Issues
// src/inspect/cli.rs ~line 205
let audit = crate::audit::fetch_audit_data(crate::audit::AuditOptions {
output: Path::new("./logs"), // ← still hardcoded
...
})The // mcp_author/mod.rs
let (_audit, report) = inspect::build_trace(&opts).await.map_err(to_mcp_error)?;This means every Simplest fix: default to the same
title: format!("Downstream jobs skipped due to {} failure", job.name),
The description body is fine (it shows the real runtime classification). The title should be more neutral, e.g.
|
🔍 Rust PR ReviewSummary: Well-designed layered IR tooling with good test coverage — two issues worth addressing before merge: an unvalidated path in the new MCP server and a very noisy lint rule. Findings🔒 Security Concerns
🐛 Bugs / Logic Issues
|
… whatif, lint, catalog, trace, mcp-author)
Adds an agent-facing read-only surface over the typed pipeline IR so
agents can reason about pipeline structure and failures programmatically.
- New public summary types in `src/compile/ir/summary.rs` (PipelineSummary,
GraphSummary, etc.) with `schema_version` pinned at 1.
- New `compile::build_pipeline_ir(path) -> (FrontMatter, Pipeline)`
read-only entry point that builds the typed IR without writing YAML.
- New `src/inspect/` module + CLI subcommands:
* `inspect <source>` — pipeline summary (text/json)
* `graph dump|deps|outputs` — resolved dependency graph
* `whatif <source> --fail <id>` — static reachability over conditions
* `lint <source>` — structural checks
* `catalog --kind <k>` — safe-outputs/runtimes/tools/engines/models
* `trace <build-id-or-url>` — joins ADO timeline with the IR graph
- New `src/mcp_author/` stdio MCP server exposing every command above as
a read-only MCP tool for IDE / Copilot Chat integration.
- Audit integration: `AuditData.pipeline_graph`, `JobData.upstream_jobs` /
`downstream_jobs`, and a new finding kind for downstream-skip impact.
- Docs: `docs/mcp-author.md`, public-JSON-summary section in `docs/ir.md`,
new commands in `docs/cli.md`, audit fields in `docs/audit.md`, prompt
updates for debug + update workflows.
Build: `cargo build`, `cargo test` (1871 + 4 passed), `cargo clippy` (0 warnings).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Windows-style hardcoded path failed on Linux CI. Resolve from CARGO_MANIFEST_DIR so the test works on both platforms. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
…dback
Security:
- mcp_author: manual Debug impls for TraceFailureParams and AuditBuildParams so the optional pat field is no longer leaked via {:?} / dbg!() / rmcp error traces.
- audit::pipeline_graph::resolve_source_path: require .md extension and reject parent-dir / tilde components to close the /home/user/.ssh/id_rsa exfiltration vector flagged by the reviewer. populate_pipeline_graph downgrades a rejection to a warning so legitimate audits still complete.
Maintainability:
- audit::model::JobData: add matches_ir_id helper. findings::add_downstream_impact_findings and inspect::trace::find_runtime_job now both delegate to it instead of carrying duplicate Stage.Job suffix-match heuristics.
- audit::findings::add_downstream_impact_findings: emit a Recommendation pointing operators at the failed-job logs (the _recommendations parameter is no longer unused).
- inspect::lint::rule_step_id_collisions: collision message now names BOTH the first-seen producer location and the colliding consumer (was only pointing at the second occurrence).
- inspect::whatif::classify_condition: doc-comment now records the coverage gap for variable-based conditions (e.g. Agent.JobStatus comparisons) which we deliberately classify as Skipped; new test pins that behaviour.
New tests:
- audit::pipeline_graph: 4 resolver tests covering the rejected attack patterns and the accepted legitimate absolute-path case.
- audit::pipeline_graph::populate_pipeline_graph_records_warning_on_malicious_source — end-to-end test that a malicious source string is recorded as a warning rather than read from disk.
- inspect::whatif::variable_based_condition_is_conservatively_skipped — pins the documented limitation.
Validated locally with cargo build, cargo test (1880 unit tests + integration suites pass), and cargo clippy --all-targets --all-features (clean).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…is-output lint + refresh inspect mod doc Bugs: - inspect::whatif::contains_unnegated_call: add a word-boundary guard so the canonical-marker search no longer matches `failed()` inside `succeededorfailed()`. Previously `not(succeededOrFailed())` was wrongly classified as RunsAnyway because the inner `failed()` match started at offset 11 and the four chars before it were `edor`, never `not(`. New is_word_boundary_before helper + regression test pinning the corrected behaviour. - inspect::lint::rule_missing_is_output: drop the TODO-guarded auto_is_output check that made the rule unreachable in normal usage. The rule now consistently fires when a cross-step consumer lacks isOutput, catching future drift between summary patching and graph codegen. Doc-comment explains the invariant. Docs: - inspect::mod doc-comment: drop the stale "Future siblings (not yet landed)" section -- trace/whatif/lint/catalog/graph_deps/graph_outputs are all present in this PR. The refreshed layout list documents every module currently in the inspect tree. Validated locally: cargo build, cargo test --bin ado-aw (1881 unit tests pass), cargo clippy --all-targets --all-features (clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…dback
Bugs:
- audit::pipeline_graph::timeline_name_matches_job and
audit::model::JobData::matches_ir_id: tighten the dotted-name suffix
fallback so multi-level names like `Stage1.SubStage.Agent` no longer
spuriously match against the bare id `Agent`. The fallback now only
accepts a single-level `Stage.Job` qualifier (rsplit_once + prefix
must contain no dot). New regression tests pin the behaviour.
- audit::cli::fetch_audit_data_inner: extract the post-fetch enrichment
(pipeline_graph + metrics + derive_findings) into a single helper so
the fresh-download and cache-load paths both produce structurally
identical AuditData. The cache-load path now persists the recomputed
data back to run-summary.json when it changed, so subsequent runs see
the canonical shape and tooling diffing successive outputs is no
longer confused by drift between the saved file and in-memory result.
Suggestions:
- mcp_author::audit_build: route `no_cache: true` invocations through a
per-invocation `tempfile::tempdir()` so two concurrent calls for the
same build can no longer race on partially-written artifacts in the
shared `${TEMP}/ado-aw/audit/build-<id>` directory. Cached calls
continue to use the shared cache root so warm-cache reads still
benefit from cross-invocation reuse.
- compile::build_pipeline_ir: add `/* skip_integrity */` and
`/* debug_pipeline */` inline-name comments to every dispatched
target-builder call (only Standalone had them) so future callers
cannot cargo-cult the wrong positional bool.
Validated locally: cargo build, cargo test (1884 unit tests + every
integration suite pass), cargo clippy --all-targets --all-features
(clean).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…998 round 4 All four items in the latest review are documentation-only — the runtime behaviour is already correct and the reviewer asked for explicit comments so future maintainers do not weaken the guards or re-introduce the fixed bugs by accident. - inspect::whatif::classify_condition: doc-comment now explicitly flags the string-literal false-positive (e.g. a condition like eq(variables['result'], 'failed()') would match the literal substring) alongside the existing variable-condition limitation. This is acceptable because ADO conditions are compiler-generated, but the residual gap is now documented. - audit::cli::fetch_audit_data_inner cache write path: add an inline comment recording the deliberate lack of filesystem locking on save_run_summary. Two concurrent ado-aw audit runs for the same build can race on this write; the failure is recorded as a warning rather than aborting the audit, and both writers derive from the same on-disk artifacts so the resulting summary stays internally consistent. - audit::pipeline_graph::resolve_source_path: complete the previously truncated security comment so the residual risk is unambiguous — the .md extension check is the primary gate for absolute paths, and weakening or removing it without adding a containment check would silently re-open the arbitrary-file-read vector flagged earlier. - main: comment on the std::process::exit(1) in the Lint dispatch arm explaining the intentional Drop-bypass (mirrors tsc --noEmit / eslint) and that the only resources in scope are runtime-managed async I/O. No behavioural changes. Validated locally with cargo build, cargo test --bin ado-aw (1884 unit tests pass), and cargo clippy --all-targets --all-features (clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…crate::audit::default_cache_root
Previously the `./logs` path was hardcoded in three places and the
default cache directory disagreed between entry points:
- The CLI `audit` command defaulted `--output` to `./logs`.
- The CLI `trace` command and `inspect::build_trace` opened
`Path::new("./logs")` directly.
- The mcp-author `audit_build` tool used
`${TEMP}/ado-aw/audit` while the mcp-author `trace_failure` tool
inherited the `./logs` default — silently scattering log
directories under arbitrary IDE working directories.
Introduce `crate::audit::default_cache_root()` returning
`${TEMP}/ado-aw/audit` on every platform, re-exported from
`crate::audit`. Every entry point now resolves to that helper:
- CLI `audit`: `--output` is `Option<PathBuf>`, unset → helper.
- CLI `trace`: passes `output: None` so `inspect::build_trace`
resolves to the helper.
- `inspect::build_trace`: default for `TraceOptions::output: None`
is the helper.
- mcp-author `audit_build`: uses the helper for normal calls and
layers a per-invocation `tempfile::tempdir()` on top when
`no_cache: true`.
- mcp-author `trace_failure`: passes `output: None`.
Docs (`docs/audit.md`, `docs/cli.md`,
`site/src/content/docs/setup/cli.mdx`,
`site/src/content/docs/reference/audit.mdx`,
`prompts/debug-ado-agentic-workflow.md`) updated to reflect the
new default and to note that the cache is shared across every
audit entry point.
Validated locally: cargo build, cargo test (1884 unit tests + every
integration suite pass), cargo clippy --all-targets --all-features
(clean).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f775a20 to
7660b06
Compare
… noisy lint rule + restore ./logs CLI default Addresses PR #998 round 6: - Drop `inspect::lint::rule_no_condition_references`. The rule fires an Info finding for every job that has `depends_on` with no explicit `condition:` — i.e. every non-Setup job in the canonical Setup → Agent → Detection → SafeOutputs shape. ADO's default `succeeded()` semantics are intentional, not a bug, so the rule documented normal behaviour rather than finding problems and made genuine findings harder to spot. Removed entirely; the serialisation roundtrip test that referenced the rule's code string still passes because the literal value is independent of rule registration. - Centralise `all_jobs()` as `PipelineSummary::all_jobs`. Three copies (in `audit::pipeline_graph`, `inspect::lint`, and `inspect::whatif`) collapsed into a single method on the public IR summary type. Call sites in those three modules plus `inspect::trace::stage_for_job` now go through the method. Future body-shape additions only need updating in one place. - Revert the CLI `ado-aw audit --output` default from `${TEMP}/ado-aw/audit` back to `./logs` to preserve backward compatibility for scripts that consume the documented `./logs` location. Non-CLI callers — `inspect::build_trace` (used by the CLI `trace` command and the mcp-author `trace_failure` tool) and the mcp-author `audit_build` tool — continue to anchor under `crate::audit::default_cache_root()` so they do not scatter `./logs/` directories under arbitrary IDE working directories. Docs (`docs/audit.md`, `docs/cli.md`, both site mdx files, and the debug prompt) updated to reflect the per-entry-point split. Validated locally: cargo build, cargo test (1887 unit tests + every integration suite pass), cargo clippy --all-targets --all-features (clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Looks good overall — well-layered design, strong test coverage, and commendable security hygiene throughout. Two issues worth addressing before merge. Findings🐛 Bugs / Logic Issues
🔒 Security Concerns
|
…in resolve_source_path Two bugs from PR #998 round 7: - inspect::whatif::is_negated_call: switch from `&str[range]` indexing to byte-slice comparison via `bytes.get(idx - 4..idx) == Some(b"not(")`. The previous form panicked if the four bytes immediately before the matched call straddled a UTF-8 char boundary — possible if a non-ASCII display name leaked into the condition string. New regression test prepends `é` (a two-byte UTF-8 sequence) before `failed()` and asserts the classifier returns RunsAnyway instead of panicking. - audit::pipeline_graph::resolve_source_path: a symlink at `/tmp/foo.md` → `/etc/passwd` lexically satisfied the `.md` extension check and was passed through unguarded. We now canonicalize absolute paths after the extension check and reject them if the resolved target does not also end in `.md`, closing the symlink-bypass vector. Legitimate `current.md` → `v1.md` style symlinks remain accepted because the resolved target still ends in `.md`. A new `has_md_extension` helper factors out the shared extension test. Two new unix-only regression tests cover both the rejected-symlink and the accepted-symlink cases. The security doc-comment now explicitly calls out the symlink-target re-check so future maintainers do not weaken it. Validated locally: cargo build, cargo test --bin ado-aw (1888 unit tests pass on Windows; +2 unix-only symlink tests compile behind cfg(unix)), cargo clippy --all-targets --all-features (clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: looks good overall — solid architecture, strong security posture on the Findings🐛 Bugs / Logic Issues
This is inconsistent with fn consumed_outputs(summary: &PipelineSummary) -> BTreeSet<(String, String)> {
let mut consumed: BTreeSet<_> = summary
.graph
.outputs_needing_is_output
.iter()
.flat_map(|entry| entry.outputs.iter().map(|o| (entry.step.clone(), o.clone())))
.collect();
// Also count same-job consumers (env_refs + condition_refs);
// these do not need isOutput=true but the output is still used.
for (_, step) in all_steps(summary) {
for r in step.env_refs.iter().chain(step.condition_refs.iter()) {
consumed.insert((r.step.clone(), r.name.clone()));
}
}
consumed
}The existing test
|
…positive + zero-alloc all_jobs + JSON graph_dump + cache-hit doc Bugs: - inspect::lint::consumed_outputs: union the set of outputs flagged by `graph.outputs_needing_is_output` with every step's `env_refs` and `condition_refs`. Same-job consumers do not appear in `outputs_needing_is_output` (ADO does not require isOutput=true for them), and the previous logic mis-classified those outputs as unused. Two new regression tests pin the fix for both env_ref and condition_ref same-job consumers; existing cross-job and fixture tests remain green. Suggestions: - mcp-author `graph_dump`: now accepts `"json"` alongside `"text"` and `"dot"`, matching the CLI's `graph` subcommand. GraphDumpParams docstring + tool description updated; the error message lists all three formats. - audit::cli::derive_post_processing: doc-comment now explicitly records that cache-hit invocations correlate against the **current local source markdown** by design, so a future maintainer does not "fix" it into using a stale cached graph snapshot (the downstream findings rules depend on the fresh correlation). - compile::ir::summary::PipelineSummary::all_jobs: switch the return type from `Vec<&JobSummary>` to `impl Iterator<Item = &JobSummary>` via a small `AllJobsIter` either-enum, removing the per-call heap allocation on the hot paths (`populate_job_edges`, `find_matching_job_summary`, the inspect traversals). All callers already chained `.into_iter()`; the redundant calls were dropped via `cargo clippy --fix`. Validated locally: cargo build, cargo test (1890 unit tests + every integration suite pass), cargo clippy --all-targets --all-features (clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Clean layered design with solid security discipline in most places — after several rounds of fixes the major issues (PAT redaction, symlink bypass, byte-safe Findings🔒 Security Concerns
fn source_path(path: &str) -> PathBuf {
PathBuf::from(path) // no checks at all
}Every tool that takes a The The fix already exists in this PR: 🐛 Bugs / Logic Issues
// build_graph_dump(GraphFormat::Json) returns serde_json::to_string_pretty(&summary.graph)
// — a String containing JSON text.
let text_or_dot = inspect::build_graph_dump(&source, format).await?;
structured_result(GraphDumpResult { text_or_dot }) // serialised again → {"text_or_dot": "{...}"}When format is
let format = parse_graph_dump_format(params.0.format.as_deref())?;
if format == GraphFormat::Json {
let graph = inspect::build_graph_summary(&source).await.map_err(to_mcp_error)?;
return structured_result(graph);
}
let text_or_dot = inspect::build_graph_dump(&source, format).await.map_err(to_mcp_error)?;
structured_result(GraphDumpResult { text_or_dot })
|
…N + perf nits
Security:
- mcp_author::source_path: was previously `PathBuf::from(path)` with
zero validation, so a prompt-injected MCP request such as
`inspect_workflow(source_path="../../.ssh/authorized_keys.md")`
would reach build_pipeline_ir and read the file. Now mirrors the
guards already in `audit::pipeline_graph::resolve_source_path`:
require a `.md` extension, reject `..` components and `~` prefix,
and canonicalize absolute paths to re-check the extension on the
symlink target. Function is now async (`tokio::fs::canonicalize`);
all seven call sites were updated to `source_path(&...).await?`.
Four new regression tests cover the rejection cases (non-md,
`..`, `~`) plus the accepted relative `.md` case.
Bugs:
- mcp_author::graph_dump: `format = "json"` previously routed through
build_graph_dump(Json), which returns a JSON *string*. Wrapping
that in `GraphDumpResult { text_or_dot: String }` produced
`{"text_or_dot": "<escaped JSON>"}` and forced callers to parse
the inner JSON twice. The json format now short-circuits to
build_graph_summary so callers receive a structured GraphSummary
object. A new regression test decodes the result as GraphSummary
to pin the contract.
Suggestions:
- inspect::whatif::levenshtein: cache `b.chars().count()` once
instead of scanning twice (allocate + final index).
- audit::pipeline_graph::find_matching_job_summary, inspect::trace::stage_for_job,
inspect::whatif::find_job: drop the redundant `.into_iter()`
calls left over from the Vec-to-Iterator migration in
`PipelineSummary::all_jobs`.
Validated locally: cargo build, cargo test (1895 unit tests pass +
all integration suites), cargo clippy --all-targets --all-features
(clean).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Overall well-engineered — the security-sensitive paths are handled carefully, anyhow usage is clean, and there's good test coverage for edge cases. Two findings worth addressing before merge. Findings🐛 Bugs / Logic Issues
The finding title is if !job.failed() || job.downstream_jobs.is_empty() { continue; }
The practical impact today is low (canonical
|
… .graph.graph + clap value-enum Addresses PR #998 round 10: Bugs: - audit::findings::add_downstream_impact_findings: filter the finding to only fire when at least one downstream job actually skipped or was cancelled (or was absent from the runtime timeline, which signals an expected skip). Previously the rule fired whenever an upstream failed and had any IR-derived downstream — including cleanup jobs with always() that successfully ran. Title also reworded to "potentially impacted by {} failure" so even with the gate in place we do not categorically assert "skipped" for jobs that may have bypassed via always(). Recommendation reason updated to match. New `is_skipped_or_cancelled` helper handles US/UK spelling. New regression test pins the bypass-suppression behaviour; the existing test was updated to match the new title. Suggestions: - audit::model::PipelineGraphSection: rename field `graph: PipelineSummary` to `summary: PipelineSummary` so trace.rs no longer reads `.graph.graph.step_locations`. Callers updated (`pipeline_graph.rs`, `trace.rs`), and the same-named doc in `docs/audit.md` switched from `pipeline_graph.graph` to `pipeline_graph.summary`. - main::GraphCmd::Dump/Deps: switch `--format` and `--direction` from `String` + manual match to `clap::ValueEnum` derives on `GraphFormat` / `GraphDepsDirection`. Invalid values are now caught at parse time with clap's standard "possible values" help text instead of a runtime `anyhow::bail!`. Existing integration test updated to match the new clap rejection message. Validated locally: cargo build, cargo test (1896 unit tests + all integration suites pass), cargo clippy --all-targets --all-features (clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Well-designed addition — the security model is solid and test coverage is thorough. One real bug to fix before merge, one minor security inconsistency to address, and a couple of low-priority notes. Findings🐛 Bugs / Logic Issues
let source = match read_source_from_aw_info(run_dir).await.transpose()? {
A corrupt artifact from a bad run is a realistic scenario; it should degrade gracefully: let source = match read_source_from_aw_info(run_dir).await {
Some(Ok(s)) if !s.trim().is_empty() => Some(s),
Some(Err(err)) => {
record_warning(
audit,
"audit::pipeline_graph",
format!("failed to read aw_info.json: {err:#}; skipping IR graph correlation"),
);
return Ok(());
}
_ => audit.overview.aw_info.as_ref().and_then(|info| info.source.clone()),
};🔒 Security Concerns
|
…h_guard + reachable_edges + fix corrupt aw_info abort
Bugs:
- audit::pipeline_graph::populate_pipeline_graph: a corrupt
aw_info.json (read error or malformed JSON) previously aborted the
entire audit via `transpose()?`. Now downgraded to a recorded
warning, matching the documented "warn and continue" contract that
already applies to resolve_source_path / missing-source failures.
New regression test writes `{not valid json` and asserts the
populate call succeeds with the expected warning.
Security:
- mcp_author::source_path: the previous implementation built
PathBuf::from the trimmed input directly, so `..\\workflow.md` on
Linux became a single Normal component and slipped past the
ParentDir traversal check. The shared guard now normalises
platform separators before the component check, closing the
bypass. A backslash-traversal regression test in
src/mcp_author/tests.rs pins the fix.
Refactors (deduplicate audit + mcp-author validation logic):
- New `src/compile/source_path_guard.rs` exposes
`validate_workflow_source_path(source: &str) -> Result<ValidatedSourcePath>`
with the full security contract: trim + separator normalisation,
`.md` extension check, `..`/`~` rejection, and symlink-target
re-check on absolute paths. Six dedicated tests (including a
unix-only symlink-bypass test) cover every guard.
- audit::pipeline_graph::resolve_source_path now delegates to the
shared guard; the local `normalize_source_path` and
`has_md_extension` helpers were removed.
- mcp_author::source_path now delegates to the shared guard and
wraps the anyhow error as `McpError::invalid_params`.
Suggestion: shared BFS over edge lists.
- `inspect::graph_deps::reachable_edges` is now `pub` and accepts a
direction. `inspect::whatif::reachable_edges` collapsed to a
one-line downstream-direction wrapper that calls the shared
helper, removing the BTreeMap/VecDeque duplication. Unused
imports trimmed from `whatif.rs`.
Validated locally: cargo build, cargo test (1904 unit tests + every
integration suite pass), cargo clippy --all-targets --all-features
(clean).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Well-designed feature addition, but there is one security gap in Findings🔒 Security Concerns
The module doc promises:
...and Then let resolved_source_path = tokio::fs::canonicalize(&source_path)
.await
.unwrap_or_else(|_| source_path.clone());
let (_fm, pipeline) = crate::compile::build_pipeline_ir(&resolved_source_path)...If a relative source like The attack is constrained — it requires both a malicious Suggested fix in let joined = cwd.join(&path);
// Mirror the absolute-path symlink check.
if let Ok(canonical) = tokio::fs::canonicalize(&joined).await
&& !has_md_extension(&canonical)
{
anyhow::bail!(
"refusing source path '{normalized}': symlink resolves to non-`.md` target '{}'",
canonical.display()
);
}
Ok(ValidatedSourcePath { path: joined, normalized })(And add a companion test with a relative symlink, mirroring the existing
|
…classify_condition limitations
Security (relative-path symlink target re-check):
- compile::source_path_guard::validate_workflow_source_path
previously only canonicalised + re-checked the `.md` extension on
the **absolute-path** branch. A relative input such as
`workflows/evil.md` that symlinked to `/etc/passwd` slipped past
the lexical `.md` check; the eventual `canonicalize` inside
`audit::pipeline_graph::populate_pipeline_graph` would then resolve
to the arbitrary target, narrowing the documented contract.
- The relative branch now mirrors the absolute-path guard: join to
cwd, canonicalize, and re-check the resolved extension. Module
doc updated to call this out. New unix-only regression test
(rejects_relative_md_symlink_to_non_md_target) creates the
bypass setup inside a tempdir, switches `set_current_dir` under
a Mutex, and asserts the resolver refuses with the same symlink
rejection message.
Docs (inspect::whatif::classify_condition):
- Expanded the "Coverage limitations" section to enumerate every
known static-analyser blind spot so authors debugging a what-if
result are not surprised: not(succeeded()) misclassification
(already listed), scoped predicate forms like
failed('Setup'), bare canceled(), variable / dependency-result
comparisons, surviving template ${{ }} expressions, ignored
boolean composition, multi-line not(...) wraps, step-level
conditions, and string-literal false positives. Closes the
reviewer's "tell me more" follow-up. No behaviour change — the
classifier remains a conservative lower bound, with
inspect::trace as the authoritative source for any
classification disagreement.
Validated locally: cargo build, cargo test (1904 unit tests + every
integration suite pass — +1 on Unix from the new symlink test),
cargo clippy --all-targets --all-features (clean).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Well-designed PR with a clean layered architecture and good security thinking. Two issues worth addressing before merge: a Findings🐛 Bugs / Logic Issues
In Concrete path: 1ES or
// inside the `for downstream_stage in ...` loop, after resolving the stage:
let stage_classification = find_stage(summary, &downstream_stage)
.and_then(|s| s.condition.as_ref())
.map(|c| classify_condition(&Some(c.clone())))
.unwrap_or(WhatIfClassification::Skipped);
// then, in the final `filter_map`:
classification: std::cmp::max_by_key(
stage_classification,
classify_condition(&job.condition),
|c| matches!(c, WhatIfClassification::RunsAnyway) as u8,
),Or more simply: if 🔒 Security Concerns
The module documentation states the guard "refuses parent-directory traversal", but the This affects both call sites the module explicitly documents as untrusted:
The if path.components().any(|c| matches!(c, Component::ParentDir)) {
anyhow::bail!("refusing source path with parent-directory components: '{normalized}'");
}You'd need a dedicated test case:
|
…parent-dir guard on absolute paths + suggestion threshold + lint shadow Bugs: - inspect::whatif::reachable_downstream_jobs: jobs added via the stage-edge traversal branch are now classified by combining the containing stage's condition with the job's own. Previously only job.condition was inspected, so a `condition: always()` on a downstream cleanup stage was silently dropped on the floor and inner jobs (which default to succeeded()) were reported as Skipped. New `stronger_classification` + `find_stage` helpers lift the stage-level bypass through to its jobs. A stages-bodied regression test pins the new behaviour. Security: - compile::source_path_guard::validate_workflow_source_path: the ParentDir + `~` check was previously gated behind the relative- path branch, so an adversarial absolute path like `/workspace/../../home/runner/.env.md` slipped through unchecked even though the module's stated contract refuses parent- directory traversal unconditionally. Hoisted the check before the absolute/relative split; updated the error message to drop "relative" since the rejection now applies to both kinds. New `rejects_absolute_path_with_parent_dir_component` regression test; existing tests + mcp_author tests updated to match the new wording. Suggestions: - inspect::lint::report: renamed the inner `LintSummary` binding from `summary` to `tally` so it no longer shadows the `PipelineSummary` parameter; the struct field on `LintReport` is still `summary` (no public-API change). - inspect::whatif::closest and inspect::graph_deps::closest: added a Levenshtein distance threshold (`needle_len / 2 + 2`) so unrelated input like `whatif --fail xyzzy` no longer gets a noisy "did you mean" suggestion against whatever IR id happens to be lexicographically nearest. Single-typo cases like `Aget` → `Agent` still suggest. Two whatif regression tests (`closest_returns_none_for_unrelated_input`, `closest_suggests_single_typo_within_threshold`). Validated locally: cargo build, cargo test (1908 unit tests + every integration suite pass), cargo clippy --all-targets --all-features (clean). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Looks good overall — solid architecture and excellent security work. One reproducible bug in the cache-hit path, one doc comment that undersells a behavioral difference. Findings🐛 Bugs / Logic Issues
|
Summary
Adds an agent-facing read-only surface over the typed pipeline IR so agents (in IDE Copilot Chat, sub-agents, or downstream tooling) can reason about pipeline structure and failures programmatically instead of grep-walking compiled
.lock.ymlfiles.Layered design — each layer reusable by the next:
PipelineSummary(schema_version: 1) +GraphSummaryetc. insrc/compile/ir/summary.rs; Serialize+Deserialize, stable contract.compile::build_pipeline_ir(path) -> (FrontMatter, Pipeline)— read-only, no YAML written.inspect,graph {dump,deps,outputs},whatif,lint,catalog,trace.ado-aw mcp-authorstdio server exposing every L1 command as a read-only MCP tool for IDE / Copilot Chat.AuditData.pipeline_graph,JobData.{upstream,downstream}_jobs, new "downstream-skipped" finding.Follow-up fixes from review feedback
whatifcondition classification to correctly handle negated predicates (e.g.not(failed()),not(always())) and added targeted regression tests.JobDatahelpers.mcp-authoraudit_buildto write under a temp directory ($TMP/ado-aw/audit) instead of./logs.whatiffor readability (reverse→downstream).What agents can now do
whatif --fail Detectionanswers "what skips" without manual graph walking;graph deps SafeOutputs --direction upstreamreturns the typed transitive dependency set;trace <build>joins ADO timeline with the IR so a failed timeline node surfaces its derived skip-set.lintflags unused outputs / anonymous producers;catalog --kind safe-outputslists every available tool (24 found);inspect --jsonreturns a stable schema-versioned summary for scripting.ado-aw mcp-authorfor VS Code / Claude Desktop.Trust-boundary preserved
The new
mcp-authorserver is separate from the existing SafeOutputs MCP that ships inside compiled pipelines. Stage 1 sandboxed agents do NOT get these tools — they're author/debug-only surface, intentional.Docs
docs/mcp-author.md(tool surface + VS Code / Claude Desktop config snippets).ir::summary)" section indocs/ir.md.docs/cli.md;pipeline_graph+upstream_jobs/downstream_jobsindocs/audit.md.prompts/debug-ado-agentic-workflow.mdandprompts/update-ado-agentic-workflow.mdupdated to surface the new commands.docs/ir.md.Test plan
cargo build✅cargo test --bin ado-aw✅ (full binary test suite passing after follow-up fixes)cargo test --test inspect_integration✅ 4 passed (end-to-end CLI tests on thecreate-pull-request.mdfixture)cargo test --bin ado-aw inspect::whatif::tests::✅cargo test --bin ado-aw inspect::trace::tests::✅cargo test --bin ado-aw audit::findings::tests::✅cargo test --bin ado-aw mcp_author::tests::✅cargo clippy --all-targets --all-features✅ 0 warningstests/safe-outputs/create-pull-request.md:ado-aw inspect ...→ text +--json(schema_version: 1)ado-aw graph dump ... --format dot→ valid DOT digraphado-aw graph deps ... SafeOutputs --direction upstream→ upstream jobsAgent,Detectionado-aw whatif ... --fail Agent→Detection,SafeOutputsskippedado-aw lint ...→ 0 errorsado-aw catalog --kind safe-outputs --json→ 24 entriesado-aw mcp-author --help→ wired