revert: remove ui-preview-smoke agent workflow#2247
Conversation
Backs out the agent-driven UI smoke workflow added in #2238 (and patched in #2240). Keeps the PR template's "How to test on Vercel preview" section — it's still useful guidance for human reviewers. Why: The workflow was supposed to give us a low-friction "did the UI break?" signal on PRs by having an agent (claude-code-action + Playwright MCP) execute a PR-author-written test plan against the Vercel preview. In practice it overlapped heavily with the existing Playwright e2e suite without the durability: - Author still had to write `**Preview routes:**` + numbered `**Steps:**` — most of the cognitive work of writing a real test. - Result was an LLM-interpreted, non-deterministic check producing no test artifact. A Playwright spec asserting the same behavior would run on every future PR for years. - Per-run cost was ~$0.85 of agent time + a Vercel deploy + ~5 min of CI; flaky by construction (LLM judgement varies run-to-run). - Maintenance cost was real and ongoing: PR #2242 went through 5 review rounds resolving security and reliability findings, cumulatively producing 543 added lines (later trimmed to +79 via /ce-resolve-pr-feedback), and the latest deep-review still surfaced new P1 findings (JS injection via Number('${{ inputs }}'), mutable-tag pinning of third-party actions, missing sha: input on workflow_dispatch). Better path forward: lean into the existing Playwright stack (`make e2e` / `make dev-e2e`). For changes that warrant preview-deploy testing specifically, a deterministic screenshot-based workflow (~50 lines) is a cheaper and more durable replacement than the LLM-driven version.
|
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
🟢 Tier 1 — TrivialDocs, images, lock files, or a dependency bump. No functional code changes detected. Why this tier:
Review process: Auto-merge once CI passes. No human review required. Stats
|
PR Review
Otherwise the revert is clean: single-file deletion, no other references to the workflow remain in the repo. |
Deep ReviewThis PR is a single-file revert: it deletes ✅ No critical issues found. 🟡 P2 -- recommended
Reviewers (4): correctness, maintainability, testing, project-standards. |
E2E Test Results✅ All tests passed • 166 passed • 3 skipped • 1196s
Tests ran across 4 shards in parallel. |
Summary
Backs out the agent-driven UI smoke workflow added in #2238 (and patched in #2240). Keeps the PR template's "How to test on Vercel preview" section — it's still useful guidance for human reviewers.
Why
The workflow was meant to give a low-friction "did the UI break?" signal on PRs by having an agent (
claude-code-action+ Playwright MCP) execute a PR-author-written test plan against the Vercel preview. In practice it overlapped heavily with the existing Playwright e2e suite without the durability:**Preview routes:**+ numbered**Steps:**is most of the cognitive work of writing a real Playwright spec./ce-resolve-pr-feedbackresolution was +79; the latest deep-review still surfaced new P1 findings (JS injection onNumber('${{ inputs.pr_number }}'), third-party tag pinning, missingsha:input on workflow_dispatch).The marginal value-add over a deterministic test was small enough that the maintenance ratchet exceeded it.
Better path forward
make e2e/make dev-e2e,tests/e2e/page-objects/). For PRs like fix(table-chart): wrap mode now breaks long URLs/IDs instead of overflowing into adjacent columns #2234 (table-chart text overflow), a Playwright spec asserting "wrap mode keeps long URLs within their column" would have caught the bug AND prevented future regressions across every chart PR.${{ vercel.outputs.url }}) is cheaper and more durable than the LLM-driven version.How to test on Vercel preview
N/A — CI workflow removal.
References