Skip to content

docs: add agent-first Playwright best-practices skill#395

Open
stefanjudis wants to merge 25 commits into
mainfrom
docs/playwright-best-practices-skill
Open

docs: add agent-first Playwright best-practices skill#395
stefanjudis wants to merge 25 commits into
mainfrom
docs/playwright-best-practices-skill

Conversation

@stefanjudis

Copy link
Copy Markdown
Collaborator

What

Adds a new agent skill, playwright-best-practices-for-agents — opinionated Playwright (@playwright/test + TypeScript) guidance written for coding agents rather than humans.

The differentiator is agent-first: it's built around Playwright's agent CLI (playwright-cli) and its no-GUI debugging flows (--debug=cli, error-context.md, aria snapshots), and consistently frames human-only GUIs (--ui, Inspector, trace viewer, codegen) as "capture for a human; as an agent, do X."

Structure

Progressive disclosure: a lean always-loaded SKILL.md (core rules + routing table + agentic workflow) plus 21 on-demand reference files.

  • Core: locators, assertions, waiting, test-structure, config, auth, network, debugging, flakiness, ci
  • Breadth: test-data, forms, files, iframes, multi-context, mobile, clock, visual (incl. toMatchAriaSnapshot), tags-annotations, console-errors, global-setup, error-states

Each reference is curated-not-exhaustive and ends with "Deeper in the docs" links (Checkly /learn + playwright.dev). Light, in-context Checkly monitor tie-ins where genuine. API-sensitive snippets verified against playwright.dev.

Before this ships

  • Packaging must exclude PLAN.md and COMPETITIVE-REVIEW.md — these are working/planning docs, not part of the distributed skill.

Notes

  • Framework-agnostic content; deliberately scoped (framework-specific, Electron, perf/Lighthouse, etc. intentionally out — see PLAN.md).
  • 25 commits, reviewed file-by-file during authoring.

Token-efficient, agent-facing skill for writing/debugging Playwright
tests with @playwright/test + TypeScript. Progressive disclosure:
a lean SKILL.md (core rules, activity routing table, agent-friendly
no-GUI verify loop) backed by on-demand reference files.

Core references so far: locators, assertions, waiting, test-structure,
config. Grounded in our /learn content and cross-checked against the
official Playwright docs; gap-fill (POM, config/projects) written fresh.

PLAN.md is included as a working doc and will be removed before release.
storageState reuse via setup project, env-var credentials, UI/SSO login,
TOTP via otpauth, API login as setup (APIRequestContext.storageState),
and multiple roles. Agent bootstrap points at playwright-cli rather than
the interactive codegen GUI. Completes Phase 2 core references.
Route mocking (patch one field, full replace, block/continue/fallback),
HAR record/replay via routeFromHAR, API testing with the request fixture
(APIRequestContext), and a don't-over-mock best practice. Prefer waiting
on UI changes over page.waitForResponse. First Phase 3 gap-fill.
Centered debugging.md on the agent CLI (playwright-cli, @playwright/cli):
zero-setup failure primitives (call log, error-context.md), the
open/snapshot/generate-locator discovery flow, and live --debug=cli
stepping. Modernized the common-errors catalog off waitForSelector.

Corrected a real defect: SKILL.md's verify loop shipped a non-existent
'npx playwright trace open/actions/snapshot/close' CLI. Verified false
against the official CLI reference and the local install (only show-trace,
a GUI, exists). Replaced with the --debug=cli + playwright-cli flow.

Added a 'best results need the agent CLI' capability check to SKILL.md
and a snapshot/generate-locator discovery section to locators.md. All
install hints are local or npx (no global -g). init-agents deferred.
Root-cause table (symptom -> real cause -> fix), retries as an
infrastructure safety net rather than a flake fix, isolation and the
fullyParallel within-file parallelism trade-offs (worker isolation,
opting out with serial mode), and detecting flakes via --repeat-each.
config.md now links here for the fullyParallel trade-offs.
GitHub Actions workflow (install --with-deps, upload report on failure),
reporters (github/html/blob, combining), sharding with the matrix +
blob/merge-reports flow, failure-artifact capture, and running the same
suite as monitors. Leans on config.md for the env-driven knobs.
Removed the performance reference (routing-table row, frontmatter
trigger) - not needed. Completes Phase 3 gap-fill references.
Dropped the scenarios router (refs removed from SKILL.md routing table
and assertions.md). Final pass: verified all relative cross-links
resolve and no stale scenarios/performance/init-agents/trace-cli refs
remain; thinned the repeated Checkly 'monitors' close (removed the
bolted-on one in network.md); refocused the debugging routing row on
playwright-cli/--debug=cli.
Reframe SKILL.md's "Verify loop" as an "Agentic workflow (no GUI)" split
into an Author phase (discover locators live with playwright-cli) and a
Run & debug phase (the existing failure-reading loop), and add a callout
that having playwright-cli available is highly encouraged. Add the missing
agentic note to network.md (verify a route mock fired via
`playwright-cli snapshot`/`network`).

Also add COMPETITIVE-REVIEW.md tracking the comparison against the
currents.dev skill and the follow-up action list.
Condense the timeout-knob paragraph (assertions), the fullyParallel
run-on (flakiness), and the no-checks low-level-calls paragraph
(waiting). In SKILL.md, remove the version-check redundancy: the top
agent-CLI blockquote is now identity-only, the workflow note owns the
check+install, and the closing 'Stay current' note is a one-liner
pointing to debugging.md. ~115 words removed, no guidance lost.
Add references/test-data.md (factories, per-worker unique data, seed +
cleanup via API, worker-scoped read-only data, factory/fixture pairing).
Lock Phase 5 breadth expansion in PLAN.md (14 granular files, Tier 3
skipped deliberately) and track it in COMPETITIVE-REVIEW.md.
Cover skip/fixme/fail/slow annotations (conditional + inline), test.only
with the forbidOnly CI guard, metadata tags + --grep filtering,
test.step cross-link, and custom annotations.
Cover catching browser pageerror/console.error, gating the suite via an
automatic fixture, sparing allowlists, and live inspection with
playwright-cli console error.
Contrast setup projects (fixtures, report-visible, the auth default)
with the globalSetup function (one-time non-browser bootstrapping), plus
globalTeardown vs teardown projects and a which-one decision guide.
Cover toHaveScreenshot, platform-specific baselines (generate where CI
runs), stabilizing renders (mask/animations/tolerance), updating
baselines, reading the -diff.png artifact as an agent, and the
distinction from plain screenshots/failure artifacts.
Cover setInputFiles (path/array/clear/buffer), the filechooser event for
button-triggered uploads, the waitForEvent('download') pattern with
saveAs/suggestedFilename/createReadStream, asserting download contents,
and agent control-discovery. Snippets verified against playwright.dev.
Cover frameLocator, the user-facing locator.contentFrame() form, nested
frames, the steer away from the non-auto-waiting handle API, and agent
discovery reaching inside iframes via snapshot.
Cover device descriptors per project/test, the emulation-not-real-device
caveat, touch/tap (hasTouch, touchscreen, basic gesture support), and
responsive breakpoints via setViewportSize.
Cover page.clock: setFixedTime for display-only cases, install + advance
(fastForward fires due timers at most once, runFor fires all, pauseAt
freezes), a which-to-reach-for guide, and the set-time-before-goto rule.
Method semantics verified against playwright.dev.
Add references/multi-context.md (popup/new-tab capture, second-user via
a new context with role-named pages, a secondUser/pageAdmin fixture).
Drop drag-drop.md from Tier 2 (Stefan) — breadth expansion now 10 -> 23.
Add a 'structure over pixels' section: assert the accessibility tree as
a platform-stable, less brittle alternative to pixel screenshots, with
the agent tie-in to playwright-cli snapshot / error-context.md.
Cover fill/pressSequentially/clear, selects/checkboxes/radios,
submit-and-assert, testing error states (web-first messages, soft
assertions, toHaveAccessibleErrorMessage), and file inputs.
Cover server errors (fulfill 500), network failure / offline
(route.abort, setOffline), observable loading/skeleton states via a
controlled mock delay, asserting recovery on retry, and pairing with the
console-error gate.
Add routing-table rows for all 12 new references and widen the
description keywords (test data, forms, files, iframes, multi-context,
mobile, clock, visual, tags/annotations, console errors, global setup,
error states) so the skill triggers on those activities.
Rename the skill directory and name field, and make the agent-first
basis explicit in the description and intro: these are Playwright best
practices for coding agents, built around the playwright-cli agent CLI
and its no-GUI agentic debugging flows.
@mintlify

mintlify Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
checkly-422f444a 🟢 Ready View Preview Jun 23, 2026, 4:38 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant