diff --git a/.mintignore b/.mintignore
index a43298bd..c3470bb9 100644
--- a/.mintignore
+++ b/.mintignore
@@ -1 +1,2 @@
 api-reference-old/**/*.mdx
+skills/**
diff --git a/skills/playwright-best-practices-for-agents/SKILL.md b/skills/playwright-best-practices-for-agents/SKILL.md
new file mode 100644
index 00000000..9028860f
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/SKILL.md
@@ -0,0 +1,73 @@
+---
+name: playwright-best-practices-for-agents
+description: Agent-first best practices for writing, structuring, debugging, and stabilizing Playwright tests in TypeScript/JavaScript, built around Playwright's agent CLI (`playwright-cli`) and no-GUI agentic debugging flows. Use when authoring or reviewing Playwright tests: choosing locators, writing web-first assertions, fixing flaky tests, handling authentication (SSO/2FA), mocking network/API requests, structuring projects and fixtures, generating test data, building forms and validation, pressing keys and keyboard shortcuts, hovering and other mouse actions, handling native alert/confirm/prompt dialogs, uploading or downloading files, testing iframes, multiple tabs/popups or multi-user flows, mobile and device emulation, mocking time and dates, visual regression and screenshots, tagging and annotating tests, catching console errors, testing error/offline/loading states, configuring global setup, or running Playwright in CI.
+metadata:
+  author: checkly
+---
+
+# Playwright best practices
+
+Condensed, opinionated guidance for writing Playwright tests that are **readable, isolated, and resilient** — built for coding **agents**, around Playwright's **agent CLI** (`playwright-cli`) and its no-GUI debugging flows. Maintained by [Checkly](https://www.checklyhq.com/?utm_source=ai-skill) — the same practices apply whether you run these tests in CI or as production monitors.
+
+Load a reference file from `references/` only when the task needs it (see routing table). Each reference ends with links to the full `/learn` articles for depth.
+
+> **Scope:** all guidance assumes the **`@playwright/test`** test runner with **TypeScript** — its `test`, fixtures, projects, config, and web-first `expect`. Examples are TypeScript (`.spec.ts`); the same APIs work in JavaScript. It does not target the standalone `playwright` automation library (which has no test runner, fixtures, or auto-retrying assertions). Imports are `import { test, expect } from '@playwright/test'`.
+
+> **The agent CLI is what makes this skill shine.** Playwright's **agent CLI** — `playwright-cli`, package `@playwright/cli` — is a separate, token-efficient, **no-GUI** browser you drive command by command to discover locators and step through failing tests. It's distinct from the standard `npx playwright` CLI, and the **Agentic workflow** below leans on it throughout. → [references/debugging.md](references/debugging.md)
+
+## Core rules (always apply)
+
+1. **Locator priority:** prefer user-facing locators — `getByRole` > `getByLabel` / `getByPlaceholder` / `getByText` > `getByTestId` > CSS/XPath. CSS/XPath tie tests to implementation and break easily. → [references/locators.md](references/locators.md)
+2. **Web-first assertions:** use auto-retrying `expect(locator).toBeVisible()` / `toHaveText()` etc. Never assert on a one-shot value you pulled out manually (`innerText()` then `toBe`). → [references/assertions.md](references/assertions.md)
+3. **No hard waits:** never `waitForTimeout()`. Trust auto-waiting actions and web-first assertions; for explicit waits use `waitForURL` / `waitForLoadState` / `waitForResponse`. Avoid `networkidle`. → [references/waiting.md](references/waiting.md)
+4. **Isolated & independent:** each test sets up its own state and can run in any order, in parallel. No test depends on another. Provision state via API in setup, not through the UI. → [references/test-structure.md](references/test-structure.md), [references/flakiness.md](references/flakiness.md)
+5. **One feature per test:** if a test's assertions span more than one feature, split it. Keep tests short and focused.
+6. **Reuse auth, don't re-login:** sign in once, persist `storageState`, reuse it across tests via a setup project. → [references/auth.md](references/auth.md)
+
+## Routing table
+
+| When the task is about… | Read |
+|---|---|
+| Picking selectors, strict mode, `data-testid` | [references/locators.md](references/locators.md) |
+| Assertions, soft assertions, `expect.poll`/`toPass` | [references/assertions.md](references/assertions.md) |
+| Waiting, auto-waiting, timeouts, navigation | [references/waiting.md](references/waiting.md) |
+| Test design, fixtures, Page Object Model, steps | [references/test-structure.md](references/test-structure.md) |
+| `playwright.config.ts`, projects, baseURL, devices, setup dependencies | [references/config.md](references/config.md) |
+| Login, 2FA/TOTP, SSO, sessions, `storageState` | [references/auth.md](references/auth.md) |
+| Mocking, intercepting, `route`, HAR, API testing | [references/network.md](references/network.md) |
+| Debugging failures, `playwright-cli`, `--debug=cli`, traces, common errors | [references/debugging.md](references/debugging.md) |
+| Flaky tests, retries, parallelism, anti-patterns | [references/flakiness.md](references/flakiness.md) |
+| Running in CI, sharding, reporters, GitHub Actions | [references/ci.md](references/ci.md) |
+| Test data, factories, unique data, seeding/cleanup | [references/test-data.md](references/test-data.md) |
+| Forms, inputs, validation, error messages | [references/forms.md](references/forms.md) |
+| Keyboard, mouse, hover, scroll, native dialogs (alert/confirm/prompt) | [references/interactions.md](references/interactions.md) |
+| File upload & download | [references/files.md](references/files.md) |
+| iframes, frames, `frameLocator` | [references/iframes.md](references/iframes.md) |
+| Multiple tabs, popups, multiple users/contexts | [references/multi-context.md](references/multi-context.md) |
+| Mobile, device emulation, touch, viewport/breakpoints | [references/mobile.md](references/mobile.md) |
+| Time/date, clock mocking, countdowns, timeouts | [references/clock.md](references/clock.md) |
+| Visual regression, screenshots, `toHaveScreenshot`, aria snapshots | [references/visual.md](references/visual.md) |
+| Tags (`@smoke`), `--grep`, `skip`/`fixme`/`slow` annotations | [references/tags-annotations.md](references/tags-annotations.md) |
+| Failing tests on `console`/`pageerror` | [references/console-errors.md](references/console-errors.md) |
+| `globalSetup`/`globalTeardown`, setup projects | [references/global-setup.md](references/global-setup.md) |
+| Error, offline, network-failure, loading states | [references/error-states.md](references/error-states.md) |
+
+## Agentic workflow (no GUI)
+
+The interactive tools — `--ui`, `--debug` (Inspector), `show-trace` — are GUIs you can't drive. Author and debug through the non-interactive signals instead.
+
+> **Having `playwright-cli` available is highly encouraged** — both phases below lean on it. Confirm with `playwright-cli --version` and install it if missing — `npm install -D @playwright/cli`, then run it via `npx playwright-cli` (or install globally with `npm install -g @playwright/cli` to call `playwright-cli` directly). Everything still works without it, but you lose the inspect/verify loop and fall back to guessing.
+
+**Author — discover, don't guess.** Read locators off the live page rather than from source: `playwright-cli open <url>` → `playwright-cli snapshot` prints the accessibility tree — the roles and accessible names that power `getByRole`/`getByLabel` — so you author the user-facing locator straight from what it shows. → [references/locators.md](references/locators.md)
+
+**Run & debug:**
+
+1. **Run and read stdout:** `npx playwright test path/to/file.spec.ts`. The reporter prints the failing assertion and the **call log** — which locator/assertion timed out and what Playwright actually saw. Read it; don't guess.
+2. **Read `error-context.md`:** on an `expect` failure Playwright writes an aria-snapshot of the page *at the moment it failed* to the test's `test-results/.../error-context.md`. This is machine-readable page state — open it to see what was actually rendered. *(Playwright ≥ 1.60)*
+3. **Capture artifacts, not GUIs:** add `--trace on` to drop `trace.zip` into `test-results/` for inspection.
+4. **Step through it live with `playwright-cli`** (no GUI): run `npx playwright test path/to/file.spec.ts --debug=cli` in the background — it pauses and prints a session name. Then `playwright-cli attach <session-name>` and drive it: `playwright-cli snapshot` (page state + element refs), `playwright-cli step-over`, `playwright-cli console error`, `playwright-cli network`, `playwright-cli eval "…"`. Inspect why the locator didn't resolve or what actually rendered, then fix and re-run. *(needs the agent CLI; full detail in [references/debugging.md](references/debugging.md))*
+5. **Fix the root cause** (usually a locator, a missing web-first assertion, or a hard wait), then re-run until green. Don't paper over flakiness with retries — see [references/flakiness.md](references/flakiness.md).
+
+Full agentic-debugging detail (the `playwright-cli` discovery and `--debug=cli` stepping workflow) is in [references/debugging.md](references/debugging.md).
+
+> **Stay current.** These primitives are recent and version-gated — check `npx playwright --version` and `playwright-cli --version`, and update both packages if they're behind. Detail in [references/debugging.md](references/debugging.md).
diff --git a/skills/playwright-best-practices-for-agents/references/assertions.md b/skills/playwright-best-practices-for-agents/references/assertions.md
new file mode 100644
index 00000000..e746a086
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/assertions.md
@@ -0,0 +1,115 @@
+# Assertions
+
+Default to auto-retrying, web-first assertions. They wait for a condition to become true (up to the timeout) instead of checking once, which removes most flakiness.
+
+## Web-first (auto-retrying) — use these
+
+`expect(locator).<matcher>()` polls until it passes or times out:
+
+```ts
+await expect(page.getByRole('alert')).toBeVisible()
+await expect(page.getByTestId('total')).toHaveText('€42.00')
+await expect(page.getByRole('button', { name: 'Pay' })).toBeEnabled()
+```
+
+Common matchers: `toBeVisible`, `toBeHidden`, `toBeAttached`, `toBeEnabled`, `toBeDisabled`, `toBeEditable`, `toBeChecked`, `toBeFocused`, `toBeInViewport`, `toHaveText`, `toContainText`, `toHaveValue`, `toHaveValues`, `toHaveCount`, `toHaveAttribute`, `toHaveClass`, `toHaveURL`, `toHaveTitle`. Accessibility-focused matchers exist too — `toHaveRole`, `toHaveAccessibleName`, `toHaveAccessibleDescription` — and `toBeOK` checks a response. All support `.not`, and negation auto-retries too.
+
+Visual/structure assertions — `toHaveScreenshot` (pixel) and `toMatchAriaSnapshot` (accessibility-tree YAML) — are also auto-retrying.
+
+This is a curated set, not the full list. For every matcher (including `toHaveCSS`, `toHaveJSProperty`, `toContainClass`, `toHaveId`, and more) see the [Playwright assertions reference](https://playwright.dev/docs/test-assertions).
+
+`toHaveText`, `toContainText`, and `toHaveCount` work against a locator that matches **many** elements — assert on the set directly instead of looping:
+
+```ts
+await expect(page.getByRole('listitem')).toHaveCount(3)
+await expect(page.getByRole('listitem')).toContainText(['Coffee', 'Tea', 'Milk'])
+```
+
+**Always `await` a web-first assertion.** It's async; a missing `await` doesn't fail loudly — the check is silently skipped and the test passes for the wrong reason.
+
+## Non-retrying — only for plain values
+
+`expect(value).toBe()/toEqual()/toBeGreaterThan()` evaluate once. Use them for deterministic, already-resolved values (numbers, parsed JSON), not for UI state.
+
+## The #1 mistake: awaiting inside expect
+
+```ts
+// BAD — reads once, no waiting
+expect(await locator.innerText()).toBeTruthy()
+
+// GOOD — web-first, auto-waits
+await expect(locator).not.toBeEmpty()
+```
+
+`await` goes *outside* `expect(locator)`, and the matcher does the waiting. Never pull a value out with `innerText()`/`textContent()` and assert on it when a web-first matcher exists.
+
+## Soft assertions
+
+`expect.soft(...)` records a failure but lets the test continue, then marks it failed at the end. Good for collecting multiple independent checks (form fields, link sweeps) in one run.
+
+```ts
+await expect.soft(page.getByTestId('cookieBanner')).toBeVisible()
+```
+
+To bail out mid-test once some have failed, check `expect(test.info().errors).toHaveLength(0)`.
+
+## Timeouts
+
+Web-first assertions retry against the **expect timeout** (default **5s**) — separate from the test timeout (default **30s**) and any action timeout. If something genuinely takes longer than 5s (a slow report, a long upload), don't add a `waitForTimeout` before it — give that one assertion a longer `timeout` instead:
+
+```ts
+await expect(page.getByText('Report ready')).toBeVisible({ timeout: 30_000 }) // per call
+```
+
+Be deliberate about which knob you turn: a per-assertion `timeout` for one genuinely slow step keeps the rest of the suite fast and signals intent at the call site; raising the project-wide default (`expect: { timeout: 10_000 }` in config) is the honest fix when the whole app is slower (a heavy staging environment), rather than peppering overrides everywhere. For a reusable variant, preconfigure `expect` once and import it:
+
+```ts
+const slowExpect = expect.configure({ timeout: 10_000 })
+const softExpect = expect.configure({ soft: true })
+```
+
+## Custom failure messages
+
+Pass a message as the second arg to `expect` (or `expect.soft`) to make failures self-explanatory in reports and logs:
+
+```ts
+await expect(page, 'dashboard should load after login').toHaveTitle(/Dashboard/)
+```
+
+## Dynamic / flaky conditions
+
+When no web-first matcher fits, retry the *value* or the *block* instead of hard-waiting.
+
+`expect.poll(fn)` re-runs `fn` until the matcher passes or the timeout hits — ideal for polling an API or any non-locator value:
+
+```ts
+await expect
+  .poll(async () => (await request.get('/api/orders/42')).status(), { timeout: 10_000 })
+  .toBe(200)
+```
+
+`expect(async () => { ... }).toPass()` retries a whole block until every assertion inside passes — use it when several conditions must converge together:
+
+```ts
+await expect(async () => {
+  const order = await getOrder(42)
+  expect(order.status).toBe('shipped')
+  expect(order.trackingId).toBeTruthy()
+}).toPass({ timeout: 10_000 })
+```
+
+Note `toPass` defaults to **no timeout** and ignores the global expect timeout — always pass an explicit `timeout` so a never-passing block can't hang the test.
+
+`expect.extend({...})` adds custom matchers for repeated domain checks (e.g. `toBeWithinRange`); merge several matcher modules with `mergeExpects()`.
+
+## Anti-patterns
+
+- `await page.waitForTimeout(3000)` before an assertion — see [waiting.md](./waiting.md).
+- Asserting five features in one test — split it; keep assertions focused.
+- `toBe()` on text where `toContainText()`/`toHaveText()` would auto-wait.
+
+## Deeper in the docs
+
+- [Assertions — types & best practices](https://www.checklyhq.com/learn/playwright/assertions/)
+- [Waits and timeouts](https://www.checklyhq.com/learn/playwright/waits-and-timeouts/)
+- [Playwright assertions reference](https://playwright.dev/docs/test-assertions)
diff --git a/skills/playwright-best-practices-for-agents/references/auth.md b/skills/playwright-best-practices-for-agents/references/auth.md
new file mode 100644
index 00000000..78a786f0
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/auth.md
@@ -0,0 +1,119 @@
+# Authentication
+
+If possible, sign in **once**, persist the session, and reuse it across tests. Logging in through the UI in every test is slow, hammers your auth provider (rate limits, lockouts), and couples unrelated tests to the login flow.
+
+## Reuse auth via a setup project (the default)
+
+Run the login flow in a `setup` project, save the authenticated cookies and local storage to disk with `storageState`, and have every other project depend on it. Dependent tests then start already signed in.
+
+```ts playwright.config.ts
+import { defineConfig, devices } from '@playwright/test'
+
+export default defineConfig({
+  projects: [
+    { name: 'setup', testMatch: /.*\.setup\.ts/ },
+    {
+      name: 'chromium',
+      use: { ...devices['Desktop Chrome'], storageState: 'playwright/.auth/user.json' },
+      dependencies: ['setup'],   // login runs first; this project reuses its state
+    },
+  ],
+})
+```
+
+```ts auth.setup.ts
+import { test as setup, expect } from '@playwright/test'
+
+const authFile = 'playwright/.auth/user.json'
+
+setup('authenticate', async ({ page }) => {
+  await page.goto('/login')
+  await page.getByPlaceholder('Email').fill(process.env.USER_EMAIL!)
+  await page.getByPlaceholder('Password').fill(process.env.USER_PASSWORD!)
+  await page.getByRole('button', { name: 'Sign in' }).click()
+  await expect(page.getByText('Welcome back')).toBeVisible()   // confirm login worked
+  await page.context().storageState({ path: authFile })        // persist the session
+})
+```
+
+Git-ignore the state file — it holds live session cookies: add `playwright/.auth/` to `.gitignore`. See [config.md](./config.md) for the projects/`dependencies` mechanics.
+
+## Credentials and test users
+
+- **Never hardcode credentials**, not even while debugging — read them from env vars (`process.env.USER_PASSWORD`). It's too easy to commit a literal.
+- Use a **dedicated test account**, never a real user's or a customer's — you control its data and avoid bot-detection lockouts.
+
+## Logging in
+
+- **Username/password and SSO/social** (Google, GitHub, Microsoft, Okta, SAML) look the same from the test's side. Third-party providers add redirects across domains; Playwright follows them automatically. Drive the provider's screens with user-facing locators like any other form.
+- **Discovering the steps (as an agent):** drive the page yourself with `playwright-cli` — navigate to the login page, take an accessibility snapshot to read the real `getByRole`/`getByLabel` names, run the login, then transcribe the working steps into `auth.setup.ts`. `npx playwright codegen <your-site>` records the same steps but opens the **interactive Inspector GUI** you can't drive in an agent session, so it's a human-only shortcut. See [debugging.md](./debugging.md) for the `playwright-cli` setup.
+
+## Two-factor auth (TOTP)
+
+You can't read an SMS or push, but **authenticator-app (TOTP) codes are just a secret + the current time** — generate them in-process with [`otpauth`](https://www.npmjs.com/package/otpauth). Store the TOTP secret as an env var.
+
+```ts
+import * as OTPAuth from 'otpauth'
+
+const totp = new OTPAuth.TOTP({ issuer: 'GitHub', digits: 6, period: 30, secret: process.env.TOTP_SECRET! })
+
+await page.getByPlaceholder('XXXXXX').fill(totp.generate())   // current 6-digit code
+```
+
+## API login (skip the UI entirely)
+
+When you only need an authenticated *session* — not coverage of the login screen — log in over HTTP and snapshot the state. It's faster and less flaky than driving the form.
+
+```ts auth.setup.ts
+import { test as setup } from '@playwright/test'
+
+setup('authenticate via API', async ({ request }) => {
+  await request.post('/api/login', { form: { email: process.env.USER_EMAIL!, password: process.env.USER_PASSWORD! } })
+  await request.storageState({ path: 'playwright/.auth/user.json' })   // captures the auth cookies
+})
+```
+
+Test the login *page itself* through the UI; use API login as setup for everything else. See [network.md](./network.md) for the `request` context.
+
+## Seed session & app state directly
+
+`storageState` and API login replay a whole session; sometimes you just need one piece of state in place before the page loads. Seed it on the **context**, before the first navigation:
+
+```ts
+// a known session cookie — skip even the API round-trip
+await context.addCookies([
+  { name: 'session', value: process.env.SESSION_TOKEN!, url: 'https://danube-web.shop' },
+])
+
+// runs before the page's own scripts: stub a global, force a flag, dismiss a consent banner
+await context.addInitScript(() => {
+  window.localStorage.setItem('feature.newCheckout', 'on')
+  window.localStorage.setItem('cookie-consent', 'accepted')
+})
+```
+
+`addInitScript` runs after the document exists but **before the page's own scripts**, so flags and stubs are in place by first paint. Set these in a fixture or `beforeEach` so every test starts from the same clean state. Reach for them for non-auth bootstrapping — feature flags, dismissing banners, freezing a global; for the full signed-in session, prefer `storageState` above.
+
+## Multiple roles
+
+Give each role its own setup step and state file, then opt a test into one with `test.use`:
+
+```ts
+setup('auth as admin', async ({ page }) => { /* … */ await page.context().storageState({ path: 'playwright/.auth/admin.json' }) })
+
+test.describe('admin area', () => {
+  test.use({ storageState: 'playwright/.auth/admin.json' })
+  test('sees settings', async ({ page }) => { /* signed in as admin */ })
+})
+```
+
+A persisted `storageState` is the same idea Checkly uses to keep authenticated monitors logged in across scheduled runs, so a session that survives reuse here survives in production monitoring too.
+
+## Deeper in the docs
+
+- [Managing authentication in Playwright](https://www.checklyhq.com/learn/playwright/authentication/)
+- [Login automation](https://www.checklyhq.com/learn/playwright/login-automation/)
+- [Bypassing TOTP / 2FA login flows](https://www.checklyhq.com/learn/playwright/bypass-totp/)
+- [Automating Google login](https://www.checklyhq.com/learn/playwright/google-login-automation/)
+- [Automating Microsoft login](https://www.checklyhq.com/learn/playwright/microsoft-login-automation/)
+- [Playwright: Authentication](https://playwright.dev/docs/auth)
diff --git a/skills/playwright-best-practices-for-agents/references/ci.md b/skills/playwright-best-practices-for-agents/references/ci.md
new file mode 100644
index 00000000..f738454e
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/ci.md
@@ -0,0 +1,95 @@
+# Continuous integration
+
+The suite that guards merges should run unattended in CI: deterministic, fast, and leaving artifacts you can read after a failure. The environment-driven config that makes a run CI-aware (`forbidOnly`, `retries`, `workers`, `reporter`, `trace`) lives in [config.md](./config.md); this file is about wiring it into a pipeline.
+
+## GitHub Actions
+
+The canonical workflow — install browsers with their OS deps, run, and upload the report even on failure:
+
+```yaml
+steps:
+  - uses: actions/checkout@v5
+  - uses: actions/setup-node@v5
+    with:
+      node-version: lts/*
+  - name: Install dependencies
+    run: npm ci
+  - name: Install Playwright browsers
+    run: npx playwright install --with-deps
+  - name: Run Playwright tests
+    run: npx playwright test
+  - uses: actions/upload-artifact@v4
+    if: ${{ !cancelled() }}        # keep the report even when tests fail
+    with:
+      name: playwright-report
+      path: playwright-report/
+      retention-days: 30
+```
+
+`--with-deps` installs the system libraries the browsers need on the runner — skip it and headless Chromium/WebKit fail to launch. `if: ${{ !cancelled() }}` is the important bit: a failed run is exactly when you want the report.
+
+## Reporters
+
+Set the reporter by environment (config.md's baseline uses `process.env.CI ? 'github' : 'html'`):
+
+- **`github`** — inline annotations on the PR's changed lines. Good default for a single CI job.
+- **`html`** — the full browsable report. Upload it as an artifact; download and view locally with `npx playwright show-report`. It's a GUI, so it's for humans — as an agent, read the failure from stdout + `error-context.md` ([debugging.md](./debugging.md)).
+- **`blob`** — machine-mergeable output; **required when sharding** (below).
+- `list` / `line` / `dot` for terminal output, `json` / `junit` for downstream tooling.
+
+Combine reporters when you want both — e.g. annotations *and* a browsable report:
+
+```ts
+reporter: [['github'], ['html']],
+```
+
+## Sharding across machines
+
+Split the suite into N independent slices that run on parallel runners. Sharding balances best with `fullyParallel: true` (splits at the test level, not the file level — see [flakiness.md](./flakiness.md)):
+
+```yaml
+strategy:
+  fail-fast: false
+  matrix:
+    shardIndex: [1, 2, 3, 4]
+    shardTotal: [4]
+steps:
+  - name: Run shard
+    run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
+```
+
+Each shard emits a **blob** report; a final job merges them into one HTML report:
+
+```ts
+reporter: process.env.CI ? 'blob' : 'html',
+```
+
+```sh
+# after downloading every shard's blob into ./all-blob-reports
+npx playwright merge-reports --reporter html ./all-blob-reports
+```
+
+Don't merge by hand — `merge-reports` reconciles retries, flaky markers, and attachments (traces) across shards into a single coherent report.
+
+## Capture artifacts for failures
+
+Turn on the artifacts that make a CI failure debuggable without re-running — these are config.md's `use` knobs, and they pay off most in CI:
+
+```ts
+use: {
+  trace: 'on-first-retry',       // a trace the moment a test retries
+}
+```
+
+The blob/HTML report bundles them. Read them with the agent-friendly primitives in [debugging.md](./debugging.md) — `show-trace` and `show-report` are GUIs for a human reviewing the run.
+
+## The same tests as monitors
+
+A Playwright suite that proves a deploy in CI can run unchanged as scheduled [Checkly monitors](https://www.checklyhq.com/?utm_source=ai-skill) from multiple regions: CI proves the change is good before it ships, monitoring proves production *stays* good after. One test asset, two jobs.
+
+## Deeper in the docs
+
+- [Running tests in parallel](https://www.checklyhq.com/learn/playwright/testing-in-parallel/)
+- [Playwright: CI](https://playwright.dev/docs/ci-intro)
+- [Playwright: Sharding](https://playwright.dev/docs/test-sharding)
+- [Playwright: Reporters](https://playwright.dev/docs/test-reporters)
diff --git a/skills/playwright-best-practices-for-agents/references/clock.md b/skills/playwright-best-practices-for-agents/references/clock.md
new file mode 100644
index 00000000..f633a9e7
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/clock.md
@@ -0,0 +1,44 @@
+# Clock & time
+
+Time-dependent UI — countdowns, "expires in 5 min", "2 hours ago", session timeouts — is flaky against the real clock. `page.clock` makes time deterministic so those assertions are stable.
+
+## Freeze the clock — `setFixedTime`
+
+When the page just *reads* the time (relative timestamps, a displayed date), pin `Date.now()` and `new Date()` to a fixed value:
+
+```ts
+await page.clock.setFixedTime(new Date('2024-02-02T10:00:00'))
+await page.goto('/')
+await expect(page.getByTestId('published')).toHaveText('2 hours ago')
+```
+
+`setFixedTime` affects only `Date` — timers (`setTimeout`/`setInterval`) keep running normally. It's the lightest option and pairs well with visual tests ([visual.md](./visual.md)), where a moving clock would cause diffs.
+
+## Control the clock — `install`, then advance
+
+When you need to drive timers — fire a countdown, trigger a session timeout — `install` a fake clock **before navigating**, then move time forward:
+
+```ts
+await page.clock.install({ time: new Date('2024-02-02T08:00:00') })
+await page.goto('/')
+
+await page.clock.fastForward('05:00')   // +5 minutes
+await expect(page.getByText('Session expiring')).toBeVisible()
+```
+
+- `fastForward(ticks)` — jump forward by `ms` or a `"mm:ss"` string; fires each due timer **at most once** (like a laptop waking from sleep).
+- `runFor(ticks)` — tick forward by a duration, firing **every** timer along the way.
+- `pauseAt(time)` — jump to an exact time and **freeze** there; the page reads that time and timers stay paused until you advance or `resume`. Ideal for a deterministic assertion or screenshot.
+- `resume()` — let time flow normally again.
+
+## Which to reach for
+
+- Page only **displays** a time → `setFixedTime`.
+- You need to **advance** time to fire timers or animations → `install` + `fastForward` / `runFor` / `pauseAt`.
+
+Either way, set the time up **before** `goto` so the app picks up the fake clock from the start.
+
+## Deeper in the docs
+
+- [Playwright: Clock](https://playwright.dev/docs/clock)
+- [Playwright: `Clock` API](https://playwright.dev/docs/api/class-clock)
diff --git a/skills/playwright-best-practices-for-agents/references/config.md b/skills/playwright-best-practices-for-agents/references/config.md
new file mode 100644
index 00000000..6ce37fea
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/config.md
@@ -0,0 +1,91 @@
+# Config & projects
+
+`playwright.config.ts` is the single place for shared options. Put common settings under `use`, and use **projects** to run the same tests across browsers, devices, or environments.
+
+## A sensible baseline
+
+Start from a config that's parallel by default and CI-aware, rather than tuning options one by one:
+
+```ts playwright.config.ts
+import { defineConfig, devices } from '@playwright/test'
+
+export default defineConfig({
+  testDir: './tests',
+  fullyParallel: true,                      // run tests within a file in parallel too
+  forbidOnly: !!process.env.CI,             // fail the run if a stray test.only ships
+  retries: process.env.CI ? 2 : 0,          // retry only in CI (see flakiness.md)
+  workers: process.env.CI ? 1 : undefined,  // cap in CI, auto-pick locally
+  reporter: process.env.CI ? 'github' : 'html',
+  use: {
+    baseURL: process.env.BASE_URL ?? 'https://danube-web.shop',
+    trace: 'on-first-retry',                // capture a trace when a test retries
+  },
+  projects: [
+    { name: 'setup', testMatch: /.*\.setup\.ts/ },
+    { name: 'chromium', use: { ...devices['Desktop Chrome'] }, dependencies: ['setup'] },
+    { name: 'mobile', use: { ...devices['Pixel 5'] } },
+  ],
+})
+```
+
+The `process.env.CI ? … : …` pattern tunes each option to its environment: locally you optimize for fast, focused iteration (auto-picked workers, `test.only` allowed, **no retries so you notice flakes immediately**); CI optimizes for determinism and guardrails (capped workers for repeatable ordering, `test.only` rejected, a couple of retries to absorb genuine infra hiccups). Retries are a safety net for infrastructure, not a fix for flaky tests — see [flakiness.md](./flakiness.md). `fullyParallel: true` also runs tests *within* a file concurrently, which has real isolation trade-offs — also covered in [flakiness.md](./flakiness.md).
+
+## Environment & secrets
+
+The config and tests read `process.env` throughout (`BASE_URL`, `CI`, credentials in [auth.md](./auth.md)). `CI` is set for you; load the rest from a `.env` file with [`dotenv`](https://www.npmjs.com/package/dotenv) at the top of the config, so a local run and CI resolve the same names:
+
+```ts playwright.config.ts
+import { defineConfig } from '@playwright/test'
+import 'dotenv/config'                 // populate process.env before defineConfig reads it
+
+export default defineConfig({
+  use: { baseURL: process.env.BASE_URL ?? 'https://danube-web.shop' },
+})
+```
+
+Git-ignore `.env` (it holds secrets) and commit a `.env.example` with just the names. In CI, set the same variables as secrets instead of shipping the file.
+
+## Shared options (`use`)
+
+Everything under `use` applies to every test's browser context (a project's `use` overrides it):
+
+- **`baseURL`** — the one to always set. Lets tests call `page.goto('/login')` and keeps environments swappable via an env var.
+- **Artifact capture** — `trace` (and `video`), e.g. `trace: 'on-first-retry'` or `'retain-on-failure'`. The "turn it on" knobs live here; reading a trace is in [debugging.md](./debugging.md). A trace already bundles a per-step screenshot timeline, so reach for traces over standalone screenshots.
+- **Emulation** — `viewport`, `locale`, `timezoneId`, `colorScheme: 'dark'`, `ignoreHTTPSErrors`.
+- **`headless`** — `false` to watch a run locally; leave `true` in CI.
+
+## Projects
+
+Each project runs the suite with its own `use`, and can filter and depend on others:
+
+- `dependencies` sequences projects — the classic use is an auth `setup` project that others depend on (see [auth.md](./auth.md)).
+- `testMatch` / `testIgnore` route files to projects (e.g. a smoke project, or per-environment `baseURL`/`retries`).
+- Run one project with `npx playwright test --project=chromium`; skip its deps with `--no-deps`.
+
+Parameterize a project by exposing a fixture as an **option** (`[value, { option: true }]`), then override it per project's `use` or per test with `test.use({ ... })` — one spec, many configurations.
+
+## Start your app first (`webServer`)
+
+Let Playwright boot your app before the run and tear it down after, so `npm test` is one command:
+
+```ts
+webServer: {
+  command: 'npm run start',
+  url: 'http://localhost:3000',
+  reuseExistingServer: !process.env.CI,   // reuse a running dev server locally
+  timeout: 120_000,
+}
+```
+
+Point `use.baseURL` at the same `url` so tests stay environment-agnostic.
+
+## Timeouts
+
+Test and assertion budgets are set here too: top-level `timeout` (per test, default 30s) and `expect: { timeout }` (web-first assertions, default 5s). See [assertions.md](./assertions.md) and [waiting.md](./waiting.md) for how these interact with auto-waiting.
+
+## Deeper in the docs
+
+- [Parameterizing projects](https://www.checklyhq.com/learn/playwright/how-to-parameterize-playwright-projects/)
+- [Playwright: Projects](https://playwright.dev/docs/test-projects)
+- [Playwright: Configuration](https://playwright.dev/docs/test-configuration)
+- [Playwright: webServer](https://playwright.dev/docs/test-webserver)
diff --git a/skills/playwright-best-practices-for-agents/references/console-errors.md b/skills/playwright-best-practices-for-agents/references/console-errors.md
new file mode 100644
index 00000000..a718ca8c
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/console-errors.md
@@ -0,0 +1,60 @@
+# Console errors
+
+A test can pass while the browser logs an uncaught exception or a `console.error` — a real bug your assertions never looked at. Turn unexpected browser errors into a test failure.
+
+## Listen for errors
+
+Two signals, in order of value:
+
+- `page.on('pageerror', …)` — **uncaught exceptions** in page code. The high-signal one; an unhandled error is almost always a bug.
+- `page.on('console', …)` — console output; filter to `msg.type() === 'error'` for `console.error` calls.
+
+```ts
+page.on('pageerror', err => console.log('uncaught:', err.message))
+page.on('console', msg => {
+  if (msg.type() === 'error') console.log('console.error:', msg.text())
+})
+```
+
+## Make it a gate with a fixture
+
+Collecting errors and asserting none at the end is per-test boilerplate — make it an **automatic fixture** so every test gets the check for free. The fixture attaches listeners *before* the test body (so nothing is missed on first navigation) and asserts in teardown:
+
+```ts base.ts
+import { test as base, expect } from '@playwright/test'
+
+export const test = base.extend<{ failOnError: void }>({
+  failOnError: [async ({ page }, use) => {
+    const errors: string[] = []
+    page.on('pageerror', err => errors.push(err.message))
+    page.on('console', msg => { if (msg.type() === 'error') errors.push(msg.text()) })
+
+    await use()   // run the test
+
+    expect(errors, 'no uncaught or console errors during the test').toEqual([])
+  }, { auto: true }],
+})
+export { expect } from '@playwright/test'
+```
+
+## Allowlist real noise — sparingly
+
+Some third-party scripts log benign errors you can't fix. Filter them by pattern, but keep the list **short and reviewed** — a broad filter quietly hides regressions:
+
+```ts
+const IGNORE = [/ResizeObserver loop limit exceeded/, /third-party-widget\.js/]
+page.on('pageerror', err => {
+  if (!IGNORE.some(re => re.test(err.message))) errors.push(err.message)
+})
+```
+
+## Inspecting errors live
+
+When chasing one error rather than gating the whole suite, `playwright-cli console error` prints the browser console from a driven or attached session — no listeners to wire up. See [debugging.md](./debugging.md).
+
+## Deeper in the docs
+
+- [Debugging common errors](https://www.checklyhq.com/learn/playwright/debugging-errors/)
+- [Playwright: `page.on('console')`](https://playwright.dev/docs/api/class-page#page-event-console)
+- [Playwright: `page.on('pageerror')`](https://playwright.dev/docs/api/class-page#page-event-page-error)
+- [Playwright: `ConsoleMessage`](https://playwright.dev/docs/api/class-consolemessage)
diff --git a/skills/playwright-best-practices-for-agents/references/debugging.md b/skills/playwright-best-practices-for-agents/references/debugging.md
new file mode 100644
index 00000000..cb664e13
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/debugging.md
@@ -0,0 +1,64 @@
+# Debugging
+
+Playwright's classic debugging tools — the Inspector (`--debug`), UI mode (`--ui`), and the Trace Viewer (`show-trace`) — are **interactive GUIs you can't drive** in an agent session. Skip them. Playwright ships a separate **agent CLI** built for exactly this situation, plus machine-readable failure artifacts. Reach for those instead.
+
+## Read the failure first (zero setup)
+
+1. **stdout call log** — the reporter prints the failing assertion *and* the call log: which locator/assertion timed out and what Playwright actually saw. This is the primary signal — read it before changing anything.
+2. **`error-context.md`** — on an `expect` failure, Playwright writes an aria snapshot of the page *at the moment it failed* to `test-results/<test>/error-context.md` (also on `testInfo.errors`). Machine-readable page structure, no re-run needed. *(recent Playwright, ≈1.60)*
+3. **Artifacts, not GUIs** — `--trace on` drops `trace.zip` into `test-results/`. The Trace Viewer (`npx playwright show-trace`, [trace.playwright.dev](https://trace.playwright.dev)) is a GUI — fine to hand a human, but as an agent prefer the live session below.
+
+## The agent CLI: `playwright-cli`
+
+A **separate tool** from `npx playwright`, purpose-built for coding agents — install it as a dev dependency (`npm install -D @playwright/cli`) and run it through `npx playwright-cli`, or install it globally (`npm install -g @playwright/cli`) to call `playwright-cli` directly. It runs a persistent browser daemon and, after every command, prints an accessibility **snapshot with element refs** (`e5`, `e15`) you act on. That makes actions deterministic and keeps it token-efficient: it surfaces a structured snapshot instead of dumping the raw DOM into your context. It also ships its own skill — `playwright-cli install --skills`.
+
+Use it to explore a flow and discover locators — the agent-drivable replacement for `codegen`'s GUI recorder:
+
+```bash
+playwright-cli open https://danube-web.shop
+playwright-cli snapshot                      # accessibility tree + element refs
+playwright-cli click e15                     # act on a ref
+playwright-cli fill e5 "user@example.com"
+```
+
+Each snapshot labels every ref with its role and accessible name — write the `getByRole`/`getByLabel` locator straight from those. (The `eN` refs drive the live session; they don't go into the spec.) Prefix any command with `-s=<name>` for named, isolated sessions.
+
+## Step through a failing test live: `--debug=cli`
+
+The agentic step debugger — non-interactive, so you *can* drive it. Run the test with `--debug=cli`; it pauses and prints a session name, then attach and inspect:
+
+```bash
+npx playwright test tests/checkout.spec.ts --debug=cli   # run in background; prints a session name
+playwright-cli attach <session-name>
+playwright-cli snapshot                       # page state at the pause
+playwright-cli step-over                      # advance one action
+playwright-cli console error                  # console errors
+playwright-cli network                        # network activity
+playwright-cli eval "() => document.title"
+playwright-cli pause-at tests/checkout.spec.ts:42   # set a breakpoint
+playwright-cli resume
+```
+
+Attach at the failing step, see why the locator didn't resolve or what the page actually rendered, fix the root cause, re-run. No Inspector, no trace GUI.
+
+## codegen is human-only
+
+`npx playwright codegen` records into the Inspector GUI — you can't drive it as an agent. Use the `playwright-cli` snapshot flow above instead — read roles and names off the snapshot and author the locator yourself. (Same point in [auth.md](./auth.md).)
+
+## Common failures → root cause
+
+Modernized — auto-waiting and web-first assertions replace the old `waitForSelector` advice:
+
+- **element not found / not visible** — usually a wrong or over-specific locator, or you acted before the UI settled. Switch to a user-facing locator; the action already auto-waits. Don't reach for `waitForSelector` / `waitForTimeout`. → [locators.md](./locators.md), [waiting.md](./waiting.md)
+- **strict mode violation (resolved to N elements)** — the locator matches more than one node. Narrow it with `.filter()`, a `getByRole` name, or by scoping into a region. → [locators.md](./locators.md)
+- **assertion timeout** — the call log shows what was or wasn't there. Either the locator is wrong or the state genuinely never happened; fix the cause, don't bump the timeout. → [assertions.md](./assertions.md)
+- **target/page closed** — the context closed mid-action, often a stray navigation or a page closed too early.
+
+> **Stay current.** These primitives are recent and version-gated — `error-context.md` and `--debug=cli` both landed in 2025 releases. Check `npx playwright --version` and `playwright-cli --version`; if they're behind, tell the user to update both packages (`@playwright/test` and `@playwright/cli`).
+
+## Deeper in the docs
+
+- [Playwright agent CLI: introduction](https://playwright.dev/agent-cli/introduction)
+- [Agent CLI: test debugging](https://playwright.dev/agent-cli/commands/test-debugging)
+- [Debugging scripts](https://www.checklyhq.com/learn/playwright/debugging/) — the human/GUI workflow
+- [Debugging common errors](https://www.checklyhq.com/learn/playwright/debugging-errors/)
diff --git a/skills/playwright-best-practices-for-agents/references/error-states.md b/skills/playwright-best-practices-for-agents/references/error-states.md
new file mode 100644
index 00000000..31147122
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/error-states.md
@@ -0,0 +1,86 @@
+# Error & edge states
+
+The happy path is the easy half. Test what the app does when the server errors, the network fails, the user goes offline, or a response is slow — that's where real bugs hide. All of these lean on request interception ([network.md](./network.md)).
+
+## Server errors
+
+Force an error response and assert the app shows a real error state — a message, a retry affordance — instead of a blank screen:
+
+```ts
+await page.route('**/api/orders', route => route.fulfill({ status: 500, body: 'boom' }))
+await page.goto('/orders')
+await expect(page.getByRole('alert')).toHaveText(/something went wrong/i)
+await expect(page.getByRole('button', { name: 'Retry' })).toBeVisible()
+```
+
+## Network failure & offline
+
+Drop requests to simulate a dead connection, or flip the whole context offline:
+
+```ts
+await page.route('**/api/**', route => route.abort())   // requests fail
+// …or take the whole context offline:
+await context.setOffline(true)
+await expect(page.getByText('You are offline')).toBeVisible()
+await context.setOffline(false)
+```
+
+`route.abort('internetdisconnected')` mimics a specific browser network error if the app branches on the failure type.
+
+## Loading & skeleton states
+
+To prove a loading state actually renders, delay the response so it's observable, then assert it resolves:
+
+```ts
+await page.route('**/api/orders', async route => {
+  await new Promise(r => setTimeout(r, 1000))   // deliberate, controlled delay
+  await route.fulfill({ json: orders })
+})
+await page.goto('/orders')
+await expect(page.getByTestId('skeleton')).toBeVisible()   // shown while pending
+await expect(page.getByRole('listitem')).toHaveCount(3)    // then the data
+```
+
+The delay lives in *your* mock, under your control — that's different from a `waitForTimeout` against the live app ([waiting.md](./waiting.md)), which is always wrong.
+
+## Assert recovery, not just failure
+
+A good error test also proves the app comes back. Serve the error first, then let a retry succeed — swap the handler so the second call returns 200 — and assert the recovered UI:
+
+```ts
+let attempt = 0
+await page.route('**/api/orders', route =>
+  attempt++ === 0
+    ? route.fulfill({ status: 500 })            // first call fails
+    : route.fulfill({ json: orders }),          // retry succeeds
+)
+await page.goto('/orders')
+await page.getByRole('button', { name: 'Retry' }).click()
+await expect(page.getByRole('listitem')).toHaveCount(3)
+```
+
+## Reproduce the error state with the agent CLI
+
+Mock the failure live and confirm the app actually renders an error state — the right alert, a retry affordance, the real copy — before you commit the assertions ([debugging.md](./debugging.md)):
+
+```bash
+playwright-cli open https://danube-web.shop/orders
+playwright-cli route "**/api/orders" --status=500   # force the server error
+playwright-cli reload                                 # re-fetch through the mock
+playwright-cli snapshot                               # the rendered error UI → author assertions from it
+playwright-cli console error                          # confirm it degrades without throwing
+```
+
+The snapshot tells you whether the app degraded gracefully or just blanked — the bug this test exists to catch — and hands you the real alert role and message text to assert against. `network --filter=orders` shows the request fired; `route-list` / `unroute` manage the active mocks.
+
+## Don't forget console errors
+
+An error state shouldn't spew uncaught exceptions while it renders. Pair these tests with the console-error gate in [console-errors.md](./console-errors.md).
+
+## Deeper in the docs
+
+- [Mocking API responses](https://www.checklyhq.com/learn/playwright/mock-api/)
+- [Intercepting requests](https://www.checklyhq.com/learn/playwright/intercept-requests/)
+- [Playwright: Mock APIs (`abort`, `fulfill`)](https://playwright.dev/docs/mock)
+- [Playwright: Network](https://playwright.dev/docs/network)
+- [Playwright: `context.setOffline`](https://playwright.dev/docs/api/class-browsercontext#browser-context-set-offline)
diff --git a/skills/playwright-best-practices-for-agents/references/files.md b/skills/playwright-best-practices-for-agents/references/files.md
new file mode 100644
index 00000000..ac705add
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/files.md
@@ -0,0 +1,78 @@
+# File upload & download
+
+Upload and download both come down to one rule: point Playwright at the input or capture the event — never try to automate the OS file dialog.
+
+## Upload to a file input
+
+The common case is a real `<input type="file">`. Set the files on its locator — no dialog involved:
+
+```ts
+await page.getByLabel('Avatar').setInputFiles('fixtures/avatar.png')
+await page.getByLabel('Docs').setInputFiles(['a.pdf', 'b.pdf'])   // multiple
+await page.getByLabel('Avatar').setInputFiles([])                  // clear the selection
+```
+
+Locate the input by its label or role like any control ([locators.md](./locators.md)). Paths are relative to the working directory.
+
+### From memory (no file on disk)
+
+Pass a buffer to upload generated content without a fixture file:
+
+```ts
+await page.getByLabel('Upload').setInputFiles({
+  name: 'report.csv',
+  mimeType: 'text/csv',
+  buffer: Buffer.from('a,b\n1,2\n'),
+})
+```
+
+### When there's no input — the file chooser
+
+Some UIs open a native chooser from a button with no reachable `<input>`. Capture the `filechooser` event, setting the wait up **before** the click:
+
+```ts
+const chooserPromise = page.waitForEvent('filechooser')
+await page.getByRole('button', { name: 'Upload' }).click()
+const chooser = await chooserPromise
+await chooser.setFiles('fixtures/avatar.png')
+```
+
+Prefer the input form when one exists; reach for the chooser only when the markup forces it.
+
+## Download a file
+
+A download is an event, not a navigation. Set up `waitForEvent('download')` **before** the click that triggers it (same promise-before-action pattern as [waiting.md](./waiting.md)), then save and inspect:
+
+```ts
+const downloadPromise = page.waitForEvent('download')
+await page.getByRole('button', { name: 'Export CSV' }).click()
+const download = await downloadPromise
+
+await download.saveAs('downloads/report.csv')   // persist it where you want
+download.suggestedFilename()                     // the server-suggested name
+```
+
+`download.path()` returns the temp file Playwright already stored; `download.createReadStream()` streams it. Downloads are accepted by default — no config needed.
+
+### Assert what came down
+
+Don't stop at "a download happened" — check the contents:
+
+```ts
+const download = await downloadPromise
+expect(download.suggestedFilename()).toBe('report.csv')
+const stream = await download.createReadStream()
+// …read the stream (or the saveAs path) and assert on the bytes / rows
+```
+
+## Discovering the control (agent)
+
+Not sure whether a page uses an `<input>` or a button-triggered chooser? Drive it with `playwright-cli`: `snapshot` shows the accessible control, and a trial click reveals whether a chooser fires — then write the matching pattern above. → [debugging.md](./debugging.md)
+
+## Deeper in the docs
+
+- [Testing file uploads](https://www.checklyhq.com/learn/playwright/testing-file-uploads/)
+- [Downloading files](https://www.checklyhq.com/learn/playwright/file-download/)
+- [Playwright: Uploading files](https://playwright.dev/docs/input#upload-files)
+- [Playwright: Downloads](https://playwright.dev/docs/downloads)
+- [Playwright: `FileChooser`](https://playwright.dev/docs/api/class-filechooser)
diff --git a/skills/playwright-best-practices-for-agents/references/flakiness.md b/skills/playwright-best-practices-for-agents/references/flakiness.md
new file mode 100644
index 00000000..fd273b8d
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/flakiness.md
@@ -0,0 +1,56 @@
+# Flakiness
+
+A flaky test passes and fails without any code change. The cause is almost always a **race between the test and the app**, or **shared state between tests**. Fix the cause — don't mask it with retries.
+
+## Common root causes → fix
+
+| Symptom | Real cause | Fix |
+|---|---|---|
+| Passes locally, fails in CI | Hard wait too short under load | Drop `waitForTimeout`; trust auto-waiting + web-first assertions → [waiting.md](./waiting.md) |
+| Assertion sees stale value | Non-retrying check (`innerText()` then `toBe`) | Web-first `await expect(locator)…` → [assertions.md](./assertions.md) |
+| Breaks after unrelated UI change | Brittle locator (CSS/`nth`) | User-facing locators → [locators.md](./locators.md) |
+| Fails only when run with others | Shared external state (same DB row, same account) | Each test provisions its own data via API → [test-structure.md](./test-structure.md) |
+| Fails when run in a different order | One test depends on another | Make every test independent |
+| Random hangs/timeouts on load | `networkidle` / arbitrary timing | Wait on app state (`waitForURL`, a visible result) → [waiting.md](./waiting.md) |
+
+## Retries are a safety net, not a fix
+
+Retries absorb genuine infrastructure hiccups (a dropped connection, a cold start) so one blip doesn't fail the run. They do **not** fix a flaky test — a retry that flips red→green is hiding a real race you should investigate.
+
+```ts playwright.config.ts
+export default defineConfig({
+  retries: process.env.CI ? 2 : 0,   // 0 locally so you notice flakes immediately
+  use: { trace: 'on-first-retry' },  // capture a trace the moment a test retries
+})
+```
+
+Keep retries **off locally** — a flake that only "passes on retry" in CI is one you'll never see if your laptop silently retries it too. When a test does retry, read the `on-first-retry` trace (see [debugging.md](./debugging.md)) to find the race. `test.info().retry` lets a fixture reset external state before a retry attempt if needed.
+
+## Isolation & parallelism
+
+Each test already gets a **fresh browser context** (its own cookies and storage), so browser state never leaks between tests. Flakiness comes from state *outside* the browser and from execution order.
+
+Playwright's default: **files run in parallel, tests within a file run in order on one worker.** Setting `fullyParallel: true` (the [config.md](./config.md) baseline) also spreads tests *within* a file across workers — so two tests in the same file can run at once and **must not** share mutable state or assume an order. That surfaces hidden coupling early: any test relying on a sibling's side effect now fails, forcing each to provision its own state via API ([test-structure.md](./test-structure.md)).
+
+- **Workers are isolated processes.** Nothing is shared across workers except external resources (your DB, test accounts). Give parallel tests distinct data, or they'll race on the same row.
+- **Opt a genuinely sequential file out** with `test.describe.configure({ mode: 'serial' })` — a stateful flow that must run in order (note: a failure stops the rest of the file). Prefer independent tests; reach for `serial` only when the flow is truly stateful, since it trades isolation for ordering.
+- **Worker-scoped fixtures** (`{ scope: 'worker' }`) share expensive setup across tests in a worker — keep them read-only; mutating shared fixture state reintroduces races.
+
+## Detecting flakiness
+
+A test that "passed once" isn't stable. Prove it by running it many times — and in parallel, to surface races:
+
+```sh
+npx playwright test tests/checkout.spec.ts --repeat-each=20   # run it 20× back to back
+npx playwright test tests/checkout.spec.ts --retries=3        # does it only pass on retry?
+```
+
+If a test fails only in CI, reproduce locally by matching CI's conditions — `fullyParallel: true` and the same `--workers` count — so the race shows up on your machine.
+
+Resilient tests are also resilient [Checkly monitors](https://www.checklyhq.com/?utm_source=ai-skill): the same race that flakes in CI pages you at 3am in production, so fixing the root cause pays off twice.
+
+## Deeper in the docs
+
+- [Running tests in parallel](https://www.checklyhq.com/learn/playwright/testing-in-parallel/)
+- [Playwright: Retries](https://playwright.dev/docs/test-retries)
+- [Playwright: Parallelism](https://playwright.dev/docs/test-parallel)
diff --git a/skills/playwright-best-practices-for-agents/references/forms.md b/skills/playwright-best-practices-for-agents/references/forms.md
new file mode 100644
index 00000000..5aea5896
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/forms.md
@@ -0,0 +1,77 @@
+# Forms & validation
+
+Fill fields by their label, submit like a user, and assert the outcome — including the error states, not just the happy path.
+
+## Fill fields
+
+```ts
+await page.getByLabel('Email').fill('user@example.test')        // clears then sets, in one step
+await page.getByLabel('Search').pressSequentially('lap')        // keystroke-by-keystroke (autocomplete)
+await page.getByLabel('Email').clear()                          // empty the field
+```
+
+`fill` is the default — it sets the value in one shot. Reach for `pressSequentially` only when the UI reacts to each keystroke (a type-ahead). Locate fields by label or role ([locators.md](./locators.md)); doing so doubles as an accessibility check.
+
+## Selects, checkboxes, radios
+
+```ts
+await page.getByLabel('Country').selectOption('DE')                  // by value
+await page.getByLabel('Country').selectOption({ label: 'Germany' })  // by visible label
+await page.getByLabel('Subscribe').check()
+await page.getByLabel('Subscribe').setChecked(false)
+await page.getByRole('radio', { name: 'Express' }).check()
+```
+
+`check()` / `uncheck()` assert the resulting state for you (and no-op if it's already there); `setChecked(boolean)` is handy when the value is dynamic.
+
+## Submit and assert the outcome
+
+Drive the submit and assert what the user would see:
+
+```ts
+await page.getByRole('button', { name: 'Sign up' }).click()
+await expect(page.getByText('Check your inbox')).toBeVisible()
+```
+
+## Test the error states, not just the happy path
+
+A form's validation is the part most likely to break. Assert the messages with web-first matchers ([assertions.md](./assertions.md)):
+
+```ts
+await page.getByRole('button', { name: 'Sign up' }).click()
+await expect(page.getByText('Email is required')).toBeVisible()
+await expect(page.getByRole('button', { name: 'Sign up' })).toBeDisabled()  // stays disabled until valid
+```
+
+Collect several field errors in one run with soft assertions (`expect.soft`) so one missing message doesn't hide the rest — see [assertions.md](./assertions.md).
+
+For field-level errors, Playwright has a purpose-built matcher: `toHaveAccessibleErrorMessage` asserts the field is flagged invalid **and** exposes the expected accessible error (via `aria-invalid` + `aria-errormessage`) — the accessibility-native check, stronger than scraping the message text by hand:
+
+```ts
+await expect(page.getByLabel('Email')).toHaveAccessibleErrorMessage('Enter a valid email')
+```
+
+## Discover and verify the form with the agent CLI
+
+Forms are where you'd otherwise guess — the exact label text, which control carries which role, the precise wording of each validation message. Read it off the live page instead ([debugging.md](./debugging.md)):
+
+```bash
+playwright-cli open https://danube-web.shop/signup
+playwright-cli snapshot                       # labels + roles for every field → author getByLabel/getByRole
+playwright-cli fill e5 "not-an-email"         # act on a ref from the snapshot
+playwright-cli click e9                        # submit
+playwright-cli snapshot                       # the rendered validation state → copy the real error text
+```
+
+The post-submit snapshot shows the messages the app actually renders and which fields it flags invalid, so you assert against real copy (`getByText('…')`, `toHaveAccessibleErrorMessage`) instead of guessing — the same accessibility tree that powers those locators and matchers.
+
+## File inputs
+
+A file field is just an upload — use `setInputFiles`, covered in [files.md](./files.md).
+
+## Deeper in the docs
+
+- [Clicking, typing, hovering](https://www.checklyhq.com/learn/playwright/clicking-typing-hovering/)
+- [Playwright: Text input (`fill`, `selectOption`, `check`)](https://playwright.dev/docs/input)
+- [Playwright: Locators (`getByLabel`)](https://playwright.dev/docs/locators)
+- [Assertions](./assertions.md)
diff --git a/skills/playwright-best-practices-for-agents/references/global-setup.md b/skills/playwright-best-practices-for-agents/references/global-setup.md
new file mode 100644
index 00000000..26e9bbaf
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/global-setup.md
@@ -0,0 +1,63 @@
+# Global setup
+
+Two ways to run work once before the suite: a `globalSetup` function, or a setup **project** that other projects depend on. They solve different problems — pick by whether the work needs the browser and fixtures.
+
+## Setup projects (prefer these)
+
+A setup project is a normal test file that runs first because other projects declare `dependencies: ['setup']` on it. It has the full toolkit — fixtures, `page`, `expect`, `baseURL`, tracing — and shows up in the report like any test. The canonical use is auth: log in once, save `storageState`, and dependents start signed in.
+
+```ts playwright.config.ts
+projects: [
+  { name: 'setup', testMatch: /.*\.setup\.ts/ },
+  { name: 'chromium', use: { ...devices['Desktop Chrome'] }, dependencies: ['setup'] },
+]
+```
+
+Reach for a setup project for anything that benefits from the browser or fixtures, or that you want visible — and retried and traced — like a test. Full auth flow in [auth.md](./auth.md); the projects/`dependencies` mechanics in [config.md](./config.md).
+
+## `globalSetup` — bootstrapping outside the runner
+
+`globalSetup` is a single function run **once** before everything, *outside* the test runner — no fixtures, no `page`, and none of the `use` options applied for you (no automatic `baseURL`, no tracing). It does receive the resolved `config`, so you can still read values like `config.projects[0].use.baseURL` when you need them. Reach for it for non-test, non-browser bootstrapping: seed a database, start an external service, mint an API token.
+
+```ts playwright.config.ts
+export default defineConfig({
+  globalSetup: './global-setup.ts',
+  globalTeardown: './global-teardown.ts',
+})
+```
+
+```ts global-setup.ts
+import type { FullConfig } from '@playwright/test'
+
+export default async function globalSetup(config: FullConfig) {
+  await seedDatabase()
+  process.env.API_TOKEN = await mintToken()   // pass data to tests via env vars
+}
+```
+
+It runs outside any test, so it returns nothing to tests directly — hand data over through `process.env` or a file on disk. An error here fails the whole run before a single test starts.
+
+## Teardown
+
+- A `globalTeardown` function mirrors `globalSetup` for one-time cleanup after the run.
+- A setup *project* cleans up with a **teardown project**: point the project's `teardown` at another project that runs once everything depending on it has finished.
+
+```ts playwright.config.ts
+projects: [
+  { name: 'setup db', testMatch: /global\.setup\.ts/, teardown: 'cleanup db' },
+  { name: 'cleanup db', testMatch: /global\.teardown\.ts/ },
+]
+```
+
+## Which one
+
+- **Auth, or anything needing the browser / fixtures / report** → setup project.
+- **Seeding, services, tokens — non-test bootstrapping** → `globalSetup`.
+
+When in doubt, prefer a setup project: it reuses your config and is visible in the report when it breaks.
+
+## Deeper in the docs
+
+- [Playwright: Global setup and teardown](https://playwright.dev/docs/test-global-setup-teardown)
+- [Playwright: Projects (dependencies & teardown)](https://playwright.dev/docs/test-projects)
+- [Managing authentication](https://www.checklyhq.com/learn/playwright/authentication/)
diff --git a/skills/playwright-best-practices-for-agents/references/iframes.md b/skills/playwright-best-practices-for-agents/references/iframes.md
new file mode 100644
index 00000000..fd2e7d8d
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/iframes.md
@@ -0,0 +1,52 @@
+# Frames & iframes
+
+An `<iframe>` is a separate document — main-page locators stop at its boundary. Step into it with a frame locator, then locate as usual.
+
+## `frameLocator` — step into the frame
+
+```ts
+await page
+  .frameLocator('iframe[title="Checkout"]')
+  .getByRole('button', { name: 'Pay' })
+  .click()
+```
+
+A frame locator is lazy and auto-waiting, exactly like a normal locator ([locators.md](./locators.md)): it waits for the frame and the element to appear, so no `waitForSelector`. Pick a stable selector for the iframe — its `name`, `title`, or `src` — never a positional one.
+
+## Find the iframe with a user-facing locator
+
+To target the iframe element the user-facing way, locate it and call `.contentFrame()`:
+
+```ts
+const checkout = page.getByTitle('Checkout').contentFrame()
+await checkout.getByLabel('Card number').fill('4242 4242 4242 4242')
+```
+
+Same result as `frameLocator`, but you find the frame with priority-ladder locators instead of a CSS string.
+
+## Nested frames
+
+Chain to step through frames inside frames:
+
+```ts
+await page
+  .frameLocator('#outer')
+  .frameLocator('#inner')
+  .getByRole('textbox')
+  .fill('hi')
+```
+
+## Don't use the handle API
+
+The older `page.frame({ name })` / `elementHandle.contentFrame()` handle approach doesn't auto-wait and is easy to get wrong. Prefer `frameLocator` / `locator.contentFrame()` — they re-resolve on every use and wait for actionability.
+
+## Discovering frame content (agent)
+
+`playwright-cli snapshot` includes the accessibility tree **inside** iframes, so you can read the roles and names to target and author the locator to chain off the frame. → [debugging.md](./debugging.md)
+
+## Deeper in the docs
+
+- [Handling iframes](https://www.checklyhq.com/learn/playwright/iframe-interaction/)
+- [Playwright: Frames](https://playwright.dev/docs/frames)
+- [Playwright: `FrameLocator`](https://playwright.dev/docs/api/class-framelocator)
+- [Playwright: `locator.contentFrame()`](https://playwright.dev/docs/api/class-locator#locator-content-frame)
diff --git a/skills/playwright-best-practices-for-agents/references/interactions.md b/skills/playwright-best-practices-for-agents/references/interactions.md
new file mode 100644
index 00000000..501af482
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/interactions.md
@@ -0,0 +1,77 @@
+# Keyboard, mouse & native dialogs
+
+Most flows are `click` and `fill` ([forms.md](./forms.md)). Some need real key presses, a pointer move, or a response to a browser dialog. Reach for these only when a plain action won't do — they're lower-level and easy to overuse.
+
+## Keyboard
+
+Prefer a semantic action; drop to keys for shortcuts, navigation, and type-ahead.
+
+```ts
+await page.getByRole('textbox').press('Enter')          // one key on a focused element
+await page.getByRole('textbox').press('Control+A')      // chord — modifiers joined with +
+await page.keyboard.press('Escape')                     // a key not aimed at an element (close a modal)
+await page.getByRole('combobox').pressSequentially('lap')  // key-by-key for type-ahead (see forms.md)
+```
+
+`press` takes key names like `Enter`, `Tab`, `Escape`, `ArrowDown`, `Backspace`, and `F1`–`F12`, combined with `Shift`/`Control`/`Alt`/`Meta`. Use it to exercise focus order (`Tab`), keyboard shortcuts, and accessibility — not as a slower substitute for `fill`.
+
+## Mouse, hover & scroll
+
+```ts
+await page.getByRole('menuitem', { name: 'More' }).hover()        // reveal a hover menu/tooltip
+await page.getByText('Row').click({ button: 'right' })            // context menu
+await page.getByText('Row').click({ modifiers: ['Shift'] })       // range-select
+await page.getByRole('listitem').last().scrollIntoViewIfNeeded()  // bring into view (infinite lists)
+```
+
+Actions auto-scroll and auto-wait for actionability already, so you rarely need raw coordinates. Reach for `page.mouse.move()` / `page.mouse.wheel()` only for pixel-bound cases — canvas, drag-on-canvas, wheel-zoom — which are inherently brittle. `dblclick()` is there when a double-click is the interaction under test.
+
+> **Hover reveals; assert the thing it reveals.** Check that the menu or tooltip became visible with a web-first matcher — don't assert "is hovered."
+
+## Clipboard
+
+Reading the clipboard needs permission; grant it, then assert what a "Copy" button actually wrote:
+
+```ts
+test.use({ permissions: ['clipboard-read', 'clipboard-write'] })
+
+await page.getByRole('button', { name: 'Copy link' }).click()
+const copied = await page.evaluate(() => navigator.clipboard.readText())
+expect(copied).toBe('https://danube-web.shop/i/42')
+```
+
+Clipboard access is most reliable on Chromium; where it isn't, assert the app's own "Copied!" confirmation instead. Permissions are a context capability ([mobile.md](./mobile.md)).
+
+## Native dialogs (alert / confirm / prompt)
+
+**Playwright auto-dismisses every dialog by default** — so a `confirm()` your flow depends on is *cancelled* unless you say otherwise, silently breaking the path. Register a handler **before** the action that triggers it:
+
+```ts
+page.on('dialog', dialog => dialog.accept())            // accept all dialogs
+await page.getByRole('button', { name: 'Delete' }).click()
+await expect(page.getByText('Deleted')).toBeVisible()
+```
+
+Assert on the dialog, or feed a prompt, from inside the handler:
+
+```ts
+page.once('dialog', async dialog => {                   // once: only the next action opens one
+  expect(dialog.type()).toBe('confirm')
+  expect(dialog.message()).toBe('Delete this item?')
+  await dialog.accept()        // dialog.accept('typed text') answers a prompt; dialog.dismiss() cancels
+})
+```
+
+> A handler **must** call `accept()` or `dismiss()`: a dialog is a modal that blocks the page until handled, so an unhandled one hangs the action.
+
+### Driving these with the agent CLI
+
+The agent CLI mirrors each — `press` for key presses, `hover` on a ref, and `dialog-accept` / `dialog-dismiss` to arm dialog handling before you trigger it ([debugging.md](./debugging.md)).
+
+## Deeper in the docs
+
+- [Playwright: Keyboard & mouse input](https://playwright.dev/docs/input)
+- [Playwright: Dialogs](https://playwright.dev/docs/dialogs)
+- [Clicking, typing, hovering](https://www.checklyhq.com/learn/playwright/clicking-typing-hovering/)
+</content>
+</invoke>
diff --git a/skills/playwright-best-practices-for-agents/references/locators.md b/skills/playwright-best-practices-for-agents/references/locators.md
new file mode 100644
index 00000000..c9629e3f
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/locators.md
@@ -0,0 +1,78 @@
+# Locators
+
+Pick locators a user (or screen reader) would recognize, not implementation details. Stable locators are the single biggest lever on test reliability.
+
+## Priority ladder
+
+Prefer, in order:
+
+1. `getByRole(role, { name })` — role + accessible name. Default choice.
+2. `getByLabel` / `getByPlaceholder` — form fields.
+3. `getByText` / `getByAltText` / `getByTitle` — non-interactive content.
+4. `getByTestId` — explicit `data-testid` escape hatch when nothing semantic fits.
+5. `locator(css/xpath)` — last resort only.
+
+```ts
+await page.getByRole('button', { name: 'Sign in' }).click()
+```
+
+Role-name matching is case-insensitive and substring-based, so it survives copy tweaks ("Sign in" → "Sign in now") but still fails when the button text breaks meaningfully.
+
+## Why not CSS/XPath
+
+CSS/XPath tie the test to markup. A class rename (`button-frontpage` → `button-hero`) breaks a passing test (false positive); CMS text breaking to `HEROBUTTON_TXT` leaves a CSS test green while the UI is broken (false negative). User-first locators track what the user sees.
+
+If `getByRole`/`getByLabel` *can't* find your elements, that's often an accessibility smell worth fixing in the app — not a reason to drop to CSS.
+
+## Discover locators with the agent CLI
+
+Don't guess selectors from source — read them off the live page. Drive the [`playwright-cli` agent CLI](./debugging.md): `playwright-cli open <url>`, then `playwright-cli snapshot` to print the accessibility tree — the roles and accessible names that power `getByRole`/`getByLabel`. Read them straight off the snapshot and write the matching locator. Because it reflects the real accessibility tree, it naturally points you at priority-ladder locators. (`npx playwright codegen` generates locators too, but only through a GUI you can't drive as an agent.)
+
+## Strict mode & narrowing
+
+Locators run in strict mode: matching >1 element throws. Resolve by narrowing, not by silencing.
+
+- **Chain** to scope into a region — mix semantic and test-id freely: `page.getByTestId('product-grid').getByRole('link', { name: 'Buy' })`.
+- `.filter({ hasText })` / `.filter({ hasNotText })` — narrow by content (preferred for dynamic lists).
+- `.filter({ has })` / `.filter({ hasNot })` — narrow by a child locator.
+- `.or(locator)` / `.and(locator)` — match either/both of two locators.
+- `.first()` / `.nth(n)` / `.last()` — positional; brittle in dynamic content, fine for stable lists.
+
+```ts
+// Narrow before acting — fails when the user would also be stuck (all sold out)
+await page.getByRole('listitem')
+  .filter({ hasText: 'available' })
+  .getByRole('button', { name: 'buy' })
+  .click()
+```
+
+## Locators are lazy and reusable
+
+A locator is a *description*, not a captured element. It re-queries the DOM on every use, so define it once, reuse it, and chain off it. Don't `await` a locator on its own — only `await` the action or assertion.
+
+```ts
+const product = page.getByRole('listitem').filter({ hasText: 'Product 2' })
+await product.getByRole('button', { name: 'Add to cart' }).click()
+await expect(product).toContainText('In cart')
+```
+
+## Anti-patterns
+
+- CSS/XPath when a role/label exists.
+- Positional `.nth()` on lists whose order/content changes.
+- Locators that encode DOM structure (`div > div:nth-child(3)`).
+- **Pre-checking before acting** — actions already auto-wait, so the guard is redundant:
+  ```ts
+  // 👎 redundant
+  await expect(page.getByRole('button', { name: 'Login' })).toBeVisible()
+  await page.getByRole('button', { name: 'Login' }).click()
+  // 👍 just act — click() waits for actionability
+  await page.getByRole('button', { name: 'Login' }).click()
+  ```
+  (Assert visibility when visibility *is* the thing under test — not as a warm-up to every click. See [waiting.md](./waiting.md).)
+
+## Deeper in the docs
+
+- [Working with selectors](https://www.checklyhq.com/learn/playwright/selectors/)
+- [Clicking, typing, hovering](https://www.checklyhq.com/learn/playwright/clicking-typing-hovering/)
+- [Playwright locators guide](https://playwright.dev/docs/locators)
diff --git a/skills/playwright-best-practices-for-agents/references/mobile.md b/skills/playwright-best-practices-for-agents/references/mobile.md
new file mode 100644
index 00000000..9bafc918
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/mobile.md
@@ -0,0 +1,75 @@
+# Mobile & device emulation
+
+Playwright emulates a mobile browser — viewport, user agent, device scale, touch — but it's still desktop Chromium or WebKit, not a real phone. It's good for responsive layout and touch behavior; it's not a substitute for real-device testing.
+
+## Emulate a device
+
+Spread a descriptor from `devices` to set viewport, user agent, `deviceScaleFactor`, `isMobile`, and `hasTouch` in one go. Do it per project — the usual place ([config.md](./config.md)):
+
+```ts playwright.config.ts
+import { defineConfig, devices } from '@playwright/test'
+
+export default defineConfig({
+  projects: [
+    { name: 'iPhone', use: { ...devices['iPhone 13'] } },
+    { name: 'Pixel',  use: { ...devices['Pixel 5'] } },
+  ],
+})
+```
+
+Or for a single test or group:
+
+```ts
+test.use({ ...devices['iPhone 13'] })
+```
+
+`devices['iPhone 13']` runs on WebKit (Mobile Safari user agent); the Pixel runs on Chromium. It's emulation — close enough for layout and touch, but not actual iOS Safari. Real-device coverage needs a device cloud.
+
+## Touch & taps
+
+A device descriptor sets `hasTouch: true`, which enables touch events. Use `tap()` instead of `click()`:
+
+```ts
+await page.getByRole('button', { name: 'Menu' }).tap()
+```
+
+`tap()` requires touch to be enabled — it is under a mobile device, or set `use: { hasTouch: true }` directly. For low-level taps by coordinate there's `page.touchscreen.tap(x, y)`. Playwright's built-in touch is basic: multi-touch gestures like pinch and swipe aren't first-class, so simulate a swipe by dispatching touch events, or fall back to the equivalent scroll/click where the UI allows.
+
+## Geolocation & permissions
+
+A location-aware UI needs both the coordinates *and* the permission granted — set them on the context, per test or per project:
+
+```ts
+test.use({
+  geolocation: { latitude: 48.8584, longitude: 2.2945 },   // near the Eiffel Tower
+  permissions: ['geolocation'],
+})
+
+test('shows nearby stores', async ({ page }) => {
+  await page.goto('/stores')
+  await expect(page.getByText('Paris')).toBeVisible()
+})
+```
+
+Move the pin mid-test with `context.setGeolocation({ latitude, longitude })` (the permission must already be granted). Other capabilities ride the same `permissions` option — `'notifications'`, `'camera'`, `'clipboard-read'` — granted in config or at runtime with `context.grantPermissions([...])`.
+
+## Responsive breakpoints
+
+To check a layout at a breakpoint without a full device, set the viewport:
+
+```ts
+test('mobile nav collapses', async ({ page }) => {
+  await page.setViewportSize({ width: 375, height: 812 })
+  await page.goto('/')
+  await expect(page.getByRole('button', { name: 'Open menu' })).toBeVisible()
+})
+```
+
+Or run the same specs across several viewport projects to cover multiple breakpoints in one run.
+
+## Deeper in the docs
+
+- [Emulating mobile devices](https://www.checklyhq.com/learn/playwright/emulating-mobile-devices/)
+- [Playwright: Emulation (devices, viewport, touch)](https://playwright.dev/docs/emulation)
+- [Playwright: `locator.tap()`](https://playwright.dev/docs/api/class-locator#locator-tap)
+- [Config & projects](./config.md)
diff --git a/skills/playwright-best-practices-for-agents/references/multi-context.md b/skills/playwright-best-practices-for-agents/references/multi-context.md
new file mode 100644
index 00000000..e68f0048
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/multi-context.md
@@ -0,0 +1,85 @@
+# Multiple tabs, popups & users
+
+A browser **context** is an isolated session — its own cookies and storage; a **page** is a tab within it. Reach for an extra tab when one user opens a window, and an extra context when you need a second, independent user.
+
+## Popups & new tabs
+
+When an action opens a new tab or window, capture the `popup` event — set the wait up **before** the click (promise-before-action, [waiting.md](./waiting.md)):
+
+```ts
+const popupPromise = page.waitForEvent('popup')
+await page.getByRole('link', { name: 'Open invoice' }).click()
+const invoice = await popupPromise
+await expect(invoice.getByRole('heading')).toHaveText('Invoice #42')
+```
+
+The popup is a full `Page` in the **same context**, so it shares the session (cookies, login). Drive it like any page. To open a tab yourself, use `context.newPage()` — same session, second tab.
+
+## A second user — new context
+
+To act as two independent users in one test (collaboration, presence, an admin watching a customer), give each their own **context** so sessions don't bleed:
+
+```ts
+test('admin sees a new order from the customer', async ({ browser }) => {
+  const adminContext    = await browser.newContext({ storageState: 'playwright/.auth/admin.json' })
+  const customerContext = await browser.newContext({ storageState: 'playwright/.auth/customer.json' })
+  const pageAdmin    = await adminContext.newPage()
+  const pageCustomer = await customerContext.newPage()
+
+  await pageAdmin.goto('/admin/orders')
+  await pageCustomer.goto('/checkout')
+
+  await pageCustomer.getByRole('button', { name: 'Place order' }).click()
+
+  await expect(pageAdmin.getByText('Order #1001')).toBeVisible()   // appears live in the dashboard
+
+  await adminContext.close()
+  await customerContext.close()
+})
+```
+
+Each context can carry a different `storageState`, so the two users are signed in as different accounts — see [auth.md](./auth.md) for capturing per-role state. Close the contexts you create.
+
+### Hand the second user to tests via a fixture
+
+Creating and closing the extra context in every test is boilerplate — wrap it in a fixture so cleanup is automatic and specs read at the user level ([test-structure.md](./test-structure.md)):
+
+```ts base.ts
+import { test as base, type Page } from '@playwright/test'
+
+export const test = base.extend<{ pageAdmin: Page }>({
+  pageAdmin: async ({ browser }, use) => {
+    const context = await browser.newContext({ storageState: 'playwright/.auth/admin.json' })
+    await use(await context.newPage())   // hand the admin's page to the test
+    await context.close()                // teardown — the test never closes it
+  },
+})
+```
+
+A test now takes `{ page, pageAdmin }` — `page` is the customer (the default fixture), `pageAdmin` is the admin — with no setup or teardown noise in the test body.
+
+## Same context or a new one?
+
+- **New tab, same user** (a link that opens a window, a multi-tab flow) → the popup event or `context.newPage()`.
+- **Different user, role, or session** → `browser.newContext()`. Isolation is the whole point — never share one context between two users.
+
+## Drive multiple users with the agent CLI
+
+The agent CLI's **named sessions** are the CLI mirror of contexts: each `-s=<name>` is an isolated browser — its own cookies and storage — so two sessions are two independent users, exactly like two contexts ([debugging.md](./debugging.md)).
+
+```bash
+playwright-cli -s=customer open https://danube-web.shop/checkout
+playwright-cli -s=admin    open https://danube-web.shop/admin/orders
+playwright-cli -s=customer click e12             # the customer places the order
+playwright-cli -s=admin    snapshot              # verify it shows up in the admin view
+playwright-cli list                              # the active sessions
+```
+
+Each session can load its own saved auth (`state-load`) to act as a different role — the CLI counterpart of `newContext({ storageState })` ([auth.md](./auth.md)). `close-all` tears them down.
+
+## Deeper in the docs
+
+- [Handling multiple tabs](https://www.checklyhq.com/learn/playwright/multitab-flows/)
+- [Playwright: Pages & popups](https://playwright.dev/docs/pages)
+- [Playwright: Browser contexts](https://playwright.dev/docs/browser-contexts)
+- [Playwright: Multiple sign-in roles](https://playwright.dev/docs/auth#testing-multiple-roles-together)
diff --git a/skills/playwright-best-practices-for-agents/references/network.md b/skills/playwright-best-practices-for-agents/references/network.md
new file mode 100644
index 00000000..ff0b3d00
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/network.md
@@ -0,0 +1,104 @@
+# Network
+
+Playwright lets you observe, block, mock, and replay HTTP traffic, and drive APIs directly. Reach for mocking to **isolate the frontend or pin down dynamic data** — not to paper over the flow you're actually testing (see [Don't over-mock](#dont-over-mock)).
+
+## Observe traffic
+
+Listen to requests and responses as they happen:
+
+```ts
+page.on('request', req => console.log('→', req.method(), req.url()))
+page.on('response', res => console.log('←', res.status(), res.url()))
+```
+
+Prefer waiting on the **UI change** a request causes — a web-first assertion on the element that updates is the most reliable signal. Reach for `page.waitForResponse` only when there's no visible change to assert on (e.g. a background fetch or fire-and-forget call). Never a hard wait. See [waiting.md](./waiting.md).
+
+**As an agent**, skip wiring up listeners: drive the flow with [`playwright-cli`](./debugging.md) and run `playwright-cli network` to watch requests and responses live. After adding a `route` mock, `playwright-cli snapshot` confirms the stubbed response actually produced the expected UI — verify the mock fired, don't assume it did.
+
+## Mock with `page.route`
+
+`page.route(urlPattern, handler)` intercepts matching requests before they leave the browser. The handler decides what happens. Glob patterns like `*/**/api/books/*` match regardless of host/port.
+
+**Patch one field, keep the rest real** — fetch the real response, edit it, fulfill:
+
+```ts
+await page.route('*/**/api/books/23', async route => {
+  const response = await route.fetch()
+  const json = await response.json()
+  json.stock = '12'                          // pin the dynamic value
+  await route.fulfill({ response, json })     // reuse real response, override body
+})
+```
+
+**Replace the whole payload** — no real call, so the test is faster and independent of the backend:
+
+```ts
+await page.route('*/**/api/books/23', route =>
+  route.fulfill({ json: { id: 23, title: 'Achilles', stock: '12' } }),
+)
+```
+
+**Block or pass through:**
+
+- `route.abort()` — drop the request (e.g. block images/trackers to speed up scraping).
+- `route.continue()` — let it proceed; pass `{ headers, postData, url }` to rewrite the outgoing request.
+- `route.fallback()` — defer to the next matching handler (handlers run last-registered first).
+
+```ts
+await page.route('**/*', route =>
+  route.request().resourceType() === 'image' ? route.abort() : route.continue(),
+)
+```
+
+## Record & replay with HAR
+
+When a flow depends on a slow or flaky third-party API, record its traffic once to a HAR file and replay it deterministically. Record with `update: true`, then commit the HAR and replay on every run:
+
+```ts
+// Record: hits the network and writes responses to the HAR
+await page.routeFromHAR('./hars/books.har', { url: '*/**/api/**', update: true })
+
+// Replay: serve matching requests from the HAR, no network
+await page.routeFromHAR('./hars/books.har', { url: '*/**/api/**' })
+```
+
+- `update: true` (re)records; omit it to replay.
+- `url` scopes which requests the HAR handles — leave the rest live.
+- `notFound: 'abort' | 'fallback'` controls unmatched requests (default `abort`).
+- `context.routeFromHAR` does the same for every page in a context. You can also capture a HAR from the CLI: `npx playwright open --save-har=books.har --save-har-glob="**/api/**" <url>`.
+
+## API testing with the `request` fixture
+
+The `request` fixture is an [`APIRequestContext`](https://playwright.dev/docs/api/class-apirequestcontext) — a clean HTTP client with no browser. Use it to test/seed APIs directly:
+
+```ts
+test('books API responds', async ({ request }) => {
+  const res = await request.post('https://api.example.com/graphql', {
+    headers: { Authorization: `Bearer ${process.env.API_TOKEN!}` },
+    data: { query: '{ books { id } }' },
+  })
+  expect(res).toBeOK()
+  expect((await res.json()).data.books).toHaveLength(3)
+})
+```
+
+Same context provisions test state via API in setup (faster and less flaky than the UI) and persists auth — see [auth.md](./auth.md) and [test-structure.md](./test-structure.md).
+
+## Don't over-mock
+
+Mocking isolates the frontend and stabilizes dynamic data, but every mocked call is a call you're **no longer testing**. For end-to-end coverage — and especially when the same tests run as production monitors — exercise the real stack:
+
+- Block images/CDNs and you won't catch their outages.
+- Mock `/checkout` and you'll miss a broken payment flow.
+- Stub third-parties and a tracking script that breaks layout slips through.
+
+Mock to isolate a component or fix a value for a deterministic assertion; keep the path under test real.
+
+## Deeper in the docs
+
+- [Mocking API responses](https://www.checklyhq.com/learn/playwright/mock-api/)
+- [Intercepting requests](https://www.checklyhq.com/learn/playwright/intercept-requests/)
+- [Testing APIs with Playwright](https://www.checklyhq.com/learn/playwright/testing-apis/)
+- [Detecting broken links](https://www.checklyhq.com/learn/playwright/how-to-detect-broken-links/)
+- [Playwright: Mock APIs (incl. HAR)](https://playwright.dev/docs/mock)
+- [Playwright: API testing](https://playwright.dev/docs/api-testing)
diff --git a/skills/playwright-best-practices-for-agents/references/tags-annotations.md b/skills/playwright-best-practices-for-agents/references/tags-annotations.md
new file mode 100644
index 00000000..35e33b10
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/tags-annotations.md
@@ -0,0 +1,75 @@
+# Tags & annotations
+
+Annotations change **whether or how** a single test runs; tags **group** tests so you can run a slice of the suite. Both keep a large suite navigable.
+
+## Annotations — skip, fixme, fail, slow
+
+| Annotation | Meaning |
+|---|---|
+| `test.skip()` | Don't run — not applicable here (e.g. a feature absent on one browser). |
+| `test.fixme()` | Known broken; don't run, and don't pretend it passes. A tracked TODO. |
+| `test.fail()` | Assert the test **currently** fails — flips to a failure the day the bug is fixed, so you notice. |
+| `test.slow()` | This test is legitimately slow; triples its timeout instead of a blanket bump. |
+
+Call them conditionally (so the suite self-documents *why*), or unconditionally inside a test:
+
+```ts
+import { test, expect } from '@playwright/test'
+
+// Conditional, at definition time
+test.skip(({ browserName }) => browserName === 'webkit', 'export not on Safari yet')
+
+test('admin export', async ({ page }) => {
+  test.slow()   // big report — needs the extra budget, just for this test
+  // …
+})
+```
+
+A `skip`/`fixme` is a tracked TODO, not a parking lot — and never a way to hide a flake. Fix the cause instead ([flakiness.md](./flakiness.md)).
+
+## `test.only` is a local tool
+
+`test.only` runs just that test while you iterate. Set `forbidOnly: true` in CI so a stray `.only` can't silently shrink the suite to one test ([config.md](./config.md)).
+
+## Tags — group and filter
+
+Tag tests, then run a slice. Use the metadata form (it keeps titles clean):
+
+```ts
+test('checkout', { tag: '@smoke' }, async ({ page }) => { /* … */ })
+
+test.describe('billing', { tag: ['@slow', '@billing'] }, () => { /* … */ })
+```
+
+Filter a run with `--grep` / `--grep-invert`:
+
+```sh
+npx playwright test --grep @smoke              # only smoke tests
+npx playwright test --grep-invert @slow        # everything except slow ones
+npx playwright test --grep "@smoke|@billing"   # either tag
+```
+
+`--grep` matches against the test title **and** its tags. Keep a small, stable vocabulary (`@smoke`, `@slow`, `@quarantine`) rather than ad-hoc labels nobody filters on.
+
+## Steps for readable reports
+
+`test.step('label', async () => { … })` groups actions into labelled phases so a report or trace points at a phase, not a flat action list — reach for it on long flows. Full detail (and the `@step` decorator for Page Objects) is in [test-structure.md](./test-structure.md).
+
+## Custom annotations (metadata)
+
+Attach context that surfaces in the report — an issue link, a reason:
+
+```ts
+test('quarantined upload',
+  { annotation: { type: 'issue', description: 'https://github.com/acme/app/issues/42' } },
+  async ({ page }) => { /* … */ },
+)
+```
+
+Push them dynamically too: `test.info().annotations.push({ type: 'perf', description: 'slow on staging' })`.
+
+## Deeper in the docs
+
+- [Playwright: Annotations](https://playwright.dev/docs/test-annotations)
+- [Playwright: Tags](https://playwright.dev/docs/test-annotations#tag-tests)
+- [Playwright: Command line (`--grep`)](https://playwright.dev/docs/test-cli)
diff --git a/skills/playwright-best-practices-for-agents/references/test-data.md b/skills/playwright-best-practices-for-agents/references/test-data.md
new file mode 100644
index 00000000..b5dd2777
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/test-data.md
@@ -0,0 +1,74 @@
+# Test data
+
+Isolated tests need their own data. Generate it per test (or per worker), seed it through the API, and clean up what you create — never let two tests share a mutable record.
+
+## Make data unique
+
+Parallel workers that reuse the same record race on it — one deletes the row another is asserting on. Give each test data that can't collide: a timestamp, a UUID, or the worker index.
+
+```ts
+import { test } from '@playwright/test'
+
+test('creates an order', async ({ page }, testInfo) => {
+  const email = `user-${testInfo.workerIndex}-${Date.now()}@example.test`
+  // …provision and use `email`; no other worker can generate the same one
+})
+```
+
+`testInfo.workerIndex` (and `parallelIndex`) identify the worker; combine with a timestamp for global uniqueness. For realistic-looking values use a generator like [`@faker-js/faker`](https://fakerjs.dev/) — but keep it deterministic when an assertion depends on the value.
+
+## Build with factories, not inline literals
+
+A factory returns a valid entity with sensible defaults and lets each test override only the field under test. Specs stay readable, and a schema change touches one place instead of every test.
+
+```ts
+type User = { name: string; email: string; plan: 'free' | 'pro' }
+
+export const makeUser = (overrides: Partial<User> = {}): User => ({
+  name: 'Test User',
+  email: `user-${Date.now()}@example.test`,
+  plan: 'free',
+  ...overrides,
+})
+
+const proUser = makeUser({ plan: 'pro' })   // only the bit that matters is explicit
+```
+
+Factories pair naturally with **fixtures**: a fixture calls the factory and hands the test a ready-made entity — and can tear it down afterwards (see the next section, and [test-structure.md](./test-structure.md) for fixtures generally).
+
+## Seed and tear down via the API
+
+Provision state with the `request` fixture in setup — faster and less flaky than clicking through the UI (see [network.md](./network.md), [test-structure.md](./test-structure.md)). Wrap create + delete in a fixture so every test gets fresh data and cleans up after itself.
+
+```ts base.ts
+import { test as base } from '@playwright/test'
+
+export const test = base.extend<{ order: Order }>({
+  order: async ({ request }, use) => {
+    const res = await request.post('/api/orders', { data: makeOrder() })
+    const order = await res.json()
+    await use(order)                                 // hand it to the test
+    await request.delete(`/api/orders/${order.id}`)  // teardown — runs even if the test failed
+  },
+})
+export { expect } from '@playwright/test'
+```
+
+Code after `await use(...)` runs whether the test passed or failed, so a failure can't leak data into the next run.
+
+## Share read-only data per worker
+
+Expensive setup that every test only *reads* — a seeded product catalog, a reference account — can be a **worker-scoped** fixture (`{ scope: 'worker' }`), created once per worker and reused. Keep it read-only; mutating shared fixture state reintroduces the races you split the data to avoid ([flakiness.md](./flakiness.md)).
+
+## Anti-patterns
+
+- A hardcoded shared account or row (`user1`) that tests mutate — the classic source of order-dependent flakes.
+- Relying on data a previous test left behind. Each test provisions its own.
+- Random values where the assertion needs a known one — make *those* deterministic.
+
+## Deeper in the docs
+
+- [Handling test data](https://www.checklyhq.com/learn/playwright/handling-test-data/)
+- [Testing APIs with Playwright](https://www.checklyhq.com/learn/playwright/testing-apis/)
+- [Playwright: API testing](https://playwright.dev/docs/api-testing)
+- [Playwright: `TestInfo` (`workerIndex`, `parallelIndex`)](https://playwright.dev/docs/api/class-testinfo)
diff --git a/skills/playwright-best-practices-for-agents/references/test-structure.md b/skills/playwright-best-practices-for-agents/references/test-structure.md
new file mode 100644
index 00000000..a49fde65
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/test-structure.md
@@ -0,0 +1,68 @@
+# Test structure
+
+How to organize tests, share setup, and configure runs so a suite stays fast, isolated, and maintainable.
+
+## Test design principles
+
+- **Short & focused** — one feature per test. If a test's assertions span two features (checkout *and* coupons), split it so a failure points at one thing.
+- **Independent** — no test depends on another running first. Each provisions the state it needs, so tests run in any order and in parallel.
+- **Set up via API, not the UI** — create users/data through API calls in setup/teardown; it's faster and less flaky than clicking through forms. Only drive the UI for the behavior actually under test.
+
+## Fixtures
+
+Fixtures provide per-test state and replace copy-pasted setup. Built-ins: `page`, `context`, `browser`, `browserName`, `request`.
+
+Define custom fixtures with `test.extend`. They're **lazy** — the setup runs only for tests that request the fixture. Code before `await use(value)` is setup; code after is teardown.
+
+```ts base.ts
+import { test as base } from '@playwright/test'
+
+export const test = base.extend<{ webApp: Page }>({
+  webApp: async ({ page }, use) => {
+    await login(page)        // setup
+    await use(page)          // hand the value to the test
+    // teardown after the test, if any
+  },
+})
+export { expect } from '@playwright/test'
+```
+
+Import `test` from your `base.ts` in spec files to share fixtures across the suite. For setup that must run for **every** test (like global hooks), make it an **automatic fixture** with `{ auto: true }` instead of repeating `beforeEach` in every file.
+
+## Page Object Model
+
+For larger suites, wrap a page's locators and interactions in a class so selectors live in one place and tests read at a higher level. A page object holds the `page`, declares its `Locator`s in the constructor, and exposes action (and optionally assertion) methods.
+
+```ts login-page.ts
+export class LoginPage {
+  constructor(private readonly page: Page) {
+    this.emailInput = page.getByLabel('Email')
+    this.submit = page.getByRole('button', { name: 'Sign in' })
+  }
+  async login(email: string, password: string) {
+    await this.emailInput.fill(email)
+    await this.submit.click()
+  }
+}
+```
+
+```ts test
+const login = new LoginPage(page)
+await login.login(process.env.USER_EMAIL!, process.env.USER_PASSWORD!)
+```
+
+Reach for POM when locators/flows repeat across many tests; skip it for a handful of simple specs. Fixtures and POM compose well — a fixture can hand a ready-made page object to tests.
+
+For `playwright.config.ts`, projects, `baseURL`, devices, and setup `dependencies`, see [config.md](./config.md).
+
+## Readable steps
+
+Group actions with `test.step('label', async () => { ... })` so reports and traces show meaningful phases instead of a flat action list. The same labels can be applied to Page Object methods automatically with a TypeScript `@step` decorator.
+
+## Deeper in the docs
+
+- [Best practices for writing tests](https://www.checklyhq.com/learn/playwright/writing-tests/)
+- [The testing pyramid](https://www.checklyhq.com/learn/playwright/testing-pyramid/)
+- [Custom test fixtures](https://www.checklyhq.com/learn/playwright/test-fixtures/)
+- [Self-documenting tests with step decorators](https://www.checklyhq.com/learn/playwright/steps-decorators/)
+- [Playwright: Page Object Model](https://playwright.dev/docs/pom)
diff --git a/skills/playwright-best-practices-for-agents/references/visual.md b/skills/playwright-best-practices-for-agents/references/visual.md
new file mode 100644
index 00000000..d6d22237
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/visual.md
@@ -0,0 +1,79 @@
+# Visual regression
+
+Pixel-compare a page or element against a committed baseline to catch unintended visual change. The hard part isn't the assertion — it's keeping renders deterministic, so a diff means a real change and not noise.
+
+## `toHaveScreenshot` — the visual assertion
+
+Auto-retrying: it takes shots until two consecutive ones are stable (waiting out late layout), then compares against the baseline. The first run writes the baseline; later runs compare against it.
+
+```ts
+await expect(page).toHaveScreenshot()                                  // viewport
+await expect(page).toHaveScreenshot({ fullPage: true })
+await expect(page.getByRole('article')).toHaveScreenshot('card.png')   // one element
+```
+
+Prefer it over the older `toMatchSnapshot` for anything rendered — `toMatchSnapshot` is for arbitrary buffers or text (a downloaded file, a JSON blob), not pages.
+
+## Structure over pixels — `toMatchAriaSnapshot`
+
+Pixel screenshots are powerful but brittle: any restyle diffs them, and baselines are platform-bound (below). When you care about **structure and content** — the right headings, items, and controls, in the right order — assert the accessibility tree instead. It's auto-retrying like any web-first matcher, stable across platforms, and produces a readable diff:
+
+```ts
+await expect(page.getByRole('main')).toMatchAriaSnapshot(`
+  - heading "Checkout" [level=1]
+  - list:
+    - listitem: "Coffee — €4.00"
+    - listitem: "Tea — €3.50"
+  - button "Pay"
+`)
+```
+
+Store the expected tree in a file with the `{ name }` form and regenerate it with `--update-snapshots`, the same workflow as screenshots — but without the platform-specific baseline problem.
+
+**Agent tie-in:** this YAML *is* the accessibility tree that `playwright-cli snapshot` prints and that Playwright writes to `error-context.md` on failure ([debugging.md](./debugging.md)) — capture it from a driven session and paste it straight in as the expected value.
+
+## Baselines are platform-specific — generate them where CI runs
+
+Browsers render differently across operating systems (fonts, anti-aliasing), so Playwright suffixes each baseline per platform — `card-chromium-darwin.png`. A baseline captured on your Mac will **always** diff against a Linux CI runner. Generate and commit baselines on the same platform CI uses: run the update inside the [Playwright Docker image](https://playwright.dev/docs/docker) or in the CI job itself. This is the #1 cause of "passes locally, fails in CI" for visual tests.
+
+## Stabilize the render
+
+A visual test is only useful if a diff means a real change. Remove the noise:
+
+- **Animations** — disabled by default (`animations: 'disabled'`); leave it on.
+- **Dynamic content** — `mask` the regions that legitimately change (dates, avatars, ads):
+  ```ts
+  await expect(page).toHaveScreenshot({ mask: [page.getByTestId('timestamp')] })
+  ```
+- **Fonts / late layout** — the auto-retry waits for a stable shot, but make sure web fonts have loaded first (assert some text is visible before the screenshot).
+- **Tolerance** — absorb sub-pixel anti-aliasing noise with `maxDiffPixels` / `maxDiffPixelRatio` (prefer these to a blanket `threshold`):
+  ```ts
+  await expect(page).toHaveScreenshot({ maxDiffPixels: 100 })
+  ```
+
+Set suite-wide defaults in config: `expect: { toHaveScreenshot: { maxDiffPixels: 100 } }`.
+
+## Updating baselines
+
+When a visual change is intentional, regenerate the baselines and **review the new images before committing** — an unreviewed update defeats the whole point:
+
+```sh
+npx playwright test --update-snapshots
+```
+
+Commit the `*-snapshots/` files in the same change that caused the visual difference.
+
+## Reading a visual diff (agent)
+
+On a mismatch Playwright writes `…-actual.png`, `…-expected.png`, and `…-diff.png` into `test-results/` (and attaches them to the report). Open the `-diff.png` to see exactly what moved — you can view the image directly, no trace GUI needed. → [debugging.md](./debugging.md)
+
+## Plain screenshots ≠ visual assertions
+
+`page.screenshot({ path: 'shot.png' })` just captures an image — useful as a one-off debug artifact, but it asserts nothing. Don't confuse it with `toHaveScreenshot`, which compares against a committed baseline and fails on a diff.
+
+## Deeper in the docs
+
+- [Taking & automating screenshots](https://www.checklyhq.com/learn/playwright/taking-screenshots/)
+- [Playwright: Visual comparisons](https://playwright.dev/docs/test-snapshots)
+- [Playwright: `toHaveScreenshot`](https://playwright.dev/docs/api/class-locatorassertions#locator-assertions-to-have-screenshot)
+- [Playwright: Docker](https://playwright.dev/docs/docker)
diff --git a/skills/playwright-best-practices-for-agents/references/waiting.md b/skills/playwright-best-practices-for-agents/references/waiting.md
new file mode 100644
index 00000000..3a783f37
--- /dev/null
+++ b/skills/playwright-best-practices-for-agents/references/waiting.md
@@ -0,0 +1,70 @@
+# Waiting
+
+Never hard-wait. Hard waits are the most common cause of flaky Playwright tests — they are always either too short (test fails early) or too long (test wastes time), and fluctuating load times make them fail randomly.
+
+```ts
+// BAD
+await page.waitForTimeout(1000)
+await page.getByRole('button', { name: 'Login' }).click()
+```
+
+## Trust auto-waiting actions
+
+Actions (`click`, `fill`, `selectOption`, …) wait for the element to be actionable before acting, retrying until the relevant checks pass or the timeout elapses (then they throw `TimeoutError`). No wait statement needed:
+
+```ts
+await page.getByRole('button', { name: 'Login' }).click()
+```
+
+The locator must first resolve to **exactly one** element (strict mode). The actionability checks themselves:
+
+- **visible** — has a non-empty bounding box and no `visibility: hidden`.
+- **stable** — same bounding box for two consecutive animation frames (not animating).
+- **receives events** — it's the hit target at the action point (not covered by an overlay).
+- **enabled** — no `[disabled]`, disabled `<fieldset>`, or `[aria-disabled]`.
+- **editable** — enabled and not `[readonly]`.
+
+Different actions run different checks. `click` waits for all five. `fill` waits for visible, enabled, and editable. The exact list per action is in the [official actionability table](https://playwright.dev/docs/actionability#actionability).
+
+A few low-level calls — `focus`, `press`, and `dispatchEvent` — run **no checks at all**: they fire immediately even if the element is invisible, animating, or covered, and won't retry. Use them only when you deliberately want to bypass the checks (e.g. dispatching a synthetic event a real user couldn't trigger); for normal interactions prefer the real action so you keep auto-waiting.
+
+Use `{ force: true }` only as a last resort: it skips the non-essential checks (e.g. receives-events), so a click can land on a covered or wrong element and hide a real bug.
+
+## Wait for state with web-first assertions
+
+To wait for a state change, assert it — the matcher auto-waits:
+
+```ts
+await expect(page.getByRole('button', { name: 'Login' })).toBeDisabled()
+await expect(page.getByRole('alert')).toBeHidden()
+```
+
+Prefer the async web-first matcher over its sync counterpart — `await expect(loc).toBeVisible()` waits; `await loc.isVisible()` only samples the current moment and invites flakiness.
+
+## Explicit waits — only when you must
+
+For navigation/network, not elements:
+
+- `page.waitForURL('**/login')` — wait for a navigation.
+- `page.waitForLoadState()` — defaults to `load`; can take `domcontentloaded`.
+- `page.waitForResponse(url)` / `page.waitForRequest(url)` — set up the promise *before* the action that triggers it.
+- `page.waitForEvent('popup')` — new windows/tabs.
+- `page.waitForFunction(fn)` — last resort for arbitrary in-page state.
+
+```ts
+const responsePromise = page.waitForResponse('**/api/login')
+await page.getByRole('button', { name: 'Login' }).click()
+await responsePromise
+```
+
+Avoid `networkidle` (in `waitForLoadState`/`waitForURL`) — it's discouraged and racy. Wait for app state instead.
+
+## Find the right wait with the agent CLI
+
+When you're unsure what a flow actually waits on, watch it instead of guessing: drive the page with `playwright-cli`, then `playwright-cli network --filter=<path>` shows which request gates the UI — the one to `waitForResponse` — and a `playwright-cli snapshot` after it confirms the state arrived. You end up waiting for app state, never a timer ([debugging.md](./debugging.md)).
+
+## Deeper in the docs
+
+- [Waits and timeouts](https://www.checklyhq.com/learn/playwright/waits-and-timeouts/)
+- [Navigation](https://www.checklyhq.com/learn/playwright/navigation/)
+- [Actionability](https://playwright.dev/docs/actionability)