Skip to content

Add CI integration for ui-test skill#70

Draft
shubh24 wants to merge 1 commit intomainfrom
shubh24/ui-test-ci-plan
Draft

Add CI integration for ui-test skill#70
shubh24 wants to merge 1 commit intomainfrom
shubh24/ui-test-ci-plan

Conversation

@shubh24
Copy link
Copy Markdown
Contributor

@shubh24 shubh24 commented Apr 4, 2026

Summary

  • Adds skills/ui-test/ci/ with a GitHub Actions workflow + wrapper script for running adversarial UI testing on every PR that touches frontend files
  • Users copy ui-test.yml to .github/workflows/, add two secrets (ANTHROPIC_API_KEY, BROWSERBASE_API_KEY), and PRs automatically get tested
  • Tested locally against a Cal.com instance on localhost:3000 — found 2 a11y bugs (meta-viewport, label/input ID mismatch), passed 11/15 tests

Files

File Purpose
ci/run-ui-test.sh Wrapper — builds prompt from git diff, pipes to claude --print, writes summary
ci/ui-test.yml Drop-in GitHub Actions workflow (Vercel preview detection, PR comment, artifact upload)
ci/README.md Setup instructions + local testing docs

Key design decisions

  • --dangerously-skip-permissions required for headless CI (no TTY to approve tool calls)
  • --local flag for testing the CI flow against localhost without needing a PR/diff
  • browse env local vs browse env remote chosen automatically based on context
  • Light mode (2 agents × 20 steps) by default to keep cost ~$0.50–$2/run

Test plan

  • Ran locally against Cal.com on localhost:3000 — produced structured summary with pass/fail/skip counts
  • Test on a real PR with Vercel preview deployment

🤖 Generated with Claude Code


Note

Medium Risk
Introduces new CI automation that runs third-party tooling (claude, browse) with broad permissions and external secrets, so misconfig or prompt/tooling behavior could affect CI reliability and cost.

Overview
Adds a copy-paste GitHub Actions workflow (skills/ui-test/ci/ui-test.yml) that gates on UI file changes, waits for a preview deploy (default Vercel), then runs agent-driven UI tests and posts/upserts a PR comment while uploading the HTML report/screenshots as artifacts.

Includes a new run-ui-test.sh wrapper to validate the preview URL, build a diff-scoped prompt (or --local full-app mode), invoke claude --print with an allowed tool set, and convert the run into a CI pass/fail via .context/ui-test-exit-code.

Adds skills/ui-test/ci/README.md with setup, configuration, and local test instructions.

Reviewed by Cursor Bugbot for commit 734b241. Bugbot is set up for automated code reviews on this repo. Configure here.

Adds a GitHub Actions workflow and wrapper script so users can run
adversarial UI testing on every PR that touches frontend files.

- run-ui-test.sh: builds prompt from git diff, pipes to claude --print
- ui-test.yml: drop-in workflow (Vercel preview, PR comment, artifacts)
- --local flag for testing the CI flow against localhost
- --dangerously-skip-permissions for headless CI (no TTY)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shubh24 shubh24 marked this pull request as draft April 4, 2026 00:48
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 734b241. Configure here.

echo "======================================="
cat .context/ui-test-summary.md
echo ""
echo "Exit code: $(cat .context/ui-test-exit-code)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Script never exits with the test result code

High Severity

The script writes the test exit code to .context/ui-test-exit-code and prints it, but never actually calls exit with that value. The last command is an echo, so the script always exits 0 when Claude runs to completion — even if tests failed. For local usage (documented in the README), this silently hides failures. In CI, the separate "Check pass rate" step partially compensates, but the "Run UI tests" step itself will misleadingly show as green.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 734b241. Configure here.

@@ -0,0 +1,154 @@
#!/usr/bin/env bash
set -euo pipefail
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipefail skips cleanup when Claude command fails

Medium Severity

With set -euo pipefail, if the claude --print command exits non-zero (e.g., bad API key, network failure, timeout), the entire script aborts at the pipeline on line 126. This skips the entire post-run section: browse session cleanup (browse stop, pkill), default exit-code file creation, and default summary file creation. Without the exit-code file, the workflow's "Check pass rate" step silently exits 0, masking the failure.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 734b241. Configure here.

BROWSERBASE_API_KEY: ${{ secrets.BROWSERBASE_API_KEY }}
PREVIEW_URL: ${{ needs.wait-for-preview.outputs.preview_url }}
UI_TEST_MODE: ${{ vars.UI_TEST_MODE || 'light' }}
UI_TEST_MAX_TOKENS: ${{ vars.UI_TEST_MAX_TOKENS || '100000' }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI_TEST_MAX_TOKENS variable is defined but never used

Medium Severity

UI_TEST_MAX_TOKENS is set as an environment variable in the workflow and documented in the README as a configurable cost control, but it's never passed to the claude CLI (e.g., via a --max-tokens flag) or referenced in run-ui-test.sh. Users who configure this variable to cap their token spend will get no actual cost limiting.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 734b241. Configure here.

id: run-tests
run: |
chmod +x skills/ui-test/ci/run-ui-test.sh
skills/ui-test/ci/run-ui-test.sh \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Script path mismatches installed skill directory path

High Severity

The "Install ui-test skill" step checks .claude/skills/ui-test and installs via npx skills add (which, per the cookie-sync skill's conventions, installs to .claude/skills/ui-test/). But the "Run UI tests" step references skills/ui-test/ci/run-ui-test.sh — a completely different path. For any external user who follows the README and copies only the workflow file to their repo, the script won't exist at the referenced path, and the step will fail with "file not found."

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 734b241. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant