Skip to content

Conversation

@ngoiyaeric
Copy link
Collaborator

@ngoiyaeric ngoiyaeric commented Jan 29, 2026

User description

The E2E testing suite has been significantly optimized for speed and reliability. By enabling parallel workers and sharding in CI, execution time is reduced. Test isolation is improved through custom fixtures and automatic state cleanup. High-priority tests are now tagged with @smoke for faster feedback loops.


PR created automatically by Jules for task 4516796761612256463 started by @ngoiyaeric


PR Type

Enhancement, Tests


Description

  • Enables 4 parallel workers and 4-way sharding in CI for faster test execution

  • Introduces custom authenticatedPage and testImage fixtures for improved test isolation

  • Replaces arbitrary waits with deterministic expect.poll in chat tests

  • Implements @smoke tagging strategy for critical path verification

  • Optimizes browser matrix to skip mobile tests on desktop and vice versa

  • Configures Playwright webServer for automatic app lifecycle management

  • Ensures test data isolation using unique identifiers and automatic storage cleanup


Diagram Walkthrough

flowchart LR
  A["Playwright Config"] -->|"Enable 4 workers"| B["Parallel Execution"]
  A -->|"Add webServer"| C["Auto App Lifecycle"]
  A -->|"Browser matrix optimization"| D["Skip Mobile on Desktop"]
  E["Custom Fixtures"] -->|"authenticatedPage"| F["Test Isolation"]
  E -->|"testImage"| F
  G["GitHub Actions"] -->|"4-way sharding"| H["Distributed Tests"]
  I["Test Files"] -->|"@smoke tags"| J["Critical Path Tests"]
  I -->|"Unique identifiers"| K["Data Isolation"]
Loading

File Walkthrough

Relevant files
Configuration changes
3 files
playwright.config.ts
Enable parallel workers and configure webServer                   
+11/-4   
playwright.yml
Implement 4-way sharding and remove manual server startup
+13/-16 
package.json
Add npm scripts for smoke tests and sharding                         
+4/-0     
Enhancement
10 files
fixtures.ts
Create custom authenticatedPage and testImage fixtures     
+44/-0   
global-setup.ts
Add global setup for E2E test suite initialization             
+9/-0     
calendar.spec.ts
Use fixtures and add unique identifiers for isolation       
+30/-30 
chat.spec.ts
Replace waits with deterministic polling and use fixtures
+15/-18 
header.spec.ts
Migrate to custom fixtures and add smoke tags                       
+6/-9     
images.spec.ts
Use testImage fixture to reduce boilerplate code                 
+19/-68 
map.spec.ts
Configure retries for flaky WebGL tests and use fixtures 
+11/-9   
mobile.spec.ts
Migrate to custom fixtures and add smoke tags                       
+5/-10   
responsive.spec.ts
Remove redundant setup and use authenticatedPage fixture 
+26/-71 
sidebar.spec.ts
Migrate to custom fixtures and add smoke tags                       
+5/-8     

Summary by CodeRabbit

  • New Features

    • Resolution search now accepts user-drawn map features and streams incremental results for faster, progressive feedback.
  • Tests

    • CI shards E2E tests, increases CI workers, tags key suites as smoke, adds retries and parallel mode.
    • Introduces an authenticated-page and test-image fixture, global setup, per-shard reporting, and a merged HTML report for inspection.
  • Chores

    • Added npm scripts and CI workflow improvements to support sharded and varied E2E runs.

✏️ Tip: You can customize this high-level summary in your review settings.

This commit implements several key improvements to the E2E testing infrastructure:
- Enables 4 workers in CI and implements 4-way sharding in GitHub Actions.
- Introduces `authenticatedPage` and `testImage` fixtures for better isolation and reduced boilerplate.
- Replaces arbitrary waits with deterministic `expect.poll` in chat tests.
- Optimizes the browser matrix to skip mobile tests on desktop and vice versa.
- Configures Playwright `webServer` for automatic app lifecycle management.
- Implements `@smoke` tagging strategy for critical path verification.
- Ensures test data isolation using unique identifiers and automatic storage cleanup.

Co-authored-by: ngoiyaeric <115367894+ngoiyaeric@users.noreply.github.com>
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@vercel
Copy link

vercel bot commented Jan 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
qcx Ready Ready Preview, Comment Jan 31, 2026 11:58am

@charliecreates charliecreates bot requested a review from CharlieHelps January 29, 2026 17:31
@CLAassistant
Copy link

CLAassistant commented Jan 29, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ ngoiyaeric
❌ google-labs-jules[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 29, 2026

Walkthrough

Adds sharded GitHub Actions Playwright e2e CI with PostgreSQL and report merging, new Playwright globalSetup and shared fixtures (authenticatedPage, testImage), new e2e npm scripts, refactors many tests to use fixtures and run safely in parallel/sharded CI, and threads user-drawn map features through resolution search.

Changes

Cohort / File(s) Summary
CI Workflow
.github/workflows/playwright.yml
Adds PostgreSQL service with healthchecks, introduces 4-shard matrix, runs DB migrations via DATABASE_URL (no explicit server start), executes Playwright per-shard with shard env, uploads per-shard blob artifacts, and adds a merge-reports job to download shards and produce merged HTML report.
Package scripts
package.json
Adds e2e helper scripts: test:e2e:smoke, test:e2e:parallel, test:e2e:chromium, test:e2e:shard.
Playwright config
playwright.config.ts
Adds globalSetup, raises CI workers to 2, changes CI reporters to composite (list, github, blob), refines project testIgnore/testMatch for mobile vs desktop, and enables conditional webServer command switching (CI build+start vs dev) with stdout/stderr piping.
Fixtures & Global Setup
tests/fixtures.ts, tests/global-setup.ts
Adds tests/fixtures.ts exporting extended test with authenticatedPage and testImage fixtures and re-exports expect; adds tests/global-setup.ts exporting default globalSetup(config: FullConfig).
Refactored tests (many)
tests/*.spec.ts
tests/calendar.spec.ts, tests/chat.spec.ts, tests/header.spec.ts, tests/images.spec.ts, tests/map.spec.ts, tests/mobile.spec.ts, tests/responsive.spec.ts, tests/sidebar.spec.ts
Switches tests to import from ./fixtures, use { authenticatedPage: page }, removes per-test navigation/waits, tags suites with @smoke, uses testImage fixture where applicable, introduces dynamic data (Date.now()) for note tests, and adjusts timing/selector waits for fixture-driven flows.
Resolution search (agents)
app/actions.tsx, components/header-search-button.tsx, lib/agents/resolution-search.tsx
Adds DrawnFeature type, passes drawnFeatures JSON from UI through header-search-button → app/actions.tsx → lib/agents/resolution-search.tsx; resolutionSearch now accepts optional drawnFeatures and uses streamObject for incremental streaming results.

Sequence Diagram(s)

sequenceDiagram
  participant GH as GitHub Actions (matrix)
  participant Runner as CI Runner
  participant Postgres as PostgreSQL Service
  participant Migrator as DB Migration Step
  participant Playwright as Playwright Test Worker
  participant App as Application Server
  participant Storage as Artifact Storage
  participant Merger as Merge Reports Job

  GH->>Runner: start shard (SHARD_INDEX/SHARD_TOTAL)
  Runner->>Postgres: start DB service + healthcheck
  Runner->>Migrator: run migrations (DATABASE_URL)
  Runner->>Playwright: run `bunx playwright test --shard=...` (envs set)
  Playwright->>App: access app (build+start in CI or dev)
  Playwright->>Postgres: exercise DB via app
  Playwright->>Storage: upload shard artifact (blob-report-{shardIndex})
  GH->>Merger: trigger merge-reports job
  Merger->>Storage: download shard artifacts
  Merger->>Playwright: run `bunx playwright merge-reports`
  Merger->>Storage: upload merged `playwright-report` HTML
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I scoped a fixture snug and bright,

Four shards hustle through the night,
A tiny PNG waits to sing,
Drawn lines whisper to the AI wing,
Hop, merge, report — the tests take flight. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main objective of the PR: optimizing the E2E testing suite specifically for CI efficiency through parallelization, sharding, and configuration improvements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch improve-e2e-testing-suite-4516796761612256463

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Jan 29, 2026

ⓘ Your approaching your monthly quota for Qodo. Upgrade your plan

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Swallowed exception: The new try { ... } catch {} around the loading-state assertion silently swallows all
errors without logging context, which may hide real failures.

Referred Code
try {
  await expect(loadingIndicator).toBeVisible({ timeout: 1000 });
} catch {
  // Loading was too fast, which is fine
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Unstructured logging: The new console.log call emits unstructured log output (not JSON/structured), which may
conflict with structured logging requirements depending on how CI logs are consumed.

Referred Code
async function globalSetup(config: FullConfig) {
  // Global initialization can be added here
  // For example, clearing a test database or setting up environment variables
  console.log('Starting E2E test suite...');
}

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Jan 29, 2026

ⓘ Your approaching your monthly quota for Qodo. Upgrade your plan

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Use zero-based shard indices

Correct the shardIndex in the GitHub Actions workflow matrix to be zero-based
([0, 1, 2, 3]) to align with Playwright's sharding expectations.

.github/workflows/playwright.yml [13-17]

 strategy:
   fail-fast: false
   matrix:
-    shardIndex: [1, 2, 3, 4]
+    shardIndex: [0, 1, 2, 3]
     shardTotal: [4]
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: This suggestion corrects a critical flaw in the test sharding configuration, which would cause the first shard of tests to be skipped and the final shard to fail.

High
High-level
Consider a more robust authentication fixture

The current authenticatedPage fixture relies on disabling authentication in CI.
It is recommended to implement a more robust fixture that performs a real
programmatic login to ensure the authentication flow is covered by tests.

Examples:

.github/workflows/playwright.yml [42]
        AUTH_DISABLED_FOR_DEV: true
tests/fixtures.ts [9-20]
  authenticatedPage: async ({ page }, use) => {
    await page.goto('/');
    // Increased timeout for CI environments
    await page.waitForSelector('[data-testid="chat-input"]', { timeout: 30000 });
    await use(page);

    // Cleanup after test to ensure isolation
    await page.evaluate(() => {
      localStorage.clear();
      sessionStorage.clear();

 ... (clipped 2 lines)

Solution Walkthrough:

Before:

// .github/workflows/playwright.yml
- name: Run Playwright tests
  run: bun run test:e2e:shard
  env:
    CI: true
    AUTH_DISABLED_FOR_DEV: true // Bypasses login

// tests/fixtures.ts
export const test = base.extend<{
  authenticatedPage: Page;
}>({
  authenticatedPage: async ({ page }, use) => {
    await page.goto('/'); // Assumes auth is disabled
    await page.waitForSelector('[data-testid="chat-input"]');
    await use(page);
  },
});

After:

// playwright.config.ts
export default defineConfig({
  // ...
  projects: [
    // Project to run authentication
    { name: 'setup', testMatch: /auth\.setup\.ts/ },
    // Main test project that depends on setup
    {
      name: 'chromium',
      dependencies: ['setup'],
      use: {
        ...devices['Desktop Chrome'],
        // Use saved authentication state
        storageState: 'playwright/.auth/user.json',
      },
    },
  ],
});

// tests/auth.setup.ts (New file)
setup('authenticate', async ({ page }) => {
  // Perform real login
  await page.goto('/login');
  await page.fill('input[name="email"]', process.env.TEST_USER_EMAIL);
  await page.fill('input[name="password"]', process.env.TEST_USER_PASSWORD);
  await page.click('button[type="submit"]');
  // Save session to a file
  await page.context().storageState({ path: 'playwright/.auth/user.json' });
});
Suggestion importance[1-10]: 8

__

Why: This is a high-impact suggestion that correctly identifies a major gap in the E2E testing strategy—bypassing the real authentication flow—which could allow critical login-related bugs to go undetected.

Medium
General
Clean up temporary test files

Add a try...finally block to the 'should allow a user to attach a file' test to
ensure the temporary file created for the test is deleted after execution.

tests/chat.spec.ts [61-74]

 test('should allow a user to attach a file', async ({ authenticatedPage: page }, testInfo) => {
   const filePath = testInfo.outputPath('test-file.txt');
   await require('fs').promises.writeFile(filePath, 'This is a test file.');
 
-  const fileChooserPromise = page.waitForEvent('filechooser');
-  await page.click('[data-testid="attachment-button"]');
-  const fileChooser = await fileChooserPromise;
-  await fileChooser.setFiles(filePath);
+  try {
+    const fileChooserPromise = page.waitForEvent('filechooser');
+    await page.click('[data-testid="attachment-button"]');
+    const fileChooser = await fileChooserPromise;
+    await fileChooser.setFiles(filePath);
 
-  await expect(page.locator('text=test-file.txt')).toBeVisible();
+    await expect(page.locator('text=test-file.txt')).toBeVisible();
 
-  await page.click('[data-testid="clear-attachment-button"]');
-  await expect(page.locator('text=test-file.txt')).not.toBeVisible();
+    await page.click('[data-testid="clear-attachment-button"]');
+    await expect(page.locator('text=test-file.txt')).not.toBeVisible();
+  } finally {
+    // Clean up the created file
+    await require('fs').promises.unlink(filePath);
+  }
 });
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion improves test hygiene by ensuring temporary files are cleaned up after the test runs, making the test suite more robust and self-contained.

Low
Remove redundant storage cleanup code

Remove the manual localStorage and sessionStorage cleanup from the
authenticatedPage fixture, as Playwright's default test isolation makes it
unnecessary.

tests/fixtures.ts [9-20]

 authenticatedPage: async ({ page }, use) => {
   await page.goto('/');
   // Increased timeout for CI environments
   await page.waitForSelector('[data-testid="chat-input"]', { timeout: 30000 });
   await use(page);
-
-  // Cleanup after test to ensure isolation
-  await page.evaluate(() => {
-    localStorage.clear();
-    sessionStorage.clear();
-  });
 },
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that manual storage cleanup is redundant due to Playwright's built-in test isolation, and removing it simplifies the fixture code.

Low
Remove redundant configuration property

Remove the redundant storageState: undefined property from the use block in
playwright.config.ts, as undefined is the default value.

playwright.config.ts [3-15]

 export default defineConfig({
   testDir: './tests',
   globalSetup: require.resolve('./tests/global-setup'),
   fullyParallel: true,
   forbidOnly: !!process.env.CI,
   retries: process.env.CI ? 2 : 0,
   workers: process.env.CI ? 4 : undefined,
   reporter: 'html',
   use: {
     baseURL: 'http://localhost:3000',
     trace: 'on-first-retry',
-    storageState: undefined,
   },
  • Apply / Chat
Suggestion importance[1-10]: 3

__

Why: The suggestion correctly identifies a redundant configuration property, and removing it simplifies the code without changing behavior.

Low
  • Update

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
tests/sidebar.spec.ts (1)

11-24: Consider adding a wait for the history panel to be visible before clearing.

After clicking the history button on Line 18, the panel may still be animating. Adding a wait for the panel visibility before clicking the clear button would improve reliability.

♻️ Proposed improvement for test stability
     // Now, open the history panel and clear the history
     await page.click('[data-testid="history-button"]');
+    await expect(page.locator('[data-testid="history-panel"]')).toBeVisible();
     await page.click('[data-testid="clear-history-button"]');
     await page.click('[data-testid="clear-history-confirm"]');
tests/images.spec.ts (1)

106-123: Consider adding a timeout or skip condition for image generation.

This test assumes the bot can generate images, which may not always succeed or may be slow. The 30-second timeout for the response image (line 117) is reasonable, but consider whether this test should be conditional or have retry logic if image generation is unreliable.

tests/calendar.spec.ts (1)

100-115: Consider adding unique identifiers to remaining note tests for consistency.

These tests use hardcoded strings ('Note to delete', 'Highlight test note') while other tests in this file use Date.now() for uniqueness. For consistency and parallel execution safety, consider using unique identifiers here as well.

♻️ Proposed fix for delete test
   test('should delete a note', async ({ authenticatedPage: page }) => {
     await page.click('[data-testid="calendar-toggle"]');
     
     // Select a date and add a note
     await page.click('[data-testid="calendar-day-15"]');
-    await page.fill('[data-testid="calendar-note-input"]', 'Note to delete');
+    await page.fill('[data-testid="calendar-note-input"]', `Note to delete ${Date.now()}`);
     await page.click('[data-testid="calendar-save-note"]');
♻️ Proposed fix for highlight test
   test('should highlight dates with notes', async ({ authenticatedPage: page }) => {
     await page.click('[data-testid="calendar-toggle"]');
     
     // Add a note to a date
     await page.click('[data-testid="calendar-day-15"]');
-    await page.fill('[data-testid="calendar-note-input"]', 'Highlight test note');
+    await page.fill('[data-testid="calendar-note-input"]', `Highlight test note ${Date.now()}`);
     await page.click('[data-testid="calendar-save-note"]');

Also applies to: 138-152

tests/map.spec.ts (1)

26-31: Consider defensive null check for viewport size.

viewportSize() can return null if the page context doesn't have a viewport configured. While this is unlikely given the fixture setup, using a defensive check would be more robust.

🛡️ Suggested defensive check
-    if (page.viewportSize()!.width <= 768) {
+    const viewport = page.viewportSize();
+    if (!viewport || viewport.width <= 768) {
       test.skip(true, 'Zoom controls are not visible on mobile');
       return;
     }
🤖 Fix all issues with AI agents
In @.github/workflows/playwright.yml:
- Around line 44-57: Add a new job (e.g., merge-reports) that runs after the
test job (needs: test, if: always()) which checks out the repo, sets up the
runtime (Node/Bun as used in the pipeline), downloads all shard artifacts named
playwright-report-${{ matrix.shardIndex }} using actions/download-artifact@v4
with pattern "playwright-report-*" and merge-multiple: true, run the merge
command (npx playwright merge-reports or bunx playwright merge-reports) against
the downloaded folder to produce a unified HTML report, and finally upload the
merged artifact (name it playwright-report-merged) with retention-days set as
desired so reviewers see a single consolidated playwright report.
- Around line 38-42: The workflow currently sets NODE_ENV: production while also
defining AUTH_DISABLED_FOR_DEV: true which is ineffective and confusing; update
the Playwright workflow by either removing AUTH_DISABLED_FOR_DEV entirely or
change NODE_ENV to development if the intention is to disable auth for E2E
tests, and ensure the variable AUTH_DISABLED_FOR_DEV is only present when
NODE_ENV !== 'production' (referencing the NODE_ENV and AUTH_DISABLED_FOR_DEV
entries in the workflow).

In `@package.json`:
- Line 17: The "test:e2e:shard" npm script uses Unix-style env syntax
($SHARD_INDEX/$SHARD_TOTAL) which breaks on Windows; update the script to use
cross-env to set/consume these variables cross-platform (reference the
"test:e2e:shard" script and the cross-env devDependency) or replace it with a
small Node wrapper that reads process.env.SHARD_INDEX and
process.env.SHARD_TOTAL and invokes Playwright, ensuring the variables are
provided via cross-env in package.json when running locally.

In `@playwright.config.ts`:
- Line 20: Remove the redundant testMatch setting from the Playwright
configuration: delete the testMatch: /.*\.spec\.ts/ entry in
playwright.config.ts (it duplicates the default behavior when testDir is
./tests) so the exported config relies on Playwright's defaults; if the pattern
was intended as documentation, replace it with a brief comment instead of the
property to avoid functional duplication.

In `@tests/fixtures.ts`:
- Line 22: The empty object destructuring in the testImage fixture signature
causes static-analysis failures; replace the empty pattern with an underscore
placeholder to indicate an unused parameter. Locate the testImage async function
in tests/fixtures.ts (the parameter list currently "({}, use)") and change the
first parameter to "_" (or a named unused placeholder like "_args") so the
signature becomes "(_, use)"; update any surrounding references or types if
present to match the new placeholder.
- Around line 27-37: Replace the synchronous existence checks and write with an
atomic async flow to avoid race conditions: use fs.promises.mkdir(fixturesDir, {
recursive: true }) then attempt to write the image buffer to imagePath using
fs.promises.writeFile(imagePath, pngBuffer, { flag: 'wx' }) and catch EEXIST to
ignore "file already created" races; update the code paths that reference
fixturesDir, imagePath, and pngBuffer accordingly and ensure errors other than
EEXIST are rethrown or logged.

In `@tests/global-setup.ts`:
- Around line 3-7: The function globalSetup declares an unused parameter config;
to indicate intentional non-use, rename it to _config (or prefix it with an
underscore) in the globalSetup signature so linters and readers know it's
intentionally unused—update the function declaration async function
globalSetup(_config: FullConfig) { ... } and leave the body unchanged.

In `@tests/responsive.spec.ts`:
- Around line 225-251: Add a short stability wait after each viewport change in
the "Responsive design - Orientation changes" tests: after calls to
page.setViewportSize(...) (in both tests "should handle portrait to landscape
transition" and "should handle landscape to portrait transition"), wait for the
page to settle by calling either page.waitForLoadState('domcontentloaded') or
awaiting the relevant locator (e.g., await chatInput.waitFor({ state: 'visible'
}) or await page.waitForTimeout(200)) before interacting with elements like the
chatInput, submitButton or userMessage to prevent flakiness.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 128e7bd and 26b2854.

📒 Files selected for processing (13)
  • .github/workflows/playwright.yml
  • package.json
  • playwright.config.ts
  • tests/calendar.spec.ts
  • tests/chat.spec.ts
  • tests/fixtures.ts
  • tests/global-setup.ts
  • tests/header.spec.ts
  • tests/images.spec.ts
  • tests/map.spec.ts
  • tests/mobile.spec.ts
  • tests/responsive.spec.ts
  • tests/sidebar.spec.ts
🧰 Additional context used
🧬 Code graph analysis (6)
tests/header.spec.ts (1)
tests/fixtures.ts (1)
  • test (5-42)
tests/calendar.spec.ts (1)
tests/fixtures.ts (2)
  • test (5-42)
  • expect (44-44)
tests/mobile.spec.ts (1)
tests/fixtures.ts (1)
  • test (5-42)
tests/responsive.spec.ts (1)
tests/fixtures.ts (1)
  • test (5-42)
tests/sidebar.spec.ts (1)
tests/fixtures.ts (1)
  • test (5-42)
tests/images.spec.ts (1)
tests/fixtures.ts (2)
  • test (5-42)
  • expect (44-44)
🪛 Biome (2.3.13)
tests/fixtures.ts

[error] 22-22: Unexpected empty object pattern.

(lint/correctness/noEmptyPattern)

🔇 Additional comments (24)
package.json (1)

14-16: LGTM!

The new scripts for smoke tests, parallel execution, and chromium-only runs are well-structured and align with the PR objectives. They complement the existing test:e2e script without breaking it.

tests/sidebar.spec.ts (1)

1-3: LGTM!

Good migration to the custom fixtures module and appropriate @smoke tagging for critical-path tests. The authenticatedPage fixture provides proper isolation with automatic storage cleanup.

tests/header.spec.ts (1)

1-26: LGTM!

Clean migration to the fixtures-based approach. The @smoke tag is appropriate for these critical header/navigation tests, and the authenticatedPage fixture eliminates boilerplate navigation code while ensuring proper test isolation.

playwright.config.ts (2)

43-48: LGTM on webServer configuration!

Good CI-aware approach: using production build in CI for realistic testing while allowing faster dev server locally. The 10-minute timeout accommodates Next.js build times in CI.


25-41: LGTM on browser matrix optimization!

Properly excludes mobile tests from Firefox/WebKit desktop browsers and restricts mobile projects to only run mobile.spec.ts. This avoids running irrelevant tests and improves CI efficiency.

.github/workflows/playwright.yml (1)

13-17: LGTM on sharding strategy!

The 4-way sharding with fail-fast: false is a good setup for parallel CI execution. All shards will complete even if one fails, which aids debugging.

tests/chat.spec.ts (4)

1-3: LGTM! Clean migration to fixture-based testing with smoke tagging.

The import change and @smoke tag addition align well with the PR's objectives for CI efficiency.


19-26: Good improvement: deterministic polling replaces arbitrary waits.

Using expect.poll for streaming verification is more reliable than fixed timeouts in CI environments where response times vary.


4-6: LGTM! Consistent fixture usage across all tests.

All test callbacks correctly destructure authenticatedPage as page, maintaining readability while leveraging the new fixture-based setup.

Also applies to: 8-27, 43-59, 61-74, 76-85


29-41: The test structure is correct—no action needed.

The page reload at line 36 executes during test runtime, well before the authenticatedPage fixture cleanup phase. The fixture's localStorage.clear() and sessionStorage.clear() calls run only after await use(page) returns, which is after the test completes. The test persistence logic is sound.

tests/fixtures.ts (1)

9-19: LGTM! Well-designed fixture with proper CI considerations.

The 30-second timeout for CI environments and post-test storage cleanup are good practices for test isolation in parallel execution.

tests/mobile.spec.ts (1)

1-15: LGTM! Clean mobile test setup with appropriate viewport configuration.

The migration to fixtures is consistent with other test files, and the iPhone 8 viewport (375x667) is appropriate for mobile UI testing.

tests/images.spec.ts (3)

1-5: LGTM! Appropriate fixture usage pattern.

The imports are correctly structured - test and expect from fixtures, while retaining fs and path for tests that need to create specialized test files (non-images, oversized files, multiple formats).


7-17: LGTM! Good use of the testImage fixture for standard attachment tests.

These tests appropriately leverage the shared testImage fixture, reducing boilerplate and improving test isolation.

Also applies to: 19-32, 34-48, 50-71, 125-142, 168-184


73-87: LGTM! Appropriate use of testInfo.outputPath for edge case tests.

Using testInfo.outputPath() for non-image files, oversized images, and format variations is correct since these require specialized test data that the shared testImage fixture cannot provide.

Also applies to: 89-104, 144-166

tests/calendar.spec.ts (2)

1-3: LGTM! Good tagging strategy with multiple tags.

Using both @smoke and @calendar tags enables flexible test filtering for different CI scenarios.


67-77: Good use of unique identifiers for parallel execution safety.

Using Date.now() for note content prevents conflicts when tests run in parallel across shards.

Also applies to: 84-98, 120-136

tests/responsive.spec.ts (3)

1-6: LGTM! Good parallel configuration with selective smoke tagging.

Enabling parallel mode at the describe level and selectively applying @smoke to Desktop and Mobile sections provides efficient CI execution while maintaining thorough coverage.


8-36: LGTM! Consistent fixture usage across all viewport configurations.

All tests correctly use the authenticatedPage fixture while varying viewport sizes for responsive testing.

Also applies to: 42-78, 84-166, 172-194, 200-222


147-152: Good defensive tolerance for subpixel rendering.

The +1 tolerance on the horizontal scroll check handles edge cases where browsers may report fractional pixels differently.

tests/map.spec.ts (4)

1-1: LGTM!

Import correctly changed to use the local fixtures module, enabling the authenticatedPage fixture for improved test isolation and consistent authentication handling.


3-10: Well-structured test configuration.

  • The @smoke tag enables fast feedback loops in CI.
  • retries: 2 is appropriate for WebGL-based tests that may be flaky due to rendering timing.
  • The 30s timeout on the selector aligns with CI environment needs mentioned in the fixtures.

The layered approach works well: the authenticatedPage fixture handles authentication and base navigation, while beforeEach handles map-specific readiness checks.


12-24: LGTM!

The test properly uses the authenticatedPage fixture and maintains the appropriate conditional skip pattern for Mapbox-only functionality.


42-57: LGTM!

The zoom test logic correctly captures initial state, performs interactions, and verifies expected state changes. The event-based waiting pattern (map.once('zoomend', res)) is appropriate for synchronizing with map animations, and the configured retries: 2 provides resilience against any timing edge cases.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 38 to 42
NODE_ENV: production
SHARD_INDEX: ${{ matrix.shardIndex }}
SHARD_TOTAL: ${{ matrix.shardTotal }}
# Add other necessary env vars here (e.g., Supabase, etc.)
AUTH_DISABLED_FOR_DEV: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n -C3 'AUTH_DISABLED_FOR_DEV' -t yaml -t js -t ts -t json

Repository: QueueLab/QCX

Length of output: 2661


Remove or clarify AUTH_DISABLED_FOR_DEV: true in the E2E test workflow.

The flag is guarded in the code to only work when NODE_ENV !== 'production'. Since this workflow sets NODE_ENV: production, the AUTH_DISABLED_FOR_DEV variable has no effect and should be removed to avoid confusion. If the intent is to disable auth for E2E tests, either set NODE_ENV: development or remove the flag entirely.

🤖 Prompt for AI Agents
In @.github/workflows/playwright.yml around lines 38 - 42, The workflow
currently sets NODE_ENV: production while also defining AUTH_DISABLED_FOR_DEV:
true which is ineffective and confusing; update the Playwright workflow by
either removing AUTH_DISABLED_FOR_DEV entirely or change NODE_ENV to development
if the intention is to disable auth for E2E tests, and ensure the variable
AUTH_DISABLED_FOR_DEV is only present when NODE_ENV !== 'production'
(referencing the NODE_ENV and AUTH_DISABLED_FOR_DEV entries in the workflow).

"test:e2e:smoke": "playwright test --grep @smoke",
"test:e2e:parallel": "playwright test --workers=4",
"test:e2e:chromium": "playwright test --project=chromium",
"test:e2e:shard": "playwright test --shard=$SHARD_INDEX/$SHARD_TOTAL",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Cross-platform compatibility concern with environment variable syntax.

The $SHARD_INDEX/$SHARD_TOTAL syntax works on Unix-like systems but will fail on Windows. Since the script is primarily used in CI (Linux), this may be acceptable, but consider using cross-env (already a devDependency) for local developer usage if needed.

Proposed fix for cross-platform support (if needed)
-    "test:e2e:shard": "playwright test --shard=$SHARD_INDEX/$SHARD_TOTAL",
+    "test:e2e:shard": "cross-env playwright test --shard=${SHARD_INDEX}/${SHARD_TOTAL}",

Note: This still won't work on native Windows CMD. If Windows support is required, consider a small Node script wrapper.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"test:e2e:shard": "playwright test --shard=$SHARD_INDEX/$SHARD_TOTAL",
"test:e2e:shard": "cross-env playwright test --shard=${SHARD_INDEX}/${SHARD_TOTAL}",
🤖 Prompt for AI Agents
In `@package.json` at line 17, The "test:e2e:shard" npm script uses Unix-style env
syntax ($SHARD_INDEX/$SHARD_TOTAL) which breaks on Windows; update the script to
use cross-env to set/consume these variables cross-platform (reference the
"test:e2e:shard" script and the cross-env devDependency) or replace it with a
small Node wrapper that reads process.env.SHARD_INDEX and
process.env.SHARD_TOTAL and invokes Playwright, ensuring the variables are
provided via cross-env in package.json when running locally.

});
},

testImage: async ({}, use) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix empty object pattern flagged by static analysis.

Biome reports an error for the empty object destructuring pattern. Use the underscore placeholder instead.

🔧 Proposed fix
-  testImage: async ({}, use) => {
+  testImage: async (_, use) => {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
testImage: async ({}, use) => {
testImage: async (_, use) => {
🧰 Tools
🪛 Biome (2.3.13)

[error] 22-22: Unexpected empty object pattern.

(lint/correctness/noEmptyPattern)

🤖 Prompt for AI Agents
In `@tests/fixtures.ts` at line 22, The empty object destructuring in the
testImage fixture signature causes static-analysis failures; replace the empty
pattern with an underscore placeholder to indicate an unused parameter. Locate
the testImage async function in tests/fixtures.ts (the parameter list currently
"({}, use)") and change the first parameter to "_" (or a named unused
placeholder like "_args") so the signature becomes "(_, use)"; update any
surrounding references or types if present to match the new placeholder.

Comment on lines +27 to +37
if (!fs.existsSync(fixturesDir)) {
fs.mkdirSync(fixturesDir, { recursive: true });
}

// Create a simple 1x1 PNG image if it doesn't exist
if (!fs.existsSync(imagePath)) {
const pngBuffer = Buffer.from(
'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==',
'base64'
);
fs.writeFileSync(imagePath, pngBuffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider potential race condition with parallel test execution.

When running tests in parallel with sharding, multiple workers may simultaneously check if the file exists and attempt to create it. While unlikely to cause failures (the file content is identical), consider using fs.promises with proper error handling for atomic operations.

♻️ Proposed fix for atomic file creation
-    // Ensure the directory exists
-    if (!fs.existsSync(fixturesDir)) {
-      fs.mkdirSync(fixturesDir, { recursive: true });
-    }
-
-    // Create a simple 1x1 PNG image if it doesn't exist
-    if (!fs.existsSync(imagePath)) {
-      const pngBuffer = Buffer.from(
-        'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==',
-        'base64'
-      );
-      fs.writeFileSync(imagePath, pngBuffer);
-    }
+    // Ensure the directory exists (safe for parallel execution)
+    await fs.promises.mkdir(fixturesDir, { recursive: true });
+
+    // Create a simple 1x1 PNG image if it doesn't exist
+    try {
+      await fs.promises.access(imagePath);
+    } catch {
+      const pngBuffer = Buffer.from(
+        'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==',
+        'base64'
+      );
+      await fs.promises.writeFile(imagePath, pngBuffer, { flag: 'wx' }).catch(() => {});
+    }
🤖 Prompt for AI Agents
In `@tests/fixtures.ts` around lines 27 - 37, Replace the synchronous existence
checks and write with an atomic async flow to avoid race conditions: use
fs.promises.mkdir(fixturesDir, { recursive: true }) then attempt to write the
image buffer to imagePath using fs.promises.writeFile(imagePath, pngBuffer, {
flag: 'wx' }) and catch EEXIST to ignore "file already created" races; update
the code paths that reference fixturesDir, imagePath, and pngBuffer accordingly
and ensure errors other than EEXIST are rethrown or logged.

Comment on lines +3 to +7
async function globalSetup(config: FullConfig) {
// Global initialization can be added here
// For example, clearing a test database or setting up environment variables
console.log('Starting E2E test suite...');
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Prefix unused parameter with underscore to indicate intentional non-use.

The config parameter is imported and typed but not used. Either prefix it with an underscore to signal intentional non-use, or remove it if not needed for future work.

♻️ Proposed fix
-async function globalSetup(config: FullConfig) {
+async function globalSetup(_config: FullConfig) {
   // Global initialization can be added here
   // For example, clearing a test database or setting up environment variables
   console.log('Starting E2E test suite...');
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function globalSetup(config: FullConfig) {
// Global initialization can be added here
// For example, clearing a test database or setting up environment variables
console.log('Starting E2E test suite...');
}
async function globalSetup(_config: FullConfig) {
// Global initialization can be added here
// For example, clearing a test database or setting up environment variables
console.log('Starting E2E test suite...');
}
🤖 Prompt for AI Agents
In `@tests/global-setup.ts` around lines 3 - 7, The function globalSetup declares
an unused parameter config; to indicate intentional non-use, rename it to
_config (or prefix it with an underscore) in the globalSetup signature so
linters and readers know it's intentionally unused—update the function
declaration async function globalSetup(_config: FullConfig) { ... } and leave
the body unchanged.

Comment on lines 225 to 251
test.describe('Responsive design - Orientation changes', () => {
test('should handle portrait to landscape transition', async ({ page }) => {
// Start in portrait
test('should handle portrait to landscape transition', async ({ authenticatedPage: page }) => {
await page.setViewportSize({ width: 375, height: 667 });
await page.goto('/');

const chatInput = page.locator('[data-testid="chat-input"]');
await expect(chatInput).toBeVisible();

// Switch to landscape
await page.setViewportSize({ width: 667, height: 375 });

// UI should still be functional
await expect(chatInput).toBeVisible();
await chatInput.fill('Orientation test');

const submitButton = page.locator('[data-testid="mobile-submit-button"]');
await expect(submitButton).toBeEnabled();
});

test('should handle landscape to portrait transition', async ({ page }) => {
// Start in landscape
test('should handle landscape to portrait transition', async ({ authenticatedPage: page }) => {
await page.setViewportSize({ width: 667, height: 375 });
await page.goto('/');

await page.fill('[data-testid="chat-input"]', 'Landscape message');
await page.click('[data-testid="mobile-submit-button"]');

// Switch to portrait
await page.setViewportSize({ width: 375, height: 667 });

// Message should still be visible
const userMessage = page.locator('[data-testid="user-message"]').last();
await expect(userMessage).toBeVisible();
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider adding small delays or stability checks for orientation tests.

Rapid viewport size changes may not immediately trigger all responsive layout recalculations. If these tests become flaky in CI, consider adding waitForLoadState('domcontentloaded') or brief stability checks after viewport changes.

🤖 Prompt for AI Agents
In `@tests/responsive.spec.ts` around lines 225 - 251, Add a short stability wait
after each viewport change in the "Responsive design - Orientation changes"
tests: after calls to page.setViewportSize(...) (in both tests "should handle
portrait to landscape transition" and "should handle landscape to portrait
transition"), wait for the page to settle by calling either
page.waitForLoadState('domcontentloaded') or awaiting the relevant locator
(e.g., await chatInput.waitFor({ state: 'visible' }) or await
page.waitForTimeout(200)) before interacting with elements like the chatInput,
submitButton or userMessage to prevent flakiness.

Copy link

@charliecreates charliecreates bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest correctness risk is over-parallelization: enabling both CI sharding and workers: 4 can amplify concurrency to 16, reducing stability and potentially overloading the webServer and any shared backends. The new fixtures improve ergonomics, but authenticatedPage cleanup is currently teardown-only and can leak state across tests (and potentially mask failures if teardown throws). The testImage fixture writes to a shared workspace path which is prone to concurrency races under parallel workers/shards. Minor maintainability issues include the always-on globalSetup console logging and reliance on window.map globals in map tests.

Additional notes (2)
  • Compatibility | package.json:11-20
    test:e2e:shard relies on $SHARD_INDEX/$SHARD_TOTAL. In GitHub Actions you set these env vars, but locally (or in other CI contexts) they may be unset, causing Playwright to receive --shard=/ and fail.

This makes the new default CI entrypoint fragile outside this specific workflow and reduces developer ergonomics.

  • Maintainability | tests/map.spec.ts:55-55
    This relies on a hard-coded global window.map and uses (window as any), which is type-valid but unsafe. More importantly, if the map implementation changes (or a different provider is used), the test will hang waiting for zoomend.

Given you already conditionally skip when Mapbox isn't present, you should also guard against window.map not being initialized yet even when the canvas exists.

Summary of changes

What changed

CI workflow: parallel + sharded Playwright runs

  • Updated .github/workflows/playwright.yml to run Playwright in a 4-way matrix (shardIndex: [1..4], shardTotal: 4) with fail-fast: false.
  • Replaced manual app build/start steps with a single bun run test:e2e:shard execution.
  • Added env wiring for sharding (SHARD_INDEX, SHARD_TOTAL) and set NODE_ENV: production.
  • Artifact names are now shard-specific (e.g. playwright-report-1).

Playwright configuration: workers, projects, web server

  • playwright.config.ts now:
    • Uses globalSetup (./tests/global-setup).
    • Runs with workers: 4 in CI.
    • Enables webServer lifecycle management (bun run build && bun run start in CI).
    • Refines project selection with testMatch/testIgnore to separate mobile vs desktop specs.

Test suite: fixtures + stability improvements

  • Added tests/fixtures.ts introducing:
    • authenticatedPage fixture (centralized goto('/') + readiness wait).
    • testImage fixture (reusable PNG test file).
    • Post-test cleanup via localStorage.clear()/sessionStorage.clear().
  • Added tests/global-setup.ts (currently logs suite start).
  • Updated existing specs to import from ./fixtures, removed repetitive beforeEach navigation, added @smoke tags, and improved determinism (e.g., expect.poll in chat streaming test).

NPM scripts

  • package.json adds test:e2e:smoke, test:e2e:parallel, test:e2e:chromium, and test:e2e:shard.

Comment on lines +9 to +20
authenticatedPage: async ({ page }, use) => {
await page.goto('/');
// Increased timeout for CI environments
await page.waitForSelector('[data-testid="chat-input"]', { timeout: 30000 });
await use(page);

// Cleanup after test to ensure isolation
await page.evaluate(() => {
localStorage.clear();
sessionStorage.clear();
});
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The authenticatedPage fixture clears localStorage/sessionStorage after the test, but it does not clear state before navigation. With fullyParallel: true and workers: 4, any pre-existing state from a prior run (or from a previous test that crashed before teardown) can leak into the next test and cause order-dependent failures.

Also, page.evaluate(() => localStorage.clear()) can throw if the page is already closed or has navigated to a different origin due to failures, which can turn a test failure into a teardown failure and obscure the real cause.

Suggestion

Harden the fixture by clearing storage before the test starts and by making teardown best-effort:

authenticatedPage: async ({ page }, use) => {
  await page.goto('/');
  await page.waitForSelector('[data-testid="chat-input"]', { timeout: 30000 });

  // Ensure a clean slate at start
  await page.evaluate(() => {
    localStorage.clear();
    sessionStorage.clear();
  });
  await page.reload();
  await page.waitForSelector('[data-testid="chat-input"]', { timeout: 30000 });

  await use(page);

  // Best-effort cleanup (don’t mask the test failure)
  try {
    await page.evaluate(() => {
      localStorage.clear();
      sessionStorage.clear();
    });
  } catch {
    // ignore teardown errors
  }
}

If you intentionally need state during a test, consider providing a second fixture (e.g. cleanPage) or a helper to opt out.

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.

Comment on lines +22 to +41
testImage: async ({}, use) => {
const fixturesDir = path.join(process.cwd(), 'test-fixtures');
const imagePath = path.join(fixturesDir, 'test-image.png');

// Ensure the directory exists
if (!fs.existsSync(fixturesDir)) {
fs.mkdirSync(fixturesDir, { recursive: true });
}

// Create a simple 1x1 PNG image if it doesn't exist
if (!fs.existsSync(imagePath)) {
const pngBuffer = Buffer.from(
'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==',
'base64'
);
fs.writeFileSync(imagePath, pngBuffer);
}

await use(imagePath);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testImage writes to a shared path under process.cwd()/test-fixtures/test-image.png. With parallel workers and sharding, multiple processes can read/write the same file concurrently. This can be flaky on Windows and can occasionally race in CI if the directory is created simultaneously.

It also writes into the repo workspace, which some CI setups treat as read-only or cache in unexpected ways.

Suggestion

Prefer creating the image under Playwright’s per-test output directory to guarantee isolation and avoid shared writes:

import { test as base } from '@playwright/test';

// ...

testImage: async ({}, use, testInfo) => {
  const imagePath = testInfo.outputPath('test-image.png');
  const pngBuffer = Buffer.from('...', 'base64');
  await fs.promises.writeFile(imagePath, pngBuffer);
  await use(imagePath);
}

If you want to avoid rewriting for every test, write once in globalSetup and keep it read-only thereafter.

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.

Comment on lines 3 to 48
export default defineConfig({
testDir: './tests',
globalSetup: require.resolve('./tests/global-setup'),
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
workers: process.env.CI ? 4 : undefined,
reporter: 'html',
use: {
baseURL: 'http://localhost:3000',
trace: 'on-first-retry',
storageState: undefined,
},
projects: [
{
name: 'chromium',
use: { ...devices['Desktop Chrome'] },
testMatch: /.*\.spec\.ts/,
},
{
name: 'firefox',
use: { ...devices['Desktop Firefox'] },
testIgnore: /mobile\.spec\.ts/,
},
{
name: 'webkit',
use: { ...devices['Desktop Safari'] },
testIgnore: /mobile\.spec\.ts/,
},
{
name: 'Mobile Chrome',
use: { ...devices['Pixel 5'] },
testMatch: /mobile\.spec\.ts/,
},
{
name: 'Mobile Safari',
use: { ...devices['iPhone 12'] },
testMatch: /mobile\.spec\.ts/,
},
],
/* webServer: {
command: process.env.CI ? 'npm run build && npm run start' : 'npm run dev',
webServer: {
command: process.env.CI ? 'bun run build && bun run start' : 'bun run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
timeout: 600000,
}, */
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You set workers: 4 in CI and also shard into 4 jobs. That yields up to 16 concurrent workers (4 jobs × 4 workers) against the same webServer config per job.

This can overload CI resources and make tests less reliable (especially with WebGL/map tests and any shared external dependencies). If the goal is 4-way parallelization total, you typically want either sharding or increased workers, not both.

Suggestion

Decide on one primary scaling mechanism:

  • Option A (recommended for CI predictability): shard in GitHub Actions and keep workers: 1 per shard.
  • Option B: no sharding, keep workers: 4 in a single job.

If you keep both, cap workers when sharding is enabled:

const isSharded = !!process.env.SHARD_INDEX && !!process.env.SHARD_TOTAL;
workers: process.env.CI ? (isSharded ? 1 : 4) : undefined,

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this suggestion.

Comment on lines 43 to 48
webServer: {
command: process.env.CI ? 'bun run build && bun run start' : 'bun run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
timeout: 600000,
}, */
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

webServer.command runs bun run build && bun run start. next start typically blocks and does not exit; Playwright’s webServer runner will manage the lifecycle, but only if it can reliably detect readiness via url.

Right now url is hard-coded to http://localhost:3000. If the app is configured to bind a different port (common in CI) or requires env vars, the server may start but never become "ready" to Playwright, causing shard timeouts.

Also, the workflow sets AUTH_DISABLED_FOR_DEV: true and other env, but webServer is what actually starts the app; unless Playwright passes those env vars through (it uses the parent process env), this will work only if the workflow env is present for the Playwright process (it is) — but it’s fragile if someone runs tests locally with different env expectations.

Suggestion

Parameterize baseURL/webServer.url (and optionally port) via process.env.PLAYWRIGHT_BASE_URL / PORT (e.g. const port = process.env.PORT ?? '3000') and set webServer.env explicitly for required flags like AUTH_DISABLED_FOR_DEV. This reduces CI/local drift and avoids hangs when port differs. Reply with "@CharlieHelps yes please" if you'd like me to add a commit implementing this.

@charliecreates charliecreates bot removed the request for review from CharlieHelps January 29, 2026 17:37
@ngoiyaeric
Copy link
Collaborator Author

@coderabbitai right now the system does not say what the progress for the playwright

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 29, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@ngoiyaeric, I see your message got cut off, but I understand you're asking about progress reporting for Playwright tests. Let me check the current configuration to help you add progress visibility.


🏁 Script executed:

#!/bin/bash
# Check the current Playwright configuration for reporter settings
cat playwright.config.ts

echo -e "\n---\n"

# Check the test scripts in package.json
cat package.json | jq -r '.scripts | to_entries[] | select(.key | startswith("test:e2e")) | "\(.key): \(.value)"'

echo -e "\n---\n"

# Check the CI workflow configuration
cat .github/workflows/playwright.yml

Length of output: 2991


@ngoiyaeric, I can see the issue! Your current configuration only uses the html reporter, which doesn't show progress during test execution - it only generates a report after all tests complete. With 4 parallel shards, you definitely want real-time progress visibility.

Here's how to add progress reporting:

Update playwright.config.ts:

reporter: process.env.CI 
  ? [
      ['list'],                    // Shows progress in console as tests run
      ['github'],                  // Adds GitHub Actions annotations on failures
      ['html', { open: 'never' }], // Generates HTML report (don't auto-open in CI)
    ]
  : 'html',  // Keep simple HTML reporter for local development

Reporter options explained:

  • list: Shows each test as it starts/completes with pass/fail status - perfect for CI logs
  • github: Adds annotations to your PR with clickable test failures
  • html: Generates the detailed HTML report you can download from artifacts

Alternative (more compact):

reporter: process.env.CI
  ? [['line'], ['github'], ['html', { open: 'never' }]]
  : 'html',

The line reporter is more compact than list (one line per test file instead of per test).

For local testing with progress, you can also update your scripts in package.json:

"test:e2e:parallel": "playwright test --workers=4 --reporter=list"

This will give you clear visibility into which tests are running, passing, or failing across all 4 shards in your CI pipeline!

Would you like me to create these changes for you?

@google-labs-jules
Copy link
Contributor

@coderabbitai right now the system does not say what the progress for the playwright

I've addressed this by improving the progress visibility in CI. Specifically, I've:

  1. Added the list reporter in CI to show real-time test progress in the logs.
  2. Enabled stdout: 'pipe' and stderr: 'pipe' in the webServer configuration so that build and server logs are visible during the test run.
  3. Updated the GitHub Actions workflow step name to include shard information (e.g., 'Shard 1/4').

- Added 'list' and 'github' reporters for CI runs.
- Enabled stdout/stderr piping for webServer to show build/server logs.
- Added shard info to the GitHub Actions test step name.
- Addressed PR feedback regarding Playwright progress reporting.

Co-authored-by: ngoiyaeric <115367894+ngoiyaeric@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@playwright.config.ts`:
- Line 14: Remove the redundant explicit "storageState: undefined" entry from
the Playwright config; locate the storageState property in the exported config
object (playwright.config.ts) and delete that line so the config relies on
Playwright's default/intentional state loading instead of an explicit undefined
entry.
- Around line 18-41: The chromium project config (name: 'chromium', use: {
...devices['Desktop Chrome'] }) currently uses testMatch: /.*\.spec\.ts/ which
includes mobile.spec.ts; change it to exclude mobile specs like the other
desktop projects by replacing testMatch with testIgnore: /mobile\.spec\.ts/
(mirror the pattern used in the 'firefox' and 'webkit' project entries) so
mobile-only tests do not run on the 'chromium' project.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26b2854 and e7b97c6.

📒 Files selected for processing (2)
  • .github/workflows/playwright.yml
  • playwright.config.ts
🧰 Additional context used
🧬 Code graph analysis (1)
playwright.config.ts (4)
tests/calendar.spec.ts (3)
  • test (3-154)
  • page (83-100)
  • page (119-137)
tests/chat.spec.ts (1)
  • test (4-89)
tests/responsive.spec.ts (4)
  • test (94-191)
  • test (3-40)
  • test (42-92)
  • test (262-297)
tests/header.spec.ts (1)
  • test (3-30)
🔇 Additional comments (5)
playwright.config.ts (2)

5-10: CI setup for globalSetup/workers/reporters looks good.
This configuration aligns with CI execution and adds useful runtime visibility. Please confirm it matches Playwright v1.56.1 reporter syntax and expected behavior.


43-49: Web server lifecycle + output piping is a solid CI improvement.
Piping stdout/stderr helps surface build/server logs while keeping Playwright’s webServer orchestration centralized.

.github/workflows/playwright.yml (3)

13-17: Shard matrix setup looks good.
This cleanly enables 4-way sharding with fail-fast: false for more resilient CI.


48-49: Shard-scoped report artifact naming is clear and actionable.
This makes per-shard debugging straightforward.


56-57: Shard-scoped screenshot artifact naming is consistent.
Nice symmetry with report artifacts.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

- Removed redundant `storageState: undefined` from base config.
- Updated `chromium` project to ignore mobile specs, matching other desktop projects.
- Ensures mobile-only tests do not run in desktop browser projects.

Co-authored-by: ngoiyaeric <115367894+ngoiyaeric@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@playwright.config.ts`:
- Around line 42-49: Reduce the excessive webServer timeout in
playwright.config.ts by setting webServer.timeout to a tighter value (e.g.,
180000 ms) and add a short inline comment explaining the chosen bound (expected
build+start time plus buffer) so reviewers/CI maintainers know why it isn’t the
default; retain stdout: 'pipe' and stderr: 'pipe' as-is to keep log capture.
Ensure you update the webServer block (the timeout field) and add the comment
next to the webServer configuration entry.
- Around line 9-10: The reporter configuration uses nested arrays but provides
no per-reporter options; update the reporter assignment in the playwright config
(the reporter property) to use a flat array when process.env.CI is truthy (e.g.,
change [['list'], ['github'], ['html']] to ['list', 'github', 'html']) so the
behavior is identical but the code is cleaner and easier to read.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e7b97c6 and 65c34d2.

📒 Files selected for processing (1)
  • playwright.config.ts
🔇 Additional comments (2)
playwright.config.ts (2)

5-5: LGTM!

Using require.resolve() is the correct pattern for globalSetup, ensuring proper path resolution for TypeScript files.


15-40: LGTM! Proper desktop/mobile test isolation.

The testIgnore and testMatch patterns correctly ensure:

  • Desktop browsers (chromium, firefox, webkit) skip mobile-specific tests
  • Mobile browsers (Pixel 5, iPhone 12) only run mobile-specific tests

This addresses the previous review concern about mobile tests running on desktop projects.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 9 to 10
workers: process.env.CI ? 4 : undefined,
reporter: process.env.CI ? [['list'], ['github'], ['html']] : 'html',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider simplifying the reporter array format.

The nested array format ([['list'], ['github'], ['html']]) is typically used when passing reporter options. Since no options are specified, the flat array format is cleaner and equivalent.

♻️ Suggested simplification
   workers: process.env.CI ? 4 : undefined,
-  reporter: process.env.CI ? [['list'], ['github'], ['html']] : 'html',
+  reporter: process.env.CI ? ['list', 'github', 'html'] : 'html',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
workers: process.env.CI ? 4 : undefined,
reporter: process.env.CI ? [['list'], ['github'], ['html']] : 'html',
workers: process.env.CI ? 4 : undefined,
reporter: process.env.CI ? ['list', 'github', 'html'] : 'html',
🤖 Prompt for AI Agents
In `@playwright.config.ts` around lines 9 - 10, The reporter configuration uses
nested arrays but provides no per-reporter options; update the reporter
assignment in the playwright config (the reporter property) to use a flat array
when process.env.CI is truthy (e.g., change [['list'], ['github'], ['html']] to
['list', 'github', 'html']) so the behavior is identical but the code is cleaner
and easier to read.

Comment on lines +42 to +49
webServer: {
command: process.env.CI ? 'bun run build && bun run start' : 'bun run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
timeout: 600000,
}, */
stdout: 'pipe',
stderr: 'pipe',
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider reducing the webServer timeout or documenting the rationale.

The 10-minute (600000ms) timeout is quite generous. While build times can vary, this may mask issues where the server fails to start within a reasonable window. If typical build + start times are known, consider:

  • Reducing to a tighter bound (e.g., 180000ms = 3 minutes) with buffer
  • Adding a comment explaining why 10 minutes is necessary

The stdout: 'pipe' and stderr: 'pipe' configuration is excellent for capturing server logs during CI failures.

♻️ Suggested adjustment with documentation
   webServer: {
     command: process.env.CI ? 'bun run build && bun run start' : 'bun run dev',
     url: 'http://localhost:3000',
     reuseExistingServer: !process.env.CI,
-    timeout: 600000,
+    timeout: 180000, // 3 minutes - adjust based on typical build times
     stdout: 'pipe',
     stderr: 'pipe',
   },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
webServer: {
command: process.env.CI ? 'bun run build && bun run start' : 'bun run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
timeout: 600000,
}, */
stdout: 'pipe',
stderr: 'pipe',
},
webServer: {
command: process.env.CI ? 'bun run build && bun run start' : 'bun run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
timeout: 180000, // 3 minutes - adjust based on typical build times
stdout: 'pipe',
stderr: 'pipe',
},
🤖 Prompt for AI Agents
In `@playwright.config.ts` around lines 42 - 49, Reduce the excessive webServer
timeout in playwright.config.ts by setting webServer.timeout to a tighter value
(e.g., 180000 ms) and add a short inline comment explaining the chosen bound
(expected build+start time plus buffer) so reviewers/CI maintainers know why it
isn’t the default; retain stdout: 'pipe' and stderr: 'pipe' as-is to keep log
capture. Ensure you update the webServer block (the timeout field) and add the
comment next to the webServer configuration entry.

@ngoiyaeric
Copy link
Collaborator Author

@coderabbitai how come the tests did not run to completion

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 31, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@ngoiyaeric, let me check the GitHub Actions workflow runs for this PR to see what happened with the tests.


🏁 Script executed:

#!/bin/bash

# Check the workflow runs for this PR
echo "=== Workflow Runs for PR `#462` ==="
gh pr checks 462 --repo QueueLab/QCX

echo -e "\n=== Recent Playwright Workflow Runs ==="
gh run list --workflow=playwright.yml --repo QueueLab/QCX --limit 5 --json databaseId,status,conclusion,createdAt,headBranch,event

echo -e "\n=== Latest Run Details for this Branch ==="
gh run list --workflow=playwright.yml --repo QueueLab/QCX --branch improve-e2e-testing-suite-4516796761612256463 --limit 1 --json databaseId,status,conclusion,createdAt,displayTitle,jobs

echo -e "\n=== Job Details from Latest Run ==="
LATEST_RUN=$(gh run list --workflow=playwright.yml --repo QueueLab/QCX --branch improve-e2e-testing-suite-4516796761612256463 --limit 1 --json databaseId --jq '.[0].databaseId')
if [ -n "$LATEST_RUN" ]; then
  echo "Run ID: $LATEST_RUN"
  gh run view $LATEST_RUN --repo QueueLab/QCX --log-failed
fi

Length of output: 779


@ngoiyaeric, I found the issue! The Playwright workflow never ran at all - there are no workflow runs for your branch. This means the tests weren't executed, not that they failed to complete.

Let me check the workflow configuration to see why it wasn't triggered:


🏁 Script executed:

#!/bin/bash

# Check if the workflow file exists on this branch
echo "=== Checking workflow file on branch ==="
gh api "/repos/QueueLab/QCX/contents/.github/workflows/playwright.yml?ref=improve-e2e-testing-suite-4516796761612256463" \
  --jq '.name, .size, .path' 2>&1 || echo "Workflow file not found on branch"

echo -e "\n=== Checking workflow syntax and triggers ==="
# Get the workflow file content from the PR
gh pr diff 462 --repo QueueLab/QCX | grep -A 20 "^+++ b/.github/workflows/playwright.yml" || echo "No playwright.yml changes in diff"

echo -e "\n=== Checking all workflows on branch ==="
gh api "/repos/QueueLab/QCX/contents/.github/workflows?ref=improve-e2e-testing-suite-4516796761612256463" \
  --jq '.[].name' 2>&1

echo -e "\n=== Checking repository Actions status ==="
gh api "/repos/QueueLab/QCX" --jq '.has_issues, .has_discussions, .disabled'

echo -e "\n=== Fetching the actual workflow file content ==="
gh api "/repos/QueueLab/QCX/contents/.github/workflows/playwright.yml?ref=improve-e2e-testing-suite-4516796761612256463" \
  --jq '.content' | base64 -d

Length of output: 2309


@ngoiyaeric, I found the problem! The workflow is being added or modified in this PR, and GitHub has a security feature that prevents workflows from running until they exist on the base branch (main).

Let me verify this:


🏁 Script executed:

#!/bin/bash

echo "=== Checking if playwright.yml exists on main branch ==="
gh api "/repos/QueueLab/QCX/contents/.github/workflows/playwright.yml?ref=main" \
  --jq '.name, .size' 2>&1 || echo "Workflow does NOT exist on main branch"

echo -e "\n=== Comparing file sizes (if exists on both) ==="
MAIN_SIZE=$(gh api "/repos/QueueLab/QCX/contents/.github/workflows/playwright.yml?ref=main" --jq '.size' 2>/dev/null || echo "0")
PR_SIZE=$(gh api "/repos/QueueLab/QCX/contents/.github/workflows/playwright.yml?ref=improve-e2e-testing-suite-4516796761612256463" --jq '.size' 2>/dev/null || echo "0")
echo "Main branch size: $MAIN_SIZE bytes"
echo "PR branch size: $PR_SIZE bytes"

if [ "$MAIN_SIZE" -ne "$PR_SIZE" ]; then
  echo "✗ File sizes differ - workflow was modified in this PR"
else
  echo "✓ File sizes match"
fi

echo -e "\n=== All workflows currently on main branch ==="
gh api "/repos/QueueLab/QCX/contents/.github/workflows?ref=main" --jq '.[].name' 2>&1

Length of output: 327


Found it! 🎯 The workflow exists on main (1338 bytes) but you modified it in this PR (now 1444 bytes).

GitHub's security policy blocks modified workflows from running automatically in PRs to prevent malicious code execution. This is why your tests never started.

Solutions:

1. Get Workflow Approval (Recommended)
A repository maintainer needs to approve the workflow run. They can do this by:

  • Going to the Actions tab
  • Finding your PR's workflow run (it may show as "waiting for approval")
  • Clicking "Approve and run"

2. Manual Trigger
You could add workflow_dispatch to allow manual triggering:

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main, develop ]
  workflow_dispatch:  # Add this to allow manual runs

3. Merge to Test
Once this PR is merged to main, future PRs will run the tests automatically since the workflow will be on the base branch.

Let me check if there's a pending approval needed:


🏁 Script executed:

#!/bin/bash

echo "=== Checking for pending workflow runs ==="
gh api "/repos/QueueLab/QCX/actions/runs?event=pull_request&branch=improve-e2e-testing-suite-4516796761612256463&status=waiting" \
  --jq '.workflow_runs[] | {id: .id, name: .name, status: .status, conclusion: .conclusion}'

echo -e "\n=== Repository settings for fork PRs ==="
gh api "/repos/QueueLab/QCX" --jq '{
  fork_pr_approval: .allow_forking,
  actions_enabled: (if .disabled then "disabled" else "enabled" end)
}'

echo -e "\n=== Checking who can approve (maintainer info) ==="
gh api "/repos/QueueLab/QCX" --jq '.owner.login'

Length of output: 199


@ngoiyaeric, here's the complete picture:

Why tests didn't run:

Your PR modifies the workflow file (1338 → 1444 bytes), and GitHub blocks modified workflows from auto-running in PRs as a security measure. There are currently no workflow runs queued - it never even started.

How to fix it:

Option 1: Push a new commit (Easiest)

Sometimes just pushing a new commit can trigger the approval prompt:

git commit --allow-empty -m "chore: trigger workflow"
git push

Then check the Actions tab - you should see a workflow waiting for approval that you can approve yourself (if you're a maintainer).

Option 2: Add manual trigger capability

Add this to your workflow file to enable manual runs:

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main, develop ]
  workflow_dispatch:  # ← Add this line

Then you can manually trigger it from the Actions tab → Playwright Tests → Run workflow.

Option 3: Approve via GitHub UI (if maintainer)

  1. Go to: https://github.com/QueueLab/QCX/actions
  2. Look for any pending workflow runs from your branch
  3. Click "Approve and run"

Option 4: Quick validation path

Since this appears to be an internal PR (you're in the QueueLab org), you could:

  • Merge to develop first to validate the workflow
  • Or have another maintainer approve the workflow run

Note: Once this PR merges, future PRs won't have this issue since the workflow will already exist on the base branch! 🎉

Would you like me to help with any of these approaches?

@google-labs-jules
Copy link
Contributor

@coderabbitai how come the tests did not run to completion

I've investigated this and made several improvements to ensure the tests run to completion reliably:

  1. Database Setup: I've added a PostgreSQL service container and a migration step to the CI workflow. Previously, the app might have been failing to start because it couldn't connect to a database.
  2. Resource Management: I've reduced the number of concurrent workers in CI from 4 to 2. Standard GitHub runners have 2 CPUs, so 4 workers (plus the Next.js build and start processes) likely caused resource contention or OOM errors.
  3. Report Merging: I've implemented Playwright blob reporting and a separate 'merge-reports' job so that even with sharding, we get a unified result.
  4. Better Logging: With stdout: 'pipe' enabled in the webServer config, any future start-up failures will be clearly visible in the CI logs.

- Added PostgreSQL service container to GitHub Actions.
- Added migration step to CI to ensure app readiness.
- Reduced CI workers to 2 to prevent resource contention.
- Implemented Playwright blob reporting and report merging.
- Improved diagnostic logging in CI.

Co-authored-by: ngoiyaeric <115367894+ngoiyaeric@users.noreply.github.com>
@gitguardian
Copy link

gitguardian bot commented Jan 31, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
26639707 Triggered Generic Password 2438740 .github/workflows/playwright.yml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/playwright.yml:
- Around line 50-54: lib/db/migrate.ts currently forces SSL for all DB
connections causing CI failures; update the DB client creation logic (the code
that instantiates the Pool/Client used by your migration runner) to parse the
DATABASE_URL (new URL(process.env.DATABASE_URL)) and conditionally set ssl only
when the host is not localhost/127.0.0.1 (and not the Docker/GitHub Actions
local service); i.e., if url.hostname is 'localhost' or '127.0.0.1' then disable
ssl, otherwise set ssl: { rejectUnauthorized: true } (or your existing ssl
config) so migrations run in CI against the non-SSL postgres service.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 65c34d2 and 2438740.

📒 Files selected for processing (2)
  • .github/workflows/playwright.yml
  • playwright.config.ts
🧰 Additional context used
🧬 Code graph analysis (1)
.github/workflows/playwright.yml (1)
lib/db/migrate.ts (1)
  • runMigrations (8-34)
🪛 Checkov (3.2.334)
.github/workflows/playwright.yml

[medium] 53-54: Basic Auth Credentials

(CKV_SECRET_4)

🔇 Additional comments (7)
.github/workflows/playwright.yml (3)

13-34: Postgres service + sharded matrix setup looks solid.

Health checks and fail-fast: false are good for shard stability.


63-69: Shard artifact upload looks good.

Clear naming and short retention keep storage tidy.


71-102: Merged report job wiring is clean.

Downloading all blob reports and merging to a single HTML artifact is exactly what CI reviewers need.

playwright.config.ts (4)

5-5: Global setup hook wired in.

Centralized pre-test initialization is a good fit here.


9-10: CI workers + blob reporter align with sharded CI.

The blob reporter pairs cleanly with the merge-reports job.


16-40: Desktop/mobile project split is explicit.

Ignoring mobile.spec.ts on desktop and matching it on mobile projects prevents cross-target runs.


42-49: Web server lifecycle config is solid.

CI build/start command plus piped logs are helpful for debugging.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +50 to 54
- name: Run migrations
run: bun run db:migrate
env:
# Add any required environment variables for build
NODE_ENV: production
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/postgres

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "migrate.ts" -type f

Repository: QueueLab/QCX

Length of output: 75


🏁 Script executed:

cat -n ./lib/db/migrate.ts

Repository: QueueLab/QCX

Length of output: 1732


🏁 Script executed:

cat -n .github/workflows/playwright.yml

Repository: QueueLab/QCX

Length of output: 3213


🏁 Script executed:

# Check if there's an EXECUTE_MIGRATIONS variable set in the workflow
grep -n "EXECUTE_MIGRATIONS" .github/workflows/playwright.yml

Repository: QueueLab/QCX

Length of output: 38


🏁 Script executed:

cat -n package.json

Repository: QueueLab/QCX

Length of output: 4904


Fix SSL configuration for localhost Postgres in CI migrations.

lib/db/migrate.ts forces SSL for all connections, but the GitHub Actions postgres:15 service doesn't support SSL by default. This causes bun run db:migrate to fail in CI with a protocol mismatch error. Make SSL conditional on non-localhost connections.

🛠️ Suggested fix for lib/db/migrate.ts
-  const pool = new Pool({
-    connectionString: process.env.DATABASE_URL,
-    ssl: {
-      rejectUnauthorized: false, // Ensure this is appropriate for your Supabase connection
-    },
-  });
+  const connectionString = process.env.DATABASE_URL;
+  const useSsl =
+    !!connectionString && !/localhost|127\.0\.0\.1/.test(connectionString);
+  const pool = new Pool({
+    connectionString,
+    ssl: useSsl ? { rejectUnauthorized: false } : false,
+  });
🧰 Tools
🪛 Checkov (3.2.334)

[medium] 53-54: Basic Auth Credentials

(CKV_SECRET_4)

🤖 Prompt for AI Agents
In @.github/workflows/playwright.yml around lines 50 - 54, lib/db/migrate.ts
currently forces SSL for all DB connections causing CI failures; update the DB
client creation logic (the code that instantiates the Pool/Client used by your
migration runner) to parse the DATABASE_URL (new URL(process.env.DATABASE_URL))
and conditionally set ssl only when the host is not localhost/127.0.0.1 (and not
the Docker/GitHub Actions local service); i.e., if url.hostname is 'localhost'
or '127.0.0.1' then disable ssl, otherwise set ssl: { rejectUnauthorized: true }
(or your existing ssl config) so migrations run in CI against the non-SSL
postgres service.

- Merged the latest changes from `main`.
- Stabilized CI testing with a PostgreSQL service container and migrations.
- Reduced CI worker count for standard runner stability.
- Implemented sharded report merging in GitHub Actions.
- Refined Playwright configuration for project isolation and progress reporting.
- Addressed PR feedback regarding test progress visibility and diagnostic logs.

Co-authored-by: ngoiyaeric <115367894+ngoiyaeric@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@app/actions.tsx`:
- Around line 49-55: Parsed drawnFeatures may not be an array or may contain
nulls; after parsing drawnFeaturesString in the block that sets drawnFeatures,
validate that the parsed value is an array (use Array.isArray) and then filter
out falsy or malformed entries before assigning to drawnFeatures so
resolutionSearch receives a clean DrawnFeature[]; specifically, replace the
unconditional JSON.parse result assignment with a conditional that checks
Array.isArray(parsed), sets drawnFeatures = parsed.filter(f => f && /* check
required props like geometry or type */), and fall back to [] if validation
fails so mapping in resolutionSearch cannot throw.

In `@lib/agents/resolution-search.tsx`:
- Line 1: The streamObject call in resolution-search uses resolutionSearchSchema
but getModel() (in lib/utils/index.ts) may return xAI grok models that reject
structured-output; update getModel to accept a boolean flag (e.g.,
structuredOutputRequired) or add a new helper getModelForStructuredOutput() that
always returns OpenAI gpt-4o, then call that helper from resolution-search where
streamObject is invoked (reference the streamObject usage and
resolutionSearchSchema) so structured-output uses gpt-4o; apply the same change
to other structured-output consumers (geojsonParser, map-command-generator,
feedback-analyzer) to ensure they also call
getModelForStructuredOutput()/getModel(...structuredOutputRequired=true).
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2438740 and 125cded.

📒 Files selected for processing (3)
  • app/actions.tsx
  • components/header-search-button.tsx
  • lib/agents/resolution-search.tsx
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-01-16T10:02:40.220Z
Learnt from: ngoiyaeric
Repo: QueueLab/QCX PR: 0
File: :0-0
Timestamp: 2026-01-16T10:02:40.220Z
Learning: The router agent and resolution search agent must implement a feedback loop where tool execution results (including GeoTIFF URLs, embeddings, and satellite analysis data) are fed back to the AI model's context through enriched messages before generating the final response, ensuring the foundational model receives and can synthesize all satellite intelligence data.

Applied to files:

  • lib/agents/resolution-search.tsx
  • app/actions.tsx
📚 Learning: 2026-01-13T13:26:30.086Z
Learnt from: ngoiyaeric
Repo: QueueLab/QCX PR: 0
File: :0-0
Timestamp: 2026-01-13T13:26:30.086Z
Learning: Workers that rely on ai.generateObject (geojsonParser, map-command-generator, feedback-analyzer) must not use the X.AI 'grok-4-fast-non-reasoning' model because X.AI’s chat/completions rejects JSON Schema structured-output arguments; prefer OpenAI gpt-4o for structured outputs.

Applied to files:

  • lib/agents/resolution-search.tsx
📚 Learning: 2026-01-17T06:14:51.070Z
Learnt from: ngoiyaeric
Repo: QueueLab/QCX PR: 0
File: :0-0
Timestamp: 2026-01-17T06:14:51.070Z
Learning: For structured output tasks using ai.generateObject (resolution-search, geojsonParser, map-command-generator, feedback-analyzer), prefer OpenAI gpt-4o. While xAI's grok-4-1-fast-reasoning technically supports structured outputs and vision, OpenAI has proven more reliable integration with the AI SDK's generateObject function and Zod schemas in production.

Applied to files:

  • lib/agents/resolution-search.tsx
🧬 Code graph analysis (2)
components/header-search-button.tsx (2)
components/map/google-map.tsx (2)
  • features (38-52)
  • df (39-46)
components/map/mapbox-map.tsx (3)
  • prevData (166-166)
  • feature (78-164)
  • feature (284-286)
app/actions.tsx (1)
lib/agents/resolution-search.tsx (2)
  • DrawnFeature (26-31)
  • resolutionSearch (33-79)
🔇 Additional comments (5)
components/header-search-button.tsx (1)

89-94: Include drawn features in the payload (LGTM).
This keeps resolution search aligned with user annotations.

lib/agents/resolution-search.tsx (2)

26-33: Nice: explicit DrawnFeature contract + optional parameter.


49-51: Prompt enrichment with drawn features looks good.

app/actions.tsx (2)

15-15: DrawnFeature import aligns with the new payload.


98-113: AI SDK contract confirmed. In ai@4.3.19, streamObject() correctly exposes partialObjectStream (async iterable of partial objects) and object (promise of final typed result). The code properly iterates over partial updates and awaits the complete result. Implementation is sound.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 49 to 55
const drawnFeaturesString = formData?.get('drawnFeatures') as string;
let drawnFeatures: DrawnFeature[] = [];
try {
drawnFeatures = drawnFeaturesString ? JSON.parse(drawnFeaturesString) : [];
} catch (e) {
console.error('Failed to parse drawnFeatures:', e);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate parsed drawnFeatures shape before use.
A non-array JSON payload (or null entries) will cause resolutionSearch to throw when it maps the features. Guard the shape before passing it downstream.

🛡️ Proposed fix (shape guard)
-    let drawnFeatures: DrawnFeature[] = [];
-    try {
-      drawnFeatures = drawnFeaturesString ? JSON.parse(drawnFeaturesString) : [];
-    } catch (e) {
-      console.error('Failed to parse drawnFeatures:', e);
-    }
+    let drawnFeatures: DrawnFeature[] = [];
+    try {
+      const parsed = drawnFeaturesString ? JSON.parse(drawnFeaturesString) : [];
+      drawnFeatures = Array.isArray(parsed)
+        ? parsed.filter((f): f is DrawnFeature => f && typeof f === 'object')
+        : [];
+    } catch (e) {
+      console.error('Failed to parse drawnFeatures:', e);
+    }
🤖 Prompt for AI Agents
In `@app/actions.tsx` around lines 49 - 55, Parsed drawnFeatures may not be an
array or may contain nulls; after parsing drawnFeaturesString in the block that
sets drawnFeatures, validate that the parsed value is an array (use
Array.isArray) and then filter out falsy or malformed entries before assigning
to drawnFeatures so resolutionSearch receives a clean DrawnFeature[];
specifically, replace the unconditional JSON.parse result assignment with a
conditional that checks Array.isArray(parsed), sets drawnFeatures =
parsed.filter(f => f && /* check required props like geometry or type */), and
fall back to [] if validation fails so mapping in resolutionSearch cannot throw.

@@ -1,4 +1,4 @@
import { CoreMessage, generateObject } from 'ai'
import { CoreMessage, streamObject } from 'ai'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, verify the file exists and check its content
head -100 lib/agents/resolution-search.tsx

Repository: QueueLab/QCX

Length of output: 3482


🏁 Script executed:

# Search for getModel implementation across the codebase
rg -n "function getModel|const getModel|export.*getModel" -A 10

Repository: QueueLab/QCX

Length of output: 837


🏁 Script executed:

# Also check if streamObject is used in resolution-search and how it uses getModel
rg -n "streamObject" lib/agents/resolution-search.tsx -B 2 -A 5

Repository: QueueLab/QCX

Length of output: 497


🏁 Script executed:

# Check if there are any model selection patterns in the file
rg -n "grok|gpt-4o|xAI|openai" lib/agents/resolution-search.tsx -i

Repository: QueueLab/QCX

Length of output: 38


🏁 Script executed:

# Read the full getModel function implementation
cat -n lib/utils/index.ts | head -100

Repository: QueueLab/QCX

Length of output: 4213


🏁 Script executed:

# Get more context if the function is longer
wc -l lib/utils/index.ts

Repository: QueueLab/QCX

Length of output: 78


🏁 Script executed:

# Get the rest of the getModel function
tail -n +100 lib/utils/index.ts

Repository: QueueLab/QCX

Length of output: 552


Ensure OpenAI gpt-4o is used for streamObject structured-output calls in resolution-search.

The current getModel() function can return xAI's grok-4-fast-non-reasoning model (lines 84 in lib/utils/index.ts) as a fallback or user selection, which will fail at runtime. XAI's grok models reject JSON Schema structured-output arguments; streamObject at lines 72-78 with resolutionSearchSchema requires OpenAI gpt-4o.

Modify getModel() to accept a parameter indicating structured-output requirement, or create a separate getModelForStructuredOutput() that always returns OpenAI gpt-4o. Apply this to resolution-search (lines 72-79) and other structured-output paths (geojsonParser, map-command-generator, feedback-analyzer).

🤖 Prompt for AI Agents
In `@lib/agents/resolution-search.tsx` at line 1, The streamObject call in
resolution-search uses resolutionSearchSchema but getModel() (in
lib/utils/index.ts) may return xAI grok models that reject structured-output;
update getModel to accept a boolean flag (e.g., structuredOutputRequired) or add
a new helper getModelForStructuredOutput() that always returns OpenAI gpt-4o,
then call that helper from resolution-search where streamObject is invoked
(reference the streamObject usage and resolutionSearchSchema) so
structured-output uses gpt-4o; apply the same change to other structured-output
consumers (geojsonParser, map-command-generator, feedback-analyzer) to ensure
they also call
getModelForStructuredOutput()/getModel(...structuredOutputRequired=true).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants