Skip to content

fix: prevent duplicate preview deployments on concurrent pull_request webhooks#4290

Open
colocated wants to merge 2 commits intoDokploy:canaryfrom
colocated:fix/preview-deployment-double-fire-race
Open

fix: prevent duplicate preview deployments on concurrent pull_request webhooks#4290
colocated wants to merge 2 commits intoDokploy:canaryfrom
colocated:fix/preview-deployment-double-fire-race

Conversation

@colocated
Copy link
Copy Markdown
Contributor

@colocated colocated commented Apr 22, 2026

What is this PR about?

Opening a PR with a configured preview label already attached causes GitHub to fire pull_request.opened and pull_request.labeled in parallel. Both handlers in pages/api/deploy/github.ts passed the same check-then-insert ("does a preview exist for this PR?"), both called createPreviewDeployment, and the result was two separate preview deployments: two rows, two bot comments on the PR, two deploys.

Root cause

createPreviewDeployment was a classic TOCTOU:

  1. findPreviewDeploymentByApplicationId(...) returns undefined
  2. Post PR comment
  3. INSERT a new row

No unique constraint on (applicationId, pullRequestId), no advisory lock, no queue-level dedup.

Fix

  • DB protection: add a unique index on (applicationId, pullRequestId) on preview_deployments. The migration first deletes existing duplicate rows (keeps the oldest per key) so the index can be built on instances that already have race artifacts.
  • Graceful race handling: createPreviewDeployment now inserts the row first (with a "" pullRequestCommentId placeholder, now defaulted). On the resulting 23505 unique violation the loser re-reads the winner's row, logs, and returns it. The GitHub comment is posted after the successful insert so the loser never orphans a second bot comment. Comment-post failures are caught and logged without rolling back the row — application.ts already recreates missing comments on the first deploy via issueCommentExists.
  • Queue dedup: enqueue with jobId = preview:<previewDeploymentId>:<sha>. BullMQ ignores duplicate adds while a job with that id is waiting/active, so the opened+labeled burst (same SHA, different action) coalesces to one queued deploy. A later synchronize on a new SHA produces a new id and enqueues normally.
  • Fault isolation: wrap the per-app preview loop body in try/catch so one app's failure is logged and the remaining apps still process.

Benefit

No more duplicate preview apps, bot comments, or deploys from a single PR event. All error paths still log; logging is not suppressed anywhere.

Checklist

  • You created a dedicated branch based on the canary branch.
  • You have read the suggestions in the CONTRIBUTING.md file.
  • You have tested this PR in your local instance.

Issues related (if applicable)

Replaces #4289 (auto-closed by anti-slop on description length).

Greptile Summary

This PR adds three interlocking deduplication layers to prevent duplicate preview deployments triggered by concurrent pull_request.opened + pull_request.labeled webhooks: a DB unique index on (applicationId, pullRequestId), an insert-first / unique-violation-catch pattern in createPreviewDeployment, and BullMQ jobId-based queue dedup. The migration correctly uses a tuple comparison (createdAt, previewDeploymentId) as the deletion predicate, so rows that share the same millisecond timestamp are still ordered deterministically by the primary key — the prior concern about same-millisecond ties is resolved.

Confidence Score: 4/5

Safe to merge for self-hosted instances; the IS_CLOUD direct-deploy race (flagged in prior review) remains open for multi-replica cloud deployments.

The core TOCTOU race is well-addressed: the unique index enforces the invariant at the DB layer, the insert-first pattern makes the loser gracefully reuse the winner's row, and BullMQ jobId dedup coalesces queue adds across replicas. The migration tie-breaking is correctly handled via the compound (createdAt, previewDeploymentId) predicate. One prior-thread concern (IS_CLOUD cross-replica concurrent deploy() calls) is not addressed in this PR and remains a real issue for cloud multi-replica deployments, preventing a full 5.

apps/dokploy/pages/api/deploy/github.ts — the IS_CLOUD direct-deploy branch at line 524 bypasses all queue-level dedup.

Reviews (2): Last reviewed commit: "fix: address greptile review on preview-..." | Re-trigger Greptile

… webhooks

Opening a PR with a preview label pre-attached fires opened and labeled in
parallel; both raced past the "exists?" check and created two previews.

- Unique index on (applicationId, pullRequestId); migration dedupes first.
- createPreviewDeployment inserts before commenting; loser reuses winner row.
- BullMQ jobId coalesces opened+labeled for the same SHA.
- Per-app try/catch so one failure doesn't drop the batch.
@colocated colocated requested a review from Siumauricio as a code owner April 22, 2026 21:57
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Apr 22, 2026
Comment thread apps/dokploy/pages/api/deploy/github.ts
Comment thread apps/dokploy/drizzle/0166_first_warstar.sql Outdated
Comment thread packages/server/src/services/preview-deployment.ts
- Migration DELETE tiebreaks createdAt ties on previewDeploymentId so same-ms
  duplicates don't both survive and fail the unique index build.
- Cloud deploy() path goes through the same in-process dedup gate as BullMQ,
  closing the gap where two concurrent webhooks on one replica fire two
  simultaneous cloud deploys for one preview+SHA.
- Clarify the race-loser's generated appName/domain are intentionally dropped.
@colocated
Copy link
Copy Markdown
Contributor Author

@greptileai re-review

@colocated
Copy link
Copy Markdown
Contributor Author

@Siumauricio need some direction here on greptiles comment for the cloud version, not really my area / neither can i test it really

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant