Skip to content

Close race window in StoreService.patch() between snapshot and write-back#5385

Open
richardhjtan wants to merge 2 commits into
mainfrom
cs-11609-listing-create-race-condition-bug
Open

Close race window in StoreService.patch() between snapshot and write-back#5385
richardhjtan wants to merge 2 commits into
mainfrom
cs-11609-listing-create-race-condition-bug

Conversation

@richardhjtan

Copy link
Copy Markdown
Contributor

Summary

StoreService.patch() builds a merged document by snapshotting the target instance, merging the patch on top, then writing the merged document back onto the live instance. The snapshot was taken before resolving the patch's linked-card relationships — a step that can require a network fetch. Any field another caller wrote to the same live instance during that fetch was silently reverted once the stale snapshot was written back, since the write-back replaces the whole document rather than merging.

This reorders the method to resolve linked relationships first, then take the snapshot immediately before the merge and write-back, closing the window. No behavior change for the non-concurrent case.

Test plan

  • Added an integration test that gates the relationship-resolution step on a controlled Deferred, mutates an unrelated field on the same live instance while patch() is in flight, and asserts that field survives.
  • Confirmed locally: fails without the reorder, passes with it.

…rite-back

patch() snapshotted the instance's full document before resolving linked
relationships (a real network gap), then reapplied that snapshot wholesale
afterward. Any field written on the same live instance by other code during
that gap was silently reverted when the stale snapshot landed. Resolving
linked relationships first, then snapshotting immediately before the
write-back, closes the window.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a3298f3af0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// a sibling background task (e.g. an unrelated auto-linking step, as in
// the catalog listing-create flow) mutates a different field on the
// same live instance while the patch above is still in flight
instance.name = 'Race Winner';

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wait for patch to reach the gated load before mutating

This write runs synchronously immediately after calling async store.patch(), but StoreService.patch() always yields at its initial await this.get(id), so the assignment happens before either the old pre-fix snapshot or the new loadPatchedInstances gate is reached. In the pre-fix implementation the snapshot would already include Race Winner, so the test can pass without the reorder; add a signal inside the stub and perform the write only after the stub is entered while gate is still pending to exercise the stale-snapshot window.

Useful? React with 👍 / 👎.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Preview deployments

Host Test Results

    1 files  ±0      1 suites  ±0   2h 33m 31s ⏱️ +21s
3 373 tests ±0  3 358 ✅ ±0  15 💤 ±0  0 ❌ ±0 
3 392 runs  ±0  3 377 ✅ ±0  15 💤 ±0  0 ❌ ±0 

Results for commit 96f3e98. ± Comparison against earlier commit a3298f3.

Realm Server Test Results

    1 files  ±0      1 suites  ±0   9m 35s ⏱️ +19s
1 674 tests ±0  1 674 ✅ ±0  0 💤 ±0  0 ❌ ±0 
1 753 runs  ±0  1 753 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 96f3e98. ± Comparison against earlier commit a3298f3.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a concurrency hazard in StoreService.patch() where a long-running linked-relationship resolution step could occur after taking the serialized snapshot used for merge + write-back, allowing concurrent writes to the same instance to be reverted by the stale snapshot.

Changes:

  • Reorders StoreService.patch() to resolve linked-card relationships before snapshotting/serializing the target instance, shrinking the race window that could clobber concurrent field writes.
  • Adds an integration test that forces patch() to pause during relationship resolution, performs a concurrent write to another field, and asserts that the write survives the patch.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
packages/host/app/services/store.ts Reorders relationship resolution to happen before the snapshot/serialize step in StoreService.patch().
packages/host/tests/integration/store-test.gts Adds a regression test covering a concurrent field write during store.patch() while relationship resolution is artificially delayed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1600 to +1612
let gate = new Deferred<void>();
let originalLoadPatchedInstances = (
storeService as any
).loadPatchedInstances.bind(storeService);
(storeService as any).loadPatchedInstances = async (...args: any[]) => {
let result = await originalLoadPatchedInstances(...args);
// simulate a slow relationship load (e.g. fetching a not-yet-cached
// linked card) so a concurrent field write on the same live instance
// can land in the window between the patch's initial snapshot and its
// eventual write-back
await gate.promise;
return result;
};
…atch()

The write ran synchronously right after invoking patch(), before the
function had actually reached the relationship-load step it awaits on —
patch() yields at its own initial instance lookup first. That let the write
land before either the old or new snapshot point, so the test could pass
without the fix under test. Wait for a signal fired from inside the stub
once relationship loading is genuinely paused, then write.

Also restore the literal original loadPatchedInstances reference in the
finally block instead of a rebound copy of it.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
@habdelra

habdelra commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

can you provide more context for the reason of this PR? why do we need to load links when we are doing a document patch? linked fields are not part of the patch payload, these are side loaded data. the document being patched should be able to be patched without loading links at all. (a patch should not touch the included[] part of the JSON:API documetn only the field is truly a brand new instance you already have it locally and are supplying a local id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants