docs(research): cron and scheduling landscape#4
Conversation
…hape Captures research on how major workflow engines (Inngest, Trigger.dev, Temporal, Cloudflare Workflows, Hatchet, DBOS, Restate, AWS Step Functions, Vercel WDK) model recurring execution. Five consistent patterns surface; informs the shape of a future @tanstack/workflow-cron package. Key finding: every mature engine separates the cron from the workflow body and fires fresh invocations per tick. Nobody runs production workflows as "loop forever with sleep." The closure engine already supports the fresh-per-tick model — a scheduler is purely additive. Also touches up research/README.md to call the directory a sketchpad for upcoming work, not strictly historical.
📝 WalkthroughWalkthroughResearch documentation is added to the TanStack Workflow repository with a new scheduling and cron landscape analysis. The research directory README is clarified to better characterize design notes, and a new ChangesWorkflow Scheduling Research and Documentation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@research/SCHEDULING.md`:
- Line 70: The bullet starting with "A driver — polls a schedule store, computes
'next fire time,' fires `runWorkflow`." is a sentence fragment; update it to
include an explicit subject and verb so it's a complete sentence (e.g., "A
driver polls a schedule store, computes the next fire time, and fires
`runWorkflow`."). Edit the line in SCHEDULING.md to replace the fragment with a
full sentence while preserving the examples (Durable Object alarm, AWS Lambda +
EventBridge, Cloudflare Cron Trigger) and backtick formatting for `runWorkflow`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7b864385-b4cb-47cc-af0a-77ec90021ada
📒 Files selected for processing (2)
research/README.mdresearch/SCHEDULING.md
| Two pieces: | ||
|
|
||
| - **Schedule definitions** — declarative, sit next to workflow definitions. | ||
| - **A driver** — polls a schedule store, computes "next fire time," fires `runWorkflow`. Can be deployed as a long-running worker, Durable Object alarm, AWS Lambda + EventBridge, Cloudflare Cron Trigger, etc. |
There was a problem hiding this comment.
Fix sentence fragment for readability.
Line 70 reads as a fragment after the previous sentence. Add an explicit subject to keep the bullet grammatically complete.
Suggested edit
-- **A driver** — polls a schedule store, computes "next fire time," fires `runWorkflow`. Can be deployed as a long-running worker, Durable Object alarm, AWS Lambda + EventBridge, Cloudflare Cron Trigger, etc.
+- **A driver** — polls a schedule store, computes "next fire time," and fires `runWorkflow`. It can be deployed as a long-running worker, Durable Object alarm, AWS Lambda + EventBridge, Cloudflare Cron Trigger, etc.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - **A driver** — polls a schedule store, computes "next fire time," fires `runWorkflow`. Can be deployed as a long-running worker, Durable Object alarm, AWS Lambda + EventBridge, Cloudflare Cron Trigger, etc. | |
| - **A driver** — polls a schedule store, computes "next fire time," and fires `runWorkflow`. It can be deployed as a long-running worker, Durable Object alarm, AWS Lambda + EventBridge, Cloudflare Cron Trigger, etc. |
🧰 Tools
🪛 LanguageTool
[style] ~70-~70: To form a complete sentence, be sure to include a subject.
Context: ... "next fire time," fires runWorkflow. Can be deployed as a long-running worker, D...
(MISSING_IT_THERE)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@research/SCHEDULING.md` at line 70, The bullet starting with "A driver —
polls a schedule store, computes 'next fire time,' fires `runWorkflow`." is a
sentence fragment; update it to include an explicit subject and verb so it's a
complete sentence (e.g., "A driver polls a schedule store, computes the next
fire time, and fires `runWorkflow`."). Edit the line in SCHEDULING.md to replace
the fragment with a full sentence while preserving the examples (Durable Object
alarm, AWS Lambda + EventBridge, Cloudflare Cron Trigger) and backtick
formatting for `runWorkflow`.
Concrete companion to research/SCHEDULING.md — shows the bring-your-
own-scheduler pattern that's possible with what shipping today.
docs/concepts/scheduling.md
- "Mental model in one paragraph" — external scheduler + fresh
workflow invocation per tick, never loop with sleep
- Six recipes: node-cron, Cloudflare Worker cron, Vercel Cron Job,
AWS EventBridge → Lambda, skip-overlap policy via deterministic
runId, buffer-one policy via local queue
- Sketch of a writing-your-own schedule store + worker
packages/workflow-core/tests/examples.external-cron.test.ts
- vitest-fake-timer-driven scheduler against the engine, three
scenarios:
- fresh runId per tick, multiple ticks finish independently
- skip-overlap via deterministic runId + state check
- buffer-one via local pending flag, third tick collapses into
the buffer
docs/config.json — adds Concepts → Scheduling nav entry.
research/SCHEDULING.md — points at the new recipes + test under
"Status" so the research and the user-facing material cross-link.
Verification: tsc clean, eslint clean, 105 / 22 tests pass
(+3 new), tsdown build clean.
NOTE: PR CI is currently blocked by a GitHub Actions billing issue
on the TanStack org — no jobs are starting. Push for visibility;
will merge once billing resolves.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/workflow-core/tests/examples.external-cron.test.ts`:
- Around line 179-188: The test uses vi.advanceTimersByTimeAsync(0) to try to
ensure tick1 reached the held step, which is flaky; change it to use an explicit
"step started" latch like the existing pattern used elsewhere in the file: add a
promise/resolver pair (e.g., stepStartedPromise and resolveStepStarted) inside
the held step callback that is resolved when the step begins, await that promise
after creating tick1 (instead of awaiting vi.advanceTimersByTimeAsync(0)), then
create tick2 via tickWithSkipOverlap(), and only call releaseFirstStep() after
the latch ensures the first step is actually running; reference the existing
latch implementation in the file to mirror naming and setup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f1db5e9b-bc53-4b35-8901-f96f814b97da
📒 Files selected for processing (4)
docs/concepts/scheduling.mddocs/config.jsonpackages/workflow-core/tests/examples.external-cron.test.tsresearch/SCHEDULING.md
✅ Files skipped from review due to trivial changes (1)
- research/SCHEDULING.md
| const tick1 = tickWithSkipOverlap() | ||
| // Yield so tick1 progresses to inside the step fn and is "running". | ||
| await vi.advanceTimersByTimeAsync(0) | ||
|
|
||
| // Fire tick2 — same minute window, first is still running. | ||
| const tick2 = tickWithSkipOverlap() | ||
| await vi.advanceTimersByTimeAsync(0) | ||
|
|
||
| // Release the first tick. | ||
| releaseFirstStep!() |
There was a problem hiding this comment.
Avoid timer-yield synchronization in skip-overlap test; use an explicit “step started” latch.
await vi.advanceTimersByTimeAsync(0) does not reliably guarantee tick1 reached the held step, so releaseFirstStep!() can race and make this test flaky. Use a promise latch (same pattern as Line 207-Line 209) before firing tick2.
Proposed deterministic fix
+ let firstStepStarted: (() => void) | null = null
+ const firstStepStartedPromise = new Promise<void>((r) => {
+ firstStepStarted = r
+ })
let releaseFirstStep: (() => void) | null = null
workFn.mockImplementationOnce(async () => {
invocations++
+ firstStepStarted!()
await new Promise<void>((r) => {
releaseFirstStep = r
})
return { summary: 'first' }
})
const tick1 = tickWithSkipOverlap()
- // Yield so tick1 progresses to inside the step fn and is "running".
- await vi.advanceTimersByTimeAsync(0)
+ // Deterministically wait until tick1 is inside the held step.
+ await firstStepStartedPromise
// Fire tick2 — same minute window, first is still running.
const tick2 = tickWithSkipOverlap()
- await vi.advanceTimersByTimeAsync(0)
// Release the first tick.
releaseFirstStep!()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/workflow-core/tests/examples.external-cron.test.ts` around lines 179
- 188, The test uses vi.advanceTimersByTimeAsync(0) to try to ensure tick1
reached the held step, which is flaky; change it to use an explicit "step
started" latch like the existing pattern used elsewhere in the file: add a
promise/resolver pair (e.g., stepStartedPromise and resolveStepStarted) inside
the held step callback that is resolved when the step begins, await that promise
after creating tick1 (instead of awaiting vi.advanceTimersByTimeAsync(0)), then
create tick2 via tickWithSkipOverlap(), and only call releaseFirstStep() after
the latch ensures the first step is actually running; reference the existing
latch implementation in the file to mirror naming and setup.
Adds `research/SCHEDULING.md` — a comparison of how Inngest, Trigger.dev, Temporal, Cloudflare Workflows, Hatchet, DBOS, Restate, AWS Step Functions, Vercel WDK, Mastra, and LangGraph.js handle cron / recurring execution.
Why
Came up while answering "how do other libraries handle cron?" — worth saving as a sketchpad entry rather than relitigating later.
Key takeaways
Implications
The closure engine needs no changes for cron. A future `@tanstack/workflow-cron` package would be small (schedule definitions + a polling driver). Recommend deferring until at least one durable storage adapter ships.
Test plan
Summary by CodeRabbit
Documentation
Tests