Skip to content

RFC 0013: SQLite State Snapshot Command#20

Open
giodl73-repo wants to merge 22 commits into
mainfrom
rfc/cloud-serializable-sqlite-state
Open

RFC 0013: SQLite State Snapshot Command#20
giodl73-repo wants to merge 22 commits into
mainfrom
rfc/cloud-serializable-sqlite-state

Conversation

@giodl73-repo

@giodl73-repo giodl73-repo commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

This RFC proposes a narrow core openclaw snapshot command for the host-sync contract:

live OpenClaw SQLite state -> verified syncable artifact + manifest

Scout/Lobster-style hosts persist OpenClaw by syncing files when they are saved. Live SQLite files are not a safe sync boundary:

  • state/openclaw.sqlite may be incomplete without state/openclaw.sqlite-wal
  • agents/<agentId>/agent/openclaw-agent.sqlite may be incomplete without its WAL
  • memory-search state should be reached through OpenClaw's database-first ownership model, not private host-copied SQLite paths
  • *.sqlite-wal, *.sqlite-shm, and *.sqlite-journal are process-local SQLite sidecars, not durable artifacts
  • unmanaged file deltas over live SQLite make this worse because sync can observe DB pages, WAL frames, and sidecars at different moments

The command provides the missing translation step: ask SQLite to materialize a clean DB artifact, verify it, write a manifest, and publish the completed artifact set into a sync-owned location. That completed artifact write is what the host should sync.

Current Pilot Checkpoint

Before maintainers accept this RFC or land the core command, we are exercising the same artifact-boundary idea in Microsoft Scout/Lobster for about one week through the Lobster-owned snapshot plugin path.

That pilot is not a replacement for the core command proposal. It is the operational evidence checkpoint for it. The pilot should tell us:

  • whether hosts can reliably sync only completed manifest.json and database.sqlite snapshot artifacts while ignoring live SQLite files, temp files, and sidecars
  • how often Scout needs snapshots to meet recovery expectations
  • observed snapshot artifact size and creation time in real hosted use
  • whether the right long-term shape is a core openclaw snapshot command, a host-owned plugin command, or both
  • whether Phase 2 WAL bundles are justified by measured size, timing, or frequency data

The linked OpenClaw PRs therefore remain implementation proofs while this pilot runs.

Phase 1 Shape

The Phase 1 command can stay small:

openclaw snapshot create --target global
openclaw snapshot create --agent main
openclaw snapshot verify <snapshot>
openclaw snapshot restore <snapshot> --target <state-dir>

This does not choose a replacement database, add cloud storage, or implement failover. Hosts still own upload, retention, routing, encryption, and restore timing.

Memory-search is intentionally not a Phase 1 named target. Today it is owned by the per-agent database-first store, so Phase 1 protects it through openclaw snapshot create --agent <agentId>. A future memory-search-only artifact needs a separate design: either a dedicated owner database or a true logical export. It should not be presented as memory-only while snapshotting the full per-agent database.

WAL Bundle Position

Ryan's scaling concern is addressed as a gated Phase 2, not as part of the initial snapshot command. Phase 1 should measure whether full snapshots are actually too large, too slow, or too infrequent for hosted deployments.

If those metrics justify it, Phase 2 can add ordered WAL-bundle artifacts anchored to a verified full snapshot. Object storage, retention policy, failover, multiple delta encodings, and pruning policy stay out unless maintainers greenlight more.

Roadmap

Phase 1: snapshot command and pilot evidence

  1. Core snapshot command and safe-sync artifact: shared SQLite snapshot provider, local artifact repository, openclaw snapshot create|verify|restore|list, --target global, --agent <agentId>, manifest/hash verification, and restore proof.
  2. Microsoft Scout/Lobster pilot: run the plugin-based artifact boundary for about one week and collect operational evidence around artifact sync correctness, artifact size, creation time, frequency, and whether WAL bundles are needed.
  3. Snapshot stress harness: measure snapshot/restore behavior while a writer commits transaction batches, with metrics to decide whether WAL bundles are worth building.

Phase 2: WAL bundles, greenlight required

  1. Simple WAL bundle proof: prove full snapshot + ordered WAL bundles -> verified local SQLite database with staging/publish semantics, invalid-chain rejection, and simple compaction into a new full snapshot generation.

Future dedicated targets, including any memory-search-only target, should require an explicit owner database or logical export contract before becoming a public command shape.

Scope

This is an RFC only. It does not change OpenClaw runtime behavior, add storage configuration, or implement backup/restore commands. The linked OpenClaw PRs are draft implementation proofs for the proposed direction while the Scout/Lobster pilot gathers evidence.

Validation

Real behavior proof

Behavior or issue addressed:
This PR adds the draft RFC text for SQLite-safe state snapshots, database-first OpenClaw targets, host-safe sync artifacts, a Microsoft Scout/Lobster pilot checkpoint, and a gated WAL-bundle follow-up. The visible behavior is the final RFC document rendered by GitHub from the PR head.

Real environment tested:
Windows scratch checkout of openclaw/rfcs at PR head 6e4615948d376581cbf0e74343b08e0a5f4c61b1, using Git/GitHub CLI terminal checks. This is documentation/RFC proof, not runtime OpenClaw behavior proof.

Exact steps or command run after this patch:

  1. git rev-parse HEAD
  2. git show --stat --oneline --no-renames HEAD
  3. git diff --check origin/main...HEAD

Evidence after fix:

6e4615948d376581cbf0e74343b08e0a5f4c61b1
6e46159 Add Scout snapshot pilot checkpoint
 rfcs/0013-cloud-serializable-sqlite-state.md | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

git diff --check origin/main...HEAD produced no output and exited successfully.

Observed result after fix:
The final PR head contains one RFC Markdown file with draft metadata, a core openclaw snapshot command proposal, explicit safe-sync artifact framing, unsafe SQLite sync input examples, Phase 1 targets limited to global and per-agent databases, a memory-search-only target deferred to future dedicated design, a Microsoft Scout/Lobster one-week pilot checkpoint, a metrics-gated Phase 2 WAL-bundle model, and no whitespace errors from git diff --check.

What was not tested:
No runtime OpenClaw behavior was tested in this RFC PR. Runtime evidence should come from the Microsoft Scout/Lobster pilot and the linked OpenClaw implementation/stress PRs. Acceptance metadata (status: accepted and issue) is intentionally not set while the RFC remains draft.

RFC Lifecycle

  • Status: draft
  • Implementation issue: intentionally blank until acceptance
  • Maintainer discussion thread: pending maintainer-side Discord thread in maintainer-discussion

@clawsweeper

clawsweeper Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codex review: found issues before merge. Reviewed June 22, 2026, 1:49 PM ET / 17:49 UTC.

Summary
Adds a new 609-line draft RFC proposing a core openclaw snapshot command for SQLite-safe state artifacts, host sync boundaries, restore verification, and gated WAL-bundle follow-up work.

Reproducibility: not applicable. this is a feature RFC rather than a bug report. The merge blocker is source-verifiable from the added frontmatter and the repository RFC lifecycle docs.

Review metrics: 2 noteworthy metrics.

  • RFC Diff Size: 1 file added, 609 lines. The patch is documentation-only, but it records broad product direction that maintainers should review as an RFC.
  • Linked Implementation Stack: 2 open draft OpenClaw PRs referenced. Merge readiness depends on coordinated maintainer acceptance across the RFC and the implementation proofs.

Merge readiness
Overall: 🦐 gold shrimp
Proof: 🦞 diamond lobster
Patch quality: 🦐 gold shrimp
Result: needs maintainer review before merge.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Complete maintainer RFC acceptance and update status plus issue before merge.
  • Keep the linked OpenClaw implementation PRs coordinated with the accepted RFC scope.

Risk before merge

  • [P1] Merging the current diff would publish an RFC with status: draft and a blank issue, contrary to the repository's documented RFC lifecycle.
  • [P1] The RFC would commit broad product direction for a new core snapshot/restore command while Add core SQLite snapshot command openclaw#94805 and Add snapshot SQLite stress harness openclaw#94967 remain open draft proof paths.
  • [P1] The RFC itself says a Scout/Lobster pilot should answer whether the long-term shape is a core command, host-owned plugin command, or both before acceptance.

Maintainer options:

  1. Accept RFC Before Merge (recommended)
    Complete maintainer discussion, choose the implementation issue, and update status plus issue before merging this RFC.
  2. Keep Open Through Pilot
    Leave the PR open while the Scout/Lobster pilot and linked OpenClaw draft PRs settle the core-vs-plugin shape.
  3. Explicit Lifecycle Override
    Maintainers could intentionally merge the draft, but that would put unaccepted product direction on main before lifecycle metadata exists.

Next step before merge

  • [P2] Manual review is needed because accepting a new core snapshot command and RFC metadata is a maintainer product/lifecycle decision, not an automated repair.

Security
Cleared: The PR adds one Markdown RFC and does not introduce code execution, dependencies, workflows, secrets handling, or supply-chain changes.

Review findings

  • [P1] Update RFC metadata before merging — rfcs/0013-cloud-serializable-sqlite-state.md:7-8
Review details

Best possible solution:

Keep this PR open as the RFC discussion vehicle until maintainers accept a scoped direction, choose or create the implementation issue, and update the frontmatter before merge.

Do we have a high-confidence way to reproduce the issue?

Not applicable; this is a feature RFC rather than a bug report. The merge blocker is source-verifiable from the added frontmatter and the repository RFC lifecycle docs.

Is this the best way to solve the issue?

No for merging as-is; the maintainable path is RFC acceptance first, then accepted metadata and an implementation issue before landing the document.

Full review comments:

  • [P1] Update RFC metadata before merging — rfcs/0013-cloud-serializable-sqlite-state.md:7-8
    README.md and the template say new RFCs should not merge while status: draft; after acceptance they need accepted metadata and an implementation issue. This added RFC still has status: draft and a blank issue, so landing this exact diff would publish an unaccepted RFC on main.
    Confidence: 0.92

Overall correctness: patch is incorrect
Overall confidence: 0.91

AGENTS.md: not found in the target repository.

Codex review notes: model internal, reasoning high; reviewed against bab3348050f7.

Label changes

Label justifications:

  • P3: This is a low-risk RFC/product-direction PR with no runtime code change in this repository.
  • merge-risk: 🚨 other: The non-CI merge risk is process and product-direction commitment: a draft RFC would land before acceptance metadata and linked implementation direction are settled.
  • rating: 🦐 gold shrimp: Overall readiness is 🦐 gold shrimp; proof is 🦞 diamond lobster and patch quality is 🦐 gold shrimp.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (terminal): The PR body contains terminal proof from a real checkout showing the final head, diff stat, and clean git diff --check; runtime OpenClaw proof is not required for this RFC-only change.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body contains terminal proof from a real checkout showing the final head, diff stat, and clean git diff --check; runtime OpenClaw proof is not required for this RFC-only change.
Evidence reviewed

What I checked:

Likely related people:

  • kevinlin-openai: git blame and git log show this person added the README/template rule that blocks merging draft RFCs before acceptance. (role: RFC lifecycle policy author; confidence: high; commits: e366ea9825a4; files: README.md, rfcs/0000-template.md)
  • Dallin Romney: Recent history shows this person updated README and template guidance for RFC sidecar/layout conventions that new RFCs follow. (role: recent RFC documentation contributor; confidence: medium; commits: 3aa7d727383f; files: README.md, rfcs/0000-template.md)
  • Omar Shahine: Recent main history shows this person aligned RFC frontmatter/status metadata in the same repository. (role: recent RFC metadata contributor; confidence: medium; commits: bab3348050f7, f346050b2878; files: rfcs/0002-approval-prompt-markdown.md, rfcs/needs_refactoring/doctor-health-upgrades.md, rfcs/needs_refactoring/imessage-channel-configuration-cleanup.md)
  • giodl73-repo: This person has current-main RFC documentation history in addition to authoring this proposal branch, so they are a practical follow-up owner once maintainers decide the accepted scope. (role: recent RFC area contributor; confidence: medium; commits: 0e353436f90b, 6e4615948d37; files: rfcs/needs_refactoring/policy-conformance.md, rfcs/0013-cloud-serializable-sqlite-state.md)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P3 Low-risk cleanup, docs, polish, ergonomics, or speculative feature. merge-risk: 🚨 other 🚨 Merging this PR has meaningful risk outside the owned taxonomy. labels Jun 18, 2026
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 18, 2026
@giodl73-repo giodl73-repo changed the title RFC 0013: Cloud-Serializable SQLite State RFC 0013: SQLite State Snapshot Plugin Jun 18, 2026
@clawsweeper clawsweeper Bot added the feature: ✨ showcase ClawSweeper spotlight: unusually compelling feature idea for maintainer attention. label Jun 18, 2026
@giodl73-repo

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Updated the PR body with final-head RFC proof, including head SHA, diff stat, frontmatter, line count, and git diff --check output. The maintainer-discussion thread remains pending maintainer-side Discord action/access, and acceptance metadata is intentionally unchanged while draft.

@clawsweeper

clawsweeper Bot commented Jun 19, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 19, 2026
@giodl73-repo giodl73-repo marked this pull request as ready for review June 19, 2026 01:52
@clawsweeper clawsweeper Bot removed the proof: sufficient Contributor real behavior proof is sufficient. label Jun 19, 2026
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Updated the RFC and PR body to current head 459fb59386958ba8b458ffd7c85a3d6c76574d2a. The RFC now includes the current core PR shape for --target memory-search --agent <id> and the Lobster-shaped configured SQLite store case. The PR body proof has been refreshed with git rev-parse HEAD, git show --stat --oneline --no-renames HEAD, git diff --check origin/main...HEAD, and line count output.

@clawsweeper

clawsweeper Bot commented Jun 19, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. proof: sufficient Contributor real behavior proof is sufficient. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Updated the RFC and PR body to current head 2586c7aadf26c7bc5a17ff503889cf756baf0254.

What changed:

  • removed the stale configured memorySearch.store.path framing
  • aligned memory-search with the database-first ownership model
  • refreshed PR body proof and removed the line-count claim that was mismatched in the previous body
  • git diff --check origin/main...HEAD passes

@clawsweeper

clawsweeper Bot commented Jun 19, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Updated the RFC and PR body to current head 7f09e0a22adc9e18caf7f0b1a16113eae703f0f5.

What changed:

  • Phase 1 command examples now only include --target global and --agent main
  • memory-search-only artifacts are explicitly deferred to future design because current memory-search state is owned by the per-agent database-first store
  • RFC now says a future memory-search-only target needs either a dedicated owner database or a true logical export
  • roadmap matches the current OpenClaw stack: PR #94805 for core snapshot, PR #94967 for stress validation, Phase 2 WAL bundles only after metrics greenlight
  • git diff --check origin/main...HEAD passes

@clawsweeper

clawsweeper Bot commented Jun 19, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature: ✨ showcase ClawSweeper spotlight: unusually compelling feature idea for maintainer attention. merge-risk: 🚨 other 🚨 Merging this PR has meaningful risk outside the owned taxonomy. P3 Low-risk cleanup, docs, polish, ergonomics, or speculative feature. proof: sufficient Contributor real behavior proof is sufficient. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant