Skip to content

v0.5: L3 Reflective + L4 Corrective memory layers (self-learning + self-reflection) #10

@MakiDevelop

Description

@MakiDevelop

Background

Per specs/08-memory-architecture.md, VirtualMe's full memory model has 4 layers. v0.4 implements L1 (episodic) + L2 (semantic). L3 (reflective) and L4 (corrective) are what make the system self-learning + self-reflective rather than "just an interview log."

What L3 enables

  • Detecting contradictions between SOUL anchors and VOICE samples (e.g., spec says "direct" but actual samples to clients are diplomatic → register switching, not contradiction)
  • Detecting cross-session drift in the extracted model
  • Flagging coverage holes for next-session question selection
  • Surfacing patterns across sessions for human review

Without L3, the system can extract and triangulate but cannot reason about its own modeling.

What L4 enables

  • Capturing agent-mode rejections: "this isn't quite me — wrong tone / wrong vocabulary / factual error"
  • Synthesizing lessons from rejections
  • Auto-adjusting L2 (e.g., VOICE retrieval weighting shifts after consistent "tone too formal" feedback)

Without L4, the agent ships once and never improves from real-world use.

Implementation plan

Depends on issue #5 (MemoryBackend Protocol).

L3 — Reflective Memory

  • Add reflections storage to backend Protocol
  • Implement weekly reflection pass — scans past 7 days of L2 changes
  • Reflection types: contradiction / drift / coverage_gap / pattern
  • All L3 entries marked requires_human_review=True by default
  • Operator UI: review and resolve reflections (CLI minimum)
  • Pre-blind-test pass at Week 5 and Week 8

L4 — Corrective Memory

  • Add feedback storage to backend Protocol
  • CLI capture: `virtualme feedback --reject "draft" --reason "tone too formal"`
  • Lesson extraction: LLM synthesizes "what to learn" from rejection
  • L4 → L2 feedback loop: e.g., adjust VOICE retrieval weighting

Storage

memory-hall backend recommended (handles provenance + multi-agent attribution natively per spec/08). SQLite backend can accept writes but reflection passes are no-ops.

Acceptance criteria

  • reflections table/namespace defined
  • feedback table/namespace defined
  • Weekly reflection pass runs (cron stub OK for v0.5)
  • CLI feedback capture works
  • At least one reflection type (contradiction detection) demonstrably catches a real case in integration test
  • Documentation updated in specs/08-memory-architecture.md

Estimated effort

1 week. L3 alone is ~3 days; L4 capture + lesson extraction another ~3 days.

Why this matters

This is the difference between "VirtualMe extracts you once" and "VirtualMe learns about you over time." The 8-week interview methodology assumes the system gets better at modeling the interviewee as weeks accumulate — that requires L3. Post-ship improvement requires L4.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions