Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion apps/memos-local-openclaw/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -519,7 +519,8 @@ All optional — shown with defaults:
"rrfK": 60, // RRF fusion constant
"mmrLambda": 0.7, // MMR relevance vs diversity (0-1)
"recencyHalfLifeDays": 14, // Time decay half-life
"vectorSearchMaxChunks": 0 // 0 = search all (default). Set 200000–300000 only if search is slow on huge DBs
"vectorSearchMaxChunks": 0, // 0 = search all (default). Set 200000–300000 only if search is slow on huge DBs
"autoRecallMinQueryLength": 2 // Auto-recall skips shorter normalized prompts; set 10 to ignore short acknowledgements
},
"dedup": {
"similarityThreshold": 0.75, // Cosine similarity for smart-dedup candidates (Top-5)
Expand Down
5 changes: 3 additions & 2 deletions apps/memos-local-openclaw/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1882,8 +1882,9 @@ Groups: ${groupNames.length > 0 ? groupNames.join(", ") : "(none)"}`,
const query = normalizeAutoRecallQuery(rawPrompt);
recallQuery = query;

if (query.length < 2) {
ctx.log.debug("auto-recall: extracted query too short, skipping");
const autoRecallMinQueryLength = ctx.config.recall?.autoRecallMinQueryLength ?? DEFAULTS.autoRecallMinQueryLength;
if (query.length < autoRecallMinQueryLength) {
ctx.log.debug(`auto-recall: extracted query shorter than autoRecallMinQueryLength=${autoRecallMinQueryLength}, skipping`);
return;
}
ctx.log.debug(`auto-recall: query="${query.slice(0, 80)}"`);
Expand Down
1 change: 1 addition & 0 deletions apps/memos-local-openclaw/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ export function resolveConfig(raw: Partial<MemosLocalConfig> | undefined, stateD
mmrLambda: cfg.recall?.mmrLambda ?? DEFAULTS.mmrLambda,
recencyHalfLifeDays: cfg.recall?.recencyHalfLifeDays ?? DEFAULTS.recencyHalfLifeDays,
vectorSearchMaxChunks: cfg.recall?.vectorSearchMaxChunks ?? DEFAULTS.vectorSearchMaxChunks,
autoRecallMinQueryLength: cfg.recall?.autoRecallMinQueryLength ?? DEFAULTS.autoRecallMinQueryLength,
},
dedup: {
similarityThreshold: cfg.dedup?.similarityThreshold ?? DEFAULTS.dedupSimilarityThreshold,
Expand Down
3 changes: 3 additions & 0 deletions apps/memos-local-openclaw/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,8 @@ export interface MemosLocalConfig {
recencyHalfLifeDays?: number;
/** Cap vector search to this many most recent chunks. 0 = no cap (search all; may get slower with 200k+ chunks). If you set a cap for performance, use a large value (e.g. 200000–300000) so older memories are still in the window; FTS always searches all. */
vectorSearchMaxChunks?: number;
/** Auto-recall skips normalized prompts shorter than this many characters. */
autoRecallMinQueryLength?: number;
};
dedup?: {
similarityThreshold?: number;
Expand All @@ -337,6 +339,7 @@ export const DEFAULTS = {
mmrLambda: 0.7,
recencyHalfLifeDays: 14,
vectorSearchMaxChunks: 0,
autoRecallMinQueryLength: 2,
dedupSimilarityThreshold: 0.80,
evidenceWrapperTag: "STORED_MEMORY",
excerptMinChars: 200,
Expand Down
160 changes: 160 additions & 0 deletions apps/memos-local-openclaw/tests/auto-recall-min-query-length.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
import { afterEach, describe, expect, it, vi } from "vitest";
import * as fs from "fs";
import * as os from "os";
import * as path from "path";
import type { MemosLocalConfig } from "../src/types";

type AutoRecallHook = (
event: { prompt?: string; messages?: unknown[] },
hookCtx?: { agentId?: string; sessionKey?: string },
) => Promise<unknown>;

const noopLog = {
debug() {},
info() {},
warn() {},
error() {},
};

async function registerPluginAndGetAutoRecallHook(opts: {
config: Partial<MemosLocalConfig>;
engineSearch: ReturnType<typeof vi.fn>;
}): Promise<AutoRecallHook> {
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "memos-auto-recall-min-query-"));
const handlers = new Map<string, AutoRecallHook>();

vi.doMock("../src/config", () => ({
buildContext: () => ({
stateDir: tmpDir,
workspaceDir: path.join(tmpDir, "workspace"),
config: {
storage: { dbPath: path.join(tmpDir, "memos.db") },
capture: { evidenceWrapperTag: "STORED_MEMORY" },
telemetry: {},
sharing: {
enabled: false,
role: "client",
hub: { port: 18800, teamName: "", teamToken: "" },
client: { hubAddress: "", userToken: "" },
capabilities: { hostEmbedding: false, hostCompletion: false, hostSkill: false },
},
skillEvolution: { autoRecallSkills: false },
...opts.config,
},
log: noopLog,
}),
}));
vi.doMock("../src/storage/ensure-binding", () => ({ ensureSqliteBinding: () => {} }));
vi.doMock("../src/storage/sqlite", () => ({ SqliteStore: class {
recordToolCall() {}
recordApiLog() {}
close() {}
} }));
vi.doMock("../src/embedding", () => ({ Embedder: class { provider = "mock"; } }));
vi.doMock("../src/ingest/worker", () => ({ IngestWorker: class {
getTaskProcessor() { return { onTaskCompleted() {} }; }
enqueue() {}
async flush() {}
} }));
vi.doMock("../src/recall/engine", () => ({ RecallEngine: class {
search = opts.engineSearch;
async searchSkills() { return []; }
} }));
vi.doMock("../src/ingest/providers", () => ({ Summarizer: class {
async filterRelevant() { return null; }
} }));
vi.doMock("../src/viewer/server", () => ({ ViewerServer: class {
async start() { return "http://127.0.0.1:18799"; }
stop() {}
getResetToken() { return "token"; }
} }));
vi.doMock("../src/hub/server", () => ({ HubServer: class {
async start() { return "http://127.0.0.1:18800"; }
async stop() {}
} }));
vi.doMock("../src/client/hub", () => ({
hubGetMemoryDetail: async () => ({}),
hubRequestJson: async () => ({}),
hubSearchMemories: async () => ({ hits: [], meta: {} }),
hubSearchSkills: async () => ({ hits: [] }),
resolveHubClient: async () => ({ hubUrl: "", userToken: "", userId: "" }),
}));
vi.doMock("../src/client/connector", () => ({
connectToHub: async () => ({ connected: false }),
getHubStatus: async () => ({ connected: false }),
}));
vi.doMock("../src/client/skill-sync", () => ({
fetchHubSkillBundle: async () => ({}),
publishSkillBundleToHub: async () => ({}),
restoreSkillBundleFromHub: () => ({}),
unpublishSkillBundleFromHub: async () => ({}),
}));
vi.doMock("../src/skill/evolver", () => ({ SkillEvolver: class { async onTaskCompleted() {} } }));
vi.doMock("../src/skill/installer", () => ({ SkillInstaller: class {
getCompanionManifest() { return null; }
install() { return { message: "ok" }; }
} }));
vi.doMock("../src/skill/bundled-memory-guide", () => ({ MEMORY_GUIDE_SKILL_MD: "# mock" }));
vi.doMock("../src/telemetry", () => ({ Telemetry: class {
trackToolCalled() {}
trackAutoRecall() {}
trackMemoryIngested() {}
trackSkillInstalled() {}
trackSkillEvolved() {}
trackPluginStarted() {}
trackError() {}
async shutdown() {}
} }));

const pluginModule = await import("../plugin-impl");
pluginModule.default.register({
id: "memos-local-openclaw-plugin",
pluginConfig: {},
config: { plugins: { entries: { "memos-local-openclaw-plugin": {} } } },
resolvePath: (p: string) => path.join(tmpDir, p.replace(/^~[\\/]/, "")),
logger: { info() {}, warn() {} },
registerTool: () => {},
registerMemoryCapability: () => {},
registerService: () => {},
on: (name: string, handler: AutoRecallHook) => {
handlers.set(name, handler);
},
} as any);

const hook = handlers.get("before_prompt_build");
if (!hook) throw new Error("before_prompt_build hook was not registered");
return hook;
}

afterEach(() => {
vi.resetModules();
vi.clearAllMocks();
});

describe("auto-recall min query length", () => {
it("skips auto-recall search when query is shorter than configured threshold", async () => {
const search = vi.fn(async () => ({ hits: [], meta: {} }));
const hook = await registerPluginAndGetAutoRecallHook({
config: { recall: { autoRecallMinQueryLength: 10 } },
engineSearch: search,
});

await hook({ prompt: "继续吧" }, { agentId: "main" });

expect(search).not.toHaveBeenCalled();
});

it("runs auto-recall search when query reaches configured threshold", async () => {
const search = vi.fn(async () => ({ hits: [], meta: {} }));
const hook = await registerPluginAndGetAutoRecallHook({
config: { recall: { autoRecallMinQueryLength: 10 } },
engineSearch: search,
});

await hook({ prompt: "remember deployment rollback preference" }, { agentId: "main" });

expect(search).toHaveBeenCalledWith(expect.objectContaining({
query: "remember deployment rollback preference",
}));
});
});
15 changes: 15 additions & 0 deletions apps/memos-local-openclaw/tests/config.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,21 @@ import { describe, expect, it } from "vitest";
import { resolveConfig } from "../src/config";

describe("resolveConfig", () => {
it("defaults autoRecallMinQueryLength to the existing two-character threshold", () => {
const resolved = resolveConfig(undefined, "/tmp/memos-config-test");

expect(resolved.recall?.autoRecallMinQueryLength).toBe(2);
});

it("preserves configured autoRecallMinQueryLength", () => {
const resolved = resolveConfig(
{ recall: { autoRecallMinQueryLength: 10 } },
"/tmp/memos-config-test",
);

expect(resolved.recall?.autoRecallMinQueryLength).toBe(10);
});

it("injects openclaw providers into existing blocks when host capabilities are enabled", () => {
const resolved = resolveConfig(
{
Expand Down
115 changes: 115 additions & 0 deletions docs/cn/open_source/evaluation/openai_memory_locomo_eval_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# OpenAI Memory 在 LoCoMo 上的评估指南

本文档简要概述了使用 LoCoMo 数据集对 OpenAI 的 Memory 功能进行评估的整体流程。

## 1. 简介

由于 OpenAI 的 [Memory 功能](https://openai.com/index/memory-and-new-controls-for-chatgpt/) 没有公开 API,因此评估需要手动进行。LoCoMo 数据集中的对话会被格式化并手动输入到 ChatGPT 网页界面中。生成的记忆随后从账号的记忆管理页面中获取并保存到本地。

为了评估这些记忆的质量,我们将通过 API 使用 `gpt-4o-mini` 模型。模型将被问及 LoCoMo 数据集中的问题,并提供相关对话的完整记忆历史作为上下文。这模拟了一个完美的记忆检索系统,为模型提供了最佳的回答信息。

## 2. 工作流程

### 步骤 2.1:生成用于记忆提取的输入上下文

运行以下 Python 脚本,为每个对话中的每个会话生成输入提示。该脚本将为每个会话创建一个单独的 `.txt` 文件,包含格式化的对话历史和提取提示。

**脚本:**
```python
import json
import os

# 确保数据集路径正确
LOCOMO_DATA_PATH = "data/locomo/locomo10.json"
SAVE_DIR = "openai_inputs"

os.makedirs(SAVE_DIR, exist_ok=True)

TEMPLATE = """Can you please extract relevant information from this conversation and create memory entries for each user mentioned? Please store these memories in your knowledge base in addition to the timestamp provided for future reference and personalized interactions.

{context}
"""

with open(LOCOMO_DATA_PATH, "r", encoding="utf-8") as f:
data = json.load(f)

for conv_idx, item in enumerate(data):
conv = item["conversation"]

for i in range(1, 35):
session_key = f"session_{i}"
session_dt_key = f"session_{i}_date_time"
if session_key not in conv:
continue

session = conv[session_key]
session_dt = conv[session_dt_key]

session_context = ""
for chat in session:
chat_str = f"({session_dt}) {chat['speaker']}: {chat['text']}\n"
session_context += chat_str

input_string = TEMPLATE.format(context=session_context)

output_filename = os.path.join(SAVE_DIR, f"{conv_idx}-D{i}.txt")
with open(output_filename, "w", encoding="utf-8") as f:
f.write(input_string)

print(f"Generated {len(os.listdir(SAVE_DIR))} input files in '{SAVE_DIR}' directory.")
```

**输入示例(`0-D9.txt`):**
```plaintext
Can you please extract relevant information from this conversation and create memory entries for each user mentioned? Please store these memories in your knowledge base in addition to the timestamp provided for future reference and personalized interactions.

(2:31 pm on 17 July, 2023) Melanie: Hey Caroline, hope all's good! I had a quiet weekend after we went camping with my fam two weekends ago. It was great to unplug and hang with the kids. What've you been up to? Anything fun over the weekend?
(2:31 pm on 17 July, 2023) Caroline: Hey Melanie! That sounds great! Last weekend I joined a mentorship program for LGBTQ youth - it's really rewarding to help the community.
... (rest of the conversation)
```

### 步骤 2.2:从 ChatGPT 中提取并保存记忆

1. **启用记忆功能:** 在 ChatGPT 中,前往 **设置(Settings) -> 个性化(Personalization)**,确保 **记忆(Memory)** 功能已开启。
2. **清除已有记忆:** 在处理新对话之前,点击 **管理(Manage)** -> **清除全部(Clear all)**,确保清除已有记忆。
3. **输入并验证:**
* 开启一个新的聊天。
* 确保模型设置为 **GPT-4o**。
* 复制生成的 `.txt` 文件的内容(例如 `0-D1.txt`)并粘贴到聊天中。
* 模型回复后,确认看到"记忆已更新"(Memory updated)的提示。
4. **保存记忆:**
* 点击记忆确认中的 **管理(Manage)**,查看新生成的记忆。
* 创建一个与输入文件同名的新本地 `.txt` 文件(例如 `0-D1.txt`)。
* 从 ChatGPT 中复制每条记忆并粘贴到新文件中,每条记忆占一行。
5. **为下一个对话重置记忆:**
* 一个对话的所有会话完成后,务必**删除所有记忆,以确保下一个对话从干净状态开始**。前往设置(Settings) -> 个性化(Personalization) -> 管理(Manage),点击删除全部(Delete all)。

**记忆输出示例(`0-D9.txt`):**
```plaintext
As of November 17, 2023, Dave has taken up photography and enjoys capturing nature scenes like sunsets, beaches, waves, rocks, and waterfalls.
Dave recently purchased a vintage camera that takes high-quality photos.
Dave discovered a serene park nearby with a peaceful spot featuring a bench under a tree with pink flowers.
As of November 17, 2023, Calvin attended a fancy gala in Boston where he had an inspiring conversation with an artist about music and art.
Calvin finds music a powerful connector and source of creativity.
Calvin took a photo in a Japanese garden that he shared with Dave.
Calvin accepted an invitation to perform at an upcoming show in Boston, expressing excitement about the musical experience.
```

### 步骤 2.3:合并记忆

记忆目前按会话分别保存。你需要编写一个简单的脚本,将同一对话的所有记忆合并到一个文件中。例如,`0-D1.txt`、`0-D2.txt` 等文件中的所有记忆应合并为一个 `conversation_0_memories.txt` 文件。


### 步骤 2.4:自动化评估

所有对话的记忆提取并保存完成后,可以运行自动化[评估脚本](../../../../evaluation/scripts/run_openai_eval.sh)。该脚本将处理生成答案、评估答案和计算指标的过程。

```bash
# 编辑 evaluation/scripts/run_openai_eval.sh 中的配置
evaluation/scripts/run_openai_eval.sh
```

## 3. 注意事项

- **账号差异:** 请注意免费账号和 Plus 账号之间可能存在差异,例如上下文长度限制和可存储的记忆数量。
- **粒度:** 评估过程在会话级别添加记忆。为确保高质量的记忆提取,应遵循相同的原则。一次性将整个对话提供给模型已被证明效果不佳,通常会导致模型忽略重要细节,从而造成大量信息丢失。
Loading