Skip to content

Conversation

@jimmyken
Copy link

@jimmyken jimmyken commented Jan 5, 2026

主要改进:

向量存储架构优化

  • 实现每个笔记本独立的向量表(vec_{notebookId}),替代全局 vec_embeddings
  • 添加 vec_metadata 表追踪每个笔记本的向量维度
  • 支持动态向量维度(768, 1024, 1536等),自动检测嵌入模型输出维度
  • 修复维度不匹配错误(Expected 1024 dimensions but received 768)

AI 提供商兼容性

  • 修复 Qwen 提供商与 AI SDK v5 的兼容性问题
  • 将 Qwen 从 qwen-ai-provider 迁移到 @ai-sdk/openai-compatible
  • 解决 UnsupportedModelVersionError 错误

思维导图生成改进

  • 优化提示词结构,添加明确的格式要求和示例
  • 修复 schema 验证错误:chunkIds 和 keywords 字段支持 null 值
  • 增强内容聚合:MAX_CHUNKS_PER_DOC 从 10 提升到 30
  • 添加详细的调试日志(输入提示词、模型输出、错误信息)

技术细节

  • 修改文件:9个核心文件
  • 新增功能:动态向量表管理、维度自动检测
  • 性能优化:提升内容聚合效率
  • 调试改进:完整的输入输出日志追踪

此次重构解决了多个关键问题,提升了系统的灵活性和稳定性。

主要改进:

**向量存储架构优化**
- 实现每个笔记本独立的向量表(vec_{notebookId}),替代全局 vec_embeddings
- 添加 vec_metadata 表追踪每个笔记本的向量维度
- 支持动态向量维度(768, 1024, 1536等),自动检测嵌入模型输出维度
- 修复维度不匹配错误(Expected 1024 dimensions but received 768)

**AI 提供商兼容性**
- 修复 Qwen 提供商与 AI SDK v5 的兼容性问题
- 将 Qwen 从 qwen-ai-provider 迁移到 @ai-sdk/openai-compatible
- 解决 UnsupportedModelVersionError 错误

**思维导图生成改进**
- 优化提示词结构,添加明确的格式要求和示例
- 修复 schema 验证错误:chunkIds 和 keywords 字段支持 null 值
- 增强内容聚合:MAX_CHUNKS_PER_DOC 从 10 提升到 30
- 添加详细的调试日志(输入提示词、模型输出、错误信息)

**技术细节**
- 修改文件:9个核心文件
- 新增功能:动态向量表管理、维度自动检测
- 性能优化:提升内容聚合效率
- 调试改进:完整的输入输出日志追踪

此次重构解决了多个关键问题,提升了系统的灵活性和稳定性。
@MrSibe
Copy link
Owner

MrSibe commented Jan 15, 2026

Code review

Found 1 issue:

  1. Missing data migration - PR changes vector storage architecture from single vec_embeddings table to per-notebook tables (vec_{notebookId}), but provides no migration strategy. Existing users will lose all vector indexes on upgrade.

const tableName = `vec_${notebookId.replace(/[^a-zA-Z0-9]/g, '_')}`
try {
// 检查元数据表中是否有记录
const metadata = sqlite
.prepare(`SELECT table_name, dimensions FROM vec_metadata WHERE notebook_id = ?`)
.get(notebookId) as { table_name: string; dimensions: number } | undefined
if (metadata) {
// 表已存在,检查维度是否匹配
if (metadata.dimensions !== dimensions) {
console.warn(
`[Database] Vector table ${metadata.table_name} exists with dimensions ${metadata.dimensions}, ` +
`but requested ${dimensions}. Dropping and recreating table.`
)
// 删除旧表
try {
sqlite.exec(`DROP TABLE IF EXISTS ${metadata.table_name}`)
} catch (err) {
console.error(`[Database] Failed to drop old table ${metadata.table_name}:`, err)

The PR introduces a new createNotebookVectorTable() function that creates separate vector tables for each notebook, completely changing from the original architecture that used a single vec_embeddings table with notebook_id column (established in commit a785075). Without migration code, existing users' vector data will be inaccessible after upgrading.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@MrSibe
Copy link
Owner

MrSibe commented Jan 15, 2026

Code review

Found 1 issue:

  1. Missing data migration - PR changes vector storage architecture from single vec_embeddings table to per-notebook tables (vec_{notebookId}), but provides no migration strategy. Existing users will lose all vector indexes on upgrade.

const tableName = `vec_${notebookId.replace(/[^a-zA-Z0-9]/g, '_')}`
try {
// 检查元数据表中是否有记录
const metadata = sqlite
.prepare(`SELECT table_name, dimensions FROM vec_metadata WHERE notebook_id = ?`)
.get(notebookId) as { table_name: string; dimensions: number } | undefined
if (metadata) {
// 表已存在,检查维度是否匹配
if (metadata.dimensions !== dimensions) {
console.warn(
`[Database] Vector table ${metadata.table_name} exists with dimensions ${metadata.dimensions}, ` +
`but requested ${dimensions}. Dropping and recreating table.`
)
// 删除旧表
try {
sqlite.exec(`DROP TABLE IF EXISTS ${metadata.table_name}`)
} catch (err) {
console.error(`[Database] Failed to drop old table ${metadata.table_name}:`, err)

The PR introduces a new createNotebookVectorTable() function that creates separate vector tables for each notebook, completely changing from the original architecture that used a single vec_embeddings table with notebook_id column (established in commit a785075). Without migration code, existing users' vector data will be inaccessible after upgrading.

🤖 Generated with Claude Code

  • If this code review was useful, please react with 👍. Otherwise, react with 👎.

@jimmyken 感谢贡献!可以运行一下数据库迁移的指令吗?自动生成一下SQLite的迁移SQL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants