Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,9 @@
},
"[json]": {
"editor.defaultFormatter": "esbenp.prettier-vscode"
}
},
"i18n-ally.localesPaths": [
"src/renderer/src/i18n",
"src/renderer/src/locales"
]
}
178 changes: 136 additions & 42 deletions src/main/config/defaults.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,78 +12,172 @@ export const defaultSettings: AppSettings = {
defaultEmbeddingModel: undefined,
prompts: {
mindMap: {
'zh-CN': `你是知识结构分析专家,负责从笔记本内容中提炼核心知识结构
'zh-CN': `你是专业的知识结构分析专家。请仔细分析笔记本内容,提炼核心知识结构,生成层次清晰的思维导图

**重要:请用中文回复,所有节点标签必须使用中文。**
## 核心要求

**输出格式要求(必须严格遵守):**
你必须返回一个包含 rootNode 和 metadata 的 JSON 对象:
**1. 必须使用中文**
- 所有节点标签(label)必须使用中文
- 每个节点标签严格限制在 12 个汉字以内(包含标点符号)
- 标签要简洁有力,突出核心概念

**2. 结构层次要求**
- 深度:最多 4 层(根节点 level=0,最深子节点 level=3)
- 广度:每个父节点必须有 2-5 个子节点
- 平衡:尽量保持树形结构的平衡,避免某一分支过深或过浅

**3. 节点设计原则**
- 根节点:概括笔记本的整体主题
- 一级子节点:主要知识领域或章节
- 二级子节点:具体知识点或子主题
- 三级子节点:详细概念或实例

**4. 数据关联要求**
- chunkIds:如果节点内容来源于特定的文档片段,必须在 metadata.chunkIds 中列出相关的 chunk ID(不是推测,而是从提供的内容中实际存在的)
- 如果无法确定具体的 chunk ID,设置为空数组 [] 而非 null
- keywords:可选,提取该节点的 2-3 个关键词,设置为空数组 [] 而非 null

## 严格的输出格式(JSON)

\`\`\`json
{
"rootNode": {
"id": "节点唯一ID(字符串)",
"label": "节点标签(必须≤12字)",
"id": "0",
"label": "主题名称",
"metadata": {
"level": 0,
"chunkIds": ["相关chunk ID数组"],
"keywords": ["关键词数组(可选)"]
"chunkIds": [],
"keywords": ["关键词1", "关键词2"]
},
"children": [子节点数组,每个子节点结构相同]
"children": [
{
"id": "1",
"label": "子主题",
"metadata": {
"level": 1,
"chunkIds": [],
"keywords": []
},
"children": []
}
]
},
"metadata": {
"totalNodes": 总节点数(数字),
"maxDepth": 最大深度(数字)
"totalNodes": 实际节点总数,
"maxDepth": 实际最大深度
}
}
\`\`\`

**内容要求:**
1. **所有节点标签必须用中文,且严格 ≤ 12字**(非常重要!)
2. 层级深度 ≤ 4层(根节点level=0, 最深level=3)
3. 每个父节点必须有 2-5 个子节点
4. 每个节点的 id 必须唯一
5. 尽可能在 metadata.chunkIds 中关联相关的 chunk ID
6. totalNodes 必须等于实际节点总数
7. maxDepth 必须等于实际最大层级深度
## 字段说明

- **id**: 字符串,唯一标识符,建议使用数字编号
- **label**: 字符串,节点显示文本,≤12 个汉字
- **level**: 数字,0-3,表示层级深度
- **chunkIds**: 字符串数组,相关文档片段 ID,无关联时使用 []
- **keywords**: 字符串数组,可选关键词,不需要时使用 []
- **children**: 数组,子节点列表,叶子节点可省略或设为 []
- **totalNodes**: 数字,必须等于实际生成的节点总数
- **maxDepth**: 数字,必须等于实际的最大层级(0-3)

## 笔记本内容

**笔记本内容:**
{{CONTENT}}

请基于以上内容生成思维导图结构,严格按照格式要求返回 JSON。`,
'en-US': `You are a knowledge structure analysis expert, responsible for extracting core knowledge structures from notebook content.
## 生成指导

1. 先通读全部内容,识别主要主题和知识结构
2. 设计根节点,用一个精炼的短语概括整体
3. 将内容分解为 2-5 个主要领域作为一级子节点
4. 继续细化每个主要领域为 2-5 个知识点
5. 如需更深层次,再细化到具体概念(但不超过 3 级子节点)
6. 确保节点 ID 唯一且连续
7. 统计总节点数和最大深度,填入 metadata

请严格按照上述格式返回 JSON 对象。`,
'en-US': `You are a professional knowledge structure analysis expert. Please carefully analyze the notebook content, extract the core knowledge structure, and generate a well-organized mind map.

**IMPORTANT: Please respond in English. All node labels must be in English.**
## Core Requirements

**Output Format Requirements (MUST strictly follow):**
You must return a JSON object with rootNode and metadata:
**1. Language Requirement**
- All node labels must be in English
- Each node label strictly limited to 24 characters (including punctuation)
- Labels should be concise and highlight core concepts

**2. Structural Hierarchy Requirements**
- Depth: Maximum 4 levels (root node level=0, deepest child level=3)
- Breadth: Each parent node must have 2-5 child nodes
- Balance: Maintain balanced tree structure, avoid overly deep or shallow branches

**3. Node Design Principles**
- Root node: Summarize the overall theme of the notebook
- Level-1 children: Main knowledge domains or chapters
- Level-2 children: Specific knowledge points or sub-topics
- Level-3 children: Detailed concepts or examples

**4. Data Association Requirements**
- chunkIds: If node content comes from specific document fragments, must list relevant chunk IDs in metadata.chunkIds (from actual provided content, not speculation)
- If unable to determine specific chunk IDs, set to empty array [] instead of null
- keywords: Optional, extract 2-3 keywords for the node, set to [] instead of null when not needed

## Strict Output Format (JSON)

\`\`\`json
{
"rootNode": {
"id": "unique node ID (string)",
"label": "node label (must be ≤24 characters)",
"id": "0",
"label": "Topic Name",
"metadata": {
"level": 0,
"chunkIds": ["array of related chunk IDs"],
"keywords": ["array of keywords (optional)"]
"chunkIds": [],
"keywords": ["keyword1", "keyword2"]
},
"children": [array of child nodes, each with same structure]
"children": [
{
"id": "1",
"label": "Sub-topic",
"metadata": {
"level": 1,
"chunkIds": [],
"keywords": []
},
"children": []
}
]
},
"metadata": {
"totalNodes": total number of nodes (number),
"maxDepth": maximum depth (number)
"totalNodes": actual_total_node_count,
"maxDepth": actual_max_depth
}
}
\`\`\`

**Content Requirements:**
1. **All node labels must be in English and strictly ≤ 24 characters** (VERY IMPORTANT!)
2. Hierarchy depth ≤ 4 levels (root node level=0, deepest level=3)
3. Each parent node must have 2-5 child nodes
4. Each node's id must be unique
5. Associate relevant chunk IDs in metadata.chunkIds whenever possible
6. totalNodes must equal the actual total number of nodes
7. maxDepth must equal the actual maximum hierarchy depth
## Field Descriptions

- **id**: String, unique identifier, suggest using numeric sequence
- **label**: String, node display text, ≤24 characters
- **level**: Number, 0-3, indicates hierarchy depth
- **chunkIds**: String array, related document fragment IDs, use [] when no association
- **keywords**: String array, optional keywords, use [] when not needed
- **children**: Array, child node list, can be omitted or set to [] for leaf nodes
- **totalNodes**: Number, must equal actual generated node count
- **maxDepth**: Number, must equal actual maximum level (0-3)

## Notebook Content

**Notebook Content:**
{{CONTENT}}

Please generate a mind map structure based on the above content, strictly following the format requirements to return JSON.`
## Generation Guidelines

1. Read through all content, identify main themes and knowledge structure
2. Design root node, summarize overall theme in a concise phrase
3. Break down content into 2-5 main domains as level-1 children
4. Continue refining each main domain into 2-5 knowledge points
5. If deeper levels needed, refine to specific concepts (but not exceeding level-3 children)
6. Ensure node IDs are unique and sequential
7. Count total nodes and max depth, fill into metadata

Please strictly return JSON object following the above format.`
}
}
}
77 changes: 70 additions & 7 deletions src/main/db/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -195,20 +195,83 @@ export function initVectorStore() {
console.log('[Database] Initializing vector store...')

try {
// 创建向量索引虚拟表(如果不存在)
// 使用 cosine 距离度量,1024 维度(BAAI/bge-m3 默认维度)
// 创建向量表元数据表,用于记录每个笔记本的向量维度
sqlite.exec(`
CREATE VIRTUAL TABLE IF NOT EXISTS vec_embeddings USING vec0(
CREATE TABLE IF NOT EXISTS vec_metadata (
notebook_id TEXT PRIMARY KEY,
table_name TEXT NOT NULL,
dimensions INTEGER NOT NULL,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
`)

console.log('[Database] Vector store initialization completed (tables will be created per notebook)')
} catch (error) {
console.error('[Database] Failed to initialize vector store:', error)
throw error
}
}

/**
* 为指定笔记本创建向量表
* @param notebookId 笔记本 ID
* @param dimensions 向量维度
*/
export function createNotebookVectorTable(notebookId: string, dimensions: number) {
if (!sqlite) {
throw new Error('[Database] Database not initialized. Call initDatabase() first.')
}

const tableName = `vec_${notebookId.replace(/[^a-zA-Z0-9]/g, '_')}`

try {
// 检查元数据表中是否有记录
const metadata = sqlite
.prepare(`SELECT table_name, dimensions FROM vec_metadata WHERE notebook_id = ?`)
.get(notebookId) as { table_name: string; dimensions: number } | undefined

if (metadata) {
// 表已存在,检查维度是否匹配
if (metadata.dimensions !== dimensions) {
console.warn(
`[Database] Vector table ${metadata.table_name} exists with dimensions ${metadata.dimensions}, ` +
`but requested ${dimensions}. Dropping and recreating table.`
)

// 删除旧表
try {
sqlite.exec(`DROP TABLE IF EXISTS ${metadata.table_name}`)
} catch (err) {
console.error(`[Database] Failed to drop old table ${metadata.table_name}:`, err)
}

// 删除元数据
sqlite.prepare(`DELETE FROM vec_metadata WHERE notebook_id = ?`).run(notebookId)
} else {
console.log(`[Database] Vector table ${metadata.table_name} already exists with correct dimensions: ${dimensions}`)
return metadata.table_name
}
}

// 创建向量表
sqlite.exec(`
CREATE VIRTUAL TABLE ${tableName} USING vec0(
embedding_id TEXT PRIMARY KEY,
chunk_id TEXT,
notebook_id TEXT,
embedding FLOAT[1024] distance_metric=cosine
embedding FLOAT[${dimensions}] distance_metric=cosine
);
`)

console.log('[Database] Vector store initialized successfully')
// 记录元数据
sqlite
.prepare(`INSERT INTO vec_metadata (notebook_id, table_name, dimensions) VALUES (?, ?, ?)`)
.run(notebookId, tableName, dimensions)

console.log(`[Database] Created vector table ${tableName} with dimensions: ${dimensions}`)
return tableName
} catch (error) {
console.error('[Database] Failed to initialize vector store:', error)
console.error(`[Database] Failed to create vector table for notebook ${notebookId}:`, error)
throw error
}
}
Expand Down
10 changes: 1 addition & 9 deletions src/main/providers/base/AISDKProvider.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
import { createOpenAI } from '@ai-sdk/openai'
import { createOpenAICompatible } from '@ai-sdk/openai-compatible'
import { createDeepSeek } from '@ai-sdk/deepseek'
import { createQwen } from 'qwen-ai-provider'
import { createOllama } from 'ollama-ai-provider-v2'
import { streamText, embed, embedMany } from 'ai'
import type { BaseProvider, LLMProviderConfig } from '../capabilities/BaseProvider'
Expand Down Expand Up @@ -43,7 +42,6 @@ export class AISDKProvider implements BaseProvider {
| ReturnType<typeof createOpenAI>
| ReturnType<typeof createOpenAICompatible>
| ReturnType<typeof createDeepSeek>
| ReturnType<typeof createQwen>
| ReturnType<typeof createOllama>
| null = null

Expand Down Expand Up @@ -82,19 +80,13 @@ export class AISDKProvider implements BaseProvider {
baseURL: this.config.baseUrl,
apiKey: config.apiKey
})
} else if (this.name === 'qwen') {
// Qwen 使用社区 provider
this.aiProvider = createQwen({
baseURL: this.config.baseUrl,
apiKey: config.apiKey
})
} else if (this.name === 'ollama') {
// Ollama 使用社区 provider
this.aiProvider = createOllama({
baseURL: this.config.baseUrl || 'http://localhost:11434/api'
})
} else {
// 其他所有 provider 都使用 OpenAI Compatible (Kimi, SiliconFlow)
// 其他所有 provider 都使用 OpenAI Compatible (Qwen, Kimi, SiliconFlow)
this.aiProvider = createOpenAICompatible({
name: this.name,
baseURL: this.config.baseUrl || '',
Expand Down
8 changes: 4 additions & 4 deletions src/main/services/KnowledgeService.ts
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ export class KnowledgeService {
onProgress?.('generating_embeddings', 30)
const embeddingResults = await this.embeddingService.embedBatch(
chunkContents,
{ dimensions: 1024 }, // 显式指定 1024 维
{}, // 使用模型的原生维度
(completed, total) => {
const progress = 30 + (completed / total) * 50
onProgress?.('generating_embeddings', Math.round(progress))
Expand All @@ -225,7 +225,7 @@ export class KnowledgeService {
const detectedDimensions = embeddingResults.length > 0 ? embeddingResults[0].dimensions : 1536
if (embeddingResults.length > 0) {
vectorStoreManager.setDefaultDimensions(detectedDimensions)
Logger.debug('KnowledgeService', `Detected embedding dimensions: ${detectedDimensions}`)
Logger.info('KnowledgeService', `Detected embedding dimensions: ${detectedDimensions}`)
}

// 5. 保存嵌入元数据并添加到向量存储
Expand Down Expand Up @@ -398,7 +398,7 @@ export class KnowledgeService {
onProgress?.('generating_embeddings', 30)
const embeddingResults = await this.embeddingService.embedBatch(
chunkContents,
{ dimensions: 1024 }, // 显式指定 1024 维
{}, // 使用模型的原生维度
(completed, total) => {
const progress = 30 + (completed / total) * 50
onProgress?.('generating_embeddings', Math.round(progress))
Expand All @@ -409,7 +409,7 @@ export class KnowledgeService {
const detectedDimensions = embeddingResults.length > 0 ? embeddingResults[0].dimensions : 1536
if (embeddingResults.length > 0) {
vectorStoreManager.setDefaultDimensions(detectedDimensions)
Logger.debug('KnowledgeService', `Detected embedding dimensions: ${detectedDimensions}`)
Logger.info('KnowledgeService', `Detected embedding dimensions: ${detectedDimensions}`)
}

// 5. 保存嵌入元数据并添加到向量存储
Expand Down
Loading