Skip to content

Comments

Local rag#88

Open
BenBritons wants to merge 5 commits intolsfusion:masterfrom
BenBritons:local-rag
Open

Local rag#88
BenBritons wants to merge 5 commits intolsfusion:masterfrom
BenBritons:local-rag

Conversation

@BenBritons
Copy link

Summary

  • Added a local RAG pipeline for LSF sources with Lucene vector indexing and ONNX-based embeddings.
  • Wired startup indexing and VFS-based reindexing to keep embeddings up to date.
  • Added model download + native ONNX extraction to make runIde reliable on Windows.

What changed vs master

New files

  • src/com/lsfusion/mcp/LocalMcpRagService.java — local RAG service: indexing, vector storage, query scoring.
  • src/com/lsfusion/mcp/OnnxEmbeddingProvider.java — ONNX Runtime embedding inference.
  • src/com/lsfusion/mcp/EmbeddingProvider.java — embedding provider interface.
  • src/com/lsfusion/mcp/LSFMcpRagFileListener.java — VFS listener to reindex on file changes.

Updated

  • src/com/lsfusion/LSFBaseStartupActivity.java — startup indexing + listener registration.
  • src/com/lsfusion/mcp/McpServerService.java, src/com/lsfusion/mcp/McpToolset.kt, src/com/lsfusion/mcp/MCPSearchUtils.java — integrate local RAG search results.
  • build.gradle.kts — new dependencies + model download + native extraction + runIde JVM args.

How it works

  1. Indexing
  • On startup, LocalMcpRagService scans all LSF files, extracts MCP declarations, builds a text payload and computes embeddings.
  • Each record is stored in Lucene with the vector saved as a binary field.
  • A VFS listener triggers reindexing on file change/delete.
  1. Query
  • Input query is embedded with ONNX Runtime.
  • Results are scored via dot‑product against stored vectors and top‑K matches returned.
  1. Model + native libs
  • downloadE5Model fetches model.onnx + tokenizer.json into .mcp-model.
  • ONNX native DLLs are extracted into build/onnxruntime-native.
  • runIde sets onnxruntime.native.path and a stable temp dir for reliable loading on Windows.

Technologies

  • ONNX Runtime (Java) — CPU embeddings.
  • DJL HuggingFace Tokenizers — tokenization from tokenizer.json.
  • Lucene — vector storage and scoring.
  • IntelliJ Platform APIs — startup activity, VFS listener, DumbService guard.

Test plan

  • runIde
  • Wait for initial indexing
  • Run MCP search and verify relevance
  • Modify an LSF file and verify search reflects the change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants