A personal semantic search engine for your documents and knowledge base
Features • Quick Start • Usage • Architecture • API Reference • Configuration • MCP Integration • Troubleshooting • Full Technical Writeup
Vector Knowledge Base is a vector database application that transforms your documents into a searchable knowledge base using semantic search. Upload PDFs, Word documents, PowerPoint, Excel, images (with OCR), and code files, then search using natural language to find exactly what you need.
- Semantic Search - Find documents by meaning, not just keywords
- Auto-Clustering - Automatically organize documents into semantic clusters using HDBSCAN (density-based clustering)
- Semantic Cluster Naming - Clusters are automatically named using TF-IDF keyword extraction (e.g., "Shakespeare & Drama", "Python & Programming")
- Cluster-Based Filtering - Filter search results by document clusters for more focused searches
- Batch Upload & Folder Preservation - Drag and drop entire folders to upload, automatically preserving folder structure in your knowledge base
- 3D Embedding Visualization - Interactive 3D visualization of your document embeddings using Three.js
- Multi-Format Support - PDF, DOCX, PPTX, XLSX, CSV, images (OCR), TXT, Markdown, and code files (Python, JavaScript, C#, etc.)
- Intelligent Chunking - AST-aware parsing for code, sentence-boundary awareness for prose
- Folder Organization - Drag-and-drop file management with custom folder hierarchy
- File Viewer - Double-click any file to preview it directly in the browser
- Multi-Page Navigation - Dedicated pages for search, documents, and file management
- Data Management - Export all data as ZIP or reset the entire database with one click
- Modern UI - Clean, responsive interface with dark mode and modular CSS architecture
- Vector Embeddings - Powered by SentenceTransformers (all-mpnet-base-v2, 768-dimensional embeddings)
- High-Performance Search - Qdrant vector database for sub-50ms search queries
- O(1) Document Listing - JSON-based document registry for instant document listing at any scale
- AI Agent Integration (MCP) - Connect Claude Desktop or other AI agents to search, create, and manage documents via Model Context Protocol
Clean, modern dark-mode interface with semantic search and filtering options
- Docker and Docker Compose (recommended)
- OR Python 3.11+ and Docker (for Performance Mode or Manual Installation)
The easiest way to run the entire application:
-
Clone the repository
git clone https://github.com/i3T4AN/Vector-Knowledge-Base.git cd Vector-Knowledge-Base -
Start all services with Docker Compose
docker-compose up -d
-
Open your browser
Navigate to
http://localhost:8001/index.html
That's it! Docker Compose will automatically:
- Start Qdrant vector database
- Build and start the backend API
- Start the frontend server with Nginx
Tip
On first run, the embedding model (~400MB) will be downloaded automatically. This may take a few minutes.
Managing the application:
# View logs
docker-compose logs -f
# Stop all services
docker-compose down
# Rebuild after code changes
docker-compose up -d --buildFor significantly faster embedding generation, run the backend natively with GPU support:
| Mode | Embedding Speed | Best For |
|---|---|---|
| Docker (CPU) | ~18s per batch | Cross-platform compatibility |
| Native (Apple M1/M2/M3) | ~3s per batch (6x faster) | Mac with Apple Silicon |
| Native (NVIDIA CUDA) | ~1s per batch (18x faster) | Windows/Linux with NVIDIA GPU |
Setup:
-
Start Qdrant and Frontend in Docker
docker-compose -f docker-compose.native.yml up -d # Or simply: docker-compose up -d qdrant frontend -
Run the backend natively
macOS/Linux:
./scripts/start-backend-native.sh
Windows:
scripts\start-backend-native.bat
The script will:
- Create a virtual environment
- Install dependencies
- Auto-detect your GPU (MPS for Apple Silicon, CUDA for NVIDIA)
- Start the backend with GPU acceleration
Note
GPU acceleration requires PyTorch with MPS support (macOS 12.3+) or CUDA toolkit (Windows/Linux with NVIDIA).
Deployment Options Summary:
| Mode | Command | GPU | Speed | Use Case |
|---|---|---|---|---|
| Full Docker | docker-compose up -d |
❌ | ~18s/batch | Production, cross-platform |
| Native (Mac/Linux) | ./scripts/start-backend-native.sh |
✅ | ~1-3s/batch | Development, large uploads |
| Native (Windows) | scripts\start-backend-native.bat |
✅ | ~1-3s/batch | Development, large uploads |
For development or if you prefer not to use Docker for the backend:
-
Clone the repository
git clone https://github.com/i3T4AN/Vector-Knowledge-Base.git cd Vector-Knowledge-Base -
Start Qdrant with Docker
docker run -d -p 6333:6333 -v ./qdrant_storage:/qdrant/storage:z qdrant/qdrant
-
Set up Python environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate python -m pip install -r requirements.txt
-
Start the backend server
cd backend python -m uvicorn main:app --reload --port 8000 --host 0.0.0.0 -
Start the frontend server
cd frontend python -m http.server 8001[!NOTE] On Mac, use
python3instead ofpythonif the command is not found. -
Open your browser
Navigate to
http://localhost:8001/index.html
Tip
On first run, the embedding model (~400MB) will be downloaded automatically. This may take a few minutes.
- Navigate to the My Documents page (
documents.html) - Drag and drop files or click to browse
- Batch Upload: Drop entire folders to upload multiple files at once
- Folder Preservation: Folder structure is automatically maintained in the "Files" tab
- Add metadata (course name, document type, tags)
- Click Upload
- Monitor progress in the Queue card for batch uploads
The backend will:
- Extract text from your files
- Split content into intelligent chunks
- Generate vector embeddings
- Store in Qdrant for fast retrieval
- Organize files in folders matching your source structure
Upload interface with drag-and-drop support, batch queue, and document management
- Navigate to the Search page (
index.html) - Enter your query in natural language
- Optionally filter by:
- Cluster - Filter results by document cluster (requires clustering first)
- Date range - Filter by upload date
- Result limit - Number of results to display (5, 10, or 20)
- Click Search to see ranked results with similarity scores
Semantic search results showing similarity scores and relevant text snippets
- Navigate to the Search page (
index.html) - Upload several documents first (clustering works best with 5+ documents)
- Click Auto-Cluster Documents
- The system will:
- Automatically determine the optimal number of clusters using HDBSCAN
- Group similar documents together using density-based clustering
- Generate semantic names for each cluster (e.g., "Python & Programming")
- Update document metadata with cluster assignments and names
- Use the Cluster filter to search within specific document groups (shown as "ID: Cluster Name")
Interactive 3D embedding space showing document clusters and search results with cluster information
Use the Files page (files.html) to:
- Create custom folders
- Drag files between folders
- View unsorted files in the sidebar
- Navigate with breadcrumb navigation
- Double-click any file to open it in the built-in file viewer
File management interface with folder hierarchy and drag-and-drop organization
In the My Documents tab, you can:
- Export Data - Download all uploaded files as a ZIP archive for backup
- Delete Data - Reset the entire database (requires confirmation)
- Clears all vector embeddings from Qdrant
- Removes all folder organization
- Deletes all uploaded files
- This action is irreversible
- Navigate to the Search page (index.html)
- Click Show 3D Embedding Space to reveal the interactive visualization
- Explore your document corpus in 3D space
- Enter a search query to see:
- Your query point highlighted in gold
- Top matching documents connected with colored lines
- Line colors indicating similarity (green = high, red = low)
- Hover over points to see document details
┌─────────────┐
│ Frontend │ Multi-Page Application
│ (Port 8001)│ index.html, documents.html, files.html
└──────┬──────┘
│ HTTP
▼
┌─────────────┐ ┌─────────────┐
│ Backend │ ←── │ MCP Server │ AI Agent Integration
│ (Port 8000)│ │ (/mcp) │ (Claude Desktop, etc.)
└──────┬──────┘ └─────────────┘
│
┌───┴────┬────────────┐
▼ ▼ ▼
┌──────┐ ┌──────┐ ┌──────────┐
│SQLite│ │Qdrant│ │Sentence │
│(Meta)│ │(Vec) │ │Transform │
└──────┘ └──────┘ └──────────┘
Port 6333
┌──────────┐ ┌───────────┐ ┌─────────┐ ┌──────────┐ ┌────────┐
│ Upload │ -> │ Extractor │ -> │ Chunker │ -> │ Embedder │ -> │ Qdrant │
│ (File) │ │ (Text) │ │ (Chunks)│ │(Vectors) │ │ (Store)│
└──────────┘ └───────────┘ └─────────┘ └──────────┘ └────────┘
How Chunks Relate to Documents:
- Each uploaded file is processed by the appropriate Extractor to extract raw text
- The Chunker splits the text into smaller pieces (default: 500 tokens with 50-token overlap)
- Each chunk is converted to a 768-dimensional vector by the Embedder (SentenceTransformers)
- Chunks are stored in Qdrant with metadata linking them back to the original document
- A single document may produce 10-100+ chunks depending on its length
- Search queries match against individual chunks, but results show which document they came from
Multi-Page Application (MPA):
index.html- Search interface with 3D visualizationdocuments.html- Document upload and managementfiles.html- File organization with drag-and-drop
Pages communicate with the backend API and share a modular CSS architecture.
Backend:
- FastAPI - Modern async web framework
- Qdrant - High-performance vector database (Dockerized)
- SentenceTransformers - State-of-the-art embeddings
- SQLite - Lightweight metadata storage
Frontend:
- Vanilla JavaScript (ES6+ modules)
- Modular CSS architecture (7 organized stylesheets)
- Three.js for 3D embedding visualization
- Fetch API for backend communication
Extractor Architecture:
The application uses a factory pattern for modular file processing:
- ExtractorFactory - Routes files to appropriate extractors based on file extension
- BaseExtractor - Interface that all extractors implement with
extract(file_path) → strmethod
Specialized Extractors:
- PDFExtractor - Uses
pypdffor PDF text extraction - DocxExtractor - Uses
docx2txtfor Word document parsing - PptxExtractor - Uses
python-pptxfor PowerPoint presentations - XlsxExtractor - Uses
openpyxlfor Excel spreadsheets with multi-sheet support - CsvExtractor - Uses
pandasfor CSV file processing with configurable delimiters - ImageExtractor - Uses
pytesseract+PILfor OCR on images (.jpg, .jpeg, .png, .webp) - TextExtractor - Handles plain text and Markdown files (.txt, .md)
- CodeExtractor - AST-aware parsing for Python code with function/class extraction
- CsExtractor - Dedicated C# file parsing with namespace and method detection
The frontend uses:
- base.css - CSS variables, reset, body, container
- animations.css - Keyframe animations and transitions
- components.css - Buttons, cards, forms, tables
- layout.css - Page-specific layouts
- filesystem.css - File manager UI
- batch-upload.css - Batch upload queue card and status indicators
- modals.css - Modal overlays and notifications
POST /upload
Content-Type: multipart/form-data
Parameters:
- file: File (required)
- category: string (required)
- tags: string[] (optional)
- relative_path: string (optional) - Folder path for batch uploads (e.g., "projects/homework")
Response: {
"filename": "doc.pdf",
"chunks_count": 42,
"document_id": "uuid"
}POST /search
Content-Type: application/json
Body: {
"query": "What is semantic search?",
"extension": ".pdf",
"start_date": "2024-01-01",
"end_date": "2024-12-31",
"limit": 10,
"cluster_filter": "0" // Optional: filter by cluster ID
}
Response: {
"results": [
{
"text": "chunk content",
"score": 0.89,
"metadata": {
"cluster": 0,
...
}
}
]
}GET /documents
Response: [
{
"filename": "doc.pdf",
"category": "CS101",
"upload_date": 1705320000.0
}
]DELETE /documents/{filename}
Response: {
"message": "Document deleted successfully"
}GET /folders- List all foldersPOST /folders- Create folderPUT /folders/{id}- Update folderDELETE /folders/{id}- Delete empty folderPOST /files/move- Move file to folderGET /files/unsorted- List unsorted filesGET /files/in_folders- Get file-to-folder mappingsGET /files/content/{filename}- Retrieve file content for viewing
POST /api/cluster
Response: {
"message": "Clustering complete",
"total_documents": 150,
"clusters": 5
}
# Automatically clusters all documents in the database
# Automatically determines optimal number of clusters using HDBSCAN density-based algorithmGET /api/clusters
Response: {
"clusters": [0, 1, 2, 3, 4]
}
# Returns list of all cluster IDs currently assigned to documentsGET /api/embeddings/3d
Response: {
"coords": [[x, y, z], ...], // PCA-reduced 3D coordinates
"point_ids": ["uuid1", ...],
"metadata": [{"filename": "doc.pdf", ...}, ...]
}
# Returns 3D coordinates for all document chunks (cached for performance)POST /api/embeddings/3d/query
Content-Type: application/json
Body: {
"query": "machine learning",
"k": 5 // Number of nearest neighbors
}
Response: {
"query_coords": [x, y, z],
"neighbors": [{"id": "uuid", "coords": [x, y, z], "score": 0.89}, ...]
}
# Transforms a search query to 3D space and finds nearest neighborsPOST /upload-batch
Content-Type: multipart/form-data
Parameters:
- files: File[] (required) - Multiple files to upload
- category: string (required)
- tags: string[] (optional)
- relative_path: string (optional) - Shared folder path for all files
Response: {
"results": [...], // Array of upload results
"total": 10,
"successful": 10,
"failed": 0
}
# Optimized batch upload for files sharing the same folderGET /api/jobs
Response: {
"jobs": [
{"id": "uuid", "type": "clustering", "status": "completed", "progress": 100}
]
}
# List all background jobs (clustering, etc.)GET /api/jobs/{job_id}
Response: {
"id": "uuid",
"type": "clustering",
"status": "running",
"progress": 45,
"created_at": "2024-01-15T10:30:00",
"message": "Processing..."
}
# Get status of a specific background jobGET /export
Response: application/zip
# Downloads a ZIP archive of all uploaded filesDELETE /reset
Response: {
"status": "success",
"message": "All data has been reset"
}
# WARNING: Irreversibly deletes all dataCreate a .env file in the project root directory (copy from .env.example):
# Qdrant Configuration
QDRANT_HOST=localhost # Default: "localhost". Use "qdrant" when running in Docker Compose
QDRANT_PORT=6333 # Default: 6333
QDRANT_COLLECTION=vector_db # Default: "vector_db"
# File Upload Settings
UPLOAD_DIR=uploads # Default: "uploads" (relative to backend directory)
MAX_FILE_SIZE=52428800 # Default: 50MB (50 * 1024 * 1024 bytes)
# Embedding Model
EMBEDDING_MODEL=all-mpnet-base-v2 # Default: "all-mpnet-base-v2" (768-dimensional)
# Compute Device (for native mode)
DEVICE=auto # Options: "auto", "cpu", "cuda", "mps"
# auto = detect best available (MPS > CUDA > CPU)
# Chunking Settings
CHUNK_SIZE=500 # Default: 500 characters per chunk
CHUNK_OVERLAP=50 # Default: 50 characters overlap between chunks
# Security
ADMIN_KEY= # Optional: protects /reset endpoint. Leave empty to disable.
# Rate Limiting (High defaults for personal use)
RATE_LIMIT_UPLOAD=1000/minute # Default: 1000/minute (won't affect normal use)
RATE_LIMIT_SEARCH=1000/minute # Default: 1000/minute
RATE_LIMIT_RESET=60/minute # Default: 60/minute (stricter for destructive ops)Note
When using Docker Compose, QDRANT_HOST is automatically set to qdrant (the service name) in docker-compose.yml. You only need a .env file for manual installations or to override defaults.
The Vector Knowledge Base includes built-in support for the Model Context Protocol (MCP), allowing AI agents like Claude Desktop to interact with your knowledge base directly.
- Node.js 18+ - Required for the MCP bridge
- Download from nodejs.org (LTS recommended)
- Or on macOS:
brew install node
-
Ensure the backend is running
# Docker mode docker-compose up -d # OR Native mode - macOS/Linux ./scripts/start-backend-native.sh # OR Native mode - Windows scripts\start-backend-native.bat
-
Locate the Claude Desktop config file
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
[!TIP] In Claude Desktop: Claude → Settings → Developer → Edit Config
- macOS:
-
Add the MCP server configuration
Edit
claude_desktop_config.json:{ "mcpServers": { "vector-knowledge-base": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:8000/mcp"] } } }[!NOTE] On macOS, if
npxis not in PATH, use the full path:/usr/local/bin/npx -
Restart Claude Desktop
- Fully quit (Cmd+Q / Alt+F4), don't just close the window
- Reopen Claude Desktop
- The MCP tools should now be available
Once connected, just ask Claude naturally:
| Example Prompt | Action |
|---|---|
| "Search my knowledge base for machine learning" | Semantic search |
| "List all documents in my knowledge base" | List documents |
| "Show me the document clusters" | Get clusters |
| "Run auto-clustering on my documents" | Cluster documents |
| "Check if my vector database is healthy" | Health check |
| "Get 3D embedding data for cluster 1" | Visualization data |
| "Create a summary document with my notes" | Create text document |
Claude Desktop searching and listing documents via MCP integration
Important
Claude Desktop has limitations when interacting with the knowledge base via MCP.
What Claude CAN do:
- ✅ Search documents semantically
- ✅ List all documents and folders
- ✅ Delete documents by filename
- ✅ Run clustering and get cluster info
- ✅ Get 3D embedding coordinates for visualization
- ✅ Check system health
- ✅ Create text documents (.txt, .md, .json) - Claude can generate content and save it to your knowledge base
What Claude CANNOT do:
- ❌ Upload binary files - PDFs, Word docs, images require multipart uploads which MCP cannot provide (at least from what I found with Claude Desktop)
- ❌ Access your filesystem - Claude cannot read files from paths like
/Users/.../file.pdf
To upload files, use one of these methods instead:
- Web interface at
http://localhost:8001/documents.html - curl command:
curl -X POST http://localhost:8000/upload \ -F "file=@/path/to/document.pdf" \ -F "category=my-category"
| Tool | Description |
|---|---|
health_check |
Check if the API is running |
get_allowed_extensions |
Get list of supported file types |
search_documents |
Semantic search across all documents |
list_documents |
List all uploaded documents |
delete_document |
Delete a document by filename |
get_folders |
List folder structure |
create_folder |
Create a new folder |
update_folder |
Rename or move a folder |
delete_folder |
Delete an empty folder |
move_file |
Move file to folder |
get_unsorted_files |
List files not in any folder |
get_files_in_folders |
Get file-to-folder mappings |
cluster_documents |
Run auto-clustering |
get_clusters |
Get cluster information |
get_embeddings_3d |
Get 3D visualization coordinates |
transform_query_3d |
Project query into 3D space |
get_job_status |
Check background job progress |
mcp_create_document |
Create text documents (.txt, .md, .json) |
Tip
Claude can create searchable text documents using mcp_create_document. Ask it to "create a summary", "write notes", or "save a document" and it will add the content to your knowledge base.
MCP settings are configured in config.py (not in .env):
MCP_ENABLED=true # Enable/disable MCP endpoint
MCP_PATH=/mcp # URL path for MCP server
MCP_NAME=Vector Knowledge Base # Display name
MCP_AUTH_ENABLED=false # Enable OAuth (production)"Server disconnected" error in Claude Desktop:
- Ensure the backend is running:
curl http://localhost:8000/health - Check that Node.js is installed:
node --version - Try the full path to npx:
/usr/local/bin/npx
MCP tools not appearing:
- Fully quit and reopen Claude Desktop
- Check the Claude Desktop logs for errors
- Verify the config JSON is valid (no trailing commas)
Caution
MCP provides AI agents with full access to your knowledge base. In production environments, enable MCP_AUTH_ENABLED=true for OAuth protection.
If you see "Connection refused" errors:
# Check if Qdrant is running
docker ps
# Restart Qdrant container
docker restart <container-id>
# Or start a new container
docker run -d -p 6333:6333 -v ./qdrant_storage:/qdrant/storage:z qdrant/qdrantWarning
Dependency conflicts between sentence-transformers and huggingface-hub can cause startup failures.
Solution:
pip install --upgrade sentence-transformers huggingface-hubCheck supported file types:
- Documents:
.pdf,.docx,.pptx,.ppt,.xlsx,.csv,.txt,.md - Images:
.jpg,.jpeg,.png,.webp(OCR-processed) - Code:
.py,.js,.java,.cpp,.html,.css,.json,.xml,.yaml,.yml,.cs
Maximum file size: 50MB (configurable)
If you see "Failed to fetch" errors in the browser console:
-
Verify backend is running on port 8000:
curl http://127.0.0.1:8000/health
-
Check
frontend/config.jsuses127.0.0.1(notlocalhost):const API_BASE_URL = 'http://127.0.0.1:8000';
This avoids IPv6/IPv4 resolution issues on some systems.
Vector-Knowledge-Base/
├── backend/
│ ├── extractors/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── factory.py
│ │ ├── pdf_extractor.py
│ │ ├── docx_extractor.py
│ │ ├── pptx_extractor.py
│ │ ├── xlsx_extractor.py
│ │ ├── csv_extractor.py
│ │ ├── text_extractor.py
│ │ ├── code_extractor.py
│ │ ├── cs_extractor.py
│ │ └── image_extractor.py
│ ├── uploads/ # Uploaded files (gitignored except .gitkeep)
│ │ └── .gitkeep
│ ├── data/ # Runtime data (auto-created)
│ │ └── documents.json # Document registry for O(1) listing
│ ├── main.py
│ ├── vector_db.py
│ ├── embedding_service.py
│ ├── ingestion.py
│ ├── chunker.py
│ ├── clustering.py
│ ├── filesystem_db.py
│ ├── document_registry.py # O(1) document listing registry
│ ├── dimensionality_reduction.py
│ ├── jobs.py # Background task tracking
│ ├── config.py
│ ├── constants.py # Shared constants
│ ├── mcp_server.py # MCP server integration
│ └── exceptions.py
├── frontend/
│ ├── css/
│ │ ├── base.css
│ │ ├── animations.css
│ │ ├── components.css
│ │ ├── layout.css
│ │ ├── filesystem.css
│ │ ├── batch-upload.css
│ │ └── modals.css
│ ├── js/
│ │ └── embedding-visualizer.js
│ ├── index.html
│ ├── documents.html
│ ├── files.html
│ ├── config.js
│ ├── constants.js
│ ├── search.js
│ ├── upload.js
│ ├── documents.js
│ ├── filesystem.js
│ ├── notifications.js
│ └── favicon.ico
├── scripts/
│ ├── start-backend-native.sh # GPU mode startup (Unix)
│ └── start-backend-native.bat # GPU mode startup (Windows)
├── screenshots/
├── Docs/
│ └── Vector_Knowledge_Base_Technical_Report.pdf # Full technical documentation
├── qdrant_storage/ # Created at runtime (gitignored)
├── uploads/ # Created at runtime by Docker (gitignored)
├── backend_db/ # Created at runtime (gitignored)
├── Dockerfile
├── docker-compose.yml # Full Docker deployment
├── docker-compose.native.yml # Native backend mode
├── nginx.conf
├── requirements.txt
├── requirements.in
├── LICENSE
└── README.md
- Upload: ~2-5 seconds for typical PDF
- Search: 100-500ms depending on corpus size (sub-50ms for <10k vectors)
- Embedding: ~50-100ms per chunk
- Capacity: Scales to 100k+ documents with Qdrant
Built with ❤️ using FastAPI, Qdrant, and SentenceTransformers



