Vector Knowledge Base

A personal semantic search engine for your documents and knowledge base

Features • Quick Start • Usage • Architecture • API Reference • Configuration • MCP Integration • Troubleshooting • Full Technical Writeup

Vector Knowledge Base is a vector database application that transforms your documents into a searchable knowledge base using semantic search. Upload PDFs, Word documents, PowerPoint, Excel, images (with OCR), and code files, then search using natural language to find exactly what you need.

Features

Semantic Search - Find documents by meaning, not just keywords
Auto-Clustering - Automatically organize documents into semantic clusters using HDBSCAN (density-based clustering)
Semantic Cluster Naming - Clusters are automatically named using TF-IDF keyword extraction (e.g., "Shakespeare & Drama", "Python & Programming")
Cluster-Based Filtering - Filter search results by document clusters for more focused searches
Batch Upload & Folder Preservation - Drag and drop entire folders to upload, automatically preserving folder structure in your knowledge base
3D Embedding Visualization - Interactive 3D visualization of your document embeddings using Three.js
Multi-Format Support - PDF, DOCX, PPTX, XLSX, CSV, images (OCR), TXT, Markdown, and code files (Python, JavaScript, C#, etc.)
Intelligent Chunking - AST-aware parsing for code, sentence-boundary awareness for prose
Folder Organization - Drag-and-drop file management with custom folder hierarchy
File Viewer - Double-click any file to preview it directly in the browser
Multi-Page Navigation - Dedicated pages for search, documents, and file management
Data Management - Export all data as ZIP or reset the entire database with one click
Modern UI - Clean, responsive interface with dark mode and modular CSS architecture
Vector Embeddings - Powered by SentenceTransformers (all-mpnet-base-v2, 768-dimensional embeddings)
High-Performance Search - Qdrant vector database for sub-50ms search queries
O(1) Document Listing - JSON-based document registry for instant document listing at any scale
AI Agent Integration (MCP) - Connect Claude Desktop or other AI agents to search, create, and manage documents via Model Context Protocol

Clean, modern dark-mode interface with semantic search and filtering options

Quick Start

Prerequisites

Docker and Docker Compose (recommended)
OR Python 3.11+ and Docker (for Performance Mode or Manual Installation)

Option 1: Docker Deployment (Recommended)

The easiest way to run the entire application:

Clone the repository

git clone https://github.com/i3T4AN/Vector-Knowledge-Base.git
cd Vector-Knowledge-Base

Start all services with Docker Compose
```
docker-compose up -d
```
Open your browser

Navigate to http://localhost:8001/index.html

That's it! Docker Compose will automatically:

Start Qdrant vector database
Build and start the backend API
Start the frontend server with Nginx

Tip

On first run, the embedding model (~400MB) will be downloaded automatically. This may take a few minutes.

Managing the application:

# View logs
docker-compose logs -f

# Stop all services
docker-compose down

# Rebuild after code changes
docker-compose up -d --build

Option 2: Performance Mode (GPU Acceleration)

For significantly faster embedding generation, run the backend natively with GPU support:

Mode	Embedding Speed	Best For
Docker (CPU)	~18s per batch	Cross-platform compatibility
Native (Apple M1/M2/M3)	~3s per batch (6x faster)	Mac with Apple Silicon
Native (NVIDIA CUDA)	~1s per batch (18x faster)	Windows/Linux with NVIDIA GPU

Setup:

Start Qdrant and Frontend in Docker

docker-compose -f docker-compose.native.yml up -d
# Or simply:
docker-compose up -d qdrant frontend

Run the backend natively

macOS/Linux:

./scripts/start-backend-native.sh

Windows:

scripts\start-backend-native.bat

The script will:

Create a virtual environment
Install dependencies
Auto-detect your GPU (MPS for Apple Silicon, CUDA for NVIDIA)
Start the backend with GPU acceleration

Note

GPU acceleration requires PyTorch with MPS support (macOS 12.3+) or CUDA toolkit (Windows/Linux with NVIDIA).

Deployment Options Summary:

Mode	Command	GPU	Speed	Use Case
Full Docker	`docker-compose up -d`	❌	~18s/batch	Production, cross-platform
Native (Mac/Linux)	`./scripts/start-backend-native.sh`	✅	~1-3s/batch	Development, large uploads
Native (Windows)	`scripts\start-backend-native.bat`	✅	~1-3s/batch	Development, large uploads

Option 3: Manual Installation (Not Recommended)

For development or if you prefer not to use Docker for the backend:

Clone the repository

git clone https://github.com/i3T4AN/Vector-Knowledge-Base.git
cd Vector-Knowledge-Base

Start Qdrant with Docker

docker run -d -p 6333:6333 -v ./qdrant_storage:/qdrant/storage:z qdrant/qdrant

Set up Python environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
python -m pip install -r requirements.txt

Start the backend server

cd backend
python -m uvicorn main:app --reload --port 8000 --host 0.0.0.0

Start the frontend server
```
cd frontend
python -m http.server 8001
```
[!NOTE] On Mac, use python3 instead of python if the command is not found.
Open your browser

Navigate to http://localhost:8001/index.html

Tip

On first run, the embedding model (~400MB) will be downloaded automatically. This may take a few minutes.

Usage

Uploading Documents

Navigate to the My Documents page (documents.html)
Drag and drop files or click to browse
- Batch Upload: Drop entire folders to upload multiple files at once
- Folder Preservation: Folder structure is automatically maintained in the "Files" tab
Add metadata (course name, document type, tags)
Click Upload
Monitor progress in the Queue card for batch uploads

The backend will:

Extract text from your files
Split content into intelligent chunks
Generate vector embeddings
Store in Qdrant for fast retrieval
Organize files in folders matching your source structure

Upload interface with drag-and-drop support, batch queue, and document management

Searching

Navigate to the Search page (index.html)
Enter your query in natural language
Optionally filter by:
- Cluster - Filter results by document cluster (requires clustering first)
- Date range - Filter by upload date
- Result limit - Number of results to display (5, 10, or 20)
Click Search to see ranked results with similarity scores

Semantic search results showing similarity scores and relevant text snippets

Auto-Clustering Documents

Navigate to the Search page (index.html)
Upload several documents first (clustering works best with 5+ documents)
Click Auto-Cluster Documents
The system will:
- Automatically determine the optimal number of clusters using HDBSCAN
- Group similar documents together using density-based clustering
- Generate semantic names for each cluster (e.g., "Python & Programming")
- Update document metadata with cluster assignments and names
Use the Cluster filter to search within specific document groups (shown as "ID: Cluster Name")

Interactive 3D embedding space showing document clusters and search results with cluster information

Organizing Files

Use the Files page (files.html) to:

Create custom folders
Drag files between folders
View unsorted files in the sidebar
Navigate with breadcrumb navigation
Double-click any file to open it in the built-in file viewer

File management interface with folder hierarchy and drag-and-drop organization

Data Management

In the My Documents tab, you can:

Export Data - Download all uploaded files as a ZIP archive for backup
Delete Data - Reset the entire database (requires confirmation)
- Clears all vector embeddings from Qdrant
- Removes all folder organization
- Deletes all uploaded files
- This action is irreversible

3D Visualization

Navigate to the Search page (index.html)
Click Show 3D Embedding Space to reveal the interactive visualization
Explore your document corpus in 3D space
Enter a search query to see:
- Your query point highlighted in gold
- Top matching documents connected with colored lines
- Line colors indicating similarity (green = high, red = low)
Hover over points to see document details

Architecture

System Overview

┌─────────────┐
│   Frontend  │  Multi-Page Application
│  (Port 8001)│  index.html, documents.html, files.html
└──────┬──────┘
       │ HTTP
       ▼
┌─────────────┐     ┌─────────────┐
│   Backend   │ ←── │  MCP Server │  AI Agent Integration
│  (Port 8000)│     │  (/mcp)     │  (Claude Desktop, etc.)
└──────┬──────┘     └─────────────┘
       │
   ┌───┴────┬────────────┐
   ▼        ▼            ▼
┌──────┐ ┌──────┐ ┌──────────┐
│SQLite│ │Qdrant│ │Sentence  │
│(Meta)│ │(Vec) │ │Transform │
└──────┘ └──────┘ └──────────┘
         Port 6333

Document Processing Pipeline

┌──────────┐    ┌───────────┐    ┌─────────┐    ┌──────────┐    ┌────────┐
│  Upload  │ -> │ Extractor │ -> │ Chunker │ -> │ Embedder │ -> │ Qdrant │
│  (File)  │    │  (Text)   │    │ (Chunks)│    │(Vectors) │    │ (Store)│
└──────────┘    └───────────┘    └─────────┘    └──────────┘    └────────┘

How Chunks Relate to Documents:

Each uploaded file is processed by the appropriate Extractor to extract raw text
The Chunker splits the text into smaller pieces (default: 500 tokens with 50-token overlap)
Each chunk is converted to a 768-dimensional vector by the Embedder (SentenceTransformers)
Chunks are stored in Qdrant with metadata linking them back to the original document
A single document may produce 10-100+ chunks depending on its length
Search queries match against individual chunks, but results show which document they came from

Frontend Architecture

Multi-Page Application (MPA):

index.html - Search interface with 3D visualization
documents.html - Document upload and management
files.html - File organization with drag-and-drop

Pages communicate with the backend API and share a modular CSS architecture.

Tech Stack

Backend:

FastAPI - Modern async web framework
Qdrant - High-performance vector database (Dockerized)
SentenceTransformers - State-of-the-art embeddings
SQLite - Lightweight metadata storage

Frontend:

Vanilla JavaScript (ES6+ modules)
Modular CSS architecture (7 organized stylesheets)
Three.js for 3D embedding visualization
Fetch API for backend communication

Extractor Architecture:

The application uses a factory pattern for modular file processing:

ExtractorFactory - Routes files to appropriate extractors based on file extension
BaseExtractor - Interface that all extractors implement with extract(file_path) → str method

Specialized Extractors:

PDFExtractor - Uses pypdf for PDF text extraction
DocxExtractor - Uses docx2txt for Word document parsing
PptxExtractor - Uses python-pptx for PowerPoint presentations
XlsxExtractor - Uses openpyxl for Excel spreadsheets with multi-sheet support
CsvExtractor - Uses pandas for CSV file processing with configurable delimiters
ImageExtractor - Uses pytesseract + PIL for OCR on images (.jpg, .jpeg, .png, .webp)
TextExtractor - Handles plain text and Markdown files (.txt, .md)
CodeExtractor - AST-aware parsing for Python code with function/class extraction
CsExtractor - Dedicated C# file parsing with namespace and method detection

CSS Architecture

The frontend uses:

base.css - CSS variables, reset, body, container
animations.css - Keyframe animations and transitions
components.css - Buttons, cards, forms, tables
layout.css - Page-specific layouts
filesystem.css - File manager UI
batch-upload.css - Batch upload queue card and status indicators
modals.css - Modal overlays and notifications

API Reference

Core Endpoints

Upload Document

POST /upload
Content-Type: multipart/form-data

Parameters:
- file: File (required)
- category: string (required)
- tags: string[] (optional)
- relative_path: string (optional) - Folder path for batch uploads (e.g., "projects/homework")

Response: {
  "filename": "doc.pdf",
  "chunks_count": 42,
  "document_id": "uuid"
}

Search

POST /search
Content-Type: application/json

Body: {
  "query": "What is semantic search?",
  "extension": ".pdf",
  "start_date": "2024-01-01",
  "end_date": "2024-12-31",
  "limit": 10,
  "cluster_filter": "0"  // Optional: filter by cluster ID
}

Response: {
  "results": [
    {
      "text": "chunk content",
      "score": 0.89,
      "metadata": {
        "cluster": 0,
        ...
      }
    }
  ]
}

List Documents

GET /documents

Response: [
  {
    "filename": "doc.pdf",
    "category": "CS101",
    "upload_date": 1705320000.0
  }
]

Delete Document

DELETE /documents/{filename}

Response: {
  "message": "Document deleted successfully"
}

Folder Management

GET /folders - List all folders
POST /folders - Create folder
PUT /folders/{id} - Update folder
DELETE /folders/{id} - Delete empty folder
POST /files/move - Move file to folder
GET /files/unsorted - List unsorted files
GET /files/in_folders - Get file-to-folder mappings
GET /files/content/{filename} - Retrieve file content for viewing

Clustering

POST /api/cluster

Response: {
  "message": "Clustering complete",
  "total_documents": 150,
  "clusters": 5
}
# Automatically clusters all documents in the database
# Automatically determines optimal number of clusters using HDBSCAN density-based algorithm

GET /api/clusters

Response: {
  "clusters": [0, 1, 2, 3, 4]
}
# Returns list of all cluster IDs currently assigned to documents

3D Visualization

GET /api/embeddings/3d

Response: {
  "coords": [[x, y, z], ...],  // PCA-reduced 3D coordinates
  "point_ids": ["uuid1", ...],
  "metadata": [{"filename": "doc.pdf", ...}, ...]
}
# Returns 3D coordinates for all document chunks (cached for performance)

POST /api/embeddings/3d/query
Content-Type: application/json

Body: {
  "query": "machine learning",
  "k": 5  // Number of nearest neighbors
}

Response: {
  "query_coords": [x, y, z],
  "neighbors": [{"id": "uuid", "coords": [x, y, z], "score": 0.89}, ...]
}
# Transforms a search query to 3D space and finds nearest neighbors

Batch Upload

POST /upload-batch
Content-Type: multipart/form-data

Parameters:
- files: File[] (required) - Multiple files to upload
- category: string (required)
- tags: string[] (optional)
- relative_path: string (optional) - Shared folder path for all files

Response: {
  "results": [...],  // Array of upload results
  "total": 10,
  "successful": 10,
  "failed": 0
}
# Optimized batch upload for files sharing the same folder

Job Management

GET /api/jobs

Response: {
  "jobs": [
    {"id": "uuid", "type": "clustering", "status": "completed", "progress": 100}
  ]
}
# List all background jobs (clustering, etc.)

GET /api/jobs/{job_id}

Response: {
  "id": "uuid",
  "type": "clustering",
  "status": "running",
  "progress": 45,
  "created_at": "2024-01-15T10:30:00",
  "message": "Processing..."
}
# Get status of a specific background job

Data Management

GET /export

Response: application/zip
# Downloads a ZIP archive of all uploaded files

DELETE /reset

Response: {
  "status": "success",
  "message": "All data has been reset"
}
# WARNING: Irreversibly deletes all data

Configuration

Create a .env file in the project root directory (copy from .env.example):

# Qdrant Configuration
QDRANT_HOST=localhost        # Default: "localhost". Use "qdrant" when running in Docker Compose
QDRANT_PORT=6333             # Default: 6333
QDRANT_COLLECTION=vector_db  # Default: "vector_db"

# File Upload Settings
UPLOAD_DIR=uploads           # Default: "uploads" (relative to backend directory)
MAX_FILE_SIZE=52428800       # Default: 50MB (50 * 1024 * 1024 bytes)

# Embedding Model
EMBEDDING_MODEL=all-mpnet-base-v2  # Default: "all-mpnet-base-v2" (768-dimensional)

# Compute Device (for native mode)
DEVICE=auto                        # Options: "auto", "cpu", "cuda", "mps"
                                   # auto = detect best available (MPS > CUDA > CPU)

# Chunking Settings
CHUNK_SIZE=500               # Default: 500 characters per chunk
CHUNK_OVERLAP=50             # Default: 50 characters overlap between chunks

# Security
ADMIN_KEY=                   # Optional: protects /reset endpoint. Leave empty to disable.

# Rate Limiting (High defaults for personal use)
RATE_LIMIT_UPLOAD=1000/minute  # Default: 1000/minute (won't affect normal use)
RATE_LIMIT_SEARCH=1000/minute  # Default: 1000/minute
RATE_LIMIT_RESET=60/minute     # Default: 60/minute (stricter for destructive ops)

Note

When using Docker Compose, QDRANT_HOST is automatically set to qdrant (the service name) in docker-compose.yml. You only need a .env file for manual installations or to override defaults.

MCP Integration (AI Agents)

The Vector Knowledge Base includes built-in support for the Model Context Protocol (MCP), allowing AI agents like Claude Desktop to interact with your knowledge base directly.

Prerequisites

Node.js 18+ - Required for the MCP bridge
- Download from nodejs.org (LTS recommended)
- Or on macOS: brew install node

Setting Up Claude Desktop

Ensure the backend is running

# Docker mode
docker-compose up -d

# OR Native mode - macOS/Linux
./scripts/start-backend-native.sh

# OR Native mode - Windows
scripts\start-backend-native.bat

Locate the Claude Desktop config file
- macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
- Windows: %APPDATA%\Claude\claude_desktop_config.json
[!TIP] In Claude Desktop: Claude → Settings → Developer → Edit Config

Add the MCP server configuration

Edit claude_desktop_config.json:

{
  "mcpServers": {
    "vector-knowledge-base": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "http://localhost:8000/mcp"]
    }
  }
}

[!NOTE] On macOS, if npx is not in PATH, use the full path: /usr/local/bin/npx

Restart Claude Desktop
- Fully quit (Cmd+Q / Alt+F4), don't just close the window
- Reopen Claude Desktop
- The MCP tools should now be available

Using the Knowledge Base from Claude

Once connected, just ask Claude naturally:

Example Prompt	Action
"Search my knowledge base for machine learning"	Semantic search
"List all documents in my knowledge base"	List documents
"Show me the document clusters"	Get clusters
"Run auto-clustering on my documents"	Cluster documents
"Check if my vector database is healthy"	Health check
"Get 3D embedding data for cluster 1"	Visualization data
"Create a summary document with my notes"	Create text document

Claude Desktop searching and listing documents via MCP integration

MCP Limitations

Important

Claude Desktop has limitations when interacting with the knowledge base via MCP.

What Claude CAN do:

✅ Search documents semantically
✅ List all documents and folders
✅ Delete documents by filename
✅ Run clustering and get cluster info
✅ Get 3D embedding coordinates for visualization
✅ Check system health
✅ Create text documents (.txt, .md, .json) - Claude can generate content and save it to your knowledge base

What Claude CANNOT do:

❌ Upload binary files - PDFs, Word docs, images require multipart uploads which MCP cannot provide (at least from what I found with Claude Desktop)
❌ Access your filesystem - Claude cannot read files from paths like /Users/.../file.pdf

To upload files, use one of these methods instead:

Web interface at http://localhost:8001/documents.html

curl command:

curl -X POST http://localhost:8000/upload \
  -F "file=@/path/to/document.pdf" \
  -F "category=my-category"

Available MCP Tools

Tool	Description
`health_check`	Check if the API is running
`get_allowed_extensions`	Get list of supported file types
`search_documents`	Semantic search across all documents
`list_documents`	List all uploaded documents
`delete_document`	Delete a document by filename
`get_folders`	List folder structure
`create_folder`	Create a new folder
`update_folder`	Rename or move a folder
`delete_folder`	Delete an empty folder
`move_file`	Move file to folder
`get_unsorted_files`	List files not in any folder
`get_files_in_folders`	Get file-to-folder mappings
`cluster_documents`	Run auto-clustering
`get_clusters`	Get cluster information
`get_embeddings_3d`	Get 3D visualization coordinates
`transform_query_3d`	Project query into 3D space
`get_job_status`	Check background job progress
`mcp_create_document`	Create text documents (.txt, .md, .json)

Tip

Claude can create searchable text documents using mcp_create_document. Ask it to "create a summary", "write notes", or "save a document" and it will add the content to your knowledge base.

MCP Configuration

MCP settings are configured in config.py (not in .env):

MCP_ENABLED=true                    # Enable/disable MCP endpoint
MCP_PATH=/mcp                       # URL path for MCP server
MCP_NAME=Vector Knowledge Base      # Display name
MCP_AUTH_ENABLED=false              # Enable OAuth (production)

Troubleshooting MCP

"Server disconnected" error in Claude Desktop:

Ensure the backend is running: curl http://localhost:8000/health
Check that Node.js is installed: node --version
Try the full path to npx: /usr/local/bin/npx

MCP tools not appearing:

Fully quit and reopen Claude Desktop
Check the Claude Desktop logs for errors
Verify the config JSON is valid (no trailing commas)

Caution

MCP provides AI agents with full access to your knowledge base. In production environments, enable MCP_AUTH_ENABLED=true for OAuth protection.

Troubleshooting

Qdrant Connection Error

If you see "Connection refused" errors:

# Check if Qdrant is running
docker ps

# Restart Qdrant container
docker restart <container-id>

# Or start a new container
docker run -d -p 6333:6333 -v ./qdrant_storage:/qdrant/storage:z qdrant/qdrant

Backend Won't Start

Warning

Dependency conflicts between sentence-transformers and huggingface-hub can cause startup failures.

Solution:

pip install --upgrade sentence-transformers huggingface-hub

File Upload Fails

Check supported file types:

Documents: .pdf, .docx, .pptx, .ppt, .xlsx, .csv, .txt, .md
Images: .jpg, .jpeg, .png, .webp (OCR-processed)
Code: .py, .js, .java, .cpp, .html, .css, .json, .xml, .yaml, .yml, .cs

Maximum file size: 50MB (configurable)

Frontend Can't Connect to Backend

If you see "Failed to fetch" errors in the browser console:

Verify backend is running on port 8000:
```
curl http://127.0.0.1:8000/health
```
Check frontend/config.js uses 127.0.0.1 (not localhost):
```
const API_BASE_URL = 'http://127.0.0.1:8000';
```

This avoids IPv6/IPv4 resolution issues on some systems.

Project Structure

Vector-Knowledge-Base/
├── backend/
│   ├── extractors/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── factory.py
│   │   ├── pdf_extractor.py
│   │   ├── docx_extractor.py
│   │   ├── pptx_extractor.py
│   │   ├── xlsx_extractor.py
│   │   ├── csv_extractor.py
│   │   ├── text_extractor.py
│   │   ├── code_extractor.py
│   │   ├── cs_extractor.py
│   │   └── image_extractor.py
│   ├── uploads/          # Uploaded files (gitignored except .gitkeep)
│   │   └── .gitkeep
│   ├── data/             # Runtime data (auto-created)
│   │   └── documents.json  # Document registry for O(1) listing
│   ├── main.py
│   ├── vector_db.py
│   ├── embedding_service.py
│   ├── ingestion.py
│   ├── chunker.py
│   ├── clustering.py
│   ├── filesystem_db.py
│   ├── document_registry.py  # O(1) document listing registry
│   ├── dimensionality_reduction.py
│   ├── jobs.py           # Background task tracking
│   ├── config.py
│   ├── constants.py      # Shared constants
│   ├── mcp_server.py     # MCP server integration
│   └── exceptions.py
├── frontend/
│   ├── css/
│   │   ├── base.css
│   │   ├── animations.css
│   │   ├── components.css
│   │   ├── layout.css
│   │   ├── filesystem.css
│   │   ├── batch-upload.css
│   │   └── modals.css
│   ├── js/
│   │   └── embedding-visualizer.js
│   ├── index.html
│   ├── documents.html
│   ├── files.html
│   ├── config.js
│   ├── constants.js
│   ├── search.js
│   ├── upload.js
│   ├── documents.js
│   ├── filesystem.js
│   ├── notifications.js
│   └── favicon.ico
├── scripts/
│   ├── start-backend-native.sh   # GPU mode startup (Unix)
│   └── start-backend-native.bat  # GPU mode startup (Windows)
├── screenshots/
├── Docs/
│   └── Vector_Knowledge_Base_Technical_Report.pdf  # Full technical documentation
├── qdrant_storage/       # Created at runtime (gitignored)
├── uploads/              # Created at runtime by Docker (gitignored)
├── backend_db/           # Created at runtime (gitignored)
├── Dockerfile
├── docker-compose.yml          # Full Docker deployment
├── docker-compose.native.yml   # Native backend mode
├── nginx.conf
├── requirements.txt
├── requirements.in
├── LICENSE
└── README.md

Performance

Upload: ~2-5 seconds for typical PDF
Search: 100-500ms depending on corpus size (sub-50ms for <10k vectors)
Embedding: ~50-100ms per chunk
Capacity: Scales to 100k+ documents with Qdrant

Built with ❤️ using FastAPI, Qdrant, and SentenceTransformers

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
Docs		Docs
backend		backend
frontend		frontend
screenshots		screenshots
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.native.yml		docker-compose.native.yml
docker-compose.yml		docker-compose.yml
nginx.conf		nginx.conf
requirements.in		requirements.in
requirements.txt		requirements.txt

License

i3T4AN/Vector-Knowledge-Base

Folders and files

Latest commit

History

Repository files navigation

Vector Knowledge Base

Features

Quick Start

Prerequisites

Option 1: Docker Deployment (Recommended)

Option 2: Performance Mode (GPU Acceleration)

Option 3: Manual Installation (Not Recommended)

Usage

Uploading Documents

Searching

Auto-Clustering Documents

Organizing Files

Data Management

3D Visualization

Architecture

System Overview

Document Processing Pipeline

Frontend Architecture

Tech Stack

CSS Architecture

API Reference

Core Endpoints

Upload Document

Search

List Documents

Delete Document

Folder Management

Clustering

3D Visualization

Batch Upload

Job Management

Data Management

Configuration

MCP Integration (AI Agents)

Prerequisites

Setting Up Claude Desktop

Using the Knowledge Base from Claude

MCP Limitations

Available MCP Tools

MCP Configuration

Troubleshooting MCP

Troubleshooting

Qdrant Connection Error

Backend Won't Start

File Upload Fails

Frontend Can't Connect to Backend

Project Structure

Performance

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages