This project is a sophisticated RAG (Retrieval-Augmented Generation) application built on a Clean Architecture. It provides a flexible and extensible framework for interacting with your documents using a conversational interface.
chat-with-docs/
├── README.md
├── pyproject.toml # Poetry dependency management
├── docker-compose.yml # Easy setup with Ollama + Chroma
├── Dockerfile
├── .env.example
├── .gitignore
├── .pre-commit-config.yaml
│
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI application entry point
│ ├── config.py # Configuration management
│ │
│ ├── core/ # Clean Architecture Core
│ │ ├── __init__.py
│ │ ├── domain/ # Domain layer - business entities
│ │ │ ├── __init__.py
│ │ │ ├── entities/
│ │ │ │ ├── __init__.py
│ │ │ │ ├── document.py
│ │ │ │ ├── collection.py
│ │ │ │ ├── chat.py
│ │ │ │ └── chunk.py
│ │ │ ├── value_objects/
│ │ │ │ ├── __init__.py
│ │ │ │ ├── embedding.py
│ │ │ │ └── metadata.py
│ │ │ └── exceptions.py
│ │ │
│ │ ├── ports/ # Interfaces/Protocols
│ │ │ ├── __init__.py
│ │ │ ├── document_processor.py
│ │ │ ├── embedding_service.py
│ │ │ ├── vector_store.py
│ │ │ ├── llm_service.py
│ │ │ ├── orchestrator.py
│ │ │ └── repositories.py
│ │ │
│ │ └── use_cases/ # Application layer
│ │ ├── __init__.py
│ │ ├── collection_management.py
│ │ ├── document_processing.py
│ │ ├── chat_interaction.py
│ │ └── search_documents.py
│ │
│ ├── adapters/ # External integrations
│ │ ├── __init__.py
│ │ │
│ │ ├── document_processing/
│ │ │ ├── __init__.py
│ │ │ ├── spacy_layout_processor.py
│ │ │ └── chunking_strategies.py
│ │ │
│ │ ├── embedding/
│ │ │ ├── __init__.py
│ │ │ ├── ollama_embedding.py
│ │ │ └── base_embedding.py
│ │ │
│ │ ├── vector_store/
│ │ │ ├── __init__.py
│ │ │ ├── chroma_store.py
│ │ │ └── hybrid_search.py
│ │ │
│ │ ├── llm/
│ │ │ ├── __init__.py
│ │ │ ├── ollama_llm.py
│ │ │ └── base_llm.py
│ │ │
│ │ ├── orchestration/
│ │ │ ├── __init__.py
│ │ │ ├── langraph_orchestrator.py
│ │ │ └── retrieval_strategies.py
│ │ │
│ │ └── persistence/
│ │ ├── __init__.py
│ │ ├── sqlite_repository.py
│ │ └── models.py
│ │
│ ├── api/ # FastAPI routes and controllers
│ │ ├── __init__.py
│ │ ├── deps.py # Dependency injection
│ │ ├── routes/
│ │ │ ├── __init__.py
│ │ │ ├── collections.py
│ │ │ ├── documents.py
│ │ │ ├── chat.py
│ │ │ └── health.py
│ │ └── middleware/
│ │ ├── __init__.py
│ │ ├── cors.py
│ │ └── error_handlers.py
│ │
│ └── infrastructure/ # Cross-cutting concerns
│ ├── __init__.py
│ ├── logging.py
│ ├── metrics.py
│ └── startup.py
│
├── frontend/ # Simple web interface
│ ├── static/
│ │ ├── css/
│ │ ├── js/
│ │ └── images/
│ └── templates/
│ ├── index.html
│ ├── chat.html
│ └── collections.html
│
├── tests/
│ ├── __init__.py
│ ├── conftest.py
│ ├── unit/
│ │ ├── test_use_cases/
│ │ ├── test_domain/
│ │ └── test_adapters/
│ ├── integration/
│ │ ├── test_api/
│ │ └── test_services/
│ └── e2e/
│ └── test_chat_flow.py
│
├── scripts/
│ ├── setup.sh # Initial setup script
│ ├── download_models.sh # Download Ollama models
│ └── dev_setup.sh # Development environment setup
│
├── docs/
│ ├── architecture.md
│ ├── api.md
│ ├── setup.md
│ └── contributing.md
│
└── data/ # Runtime data
├── uploads/ # Uploaded documents
├── vector_db/ # Chroma database
└── logs/
This method allows you to run the application directly on your machine, which is ideal for debugging, making changes, and developing new features.
- Python 3.12
- Poetry for dependency management.
- An Ollama server running locally or accessible via a URL.
- An accessible instance of the BrainDrive-Document-AI service for document processing.
-
Clone the repository:
git clone https://github.com/BrainDriveAI/chat-with-your-documents.git cd chat-with-documents-dev -
Set up the environment file: Copy the example file and fill in your specific configuration, including the URLs for your Ollama and Document Processor services.
cp .env.example .env # Open .env in your editor and modify the variables as needed. -
Install dependencies with Poetry: This will install all project dependencies and set up a virtual environment.
poetry install
-
Activate the virtual environment:
poetry shell
-
Start the application:
uvicorn app.main:app --reload
- The
--reloadflag enables live-reloading during development.
- The
-
Access the application: Open your web browser and navigate to
http://localhost:8000. You can also access the interactive API documentation athttp://localhost:8000/docs.
This is the fastest way to get the application running without worrying about local dependencies. It's ideal for a quick deployment or testing.
- Docker installed and running.
- An Ollama server running, with chat and embedding models pulled (e.g.,
llama3.2:8b,mxbai-embed-large, andllama3.2:3b).
-
Clone the repository.
git clone https://github.com/BrainDriveAI/chat-with-your-documents.git cd chat-with-documents-dev -
Configure your environment variables. Create your
.envfile from the example and provide the necessary configuration values. This is crucial for the Docker container to pick up your settings.cp .env.example .env # Open the .env file and set the URLs for your Ollama and Document Processor services. -
Run the application. The
docker-compose.ymlfile will build the image and start the container, mapping the necessary ports and volumes.docker-compose up --build
-
Access the application at
http://localhost:8000. -
To stop the containers.
docker-compose downFor a production-ready setup with Nginx and Prometheus, use the provided docker-compose.prod.yml file.
- Follow the basic setup steps above.
- Build and run the production stack.
docker-compose -f docker-compose.prod.yml up --build -d
- Domain: Pure business logic, no external dependencies.
- Ports: Abstract interfaces for external services.
- Adapters: Concrete implementations of external services.
- Use Cases: Application-specific business rules.
- FastAPI's dependency system for clean Inversion of Control.
- Easy to swap implementations (Ollama → OpenAI, Chroma → Qdrant).
- Abstract base classes for all external services.
- Configuration-driven provider selection.
- Docker Compose for one-command deployment.
- Poetry for reproducible dependencies.
- Easy to add new retrieval strategies.
- Configurable chunking strategies.
| Component | Technology | Purpose |
|---|---|---|
| Web Framework | FastAPI | API endpoints, WebSocket support |
| Document Processing | spaCy Layout | PDF/Word structure extraction |
| Embeddings | mxbai-embed-large (Ollama) | Vector representations |
| Vector Store | Chroma | Document search and storage |
| LLM | llama3.2:3b/8b (Ollama) | Chat responses |
| Orchestration | LangGraph | RAG pipeline management |
| Database | SQLite | Metadata and collections |
| Frontend | HTML/JS | Simple chat interface |
| Container | Docker + Docker Compose | Easy deployment |
Start here: FOR-AI-CODING-AGENTS.md
Complete instructions for AI assistants (Claude Code, Cursor, Windsurf, Aider, Cline, etc.) including:
- Architecture overview & development commands
- Compounding Engineering - Auto-documentation system
- Code patterns & conventions
- 230 tests structure
Quick links:
- Knowledge base:
docs/AI-AGENT-GUIDE.md- When/how to document (ADRs, failures, quirks) - Decisions:
docs/decisions/- Architecture Decision Records - Failures:
docs/failures/- Lessons learned (what NOT to do) - Quirks:
docs/data-quirks/- Non-obvious system behaviors
Before implementing: grep -ri "keyword" docs/decisions/ docs/failures/ docs/data-quirks/
- Owner's Manual - Complete user/operator guide
- Architecture - Technical architecture deep-dive
- Evaluation System - RAG accuracy testing
- Performance Optimization - Tuning guide