Production-ready AI consciousness framework with streaming, memory, tools, voice, telephony, and MCP integration.
Copyright © 2025 Substrate AI. All Rights Reserved.
This repository is made public for documentation and transparency purposes only. No license is granted for commercial use, modification, distribution, or commercial exploitation without explicit written permission from the copyright holder. Personal use only is granted by this license.
See LICENSE file for full terms.
Built on modern LLM infrastructure with support for Grok (xAI), OpenRouter (100+ models), Mistral AI, Venice AI, and Ollama, PostgreSQL persistence, and an extensible tool architecture. This is the technical substrate powering a configurable AI agent for research, operations, voice interaction, and product work.
###NOTE: URLs have been changed in the code to placeholders like "http://your_url.com". Please replace these placeholders with your actual IP address.
# Clone the repository
git clone https://github.com/your-username/substrate-ai.git
cd substrate-ai
# Run the setup wizard
python setup.pyThe setup wizard will:
- ✅ Create Python virtual environment
- ✅ Install all backend dependencies
- ✅ Interactively configure your LLM provider and API key
- ✅ Create
backend/.envfrom the template - ✅ Install frontend dependencies
- ✅ Initialize the agent
- ✅ Validate your setup
Supported providers: Grok (xAI), OpenRouter, Mistral AI, Venice AI, Ollama (local)
# Backend
cd backend
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Configure — set at least one provider key in .env
cp .env.example .env
# GROK_API_KEY, OPENROUTER_API_KEY, MISTRAL_API_KEY, or VENICE_API_KEY
# Initialize the agent
python setup_agent.py
# Frontend
cd ../frontend
npm install# Terminal 1: Backend
cd backend
source venv/bin/activate
python api/server.py
# Terminal 2: Frontend
cd frontend
npm run dev
# Open http://localhost:5173📖 Full guide: See QUICK_START.md
+------------------+
| Frontend UI |
| React + Vite |
+--------+---------+
|
HTTP / SSE
|
+------------+ +-----------v-----------+ +------------+
| Discord | | | | Telegram |
| Bot +-->+ Substrate Backend +<--+ Bot |
| (Node.js) | | (Python / Flask) | | (Python) |
+------------+ | | +------------+
| +------------------+ |
+------------+ | | Consciousness | | +------------+
| WhatsApp | | | Loop | | | Guardian |
| Bot +-->+ | | +<--+ Watch |
| (Node.js) | | | Model Routing | | | (Apple |
+------------+ | | Tool Execution | | | Watch) |
| | Memory Mgmt | | +------------+
| +------------------+ |
| |
| +------+ +--------+ |
| |Memory| | Tools | |
| |System| |Registry| |
| | | | | |
| |SQLite| | 50+ | |
| |Chroma| | tools | |
| |Miras | | | |
| +------+ +--------+ |
| |
| +------+ +--------+ |
| |Voice | | MCP | |
| |TTS | | Servers| |
| |STT | | | |
| +------+ +--------+ |
+-----------+-----------+
|
+--------------+--------------+
| | |
+-----v----+ +-----v----+ +------v-----+
| SQLite | | ChromaDB | | Neo4j |
| Primary | | Vectors | | (Optional) |
| Database | | + Jina | | Graph RAG |
+----------+ | Embed | +------------+
+---------+
The substrate supports multiple providers with automatic fallback:
| Priority | Provider | Configuration |
|---|---|---|
| 1 | Mistral AI | MISTRAL_API_KEY + MISTRAL_MODEL |
| 2 | Grok (xAI) | GROK_API_KEY + MODEL_NAME |
| 3 | OpenRouter | OPENROUTER_API_KEY + DEFAULT_LLM_MODEL |
| 4 | Ollama Cloud | OLLAMA_API_URL + OLLAMA_MODEL |
| 5 | Fallback | FALLBACK_MODEL (default: moonshotai/kimi-k2-0905) |
- Primary: Jina Embeddings
jinaai/jina-embeddings-v2-base-devia Hugging Face (German + English bilingual) - Fallback: Ollama
nomic-embed-text(used only if HuggingFace unavailable) - Vector DB: ChromaDB with cosine similarity
Persona, Human, and Custom blocks that define the agent's identity and knowledge about the user. Always present in the context window.
Tools:
core_memory_append/core_memory_replace- Letta-compatible core memory operationsmemory_insert/memory_replace/memory_rethink/memory_finish_edits- New memory APImemory- Unified file-like API with sub-commands (create, str_replace, insert, delete, rename)
ChromaDB-backed long-term storage with semantic search, importance weighting, and decay lifecycle.
- 12-category taxonomy for tag-enhanced retrieval: relational, people, technical, preferences, plans, identity, events, spice, sovereignty, sanctuary, ritual, reflections
- Importance weighting (1-10 scale) with relevance decay (0.01/day)
- Memory states: active, favorite (protected from decay), faded (below 0.3 relevance), forgotten (removed)
- Capacity: 50,000 memories with automatic cleanup, 5,000 favorite slots
Tools:
archival_memory_insert- Store memories with category, importance, and tagsarchival_memory_search- Semantic search with attention-based ranking
favorite_memory- Protect a memory from decay (max 5,000 favorites)unfavorite_memory- Remove decay protectiondrift_memory- Soft deprioritize (reduces importance by 30%)memory_stats- Get lifecycle statistics (counts by state, capacity, decay rates)
category_browse- Browse memories by taxonomy categoryretag_memory- Change taxonomy tags on existing memoriesadd_taxonomy_tag- Create custom taxonomy categories
conversation_search- Searches both SQLite message history and ChromaDB archived summaries/insightsconversation_summarize- Summarizes old messages, archives the summary to ChromaDB archival memory with importance and category tags, then frees context window space. Summaries are taggedconversation_summaryand extracted insights taggedextracted_insightin ChromaDB, making them searchable alongside regular archival memories.
SQLite-backed relational tracking system for people the agent interacts with. Automatically injected into the consciousness loop context when people are mentioned in conversation.
- Fields: name, relationship_type, category, sentiment (-1.0 to 1.0), my_opinion, angela_says, discord_id, associated_ai
- Categories: FAVORITES, NEUTRAL, CAUTIOUS, DISLIKE
- Context injection: Scans current message + recent messages for people mentions, builds context with relationship info and tone guidance
Tools:
add_person- Add person with relationship type, category, sentimentupdate_opinion- Update personal opinion about someonerecord_user_says- Store what User said about someoneadjust_sentiment- Adjust sentiment score with reasonget_person- Retrieve full perspective on someonelist_people- List all people, optionally filtered by category
Based on Google Research Titans & Miras papers:
- Retention Gates - Dynamic memory decay/boost. Weights: importance (35%), access count (30%), temporal recency (25%), base retention (10%). Actions: KEEP, BOOST, CONSOLIDATE, DECAY, ARCHIVE.
- Attentional Bias - Multi-factor retrieval scoring with 6 attention modes: standard, semantic, temporal, importance, access, emotional. Auto-detects mode from query analysis. Scoring: semantic similarity (40%), importance (20%), temporal (15%), access patterns (15%), category relevance (10%).
- Hierarchical Memory - 3-tier system: Working (in-memory LRU, current session) -> Episodic (ChromaDB, retention-gated) -> Semantic (Neo4j if available, core beliefs and identity)
- Online Learning - Hebbian associations ("neurons that fire together, wire together") + feedback learning (helpful, not_helpful, incorrect, outdated, redundant)
Three connected memory types working together:
- Core Memory - Always loaded (persona, human, system context)
- Recall Memory - Recent conversation history from current session
- Archival Memory - Long-term semantic storage
After every message, the system checks if core memory needs updating, extracts key information to archival, and maintains cross-references across all three types.
- SQLite persistence (no amnesia on restart), PostgreSQL optional
- Smart context window management with automatic message compaction
- Conversation summarization archives old messages to ChromaDB to free context space
Key files:
backend/core/memory_system.py- ChromaDB + Jina embeddingsbackend/core/memory_coherence.py- Three-memory coherence enginebackend/core/message_continuity.py- Persistent messages and context windowsbackend/core/retention_gate.py- Retention gate logicbackend/core/attentional_bias.py- Attention scoringbackend/core/hierarchical_memory.py- 3-tier architecturebackend/core/memory_learner.py- Hebbian learningbackend/tools/memory_tools.py- All memory tool definitions (25+ tools)
- discord_tool - Full Discord integration: DMs, channels, message history, task scheduling, file downloads
- send_voice_message - Voice messages via ElevenLabs TTS, sent as Discord audio attachments
- phone_tool - Twilio SMS, voice calls, contact management, number screening
- spotify_control - Full Spotify playback, queue, search, and playlist management
- mobile app - chat interface and real time voice for iOs or Android
- web_search - DuckDuckGo web search
- deep_research - Multi-step autonomous research combining DuckDuckGo + Wikipedia + ArXiv (depth 1-3)
- fetch_webpage - Jina AI page reader returning clean Markdown
- arxiv_search - Academic paper search across 2M+ papers
- read_pdf - PDF reader supporting ArXiv LaTeX sources and PyMuPDF
- search_places - POI/restaurant/shop finder using OpenStreetMap
- image_tool - Image generation via Together.ai FLUX models (selfie, couple modes)
- agent_dev_tool - Codebase inspection and self-development (Level 1: read-only, Level 2: command execution, Level 3: file editing)
- notebook_library_tool - Token-efficient semantic document retrieval and management
- google_places_tool - Google Places for detailed location-aware features (search_nearby, get_details, find_gas, find_hotel)
- browser_tool - Playwright browser automation: navigate, click, type, fill forms, screenshots
- sanctum_tool - Focus/privacy mode control (status, toggle, queue management)
- polymarket_tool - Polymarket weather trading and market analysis
- lovense_tool - Hardware control for intimate feedback devices
- cost_tools - Real-time API cost tracking with budget awareness
All messaging platforms share unified conversation context through the consciousness loop. Messages from any platform are processed with the same memory, personality, and tool access.
- Language: TypeScript / Node.js
- Features: Streaming responses, voice messages (ElevenLabs), voice channels, Spotify integration, autonomous heartbeats, task scheduling, admin commands, image/PDF/OCR processing
- Service:
agent-discord.service - Port: 3001
- Language: Python
- Features: Text, images, documents, voice messages, 4096 char limit, auto-chunking
- Service:
agent-telegram.service
- Language: Node.js
- Features: Cross-platform messaging via Baileys library
- Service:
agent-whatsapp.service
- Language: Node.js
- Features: Text, images, documents, real-time voice mode, location tracking
- See docs/MOBILE_APP_SETUP.md for Mobile setup.
Provider abstraction with automatic fallback:
- ElevenLabs - Primary (Turbo v2.5 + v3 with Audio Tags)
- Hume Octave - Emotionally intelligent TTS
- Amazon Polly - Neural voices
- PocketTTS - Local fallback
- OpenAI Whisper API integration
- Real-time voice channel conversations in Discord
Key files:
backend/api/routes_tts.py- TTS endpointsbackend/api/routes_stt.py- STT endpointsbackend/core/voice_providers.py- Provider abstraction
POST /api/chat- Chat with streaming supportPOST /ollama/api/chat- Ollama-compatible chat endpointPOST /ollama/api/chat/stream- Streaming chat with SSEGET /api/health- Health checkGET /api/stats- Usage statistics
GET /api/conversation/{session_id}- Conversation historyGET /api/memory/blocks- Memory blocksPUT /api/memory/blocks/{label}- Update memory blockGET /api/context/usage- Context usage per session
POST /tts- Text-to-speechPOST /tts/stream- Streaming TTSPOST /stt- Speech-to-text
POST /api/discord/message- Send message via DiscordGET /api/discord/messages- Read messages
GET /api/agents- List agentsPOST /api/agents- Create agentGET /api/agent/info- Agent information
POST /api/guardian/heartbeat- GPS/motion telemetryPOST /api/guardian-watch/ingest- Apple Watch biometric data
GET /api/graph/nodes- Graph nodesGET /api/graph/edges- Graph relationshipsGET /api/graph/rag- Knowledge graph retrieval
POST /v1/chat/completions- OpenAI format compatibility
Level 1 read-only diagnostics server:
read_source_file- Read source files (with .env redaction)search_code- Search codebaseread_logs- Read service logscheck_health- Health checkslist_directory- Directory listingget_config- Configuration inspection
Document management and semantic search:
- Notebook management with semantic retrieval
- Document processing pipeline
- File watching for automatic indexing
- Python 3.11+ - Core runtime
- Flask - API server with SSE streaming
- SQLite - Primary database (conversation history, state, people map)
- ChromaDB - Vector embeddings for semantic memory
- Jina Embeddings -
jinaai/jina-embeddings-v2-base-devia Hugging Face (Ollama fallback) - Neo4j - Graph database for Graph RAG (optional, local DB fallback)
- PostgreSQL - Optional alternative to SQLite
- React 18 + TypeScript + Tailwind CSS + Vite
- TypeScript + discord.js 14 + ElevenLabs + sharp + tesseract.js
- ElevenLabs - Primary TTS
- Hume / Amazon Polly / PocketTTS - Fallback TTS
- OpenAI Whisper - STT
cd backend
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # Mac/Linux
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your API keys
# Optional: Playwright for MCP browser automation
playwright install chromiumcd frontend
npm install
npm run devcd whatsapp_bot
npm install
# Configure user_mapping.json (see user_mapping.json.example)
node bot.jsSee backend/TELEGRAM_SETUP.md for full setup instructions.
# Add TELEGRAM_BOT_TOKEN to backend/.env
python backend/telegram_bot.pyThe framework supports two voice call pipelines:
Real-time bidirectional audio via Twilio Media Streams WebSocket:
- Caller audio → Twilio → Whisper STT → Consciousness Loop → TTS → Twilio → Caller
- Supports barge-in (caller can interrupt mid-response)
- Energy-based speech endpointing
Lower latency pipeline using Hume's Empathic Voice Interface:
- Caller audio → Twilio → EVI (integrated STT + LLM + TTS) → Twilio → Caller
- Tool calls route through the substrate's MemoryTools
- Context injection (system prompt, memory blocks, Graph RAG)
See docs/PHONE_SETUP_GUIDE.md for Twilio setup.
Guardian Mode provides safety features for real-world awareness:
- GPS Heartbeat — Periodic location telemetry from mobile client
- Emergency Triggers — Panic button routes through the consciousness loop
- Proactive Intervention — Agent evaluates context and can reach out proactively
- Apple Watch Biometrics — Heart rate and activity data integration
Endpoints: POST /api/guardian/heartbeat, POST /api/guardian/emergency
When the user is in an active DM conversation, Sanctum Mode automatically queues non-urgent channel @mentions instead of delivering them to the consciousness loop. An auto-reply is sent in the channel:
"Assistant's in sanctum; will circle back when free."
Queued mentions are reviewed during heartbeats. The agent can also manually activate/deactivate via the sanctum_tool.
- ✅ RestrictedPython compilation (no unsafe operations)
- ✅ 30-second timeout enforcement
- ✅ 512MB memory limit per execution
- ✅ Isolated workspace per session
- ✅ Command whitelist (only approved binaries)
- ✅ Rate limiting (max 15 commands/min)
- ✅ Full audit log at
/var/log/agent_dev_commands.log - ✅ Sandboxed to
/home/user - ✅ Dry-run mode for safe testing
- ✅ Path sandboxing (restricted to
/home/user) - ✅ Syntax validation before write (Python, JSON, YAML, JS)
- ✅ Dangerous pattern detection (eval, exec,
__import__) - ✅ Timestamped auto-backup before every change
- ✅ Auto-rollback on validation failure
- ✅ Domain whitelist (Wikipedia, GitHub, ArXiv, etc.)
- ✅ Domain blacklist (banking, payments blocked)
- ✅ Rate limiting (10 nav/min, 5 screenshots/min)
- ✅ Headless mode only
- ✅ Rate limiting on all endpoints
- ✅ CORS configuration
- ✅ Input sanitization
- ✅ API key validation
The repository includes systemd service files for production deployment:
| Service file | Description |
|---|---|
api-substrate.service |
Main backend API server |
api-telegram.service |
Telegram bot process |
substrate-agent.service |
Agent substrate (alternative config) |
substrate-telegram.service |
Telegram bot (alternative config) |
whatsapp_bot/AGENT-whatsapp.service |
WhatsApp bot |
sudo cp api-substrate.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable api-substrate
sudo systemctl start api-substrateSee SERVICE_DEPLOYMENT.md for full instructions.
# Backend startup test
cd backend
python test_startup.py
# MCP integration tests
python test_mcp_integration.py
# Level 3 demo (file editing)
python tools/test_level3_demo.pyApple Watch biometric integration via Unix socket (/run/agent/guardian-watch.sock). Monitors heart rate, HRV, SpO2, body temperature, respiratory rate, activity, sleep metrics, and stress level. Includes anomaly detection with rolling 7-day baseline.
backend/services/guardian_watch.pybackend/services/guardian_watch_standalone.py
GPS/motion telemetry from mobile devices. Detects sudden stops, high speed, off-route, low battery, and geofence violations.
backend/api/routes_guardian.py
Weather market trading with probability engine and risk management.
backend/services/polymarket/
FLUX model image generation via Together.ai.
backend/services/image_generator.py
Emotional analysis for contextual response adjustment.
backend/services/emotional_analyzer.py
Knowledge graph retrieval with optional Neo4j or local DB fallback.
backend/services/graph_rag.py
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
- Python: PEP 8, type hints, docstrings
- TypeScript: ESLint, Prettier, strict mode
- Tests: Add tests for new features
- Docs: Update relevant documentation
See LICENSE for details.
- xAI / Grok — Primary LLM API
- OpenRouter — Multi-model API gateway
- Mistral AI — Direct API with reasoning models
- Venice AI — Privacy-focused LLM access
- Anthropic MCP — Model Context Protocol architecture
- Playwright — Browser automation framework
- Gemini — Vision analysis (Google)
- PostgreSQL — Database engine
- ChromaDB — Vector embeddings
- Twilio — Telephony infrastructure
- Hume — Empathic voice interface
- Google Titans/Miras — Advanced memory architecture (Retention Gates, Attentional Bias, Online Learning)
- "It's All Connected" — Test-time memorization and retention research
Built with inspiration from:
- Letta (formerly MemGPT) — Memory architecture patterns
- LangChain — Tool execution concepts
- AutoGPT — Agent autonomy ideas
- 🐛 Bug Reports: GitHub Issues
- 💬 Questions: GitHub Discussions
- 📖 Documentation: See
/docsfolder and individual*.mdfiles - 🔧 Troubleshooting: See QUICK_START.md
Built for developers who need production-ready AI agents and for people who want an AI companion that are part of their daily lives.
Version 1.2.0 | Last Updated: March 2026