| title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned | license | models | short_description | |
|---|---|---|---|---|---|---|---|---|---|---|---|
VELA Research Agent |
📊 |
blue |
purple |
gradio |
5.12.0 |
app.py |
false |
mit |
|
Korean Financial Research with 7B LLM |
Domain-Specialized LLM Research Agent for Korean Financial Markets
VELA is an open-source research agent framework that demonstrates how a single developer can build a domain-specialized LLM system competitive with $100M+ proprietary projects -- for under $235/month in compute costs.
| Metric | VELA (7B) | Qwen 2.5 7B Base | GPT-4o | Exaone 3.5 7.8B |
|---|---|---|---|---|
| Domain Knowledge (100pt) | 87.5 | 72.0 | 81.0 | 74.5 |
| Korean Fluency | Native | Mixed (CN leak) | Good | Native |
| Reasoning Trace | Structured | None | Free-form | None |
- Base Model: Qwen/Qwen2.5-7B-Instruct
- Training: SFT (58K samples) + DPO (26K pairs) on Korean financial domain
- Inference: 16 tok/s on Apple Silicon (MLX 4-bit), RunPod Serverless, or any vLLM server
- License: MIT
User Query
|
v
[ResearchAgent] -- CoT Reasoning Loop (Think -> Search -> Analyze -> Conclude)
|
+-- [CoTReasoningEngine] -- TODO-based iterative reasoning with confidence gating
+-- [ResearchSearchModule] -- Multi-source web search (Naver + DuckDuckGo)
+-- [ContentExtractor] -- Web page & PDF content extraction
+-- [AdversaryAgent] -- Cross-verification via Perplexity API (optional)
|
v
ResearchResult (structured JSON with trajectory, claim-evidence mapping)
# 1. Clone
git clone https://github.com/intrect/vela-framework.git
cd vela-framework
# 2. Install
pip install -e .
# 3. Configure
cp .env.example .env
# Edit .env with your API keys (RUNPOD_API_KEY, NAVER_CLIENT_ID, etc.)
# 4. Run
python inference.py --query "SK하이닉스 HBM 시장 전망" --backend mlx- Python 3.10+
- At least one LLM backend configured (RunPod, MLX, or vLLM)
pip install -e .Core (auto-installed):
pydantic>=2.0-- Structured schemasrequests-- HTTP clientpython-dotenv-- Environment configurationduckduckgo-search-- Web search fallbackbeautifulsoup4-- Content extraction
All configuration is via environment variables. Copy .env.example and fill in your keys:
# LLM Backends (configure at least one)
RUNPOD_API_KEY=your_key # RunPod Serverless
RUNPOD_ENDPOINT_ID=your_endpoint
VELA_MLX_BASE_URL=http://localhost:8081/v1 # MLX server
VLLM_BASE_URL=http://localhost:8000/v1 # vLLM server
# Search APIs
NAVER_CLIENT_ID_0=your_id # Naver Search API
NAVER_CLIENT_SECRET_0=your_secret
# Verification (optional)
PERPLEXITY_API_KEY=your_key # Adversary Agentfrom vela import ResearchAgent
from vela.schemas import ResearchOptions
# Initialize with your preferred backend
agent = ResearchAgent(llm_backend="mlx")
# Run research
result = agent.research(
query="SK하이닉스 HBM 시장 전망",
options=ResearchOptions(max_iterations=5),
)
# Access results
print(result.conclusion)
print(f"Confidence: {result.confidence:.0%}")
print(f"Sources: {len(result.sources)}")
# Save with full metadata (for training data generation)
from pathlib import Path
ResearchAgent.save_with_metadata(result, Path("output/result.json"))# Basic research
python inference.py -q "삼성전자 반도체 전략" -b mlx
# With verification
python inference.py -q "네이버 AI 전략" --verify
# Save output
python inference.py -q "카카오 실적" -o result.json
# Verbose logging
python inference.py -q "현대차 전기차" -vTry VELA directly in your browser via Gradio:
# Install with web dependencies
pip install -e ".[web]"
# Launch local demo
python app.py
# Opens at http://localhost:7860Or try the hosted demo on HuggingFace Spaces.
| Backend | Use Case | Setup |
|---|---|---|
runpod |
Cloud GPU inference | Set RUNPOD_API_KEY + RUNPOD_ENDPOINT_ID |
mlx |
Apple Silicon local | Run MLX server, set VELA_MLX_BASE_URL |
vllm |
Any GPU server | Run vLLM, set VLLM_BASE_URL |
VELA uses a TODO-based CoT protocol where each research iteration follows:
- Think: Analyze current state and generate a TODO list
- Search: Execute web searches (Naver + DuckDuckGo)
- Analyze: Extract intermediate findings from collected sources
- Conclude: Synthesize final report when confidence threshold is met
**Step 1**:
**Thought**: SK하이닉스의 HBM 시장 점유율과 경쟁 구도 분석 필요
**Action**: search
**Query**: SK하이닉스 HBM3E 시장점유율 2025
**Confidence**: 35%
**Step 2**:
**Thought**: HBM 매출 비중과 영업이익률 데이터 확보 완료
**Action**: analyze
**Confidence**: 65%
**Step 3**:
**Thought**: 충분한 데이터 수집, 결론 도출 가능
**Action**: conclude
**Confidence**: 85%
The system uses a confidence gate at multiple levels:
- Per-step: Each reasoning step reports confidence (0-100%)
- Continuation: Research continues until confidence >= 80% or max iterations
- Synthesis: Final report includes overall confidence score
When --verify is enabled, an independent AdversaryAgent cross-checks the research output using the Perplexity API, identifying:
- Factual inconsistencies
- Unsupported claims
- Missing counter-arguments
VELA's training pipeline produced the domain-specialized model through:
- SFT (Supervised Fine-Tuning): 58K samples of Korean financial analysis with structured reasoning traces
- DPO (Direct Preference Optimization): 26K pairs targeting language purity (eliminating Chinese/English leaks from Qwen base) and reasoning quality
- HuggingFace: intrect/vela (GGUF Q4_K_M)
- Base: Qwen/Qwen2.5-7B-Instruct + LoRA (r=64, alpha=128)
| Component | Cost | Notes |
|---|---|---|
| RunPod RTX 4090 | ~$50/month | SFT + DPO training |
| Haiku API (data gen) | ~$80 | 5 batches, 50K samples |
| Naver/Search APIs | ~$30/month | Data collection |
| Perplexity API | ~$20/month | Adversary verification |
| Total | ~$235/month |
vela-framework/
├── app.py # Gradio web demo (HF Spaces)
├── inference.py # CLI entry point
├── vela/ # Core package
│ ├── agent.py # ResearchAgent orchestrator
│ ├── reasoning.py # CoT reasoning engine
│ ├── search.py # Multi-source web search
│ ├── schemas.py # Pydantic data models
│ ├── content_extractor.py
│ ├── adversary.py # Verification agent
│ ├── config.py # Centralized configuration
│ ├── prompts/ # System & research prompts
│ └── tools/ # LLM clients & utilities
│ ├── runpod_client.py
│ ├── mlx_client.py
│ ├── vllm_client.py
│ ├── ddg_search.py
│ ├── naver_search.py
│ ├── confidence_gate.py
│ └── fact_extractor.py
├── docs/
│ └── METHODOLOGY.md # Detailed methodology
└── examples/
└── simple_analysis.py
See docs/METHODOLOGY.md for detailed documentation on:
- Reasoning Trace format specification
- CoT protocol design
- Training data generation pipeline
- DPO strategy for language purity
- Benchmark methodology
This open-source demo uses public search APIs only. See Production Enhancements for commercial capabilities.
| Category | Limitation | Impact | Production Note |
|---|---|---|---|
| Model Size | 7B parameter model (Qwen2.5-7B base) | Complex multi-step reasoning may degrade compared to 70B+ models | |
| Language | Korean financial domain only | English/multilingual queries produce lower quality output | |
| Real-time Data | No direct market data feed (price, volume, orderbook) | Research relies on web search snippets, not live market data | |
| Valuation | No financial database integration (e.g., FnGuide, Bloomberg) | Cannot provide real-time PER/PBR/EPS; relies on news-sourced figures | FnGuide integration available |
| Search Coverage | Naver News API + DuckDuckGo only | No access to paywalled sources (증권사 리포트, 유료 DB) | Securities firm reports in prod |
| Content Extraction | Top 3 sources per search step | Remaining sources provide title + snippet only (no full text) | Full-text extraction in prod |
| Inference Speed | ~16 tok/s (MLX 4-bit) / ~5 tok/s (CPU BF16) | Full research cycle takes 30-120 seconds depending on iterations | |
| Repetition | 7B models may exhibit output repetition | Post-processing mitigates but does not fully eliminate | |
| Confidence | Self-reported confidence (not calibrated) | Confidence scores reflect model's subjective estimate, not statistical accuracy | |
| Temporal | Training data cutoff affects domain knowledge | Recent events after training may not be reflected in reasoning quality |
In commercial deployments, VELA can integrate:
- FnGuide API: Real-time consensus, target prices, analyst ratings (50+ firms)
- Securities firm reports: Full-text extraction from major Korean brokerages
- Financial statements: 3+ years of balance sheet, cash flow, income statement
- Order flow data: Institutional/foreign investor net buying (real-time)
Contact hello@intrect.io for enterprise features.
- Not a trading bot: VELA generates research reports, not trade signals or orders
- Not a financial advisor: Output is for informational/educational purposes only
- Not a real-time system: Research runs in batch mode (30-120s per query), not streaming
- Not a replacement for professional analysis: Designed to augment, not replace, human judgment
Contributions are welcome. Please open an issue first to discuss what you would like to change.
@software{vela_framework_2026,
title={VELA Framework: Domain-Specialized LLM Research Agent for Korean Financial Markets},
author={intrect},
year={2026},
url={https://github.com/intrect/vela-framework}
}