|
1 | | -# Documentation |
| 1 | +# Documentation Index |
2 | 2 |
|
3 | | -This directory contains the relevant documents important to the software. |
| 3 | +**Complete documentation for the unstructuredDataHandler project** |
4 | 4 |
|
| 5 | +This directory contains all project documentation, organized for easy navigation. Choose a section based on your role and needs: |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## 📘 For Users |
| 10 | + |
| 11 | +**New to the project? Start here:** |
| 12 | + |
| 13 | +### User Guides (`user-guide/`) |
| 14 | + |
| 15 | +- **[Quick Start](user-guide/quick-start.md)** ⭐ |
| 16 | + - 5-minute setup and first extraction |
| 17 | + - Programmatic usage with Python |
| 18 | + - Streamlit UI walkthrough |
| 19 | + - Provider comparison (Ollama vs Cerebras vs OpenAI) |
| 20 | + - Troubleshooting guide |
| 21 | + |
| 22 | +- **[Configuration](user-guide/configuration.md)** |
| 23 | + - 4-tier configuration system |
| 24 | + - LLM provider setup (Ollama, Cerebras, OpenAI, Anthropic) |
| 25 | + - Environment variables and YAML config |
| 26 | + - Image storage (local/MinIO) |
| 27 | + - Debug and optimization settings |
| 28 | + |
| 29 | +- **[Testing](user-guide/testing.md)** |
| 30 | + - Running test suite (233 tests, 87.2% pass rate) |
| 31 | + - Test types: unit, integration, smoke, E2E |
| 32 | + - Manual test scripts |
| 33 | + - Performance benchmarks |
| 34 | + - CI/CD integration |
| 35 | + |
| 36 | +--- |
| 37 | + |
| 38 | +## 👨💻 For Developers |
| 39 | + |
| 40 | +**Contributing to the codebase? Read these:** |
| 41 | + |
| 42 | +### Developer Guides (`developer-guide/`) |
| 43 | + |
| 44 | +- **[Architecture](developer-guide/architecture.md)** ⭐ |
| 45 | + - 7-layer system architecture |
| 46 | + - Component details (Agent, Parser, LLM, Skill, Memory, Retrieval) |
| 47 | + - Data flow examples and diagrams |
| 48 | + - Design patterns (Factory, Strategy, Pipeline, Observer, Singleton) |
| 49 | + - Quality attributes |
| 50 | + - Extension guide |
| 51 | + |
| 52 | +- **[Development Setup](developer-guide/development-setup.md)** |
| 53 | + - Prerequisites and environment setup |
| 54 | + - Python environments (venv, conda, pyenv) |
| 55 | + - Dependency management |
| 56 | + - LLM provider configuration |
| 57 | + - IDE setup (VS Code, PyCharm) |
| 58 | + - Pre-commit hooks |
| 59 | + - Code quality tools |
| 60 | + - Branch strategy (dev/<alias>/<feature>) |
| 61 | + - Git2Git workflow |
| 62 | + |
| 63 | +- **[API Reference](developer-guide/api-reference.md)** |
| 64 | + - DocumentAgent API (5 methods with examples) |
| 65 | + - DocumentParser API (3 methods) |
| 66 | + - LLMRouter API (2 methods) |
| 67 | + - Configuration utilities |
| 68 | + - Quality metrics structures |
| 69 | + - Type hints and error handling |
| 70 | + |
| 71 | +--- |
| 72 | + |
| 73 | +## ⚡ Feature Documentation |
| 74 | + |
| 75 | +**Deep dive into specific features:** |
| 76 | + |
| 77 | +### Features (`features/`) |
| 78 | + |
| 79 | +- **[Requirements Extraction](features/requirements-extraction.md)** |
| 80 | + - AI-powered extraction from documents |
| 81 | + - Multi-format support (PDF, DOCX, PPTX, HTML, images) |
| 82 | + - Quality enhancement mode (99-100% accuracy) |
| 83 | + - Batch processing |
| 84 | + - Provider optimization |
| 85 | + |
| 86 | +- **[Document Tagging](features/document-tagging.md)** |
| 87 | + - Automatic categorization |
| 88 | + - Tag types: requirement types, domains, priorities, sections |
| 89 | + - Tag-based filtering and analytics |
| 90 | + - Multi-label classification |
| 91 | + - Custom taxonomies |
| 92 | + |
| 93 | +- **[Quality Enhancements](features/quality-enhancements.md)** |
| 94 | + - Confidence scoring (0.0-1.0) |
| 95 | + - 5 confidence levels (very high to very low) |
| 96 | + - Quality flags (7 types: vague, missing ID, duplicate, incomplete, etc.) |
| 97 | + - Auto-approval workflow |
| 98 | + - Quality metrics and reporting |
| 99 | + |
| 100 | +- **[LLM Integration](features/llm-integration.md)** |
| 101 | + - Multi-provider support (4 providers) |
| 102 | + - Provider details: Ollama, Cerebras, OpenAI, Anthropic |
| 103 | + - Performance comparison |
| 104 | + - Cost optimization strategies |
| 105 | + - Advanced topics: custom providers, caching, token tracking |
| 106 | + |
| 107 | +--- |
| 108 | + |
| 109 | +## 📚 Additional Resources |
| 110 | + |
| 111 | +### Architecture (`architecture/`) |
| 112 | + |
| 113 | +System architecture documentation and templates: |
| 114 | +- [System Overview](architecture/system_overview.md) - High-level system view |
| 115 | +- [Component](architecture/component.md) - Component documentation template |
| 116 | +- [Interface](architecture/interface.md) - Interface specifications |
| 117 | +- [Pattern](architecture/pattern.md) - Design patterns used |
| 118 | +- [Quality](architecture/quality.md) - Quality attributes |
| 119 | +- And 9 more architectural templates |
| 120 | + |
| 121 | +### Business Documentation (`business/`) |
| 122 | + |
| 123 | +- [Stakeholder Analysis](business/stakeholder_analysis.md) |
| 124 | +- [Stakeholder Goals](business/stakeholder_goals.md) |
| 125 | +- [Differentiation Strategy](business/differentiation_strategy.md) |
| 126 | +- [Risks and Challenges](business/risks_and_challenges.md) |
| 127 | + |
| 128 | +### Specifications (`specs/`) |
| 129 | + |
| 130 | +- [Spec Template](specs/spec-template.md) - Feature specification template |
| 131 | +- [Settings Spec Template](specs/settings-spec-template.md) |
| 132 | +- [Example Spec](specs/#000%20-%20Example/) - Example specification |
| 133 | + |
| 134 | +### Process Documentation |
| 135 | + |
| 136 | +- [Building](building.md) - Build instructions |
| 137 | +- [Submitting Code](submitting_code.md) - Branch strategy and Git2Git workflow |
| 138 | +- [Code Style](STYLE.md) - Code style guidelines |
| 139 | +- [Code Organization](ORGANIZATION.md) - Repository structure rules |
| 140 | +- [Bot Usage](bot.md) - Working with AI coding assistants |
| 141 | + |
| 142 | +### Development Guides |
| 143 | + |
| 144 | +- [DeepAgent](deepagent.md) - DeepAgent implementation guide |
| 145 | +- [Debugging](Debugging.md) - Debugging techniques |
| 146 | +- [Feature Flags](feature_flags.md) - Feature flag system |
| 147 | +- [Fuzzing](fuzzing.md) - Fuzzing and testing strategies |
| 148 | +- [Exceptions](EXCEPTIONS.md) - Exception handling |
| 149 | +- [Cooked Read Data](COOKED_READ_DATA.md) - Data processing |
| 150 | + |
| 151 | +### Historical Documentation (`.archive/`) |
| 152 | + |
| 153 | +Implementation reports and phase summaries preserved for reference: |
| 154 | + |
| 155 | +- **phase1/** - Phase 1 implementation (3 docs) |
| 156 | +- **phase2/** - Phase 2 implementation (10 docs) |
| 157 | +- **implementation-reports/** - Task completion reports (5 docs) |
| 158 | +- **working-docs/** - Working documents and summaries (6 docs) |
| 159 | + |
| 160 | +--- |
| 161 | + |
| 162 | +## 🔍 Quick Navigation |
| 163 | + |
| 164 | +### By Role |
| 165 | + |
| 166 | +**I'm a user wanting to extract requirements:** |
| 167 | +1. Start: [Quick Start Guide](user-guide/quick-start.md) |
| 168 | +2. Configure: [Configuration Guide](user-guide/configuration.md) |
| 169 | +3. Deep dive: [Requirements Extraction](features/requirements-extraction.md) |
| 170 | + |
| 171 | +**I'm a developer wanting to contribute:** |
| 172 | +1. Setup: [Development Setup](developer-guide/development-setup.md) |
| 173 | +2. Understand: [Architecture](developer-guide/architecture.md) |
| 174 | +3. Reference: [API Reference](developer-guide/api-reference.md) |
| 175 | + |
| 176 | +**I'm evaluating the project:** |
| 177 | +1. Overview: [README](../README.md) |
| 178 | +2. Architecture: [System Overview](architecture/system_overview.md) |
| 179 | +3. Business: [Stakeholder Analysis](business/stakeholder_analysis.md) |
| 180 | + |
| 181 | +### By Task |
| 182 | + |
| 183 | +**Setting up LLM providers:** |
| 184 | +- [Configuration Guide](user-guide/configuration.md) → LLM Provider Setup |
| 185 | +- [LLM Integration](features/llm-integration.md) → Provider Details |
| 186 | + |
| 187 | +**Running tests:** |
| 188 | +- [Testing Guide](user-guide/testing.md) → Running Tests |
| 189 | +- [Development Setup](developer-guide/development-setup.md) → Testing Setup |
| 190 | + |
| 191 | +**Understanding the architecture:** |
| 192 | +- [Architecture Guide](developer-guide/architecture.md) → System Architecture |
| 193 | +- [Source README](../src/README.md) → Code Structure |
| 194 | + |
| 195 | +**Improving extraction quality:** |
| 196 | +- [Quality Enhancements](features/quality-enhancements.md) → Confidence Scoring |
| 197 | +- [Requirements Extraction](features/requirements-extraction.md) → Quality Mode |
| 198 | + |
| 199 | +--- |
| 200 | + |
| 201 | +## 📝 Documentation Standards |
| 202 | + |
| 203 | +### Writing New Documentation |
| 204 | + |
| 205 | +1. **Use templates** from `specs/` for consistency |
| 206 | +2. **Include code examples** that are tested and work |
| 207 | +3. **Add cross-references** to related documentation |
| 208 | +4. **Keep it updated** when making code changes |
| 209 | +5. **Follow markdown linting** rules (see `.markdownlint.json`) |
| 210 | + |
| 211 | +### Documentation Structure |
| 212 | + |
| 213 | +Each major doc should include: |
| 214 | +- **Overview** - What is this about? |
| 215 | +- **Quick Start** - How do I use it? |
| 216 | +- **Configuration** - How do I configure it? |
| 217 | +- **Examples** - Show me working code |
| 218 | +- **Troubleshooting** - What if it doesn't work? |
| 219 | +- **Related Docs** - Where can I learn more? |
| 220 | + |
| 221 | +### Maintenance |
| 222 | + |
| 223 | +- User guides: Update when user-facing features change |
| 224 | +- Developer guides: Update when APIs or architecture change |
| 225 | +- Feature docs: Update when feature capabilities change |
| 226 | +- Architecture docs: Update when design decisions change |
| 227 | + |
| 228 | +--- |
| 229 | + |
| 230 | +## 🤝 Contributing |
| 231 | + |
| 232 | +See [CONTRIBUTING.md](../CONTRIBUTING.md) for: |
| 233 | +- How to submit documentation changes |
| 234 | +- Pull request template |
| 235 | +- Review process |
| 236 | +- Style guidelines |
| 237 | + |
| 238 | +--- |
| 239 | + |
| 240 | +**Need help?** See [SUPPORT.md](../SUPPORT.md) for support channels. |
| 241 | + |
| 242 | +**Have a question?** Check the [FAQ section](../README.md#faq) in the main README. |
| 243 | + |
| 244 | +--- |
| 245 | + |
| 246 | +*Last Updated: 2025-10-07* |
| 247 | +*Total Documentation Files: 60+* |
| 248 | +*Documentation Coverage: Complete* |
0 commit comments