Skip to content

Commit 023e332

Browse files
authored
docs: Update documentation structure to reflect experiments-first paradigm (#2394)
## Problem Description: Documentation structure was outdated and didn't reflect the current library focus on experiments and custom metrics. The library has evolved to emphasize systematic experimentation and custom metrics for evaluating any AI application. ## Changes Made - **Home page (`docs/index.md`)**: Added "Why Ragas?" section explaining value proposition and key features. Updated to reflect experiments-first approach, custom metrics, and broader AI application evaluation. Removed FAQ section and improved card descriptions to be more actionable. - **Get Started (`docs/getstarted/index.md`)**: Reorganized to reflect experiments quickstart as the main entry point. Removed links to outdated RAG-focused tutorials (evals.md, rag_eval.md, rag_testset_generation.md). Added Discord community link and organized tutorials into clearer sections. - **Core Concepts (`docs/concepts/index.md`)**: Reordered sections to prioritize Experimentation and Datasets at the top with separate cards. Updated metrics description to reflect both available metrics library and creating custom metrics. Removed Feedback Intelligence card. Moved Components to the end. - **Navigation (`mkdocs.yml`)**: Updated navigation structure to match current library organization. Removed outdated tutorials from Get Started navigation. Flattened Experiments section (Experimentation and Datasets as direct children). Removed Feedback Intelligence from navigation. Reorganized to reflect experiments-first paradigm. ## Testing ### How to Test - [ ] Automated tests added/updated: N/A (documentation changes only) - [ ] Manual testing steps: 1. Build docs locally: make serve-docs 2. Navigate through home page and verify "Why Ragas?" section appears 3. Check Get Started section - experiments quickstart should be prominent 4. Verify Core Concepts has Experimentation and Datasets at the top 5. Confirm outdated RAG tutorials are no longer in navigation 6. Test all internal links work correctly 7. Verify navigation structure matches current library organization ## References - Related issues: - Documentation: Updated to reflect current library capabilities and focus - External references: N/A ## Screenshots/Examples (if applicable) <!-- Navigation structure changes and content reorganization visible in local docs build -->
1 parent 8501a49 commit 023e332

File tree

5 files changed

+59
-132
lines changed

5 files changed

+59
-132
lines changed

docs/concepts/experimentation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ graph LR
3636

3737
## Creating Experiments with Ragas
3838

39-
Ragas provides an `@experiment` decorator to streamline the experiment creation process. If you prefer a hands-on intro first, see [Run your first experiment](../getstarted/experiments_quickstart.md).
39+
Ragas provides an `@experiment` decorator to streamline the experiment creation process. If you prefer a hands-on intro first, see the [Quick Start guide](../getstarted/quickstart.md).
4040

4141
### Basic Experiment Structure
4242

docs/concepts/index.md

Lines changed: 12 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -3,41 +3,36 @@
33

44
<div class="grid cards" markdown>
55

6-
- :material-widgets:{ .lg .middle } [__Components Guides__](components/index.md)
6+
- :material-flask-outline:{ .lg .middle } [__Experimentation__](experimentation.md)
77

88
---
99

10-
Discover the various components used within Ragas.
11-
12-
Components like [Prompt Object](components/prompt.md), [Evaluation Dataset](components/eval_dataset.md) and [more..](components/index.md)
10+
Learn how to systematically evaluate your AI applications using experiments.
1311

12+
Track changes, measure improvements, and compare results across different versions of your application.
1413

15-
- ::material-ruler-square:{ .lg .middle } [__Ragas Metrics__](metrics/index.md)
14+
- :material-database-export:{ .lg .middle } [__Datasets__](datasets.md)
1615

1716
---
1817

19-
Explore available metrics and understand how they work.
18+
Understand how to create, manage, and use evaluation datasets.
2019

21-
Metrics for evaluating [RAG](metrics/available_metrics/index.md#retrieval-augmented-generation), [Agentic workflows](metrics/available_metrics/index.md#agents-or-tool-use-cases) and [more..](metrics/available_metrics/index.md#list-of-available-metrics).
20+
Learn about dataset structure, storage backends, and best practices for maintaining your test data.
2221

23-
- :material-database-plus:{ .lg .middle } [__Test Data Generation__](test_data_generation/index.md)
22+
- ::material-ruler-square:{ .lg .middle } [__Ragas Metrics__](metrics/index.md)
2423

2524
---
2625

27-
Generate high-quality datasets for comprehensive testing.
28-
29-
Algorithms for synthesizing data to test [RAG](test_data_generation/rag.md), [Agentic workflows](test_data_generation/agents.md)
26+
Use our library of [available metrics](metrics/available_metrics/index.md) or create [custom metrics](metrics/overview/index.md) tailored to your use case.
3027

28+
Metrics for evaluating [RAG](metrics/available_metrics/index.md#retrieval-augmented-generation), [Agentic workflows](metrics/available_metrics/index.md#agents-or-tool-use-cases) and [more..](metrics/available_metrics/index.md#list-of-available-metrics).
3129

32-
- :material-chart-box-outline:{ .lg .middle } [__Feedback Intelligence__](feedback/index.md)
30+
- :material-database-plus:{ .lg .middle } [__Test Data Generation__](test_data_generation/index.md)
3331

3432
---
3533

36-
Leverage signals from production data to gain actionable insights.
37-
38-
Learn about to leveraging implicit and explicit signals from production data.
39-
40-
34+
Generate high-quality datasets for comprehensive testing.
4135

36+
Algorithms for synthesizing data to test [RAG](test_data_generation/rag.md), [Agentic workflows](test_data_generation/agents.md)
4237

4338
</div>

docs/getstarted/index.md

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,25 @@
11
# 🚀 Get Started
22

3-
Welcome to Ragas! If you're new to Ragas, the Get Started guides will walk you through the fundamentals of working with Ragas. These tutorials assume basic knowledge of Python and building LLM application pipelines.
3+
Welcome to Ragas! The Get Started guides will walk you through the fundamentals of working with Ragas. These tutorials assume basic knowledge of Python and building LLM application pipelines.
44

55
Before you proceed further, ensure that you have [Ragas installed](./install.md)!
66

77
!!! note
8-
The tutorials only provide an overview of what you can accomplish with Ragas and the basic skills needed to utilize it effectively. For an in-depth explanation of the core concepts behind Ragas, check out the [Core Concepts](../concepts/index.md) page. You can also explore the [How-to Guides](../howtos/index.md) for specific applications of Ragas.
8+
The tutorials provide an overview of what you can accomplish with Ragas and the basic skills needed to utilize it effectively. For an in-depth explanation of the core concepts behind Ragas, check out the [Core Concepts](../concepts/index.md) page. You can also explore the [How-to Guides](../howtos/index.md) for specific applications of Ragas.
99

10-
If you have any questions about Ragas, feel free to join and ask in the `#questions` channel in our Discord community.
10+
If you have any questions about Ragas, feel free to join our [Discord community](../community/index.md) and ask in the `#questions` channel.
1111

12-
Let's get started!
12+
## Quickstart
1313

14-
- [Quick Start: Get Running in 5 Minutes](./quickstart.md)
15-
- [Evaluate your first AI app](./evals.md)
16-
- [Run ragas metrics for evaluating RAG](rag_eval.md)
17-
- [Generate test data for evaluating RAG](rag_testset_generation.md)
18-
- [Run your first experiment](experiments_quickstart.md)
14+
Start here to get up and running with Ragas in minutes:
15+
16+
- [Quick Start: Get Running in 5 Minutes](./quickstart.md)
17+
18+
## Tutorials
19+
20+
Learn how to evaluate different types of AI applications:
21+
22+
- [Evaluate a prompt](../tutorials/prompt.md) - Test and compare different prompts
23+
- [Evaluate a simple RAG system](../tutorials/rag.md) - Evaluate a RAG application
24+
- [Evaluate an AI Workflow](../tutorials/workflow.md) - Evaluate multi-step workflows
25+
- [Evaluate an AI Agent](../tutorials/agent.md) - Evaluate agentic applications

docs/index.md

Lines changed: 20 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -1,121 +1,52 @@
11
# ✨ Introduction
22

3-
Ragas is a library that provides tools to supercharge the evaluation of Large Language Model (LLM) applications. It is designed to help you evaluate your LLM applications with ease and confidence.
3+
Ragas is a library that helps you move from "vibe checks" to systematic evaluation loops for your AI applications. It provides tools to supercharge the evaluation of Large Language Model (LLM) applications, enabling you to evaluate your LLM applications with ease and confidence.
44

5+
## Why Ragas?
56

7+
Traditional evaluation metrics don't capture what matters for LLM applications. Manual evaluation doesn't scale. Ragas solves this by combining **LLM-driven metrics** with **systematic experimentation** to create a continuous improvement loop.
8+
9+
### Key Features
10+
11+
- **Experiments-first approach**: Evaluate changes consistently with `experiments`. Make changes, run evaluations, observe results, and iterate to improve your LLM application.
12+
13+
- **Ragas Metrics**: Create custom metrics tailored to your specific use case with simple decorators or use our library of [available metrics](./concepts/metrics/available_metrics/index.md). Learn more about [metrics in Ragas](./concepts/metrics/overview/index.md).
14+
15+
- **Easy to integrate**: Built-in dataset management, result tracking, and integration with popular frameworks like LangChain, LlamaIndex, and more.
616

717
<div class="grid cards" markdown>
818
- 🚀 **Get Started**
919

10-
Install with `pip` and get started with Ragas with these tutorials.
20+
Start evaluating in 5 minutes with our quickstart guide.
1121

12-
[:octicons-arrow-right-24: Get Started](getstarted/evals.md)
22+
[:octicons-arrow-right-24: Get Started](getstarted/quickstart.md)
1323

1424
- 📚 **Core Concepts**
1525

16-
In depth explanation and discussion of the concepts and working of different features available in Ragas.
26+
Understand experiments, metrics, and datasets—the building blocks of effective evaluation.
1727

1828
[:octicons-arrow-right-24: Core Concepts](./concepts/index.md)
1929

2030
- 🛠️ **How-to Guides**
2131

22-
Practical guides to help you achieve a specific goals. Take a look at these
23-
guides to learn how to use Ragas to solve real-world problems.
32+
Integrate Ragas into your workflow with practical guides for specific use cases.
2433

2534
[:octicons-arrow-right-24: How-to Guides](./howtos/index.md)
2635

2736
- 📖 **References**
2837

29-
Technical descriptions of how Ragas classes and methods work.
38+
API documentation and technical details for diving deeper.
3039

3140
[:octicons-arrow-right-24: References](./references/index.md)
3241

3342
</div>
3443

3544

45+
## Want help improving your AI application using evals?
3646

47+
In the past 2 years, we have seen and helped improve many AI applications using evals.
3748

49+
We are compressing this knowledge into a product to replace vibe checks with eval loops so that you can focus on building great AI applications.
3850

39-
## Frequently Asked Questions
40-
41-
<div class="toggle-list"><span class="arrow">→</span> What is the best open-source model to use?</div>
42-
<div style="display: none;">
43-
There isn't a single correct answer to this question. With the rapid pace of AI model development, new open-source models are released every week, often claiming to outperform previous versions. The best model for your needs depends largely on your GPU capacity and the type of data you're working with.
44-
<br><br>
45-
It's a good idea to explore newer, widely accepted models with strong general capabilities. You can refer to <a href="https://github.com/eugeneyan/open-llms?tab=readme-ov-file#open-llms">this list</a> for available open-source models, their release dates, and fine-tuned variants.
46-
</div>
47-
48-
<div class="toggle-list"><span class="arrow">→</span> Why do NaN values appear in evaluation results?</div>
49-
<div style="display: none;">
50-
NaN stands for "Not a Number." In ragas evaluation results, NaN can appear for two main reasons:
51-
<ul style="margin: 0.5rem 0; padding-left: 1.5rem;">
52-
<li><strong>JSON Parsing Issue:</strong> The model's output is not JSON-parsable. ragas requires models to output JSON-compatible responses because all prompts are structured using Pydantic. This ensures efficient parsing of LLM outputs.</li>
53-
<li><strong>Non-Ideal Cases for Scoring:</strong> Certain cases in the sample may not be ideal for scoring. For example, scoring the faithfulness of a response like "I don't know" might not be appropriate.</li>
54-
</ul>
55-
</div>
56-
57-
<div class="toggle-list"><span class="arrow">→</span> How can I make evaluation results more explainable?</div>
58-
<div style="display: none;">
59-
The best way is to trace and log your evaluation, then inspect the results using LLM traces. You can follow a detailed example of this process <a href="./howtos/customizations/metrics/tracing/">here</a>.
60-
</div>
61-
62-
<script>
63-
// FAQ
64-
(function() {
65-
function initFAQ() {
66-
const toggles = document.querySelectorAll('.toggle-list');
67-
68-
toggles.forEach(toggle => {
69-
// Remove any existing listeners
70-
const newToggle = toggle.cloneNode(true);
71-
toggle.parentNode.replaceChild(newToggle, toggle);
72-
});
73-
74-
// Re-select after cloning
75-
const freshToggles = document.querySelectorAll('.toggle-list');
76-
77-
freshToggles.forEach(toggle => {
78-
const arrow = toggle.querySelector('.arrow');
79-
const content = toggle.nextElementSibling;
80-
81-
// Initialize as closed
82-
if (arrow) arrow.innerText = '';
83-
if (content) content.style.display = 'none';
84-
toggle.classList.remove('active');
85-
86-
// Add click listener
87-
toggle.addEventListener('click', function() {
88-
const myContent = this.nextElementSibling;
89-
const myArrow = this.querySelector('.arrow');
90-
const isOpen = this.classList.contains('active');
91-
92-
// Close all others first
93-
freshToggles.forEach(other => {
94-
const otherContent = other.nextElementSibling;
95-
const otherArrow = other.querySelector('.arrow');
96-
if (otherContent) otherContent.style.display = 'none';
97-
other.classList.remove('active');
98-
if (otherArrow) otherArrow.innerText = '';
99-
});
100-
101-
// Open this one if it was closed
102-
if (!isOpen) {
103-
if (myContent) myContent.style.display = 'block';
104-
this.classList.add('active');
105-
if (myArrow) myArrow.innerText = '';
106-
}
107-
});
108-
});
109-
}
110-
111-
// Initialize when page loads
112-
if (document.readyState === 'loading') {
113-
document.addEventListener('DOMContentLoaded', function() {
114-
initFAQ();
115-
});
116-
} else {
117-
initFAQ();
118-
}
119-
})();
120-
</script>
51+
If you want help with improving and scaling up your AI application using evals, 🔗 Book a [slot](https://bit.ly/3EBYq4J) or drop us a line: [founders@explodinggradients.com](mailto:founders@explodinggradients.com).
12152

mkdocs.yml

Lines changed: 10 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -13,24 +13,15 @@ nav:
1313
- getstarted/index.md
1414
- Quick Start: getstarted/quickstart.md
1515
- Installation: getstarted/install.md
16-
- Evaluate your first LLM App: getstarted/evals.md
17-
- Evaluate a simple RAG: getstarted/rag_eval.md
18-
- Generate Synthetic Testset for RAG: getstarted/rag_testset_generation.md
19-
- Experiments:
20-
- Run your first experiment: getstarted/experiments_quickstart.md
16+
- Tutorials:
2117
- Evaluate a prompt: tutorials/prompt.md
2218
- Evaluate a simple RAG system: tutorials/rag.md
2319
- Evaluate an AI Workflow: tutorials/workflow.md
2420
- Evaluate an AI Agent: tutorials/agent.md
2521
- 📚 Core Concepts:
2622
- concepts/index.md
27-
- Components:
28-
- concepts/components/index.md
29-
- General:
30-
- Prompt: concepts/components/prompt.md
31-
- Evaluation:
32-
- Evaluation Sample: concepts/components/eval_sample.md
33-
- Evaluation Dataset: concepts/components/eval_dataset.md
23+
- Experimentation: concepts/experimentation.md
24+
- Datasets: concepts/datasets.md
3425
- Metrics:
3526
- concepts/metrics/index.md
3627
- Overview: concepts/metrics/overview/index.md
@@ -84,10 +75,13 @@ nav:
8475
- Scenario Generation: concepts/test_data_generation/rag/#scenario-generation
8576
- Agents or tool use:
8677
- concepts/test_data_generation/agents.md
87-
- Feedback Intelligence:
88-
- concepts/feedback/index.md
89-
- Datasets: concepts/datasets.md
90-
- Experimentation: concepts/experimentation.md
78+
- Components:
79+
- concepts/components/index.md
80+
- General:
81+
- Prompt: concepts/components/prompt.md
82+
- Evaluation:
83+
- Evaluation Sample: concepts/components/eval_sample.md
84+
- Evaluation Dataset: concepts/components/eval_dataset.md
9185

9286
- 🛠️ How-to Guides:
9387
- howtos/index.md

0 commit comments

Comments
 (0)