Skip to content

This is our repository for the training code on the DeepResearch-9K dataset.

Notifications You must be signed in to change notification settings

Applied-Machine-Learning-Lab/DeepResearch-R1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepResearch-9K

This repository contains the dataset and codebase for the DeepResearch-9K project. All environment configuration files are stored in the env/ directory.


📊 Environment Overview

Environment Python Key Features Primary Use Case
react_infer_env 3.10 OpenAI SDK, Data processing Inference & Data Modification
searchr1 3.9 vLLM, Verl, Flash-Attn 2 Model Training & RL Tasks
retriever 3.10 Faiss-GPU, Pyserini, FastAPI Vector Search & Knowledge Retrieval

🛠 1. Inference Environment (react_infer_env)

Purpose: Optimized for running basic inference and large-scale data processing/modification scripts.

# Create and activate environment
conda create -n react_infer_env python=3.10.0 -y
conda activate react_infer_env

# Install dependencies from the env folder
conda install --file env/react_infer_requirements.txt

🚀 2. Search & Training Environment (searchr1)

Purpose: Designed for model training (Verl), Reinforcement Learning (RL) tasks, and high-performance inference via vLLM.


# Create and activate environment
conda create -n searchr1 python=3.9 -y
conda activate searchr1

# Install PyTorch and vLLM
pip install torch==2.4.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)
pip install vllm==0.6.3

# Install Verl Framework & Flash Attention 2
pip install -e .
pip install flash-attn --no-build-isolation
pip install wandb

🔍 3. Retriever Environment (retriever)

Purpose: Specialized for knowledge retrieval, vector database management, and hosting API services.


# Create and activate environment
conda create -n retriever python=3.10 -y
conda activate retriever

# Install PyTorch and CUDA (Conda recommended for Faiss-GPU compatibility)
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.1 -c pytorch -c nvidia

# Install Faiss-GPU to guarantee efficient RL rollout
conda install -c pytorch -c nvidia faiss-gpu=1.8.0

# Install additional retrieval components
pip install transformers datasets pyserini uvicorn fastapi

📊 Dataset & Rollouts

You can access the complete dataset and model rollouts on Hugging Face:

* **Dataset**: [artillerywu/DeepResearch-9K](https://huggingface.co/datasets/artillerywu/DeepResearch-9K)
    * Contains **9,000 high-quality samples**.
    * Includes full **rollouts** generated by the **Tongyi-DeepResearch-30B-A3B** teacher model.

---

## 🚀 Supervised Fine-Tuning (SFT)

We provide scripts to perform SFT on popular 3B-parameter base models.

### 1. Base Models
The scripts are optimized for the following models available on Hugging Face:
* **Qwen2.5-3B**: [Qwen/Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B)
* **Llama-3.2-3B**: [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B)

### 2. SFT Scripts
Use the corresponding Python scripts to start the training process:
* For Llama 3.2: `python sft_llama3b.py`
* For Qwen 2.5: `python sft_qwen3b.py`

---

About

This is our repository for the training code on the DeepResearch-9K dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published