This repository contains the dataset and codebase for the DeepResearch-9K project. All environment configuration files are stored in the env/ directory.
| Environment | Python | Key Features | Primary Use Case |
|---|---|---|---|
| react_infer_env | 3.10 | OpenAI SDK, Data processing | Inference & Data Modification |
| searchr1 | 3.9 | vLLM, Verl, Flash-Attn 2 | Model Training & RL Tasks |
| retriever | 3.10 | Faiss-GPU, Pyserini, FastAPI | Vector Search & Knowledge Retrieval |
Purpose: Optimized for running basic inference and large-scale data processing/modification scripts.
# Create and activate environment
conda create -n react_infer_env python=3.10.0 -y
conda activate react_infer_env
# Install dependencies from the env folder
conda install --file env/react_infer_requirements.txt
Purpose: Designed for model training (Verl), Reinforcement Learning (RL) tasks, and high-performance inference via vLLM.
# Create and activate environment
conda create -n searchr1 python=3.9 -y
conda activate searchr1
# Install PyTorch and vLLM
pip install torch==2.4.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)
pip install vllm==0.6.3
# Install Verl Framework & Flash Attention 2
pip install -e .
pip install flash-attn --no-build-isolation
pip install wandb
Purpose: Specialized for knowledge retrieval, vector database management, and hosting API services.
# Create and activate environment
conda create -n retriever python=3.10 -y
conda activate retriever
# Install PyTorch and CUDA (Conda recommended for Faiss-GPU compatibility)
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.1 -c pytorch -c nvidia
# Install Faiss-GPU to guarantee efficient RL rollout
conda install -c pytorch -c nvidia faiss-gpu=1.8.0
# Install additional retrieval components
pip install transformers datasets pyserini uvicorn fastapi
You can access the complete dataset and model rollouts on Hugging Face:
* **Dataset**: [artillerywu/DeepResearch-9K](https://huggingface.co/datasets/artillerywu/DeepResearch-9K)
* Contains **9,000 high-quality samples**.
* Includes full **rollouts** generated by the **Tongyi-DeepResearch-30B-A3B** teacher model.
---
## 🚀 Supervised Fine-Tuning (SFT)
We provide scripts to perform SFT on popular 3B-parameter base models.
### 1. Base Models
The scripts are optimized for the following models available on Hugging Face:
* **Qwen2.5-3B**: [Qwen/Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B)
* **Llama-3.2-3B**: [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B)
### 2. SFT Scripts
Use the corresponding Python scripts to start the training process:
* For Llama 3.2: `python sft_llama3b.py`
* For Qwen 2.5: `python sft_qwen3b.py`
---