Yue Li‡,*, Qi Ma*, Runyi Yang, Mengjiao Ma, Bin Ren, Nikola Popovic, Nicu Sebe, Theo Gevers, Luc Van Gool, Danda Pani Paudel†, Martin R. Oswald†
‡ Project lead. * Equal contribution. † Equal supervision.
Chorus is a multi-teacher pretraining framework for 3D Gaussian scene encoding, which trains in end-to-end manner a hoslistic scene encoder with strong data efficiency. This release includes the Chorus multi-teacher language pretraining pipeline, the 2D adaptation pipeline, and standalone inference for both preprocessed 3DGS folders and raw 3DGS .ply inputs.
We provide two tested conda environments. env.yaml is with Python 3.10, PyTorch 2.5.1, and CUDA 12.4. env_cu126.yaml is the newer CUDA 12.6 environment with PyTorch 2.7.1. If you are using micromamba 2.6.0+, please set micromamba config set use_sharded_repodata false for solving the environment.
PIP_NO_BUILD_ISOLATION=0 conda env create -f env.yaml
conda activate chorusFor the CUDA 12.6 variant:
PIP_NO_BUILD_ISOLATION=0 conda env create -f env_cu126.yaml
conda activate chorusChorus 3DGS inference supports:
- standard Gaussian
.ply - compressed SuperSplat
.ply - preprocessed scene folder with Gaussian params in
color.npy, coord.npy, opacity.npy, quat.npy, scale.npy
Set the input and output paths once, then use the same OUTPUT_DIR for later PCA visualization and Mini Viewer as well:
export INPUT_ROOT=/path/to/processed_scene_or_ply
export OUTPUT_DIR=/path/to/chorus_outputs/scene_namepython -m tools.lang_inference \
--config chorus_3dgs \
--checkpoint lang-dino-enc-pretrain-scan-ppv2-mp3d-mcmc \
--input-root "$INPUT_ROOT" \
--output-dir "$OUTPUT_DIR"With optional args:
--pca_vis # optional PCA visualization on saved features; if omitted, PCA can be run separately with scripts/pca_colorize_features.py
--no_save # to skip saving the features
--disable-outlier-filter # Raw `.ply` inference applies the outlier filter by default, use this flag to disable the filter.Running point-cloud variant of Chorus for preprocessed folders with coord.npy, color.npy, and normal.npy:
python -m tools.lang_inference \
--config chorus_pts \
--checkpoint lang-dino-enc-pretrain-ppv2-mcmc-from-pts-params \
--input-root "$INPUT_ROOT" \
--output-dir "$OUTPUT_DIR"We provide a simple Mini Viewer for visualizing the original 3DGS scene and the saved language features after Chorus inference.
Install Mini Viewer into the chorus conda env from the commit:
python -m pip install "git+https://github.com/RunyiYang/Mini_Viewer.git@6c8e5c938844487319a92e19f952e76cd4eba847"Mini Viewer uses gsplat for the fast CUDA renderer. Run this once after installing Mini Viewer so the long gsplat JIT build happens before the first viewer session:
python -m tools.mini_viewer --precompile-gsplatAfter running Chorus inference, launch Mini Viewer on the original source scene and the saved language features. Use the same --output-dir from tools.lang_inference:
python -m tools.mini_viewer \
--input-root "$INPUT_ROOT" \
--output-dir "$OUTPUT_DIR"By default this resolves to the feature path <output-dir>/<scene>_lang_feat.pt and uses --feature-type siglip2.
For multiple inputs, scripts/run_inference.sh is the simple batch wrapper around tools.lang_inference. It is meant for input roots like root_dir/scene_a/*.npy, root_dir/scene_b/*.npy, ... or root_dir/a.ply, root_dir/b.ply, the output structure is <batch_output_root>/<scene_name>/....
bash scripts/run_inference.sh \
--input-root /path/to/batch_input_root \
--output-dir /path/to/batch_output_rootAdd --disable-outlier-filter if you want batch raw-Ply inference without pruning.
chorus_3dgsconfig supports preprocessed scene folders with*.npyfiles, raw standard Gaussian.ply, and raw compressed SuperSplat.ply.chorus_ptsconfig is npy-only and expectscoord.npy,color.npy, andnormal.npy.- Raw
.plyinference enables the outlier filter by default. Use--disable-outlier-filterif you want to keep all splats for inference.- Chorus supports inference on scenes with millions of splats. The GPU memory cost depends on the input splats number, density and the grid_size in the test_cfg. We measured around
22 GiBfor scenes with~500ksplats and around48 GiBfor~1.5Msplats with the default config.--pca_visruns PCA right after inference on the saved features. If you prefer a separate step, usescripts/pca_colorize_features.py.
Chorus also provides an inference-only package mode for users who want to use the 3DGS encoder directly in their code. Use install_package_deps.sh for package mode installation. It first installs the dependencies first, then installs the local chorus package.
bash scripts/package/install_package_deps.sh --track cu124
# or
bash scripts/package/install_package_deps.sh --track cu126Use cu124 for the env.yaml track (CUDA 12.4 / PyTorch 2.5.1) and cu126 for the env_cu126.yaml track (CUDA 12.6 / PyTorch 2.7.1). The exact versions are listed in the requirements files:
scripts/package/requirements-cu124.txt
scripts/package/requirements-cu126.txt
For optional Mini Viewer support in package mode, please add the --viewer flag:
bash scripts/package/install_package_deps.sh --track cu126 --viewerFor manual installation, please refer to the requirements file directly and install them before installing Chorus:
python -m pip install -r scripts/package/requirements-cu124.txt
python -m pip install -e . --no-deps
python -c "import torch, spconv, torch_scatter, flash_attn, chorus; print(torch.__version__, chorus.__version__)"
chorus-encode --helpThe default Python API returns features without writing files:
import chorus
encoder = chorus.load("chorus_3dgs")
out = encoder.encode("/path/to/processed_scene_or_ply")
lang = out.features["lang"] # input-aligned N x 1152 feature arraySupported package modes are chorus_3dgs and chorus_pts. By default, chorus.load("chorus_3dgs") resolves lang-dino-enc-pretrain-scan-ppv2-mp3d-mcmc from the Chorus Hugging Face model repo, while chorus.load("chorus_pts") resolves lang-dino-enc-pretrain-ppv2-mcmc-from-pts-params. Pass checkpoint="/path/to/model.pth" to load a local checkpoint instead.
Configurable outputs are lang, dino, backbone_upcast, and backbone_last:
encoder = chorus.load(
"chorus_3dgs",
outputs=("lang", "dino", "backbone_upcast", "backbone_last"),
)
out = encoder.encode("/path/to/scene.ply", output_dir="/path/to/features")lang, dino, and backbone_upcast are aligned to the input splat/point rows after any raw-Ply filtering. backbone_last is a structured low-resolution token from the last encoder layer, outputing with feat, coord, grid_coord, offset, and fragment_index; it is not input-row aligned.
Package CLI:
chorus-encode \
--mode chorus_3dgs \
--input-root "$INPUT_ROOT" \
--output-dir "$OUTPUT_DIR" \
--outputs langOptional Mini Viewer package command:
chorus-viewer --input-root "$INPUT_ROOT" --output-dir "$OUTPUT_DIR"The full Chorus pretraining data for the joint training is close to 20TB in storage. Given the sheer amount, we release the complete repos we used to obtain the data which cover the steps: start from the already released 3DGS *.npy datasets, 2D feature map extraction with SigLIP2/SAM2, OccamLGS language feature lifts, LUDVIG DINOv3/PE-Spatial feature lifts, then merge and chunk to the final training samples.
See pretraining_data.md for the detailed steps, repo links, split usage, and chunking commands.
The main Chorus configs are under configs/chorus/ for the LangPretrainerMultiTeacher class.
Train from scratch with:
python -m tools.train \
--config-file configs/chorus/concat_dataset/lang-dino-enc-pretrain-scan-ppv2-mp3d-mcmc.py \
--options save_path=exp_runs/lang_pretrainer/chorus/concat_dataset/lang-dino-enc-pretrain-scan-ppv2-mp3d-mcmc \
--num-gpus <n>Test only with a trained checkpoint, with the test data in the config:
python -m tools.train \
--config-file configs/chorus/concat_dataset/lang-dino-enc-pretrain-scan-ppv2-mp3d-mcmc.py \
--options test_only=True weight=<ckpt_path> save_path=<exp_path> \
--num-gpus <n>If needed to change the dataset root with CLI, please override the nested fields used by the dataloaders. For example, for the concat config:
python -m tools.train \
--config-file configs/chorus/concat_dataset/lang-dino-enc-pretrain-scan-ppv2-mp3d-mcmc.py \
--options \
data.train.datasets.0.data_root=/path/to/scannetpp_root \
data.val.0.data_root=/path/to/scannetpp_root \
...
save_path=exp_runs/lang_pretrainer/chorus/debug_run \
--num-gpus 1Run these commands from the repo root. python -m tools.lang_inference accepts the short aliases chorus_3dgs and chorus_pts. If --output-dir is omitted, both inference and PCA visualization write to <repo_root>/outputs. Inference is CUDA-only now.
Released checkpoints are hosted in the Chorus Hugging Face model repo. Please first request access on Hugging Face. After authenticating, the inference CLI accepts either a local path or one of these checkpoint names directly:
| Checkpoint name | Inference alias | Inputs | What it represents |
|---|---|---|---|
| lang-dino-enc-pretrain-scan-ppv2-mp3d-mcmc | chorus_3dgs |
3DGS params: color, opacity, quat, scale |
Main release model trained jointly with SigLIP2 language targets and DINO targets on ScanNet++, ScanNet, and Matterport3D 3DGS data. |
| lang-dino-enc-pretrain-ppv2-mcmc-from-pts-params | chorus_pts |
point params: coord, color, normal |
Point-parameter variant trained with the same language/DINO teacher setup on ScanNet++ data. |
Use chorus_3dgs for preprocessed 3DGS folders, standard Gaussian .ply, or compressed SuperSplat .ply. Use chorus_pts only for preprocessed folders with coord.npy, color.npy, and normal.npy.
If you want to convert raw .ply files into parameter folders first:
python scripts/preprocess_gs.py \
--input /path/to/raw_scene_or_dir \
--output /path/to/preprocessed_outputThe export script also supports compressed .ply, and it uses the same default outlier filter as inference. Add --disable-outlier-filter to export the raw splats without pruning.
If you already have a saved feature file and want to visualize it separately:
python scripts/pca_colorize_features.py \
--feature-path /path/to/saved_feature.pt \
--input-root /path/to/original_scene_or_ply \
--device cudaThis script writes a colored point cloud .ply, and it also writes a feat-vis 3DGS .ply when the source input provides opacity, scale, and quat. If --device is omitted, PCA defaults to CPU.
2D adaptation is the proposed workflow for finetuning Chorus on custom 3DGS data, without requiring preparing the teacher features in advance. It is registered as
LangPretrainerMultiTeacher2D and is not used by the
LangPretrainerMultiTeacher pretraining or inference path. The extra renderer
dependency is gsplat==1.4.0.
The optional PE-Spatial examples also need perception_models source
tree. Please use a local checkout instead and export
it before launching the configs using it:
git clone https://github.com/facebookresearch/perception_models.git /path/to/perception_models
export PERCEPTION_MODELS_ROOT=/path/to/perception_models
export PYTHONPATH=$PERCEPTION_MODELS_ROOT:$PYTHONPATHThe example config is the following one that adapts by the DINO teacher:
configs/chorus/2d_adaption/dino-only-enc-pretrain-interior-mcmc_lora.pyFor InteriorGS preprocessing, see scripts/interiorgs_process/README.md. We also provide a demo of processed data in mini_test.zip.
The expected per-scene input layout for 2D adaptation is:
<data_root>/<split>/<scene_id>/
├── color.npy
├── coord.npy
├── opacity.npy
├── quat.npy
├── scale.npy
├── segment.npy
├── render_filtered/ # preferred; render/ is fallback
├── transforms_camera_positions_filtered.json
├── visiable_gaussian_masks_per_frame_filtered_box_mask.npy
└── visiable_gaussian_masks_per_frame_filtered_pair_top4.npyFresh 2D adaptation should initialize from the released Chorus checkpoint via
model.backbone_path. Do not pass the base Chorus checkpoint through weight;
weight is only for resuming or evaluating an already adapted checkpoint.
Example:
python -m tools.train \
--config-file configs/chorus/2d_adaption/dino-only-enc-pretrain-interior-mcmc_lora.py \
--options \
model.backbone_path=/path/to/released_chorus/best_lang.pth \
data.train.data_root=/path/to/2d_adaptation_root \
data.val.data_root=/path/to/2d_adaptation_root \
save_path=exp_runs/lang_pretrainer/chorus/2d_adaption/dino-only \
--num-gpus 1Useful config fields:
resize_w,resize_h: rendering resolution used by the differentiable 2D supervision path.downsample_ratio_2D: supervision feature-map downsample ratio.teacher_2D_model: Hugging Face 2D teacher name; adjust projector output channels if changing teacher families.online_image: renders GS images online before 2D teacher encoding;Falseloads rendered image files.render_dir_nameandpair_top_k: dataset artifact selection, defaulting torender_filteredand4.
The 2D adaptation examples use the smaller 768-D DINOv3-B and SigLIP2-base
feature spaces. Therefore
configs/chorus/concat_dataset/lang-dino-enc-pretrain-scan-ppv2-mp3d-mcmc-eval-after-2d-adapt-w-lora.py
is eval-only for adapted LoRA checkpoints; do not use it to train on the
normal concat pretraining data, whose stored release targets are 1152-D
SigLIP2-SO400M language features and 1024-D DINO features. For lang
zero-shot evaluation, load a checkpoint adapted with a SigLIP2 teacher.
After LoRA-based adaptation, the adapted model can be probed with, e.g.:
configs/chorus/2d_adaption/semseg-gs-ptv3m2-interior-gs-2d-adapter-lin.pySet the adapted checkpoint as weight=<adapted_ckpt>.
@InProceedings{Li_2026_CVPR,
author = {Li, Yue and Ma, Qi and Yang, Runyi and Ma, Mengjiao and Ren, Bin and Popovic, Nikola and Sebe, Nicu and Gevers, Theo and Van Gool, Luc and Paudel, Danda Pani and Oswald, Martin R.},
title = {Chorus: Multi-Teacher Pretraining for Holistic 3D Gaussian Scene Encoding},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2026},
pages = {21431-21442}
}This codebase is developed based on the SceneSplat and Pointcept codebase. We sincerely thank the authors of Pointcept, OccamLGS, and LUDVIG for their contributions!
