Rotem Gatenyo, Ohad Fried
Reichman University
📄 Paper | 🌐 Project Page | 🏆 CVPR 2026
This repository provides tools to align two 3D objects using vision-language guidance and geometric constraints. Given two meshes and a text prompt describing their desired spatial relationship (e.g., "a hotdog sausage sits inside a bun"), the method optimizes the relative translation, rotation, and optionally scale of one object with respect to the other.
Given two input meshes and a text prompt, our method optimizes the relative pose between the objects through differentiable rendering and vision-language supervision. The optimization is regularized by geometric objectives that encourage physically plausible contact while discouraging interpenetration. Optimization proceeds in multiple phases, gradually increasing geometric constraints and focusing the cameras on the interaction region for fine-grained refinement.
Recommended: Use the provided installation script. It performs a complete installation in a fresh Conda environment, validates all dependencies, and runs an example alignment to verify that everything is working correctly.
Run:
bash install.shThe script automatically:
- Clones all required repositories
- ObjectsAlignment
- nvdiffrast
- nvdiffmodeling
- Creates a fresh Conda environment
- Installs PyTorch and all required Python dependencies
- Installs nvdiffrast and PyTorch3D
- Configures nvdiffmodeling
- Verifies that all required libraries can be imported
- Runs the provided hotdog example to validate the installation
A successful installation should end with:
=====================================
INSTALLATION SUCCESSFUL
=====================================
Clone the repository:
git clone https://github.com/RotemGat/ObjectsAlignment.git
cd ObjectsAlignmentClone the required external repositories:
git clone https://github.com/NVlabs/nvdiffrast.git
git clone https://github.com/RotemGat/nvdiffmodeling.gitCreate and activate a Conda environment:
conda create -n align3d python=3.9 -y
conda activate align3dInstall PyTorch (example for CUDA 12.6):
pip install torch==2.7.1 torchvision==0.22.1 \
--index-url https://download.pytorch.org/whl/cu126Install Python dependencies:
pip install -r requirements.txtInstall nvdiffrast:
pip install -e nvdiffrastInstall PyTorch3D:
pip install --no-build-isolation \
"git+https://github.com/facebookresearch/pytorch3d.git@stable"Add nvdiffmodeling to your Python path:
export PYTHONPATH="$PWD/nvdiffmodeling:$PYTHONPATH"Run the provided hotdog example:
python main.py --config configs/PairBench3D/hotdog.yamlThe run creates a timestamped workspace containing logs, rendered images, optimization checkpoints, and final aligned meshes.
The repository provides two example optimization modes:
Optimizes translation and rotation only.
python main.py --config configs/PairBench3D/hotdog.yamlOptimizes translation, rotation, and isotropic scale.
In general, optimization with scale enabled is more challenging and typically benefits from using more optimization epochs/steps than rigid alignment.
Configuration files are located in:
configs/
All command-line arguments override values specified in the YAML configuration file.
Available options can be viewed with:
python main.py --helpThe fractional soft-ICP attachment ratio can optionally be specified manually:
icp_ratio: 0.3If icp_ratio is not provided, it is automatically estimated by the LLM based on the object pair and the text prompt.
Logs, rendered images, checkpoints, and final aligned meshes are written to the generated workspace directory.
Useful outputs include:
log_objects_alignment.txt
tmp/final_meshes/
To validate a fresh installation:
bash test_install.shThe validation script creates a clean environment, installs all required dependencies, verifies imports, and runs the hotdog example configuration.
If you find this repository useful, please cite:
@InProceedings{Gatenyo_2026_CVPR,
author = {Gatenyo, Rotem and Fried, Ohad},
title = {Copy-Transform-Paste: Zero-Shot Object-Object Alignment Guided by Vision-Language and Geometric Constraints},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2026},
pages = {14936-14945}
}