Copy-Transform-Paste

Zero-Shot Object-Object Alignment Guided by Vision-Language and Geometric Constraints

Rotem Gatenyo, Ohad Fried

Reichman University

📄 Paper | 🌐 Project Page | 🏆 CVPR 2026

This repository provides tools to align two 3D objects using vision-language guidance and geometric constraints. Given two meshes and a text prompt describing their desired spatial relationship (e.g., "a hotdog sausage sits inside a bun"), the method optimizes the relative translation, rotation, and optionally scale of one object with respect to the other.

Pipeline Overview

Given two input meshes and a text prompt, our method optimizes the relative pose between the objects through differentiable rendering and vision-language supervision. The optimization is regularized by geometric objectives that encourage physically plausible contact while discouraging interpenetration. Optimization proceeds in multiple phases, gradually increasing geometric constraints and focusing the cameras on the interaction region for fine-grained refinement.

Installation

Recommended: Use the provided installation script. It performs a complete installation in a fresh Conda environment, validates all dependencies, and runs an example alignment to verify that everything is working correctly.

Run:

bash install.sh

The script automatically:

Clones all required repositories
- ObjectsAlignment
- nvdiffrast
- nvdiffmodeling
Creates a fresh Conda environment
Installs PyTorch and all required Python dependencies
Installs nvdiffrast and PyTorch3D
Configures nvdiffmodeling
Verifies that all required libraries can be imported
Runs the provided hotdog example to validate the installation

A successful installation should end with:

=====================================
INSTALLATION SUCCESSFUL
=====================================

Manual Installation

Clone the repository:

git clone https://github.com/RotemGat/ObjectsAlignment.git
cd ObjectsAlignment

Clone the required external repositories:

git clone https://github.com/NVlabs/nvdiffrast.git
git clone https://github.com/RotemGat/nvdiffmodeling.git

Create and activate a Conda environment:

conda create -n align3d python=3.9 -y
conda activate align3d

Install PyTorch (example for CUDA 12.6):

pip install torch==2.7.1 torchvision==0.22.1 \
    --index-url https://download.pytorch.org/whl/cu126

Install Python dependencies:

pip install -r requirements.txt

Install nvdiffrast:

pip install -e nvdiffrast

Install PyTorch3D:

pip install --no-build-isolation \
    "git+https://github.com/facebookresearch/pytorch3d.git@stable"

Add nvdiffmodeling to your Python path:

export PYTHONPATH="$PWD/nvdiffmodeling:$PYTHONPATH"

Running Examples

Run the provided hotdog example:

python main.py --config configs/PairBench3D/hotdog.yaml

The run creates a timestamped workspace containing logs, rendered images, optimization checkpoints, and final aligned meshes.

Example Configurations

The repository provides two example optimization modes:

Rigid Alignment

Optimizes translation and rotation only.

python main.py --config configs/PairBench3D/hotdog.yaml

Scale-Enabled Alignment

Optimizes translation, rotation, and isotropic scale.

In general, optimization with scale enabled is more challenging and typically benefits from using more optimization epochs/steps than rigid alignment.

Configuration

Configuration files are located in:

configs/

All command-line arguments override values specified in the YAML configuration file.

Available options can be viewed with:

python main.py --help

ICP Ratio

The fractional soft-ICP attachment ratio can optionally be specified manually:

icp_ratio: 0.3

If icp_ratio is not provided, it is automatically estimated by the LLM based on the object pair and the text prompt.

Outputs

Logs, rendered images, checkpoints, and final aligned meshes are written to the generated workspace directory.

Useful outputs include:

log_objects_alignment.txt
tmp/final_meshes/

Validation

To validate a fresh installation:

bash test_install.sh

The validation script creates a clean environment, installs all required dependencies, verifies imports, and runs the hotdog example configuration.

Citation

If you find this repository useful, please cite:

@InProceedings{Gatenyo_2026_CVPR,
    author    = {Gatenyo, Rotem and Fried, Ohad},
    title     = {Copy-Transform-Paste: Zero-Shot Object-Object Alignment Guided by Vision-Language and Geometric Constraints},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2026},
    pages     = {14936-14945}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
configs/PairBench3D		configs/PairBench3D
data/3d_models		data/3d_models
utilities		utilities
.gitignore		.gitignore
LICENSE		LICENSE
Monaco.ttf		Monaco.ttf
README.md		README.md
install.sh		install.sh
main.py		main.py
requirements.txt		requirements.txt
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Copy-Transform-Paste

Zero-Shot Object-Object Alignment Guided by Vision-Language and Geometric Constraints

Pipeline Overview

Installation

Manual Installation

Running Examples

Example Configurations

Rigid Alignment

Scale-Enabled Alignment

Configuration

ICP Ratio

Outputs

Validation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Copy-Transform-Paste

Zero-Shot Object-Object Alignment Guided by Vision-Language and Geometric Constraints

Pipeline Overview

Installation

Manual Installation

Running Examples

Example Configurations

Rigid Alignment

Scale-Enabled Alignment

Configuration

ICP Ratio

Outputs

Validation

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages