Skip to content

csiro-funml/generativebo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generative Bayesian Optimization

Code for the paper "Generative Bayesian Optimization: Generative Models as Acquisition Functions" accepted at the 14th International Conference on Learning Representations (ICLR 2026), available at the conference proceedings in OpenReview: https://openreview.net/forum?id=GBWkRRJrdu

Setup

Create and activate a Python virtual environment:

python3 -m venv .venv
source .venv/bin/activate

Use Python 3.12 as the interpreter version for this environment. Exact versions for the other packages are specified in requirements.txt.

Install the experiment dependencies inside the environment:

pip install -r requirements.txt

ALOHA Experiment

The try_aloha.py script runs GenBO and other methods on the POLI ALOHA text optimization problem.

Run the best-performing GenBO variant from the paper's ALOHA experiment:

python try_aloha.py \
  --solver genbo \
  --utility PI \
  --loss RobustPreferenceLoss \
  --reg-factor 0.01 \
  --lr 0.1 \
  --percentile-min 0.50 \
  --percentile-max 0.95 \
  --logdir aloha-paper

This matches the ALOHA text optimization setting from the paper: initial dataset size 64, batch size 8, 10 optimization rounds, mean-field categorical proposal, no informative prior, PI utility, robust preference loss, regularization factor 0.01, learning rate 0.1, and percentile schedule 0.50 to 0.95. The paper reports averages across five random seeds, so repeat the command with different --seed values to reproduce that style of result.

For example, to run five seeds:

for seed in 42 666 121 11 391; do
  python try_aloha.py \
    --solver genbo \
    --utility PI \
    --loss RobustPreferenceLoss \
    --reg-factor 0.01 \
    --lr 0.1 \
    --percentile-min 0.50 \
    --percentile-max 0.95 \
    --seed "$seed" \
    --logdir aloha-paper
done

To include the other ALOHA solvers in the same comparison directory, run them with the same seeds and --logdir:

for seed in 42 666 121 11 391; do
  python try_aloha.py --solver randmut --seed "$seed" --logdir aloha-paper
  python try_aloha.py --solver vsd --seed "$seed" --logdir aloha-paper
  python try_aloha.py \
    --solver cbas \
    --percentile-min 0.80 \
    --percentile-max 0.99 \
    --seed "$seed" \
    --logdir aloha-paper
done

The script writes:

  • a .log file with progress messages
  • a .json file containing the command-line settings
  • a .npz file containing evaluated sequences and scores
  • a .png plot of regret and observed values
  • for GenBO, additional loss .csv and .png files

Ehrlich Experiment

The try_ehrlich.py script runs GenBO on the Ehrlich closed-form protein design benchmark. The paper uses sequence lengths 15, 32, and 64, initial dataset size 128, batch size 128, and 32 optimization rounds. In this script, 128 is already the default batch size and initial dataset size, so the paper commands only need to specify --max-iter 32.

Run the best-performing GenBO variants from the paper's Ehrlich experiments:

python try_ehrlich.py \
  --sequence-length 15 \
  --max-iter 32 \
  --logdir ehrlich-paper
python try_ehrlich.py \
  --sequence-length 32 \
  --proposal tfm \
  --utility sEI \
  --loss ForwardKL \
  --reg-factor 0.01 \
  --use-exp \
  --max-iter 32 \
  --logdir ehrlich-paper
python try_ehrlich.py \
  --sequence-length 64 \
  --proposal tfm \
  --utility PI \
  --loss BalancedForwardKL \
  --use-exp \
  --max-iter 32 \
  --logdir ehrlich-paper

These correspond to the Figure 2 GenBO variants:

  • M = 15: GenBO-MF-EI-fKL
  • M = 32: GenBO-Tfm-sEI-fKL-r0p01-np-exp
  • M = 64: GenBO-Tfm-PI-bfKL-exp

To repeat one of these across five seeds, run, e.g.:

for seed in 42 666 121 11 391; do
  python try_ehrlich.py \
    --sequence-length 32 \
    --proposal tfm \
    --utility sEI \
    --loss ForwardKL \
    --reg-factor 0.01 \
    --use-exp \
    --max-iter 32 \
    --seed "$seed" \
    --logdir ehrlich-paper
done

Ehrlich results are written under a sequence-length subdirectory, for example ehrlich-paper/32/.

Regret Plots

Use plot_results.py to aggregate .npz files across seeds and generate the simple regret plots. It also writes a diversity plot and JSON summaries of final means, standard deviations, and method ranking.

Plot the ALOHA results:

python plot_results.py \
  --resultsdir aloha-paper \
  --ystar 5 \
  --trainsize 64 \
  --batchsize 8 \
  --maxiter 10 \
  --fileprefix aloha

Plot an Ehrlich sequence length:

python plot_results.py \
  --resultsdir ehrlich-paper/32 \
  --ystar 1 \
  --trainsize 128 \
  --batchsize 128 \
  --maxiter 32 \
  --fileprefix ehrlich32

Change --resultsdir and --fileprefix for the other Ehrlich sequence lengths, for example ehrlich-paper/15 with ehrlich15, or ehrlich-paper/64 with ehrlich64.

Common Options

Use --device cuda to run on a CUDA device, and --logdir <path> to choose the output directory. Use --help to see all command-line options:

python try_aloha.py --help
python try_ehrlich.py --help
python plot_results.py --help

About

Code for Generative BO paper accepted at ICLR 2026

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages