Skip to content

bloomberg/fun2spec

fun2spec

ℹ️ About

fun2spec is a new tool used to automatically infer code contracts for large C++ libraries.

In particular, fun2spec uses large language models (LLMs) to generate postcondition specifications for C++ functions.

Two case studies are presented: BDE (the Bloomberg Development Environment library) and BlazingMQ (an open source message queue solution developed at Bloomberg).

These projects contain several millions of lines of code and constitute great experiments showcasing what can be done with fun2spec.

We have documented fun2spec experiments in our paper: Fun2spec: Code Contract Synthesis At Scale, which has been accepted for publication at FSE 2026 in July.

💡 Installation

BDE Installations

To use fun2spec with Bloomberg's BDE codebase, the BDE CMake build system requires the following software to be preinstalled and configured on the system:

Clone the bde-tools and bde repositories and add bde-tools to your PATH:

 $ git clone https://github.com/bloomberg/bde-tools.git
 $ export PATH=$PWD/bde-tools/bin:$PATH      # add bde-tools to the 'PATH'
 $ git clone https://github.com/bloomberg/bde.git

Please see the complete build instructions for open source users.

Python Installation and Usage Instructions

We recommend creating a new conda/pipenv environment for installing the dependencies.

Click here for steps to set up a conda environment Instructions for installing miniconda in your machine are here.
conda create -n venv_name
conda activate venv_name
conda install pip

To install all the requirements for fun2spec:

pip install -r requirements.txt

To run the tool with CLI, run fun2spec/main.py with the following arguments:

Argument Type Description
source str Path to the C++ repository humaneval, fsc or paths to BDE and BlazingMQ
model_name str Hugging Face model ID; defaults to microsoft/Phi-3-mini-128k-instruct
gen_name str fun2spec or daikon. Defaults to fun2spec
return_type str The generation is performed on functions with provided return type; currently support int, pointer or all
use_cache bool If there are changes in the fun2spec or the repository to be analyzed, set use_cache to False. Defaults to True.
count int Number of functions for which the specifications are generated. Defaults to 50.
timeout int Timeout in seconds for each function. Defaults to 30s

For example, you can run the following command to run fun2spec on a BDE repository given ~/bde is the local path to the clone of the repository.

python3 fun2spec/main.py --source ~/bde  --return_type pointer --count 30 --use_cache True

📄 Results

Execution of fun2spec should generate a file data/results.csv. The CSV contains the following columns:

Column Name Description
Function Name The function name that is tested.
Postcondition LLM generated postcondition.
true Count of executions of function where the postcondition holds.
false Count of executions of function where the postcondition does not hold.

About

fun2spec uses large language models (LLMs) for generating postcondition specifications for C++ functions

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors