A template for getting started writing code using GGML.
- Simple function with linear layer added
- Export model weights to
.ggufformat - Compare your python and GGML code using tests
To use this template, follow these steps:
- Clone the repository:
git clone https://github.com/grazder/ggml_template.git --recursive - Navigate to the project directory:
cd ggml_template - Export model weights to
.ggufformat:python weights_export/export_model_weights.py - Build the project:
mkdir build cd build cmake .. make - Run the project:
./example/main - Run tests:
python -m pytest tests/test.py
- Export your model to GGUF format. Example in
weights_export/export_model_weights.py - Load your GGUF file into CPP code. Example in
template.cpp-load_weigthsandload_hparamsfunctions - Write inference code for your model. Example in
template.cpp-forwardandcompute. - Write usage example. Example in
example/main.cpp. - Write python bindings for your model. Example in
tests/bindings.cpp - Write tests for python and cpp code comparison. Example in
tests/test.py.
- Basic FF example
- Python-CPP tests
- Add GGUF
- Make cleaning
- Try on real model
- Adapt template for real case usage
- Write comments
- Add argparse for
model.cpp - Support FP16
- Quantization (?)