bitloops-inference

bitloops-inference is a small Rust workspace that runs semantic-summary inference out of process for Bitloops. Bitloops launches the runtime as a child process, speaks a versioned line-delimited JSON protocol over stdin and stdout, and leaves all provider-specific HTTP, auth, parsing, and error handling inside this repository.

Workspace layout

bitloops-inference-protocol: shared protocol types, versioning, and JSON-line serialisation helpers.
bitloops-inference: config loading, CLI, provider registry, provider implementations, and the stdio runtime loop.

Why this exists

Bitloops core stays provider-agnostic. Adding or changing a summary provider only requires a new bitloops-inference release rather than a Bitloops release.

CLI

bitloops-inference run --config config.toml --profile openai_fast
bitloops-inference validate-config --config config.toml
bitloops-inference describe-profile --config config.toml --profile openai_fast

run reserves stdout strictly for line-delimited JSON protocol responses. Diagnostics and failures go to stderr.

Config

bitloops-inference reads the Bitloops daemon inference config. Text-generation profiles live under [inference.profiles.<name>] and reference a runtime from [inference.runtimes.<name>].

[inference.runtimes.bitloops_inference]
request_timeout_secs = 60

[inference.profiles.openai_fast]
task = "text_generation"
driver = "openai_chat_completions"
runtime = "bitloops_inference"
model = "gpt-4.1-mini"
base_url = "https://api.openai.com/v1/chat/completions"
api_key = "${OPENAI_API_KEY}"
temperature = "0.1"
max_output_tokens = 200

[inference.profiles.ollama_local]
task = "text_generation"
driver = "ollama_chat"
runtime = "bitloops_inference"
model = "qwen2.5-coder:14b"
base_url = "http://127.0.0.1:11434/api/chat"
temperature = "0.1"
max_output_tokens = 200

String fields support ${ENV_VAR} interpolation. Missing environment variables fail validation immediately. Non-text-generation profiles in the same daemon config are ignored by bitloops-inference.

The public Bitloops platform gateway has a dedicated bitloops_platform_chat driver. It defaults to the production Bitloops platform endpoint, and the Bitloops host can optionally provide base_url when a test or non-production override is needed:

[inference.runtimes.bitloops_inference]
request_timeout_secs = 300

[inference.profiles.platform_summary]
task = "text_generation"
driver = "bitloops_platform_chat"
runtime = "bitloops_inference"
model = "ministral-3-3b-instruct"
api_key = "${BITLOOPS_PLATFORM_GATEWAY_TOKEN}"
temperature = "0.1"
max_output_tokens = 200

If base_url is omitted, bitloops-inference uses https://platform.bitloops.net/v1/chat/completions. When base_url is present, it overrides that default for the selected profile.

Supported drivers

openai_chat_completions
bitloops_platform_chat
ollama_chat

Both providers normalise their outputs into one canonical inference response with text, optional parsed_json, optional token usage, finish reason, provider name, and model name.

How Bitloops calls it

Start the runtime once for a selected profile.
Send JSON requests over stdin, one line per request.
Read one JSON response line per request from stdout.
Send shutdown when the session is finished.

Example request stream:

{"request_id":"1","type":"describe"}
{"request_id":"2","type":"infer","system_prompt":"You write terse semantic summaries.","user_prompt":"Summarise this diff.","response_mode":"json_object","temperature":0.1,"max_output_tokens":200}
{"request_id":"3","type":"shutdown"}

Example responses:

{"request_id":"1","type":"describe","protocol_version":1,"runtime_name":"bitloops-inference","runtime_version":"0.1.2","profile_name":"openai_fast","provider":{"kind":"openai_chat_completions","provider_name":"openai","model_name":"gpt-4.1-mini","endpoint":"https://api.openai.com/v1/chat/completions","capabilities":{"response_modes":["text","json_object"],"usage_reporting":true}}}
{"request_id":"2","type":"infer","text":"{\"summary\":\"Adds provider isolation\",\"confidence\":0.92}","parsed_json":{"summary":"Adds provider isolation","confidence":0.92},"usage":{"prompt_tokens":120,"completion_tokens":24,"total_tokens":144},"finish_reason":"stop","provider_name":"openai","model_name":"gpt-4.1-mini"}
{"request_id":"3","type":"shutdown"}

Running manually

Run config validation first:

cargo run -p bitloops-inference -- validate-config --config ./bitloops-daemon-config.toml

Describe a profile:

cargo run -p bitloops-inference -- describe-profile --config ./bitloops-daemon-config.toml --profile ollama_local

Start the stdio runtime:

cargo run -p bitloops-inference -- run --config ./bitloops-daemon-config.toml --profile ollama_local

You can then write protocol lines to stdin manually or from another process.

Testing

The test suite avoids live network calls. Provider integrations use mocked HTTP servers and the stdio loop is exercised through spawned child-process tests.

cargo nextest run
cargo dev-clippy

CI and releases

GitHub Actions runs a lean hosted-runner CI pipeline for formatting, clippy, nextest, and native release-build smoke checks on Linux, macOS, and Windows.

Tagged releases are published from v* tags. The release workflow builds packaged artefacts for:

aarch64-apple-darwin
x86_64-apple-darwin
x86_64-unknown-linux-musl
aarch64-unknown-linux-musl
x86_64-pc-windows-msvc
aarch64-pc-windows-msvc

macOS signing and notarisation use the same secret and variable names as the main Bitloops repository:

Secrets: APPLE_CERT_P12_BASE64, APPLE_CERT_PASSWORD, APPSTORE_CONNECT_API_KEY_P8_BASE64
Variables: APPLE_SIGNING_IDENTITY, APPSTORE_CONNECT_KEY_ID, APPSTORE_CONNECT_ISSUER_ID

Optional release notification:

Secret: SLACK_WEBHOOK_URL

Future work

Possible later provider families include anthropic_messages and other explicit provider integrations. v1 deliberately avoids a generic mapping DSL, streaming, batching, local in-process model serving, and runtime orchestration.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.cargo		.cargo
.github/workflows		.github/workflows
crates		crates
scripts		scripts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bitloops-inference

Workspace layout

Why this exists

CLI

Config

Supported drivers

How Bitloops calls it

Running manually

Testing

CI and releases

Future work

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bitloops-inference

Workspace layout

Why this exists

CLI

Config

Supported drivers

How Bitloops calls it

Running manually

Testing

CI and releases

Future work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages