I build the infra that makes LLM agents reliable in production — inference serving, MCP tool layers, multi-agent orchestration, and eval/observability.
Every tool call and LLM span is traced; guardrails gate actions; eval feedback closes the loop — agents as observable, cost-bounded systems on a cloud-native substrate.
- Bison — Enterprise GPU billing & multi-tenant platform
- Inference Cookbook — Inference frameworks, deep-dived
- Cloud Native Cookbook — Cloud-native engineering, deep-dived



