Skip to content
View SuperMarioYL's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report SuperMarioYL

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SuperMarioYL/README.md
EN  ⇄  中文
Leo — AI systems, made to run in production.

I build the infra that makes LLM agents reliable in production — inference serving, MCP tool layers, multi-agent orchestration, and eval/observability.

How my agents run

User → Orchestrator → Tools/MCP + Memory/RAG → Inference, on a Cloud Native AI substrate, instrumented by Eval & Observability

Every tool call and LLM span is traced; guardrails gate actions; eval feedback closes the loop — agents as observable, cost-bounded systems on a cloud-native substrate.

Capabilities

AI Agent: plan→act→reflect loop · Cloud Native: scheduler + pods · Inference: lower latency, higher throughput

Tech stack

Tech stack grouped by pillar: AI Agent, Cloud Native AI, Inference

Journey

From infrastructure to agents: Cloud Native AI → Inference Acceleration → AI Agent

Selected work


Let's build reliable AI systems together · blog.lei6393.com

Blog    Email    GitHub     Profile views

Pinned Loading

  1. trouve trouve Public

    trouve : A built-in integrated service discovery, service registration, and service forwarding general component for Spring projects

    Java 30 9

  2. Bison Bison Public

    Enterprise GPU Resource Billing & Multi-Tenant Management Platform 企业级 GPU 资源计费与多租户管理平台

    TypeScript 7

  3. inference-cookbook inference-cookbook Public

    inference cookbook / inference 框架原理解析

    HTML 5