NVIDIA · AbhiramDwivedi · Jun 14, 2026
diff --git a/README.md b/README.md
@@ -149,6 +149,8 @@ inference gateways.
 | `openai` | `OPENAI_API_KEY` (+ optional `OPENAI_BASE_URL`) | api.openai.com (or any OpenAI-compatible URL) | `gpt-5.4` |
 | `anthropic` | `ANTHROPIC_API_KEY` | api.anthropic.com | `claude-opus-4-6` |
 | `nv_build` | `NVIDIA_INFERENCE_KEY` | build.nvidia.com | `deepseek-ai/deepseek-v4-flash` |
+| `claude_cli` | _(none — uses local CLI auth)_ | local `claude` binary | `claude-sonnet-4-6` |
+| `codex_cli` | _(none — uses local CLI auth)_ | local `codex` binary | `o4-mini` |
 
 ```bash
 # Stock OpenAI
@@ -166,6 +168,16 @@ export SKILLSPECTOR_PROVIDER=nv_build
 export NVIDIA_INFERENCE_KEY=nvapi-...
 skillspector scan ./my-skill/
 
+# Local Claude CLI — no API key; uses your existing `claude auth login` session
+# Requires: claude CLI installed and authenticated (claude auth login)
+export SKILLSPECTOR_PROVIDER=claude_cli
+skillspector scan ./my-skill/
+
+# Local Codex CLI — no API key; uses your existing `codex login` session
+# Requires: codex CLI installed and authenticated
+export SKILLSPECTOR_PROVIDER=codex_cli
+skillspector scan ./my-skill/
+
 # Local Ollama or any OpenAI-compatible endpoint
 export SKILLSPECTOR_PROVIDER=openai
 export OPENAI_API_KEY=ollama
@@ -396,7 +408,7 @@ Issues (2)
 
 | Variable | Description | Required |
 |----------|-------------|----------|
-| `SKILLSPECTOR_PROVIDER` | Active LLM provider: `openai`, `anthropic`, or `nv_build`. Each provider has its own bundled `model_registry.yaml` and default model (see the LLM Analysis table above). Defaults to `nv_build`. | Optional |
+| `SKILLSPECTOR_PROVIDER` | Active LLM provider: `openai`, `anthropic`, `nv_build`, `claude_cli`, or `codex_cli`. Each provider has its own bundled `model_registry.yaml` and default model (see the LLM Analysis table above). Defaults to `nv_build`. | Optional |
 | `NVIDIA_INFERENCE_KEY` | Credential for the `nv_build` provider (build.nvidia.com). | Required for LLM analysis when `SKILLSPECTOR_PROVIDER=nv_build` |
 | `OPENAI_API_KEY` | Credential for the OpenAI provider (`SKILLSPECTOR_PROVIDER=openai`). Also serves as the tier-2 fallback in the credential waterfall when the active provider returns no credentials. | Required for LLM analysis when `SKILLSPECTOR_PROVIDER=openai` |
 | `OPENAI_BASE_URL` | Override the OpenAI endpoint (e.g. point at Ollama). | Optional |
@@ -405,6 +417,8 @@ Issues (2)
 | `SKILLSPECTOR_MODEL_REGISTRY` | Override the bundled per-provider YAML registry (`src/skillspector/providers/<provider>.yaml`) with a custom path. | Optional |
 | `SKILLSPECTOR_LOG_LEVEL` | Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR` (default: `WARNING`). | Optional |
 
+> **CLI providers** (`claude_cli`, `codex_cli`): No API key is needed. Authentication is managed entirely by the agent CLI's own login session (`claude auth login` / `codex login`). SkillSpector never reads or forwards API keys when these providers are active. The subprocess is run in a hardened sandbox: tools disabled, no MCP, read-only sandbox mode (codex), and untrusted skill content is delivered only via stdin.
+
 ### CLI Options
 
 ```bash

diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md
@@ -260,21 +260,33 @@ Copy [.env.example](../.env.example) to `.env` in the project root and set value
 
 | Variable | Description | Example |
 |----------|-------------|---------|
-| `SKILLSPECTOR_PROVIDER` | Active LLM provider: `openai` \| `anthropic` \| `nv_build`. Defaults to `nv_build`. | `openai` |
+| `SKILLSPECTOR_PROVIDER` | Active LLM provider: `openai` \| `anthropic` \| `nv_build` \| `claude_cli` \| `codex_cli`. Defaults to `nv_build`. | `claude_cli` |
 | `NVIDIA_INFERENCE_KEY` | Credential for `nv_build`. | `nvapi-...` |
 | `OPENAI_API_KEY` | Credential for `SKILLSPECTOR_PROVIDER=openai`. Also tier-2 fallback for non-OpenAI providers. | `sk-...` |
 | `OPENAI_BASE_URL` | Override the OpenAI endpoint (e.g. point at Ollama). | `http://localhost:11434/v1` |
 | `ANTHROPIC_API_KEY` | Credential for `SKILLSPECTOR_PROVIDER=anthropic`. | `sk-ant-...` |
-| `SKILLSPECTOR_MODEL` | Override the active provider's bundled default model (see [README.md](../README.md) for per-provider defaults). | `gpt-5.2` |
+| `SKILLSPECTOR_MODEL` | Override the active provider's bundled default model (see [README.md](../README.md) for per-provider defaults). For `claude_cli`, this is passed as `--model` to the `claude` binary. | `gpt-5.2` |
+
+> **CLI providers** (`claude_cli`, `codex_cli`): no credential env var is needed. Authentication is managed by the agent CLI's own session (`claude auth login` / `codex login`). The subprocess is heavily sandboxed — see [providers/_agent_cli.py](../src/skillspector/providers/_agent_cli.py).
 
 ### Constants, token budgets, and LLM
 
 - **Constants** ([constants.py](../src/skillspector/constants.py)): `_SKILLSPECTOR_DEFAULT_MODEL`, `MODEL_CONFIG` (per-node model selection), `MAX_INPUT_TOKENS_PCT` (0.75), `DEFAULT_CONTEXT_LENGTH` (128k fallback).
   - **`get_max_input_tokens(model)`** — input budget per LLM request (75% of resolved context window).
   - **`get_max_output_tokens(model)`** — output budget per LLM request (min of 25% context, registry's `max_output_tokens` cap if set).
   - Batch budget overhead is computed per-prompt via `estimate_tokens(base_prompt)` rather than a fixed constant.
-- **Providers** ([providers/](../src/skillspector/providers/)): pluggable credential + token-budget resolvers. Each provider is a subpackage with its own `provider.py` and bundled `model_registry.yaml`; [registry.py](../src/skillspector/providers/registry.py) exposes `lookup_context_length` / `lookup_max_output_tokens` utilities the providers call directly. The active provider is chosen by `SKILLSPECTOR_PROVIDER` (default: `nv_build`) — see [providers/`__init__`.py](../src/skillspector/providers/__init__.py): `nv_build/` (build.nvidia.com), `openai/`, or `anthropic/`.
-- **LLM calls** ([llm_utils.py](../src/skillspector/llm_utils.py)): **`get_chat_model()`** and **`chat_completion()`** resolve credentials in two tiers — active NVIDIA provider (`NVIDIA_INFERENCE_KEY` → endpoint) → standard `OPENAI_API_KEY` / `OPENAI_BASE_URL` — against any OpenAI-compatible endpoint. `max_tokens` is auto-bound to `get_max_output_tokens(model)` from `model_info`.
+- **Providers** ([providers/](../src/skillspector/providers/)): pluggable credential + token-budget resolvers. Each provider is a subpackage with its own `provider.py` and bundled `model_registry.yaml`; [registry.py](../src/skillspector/providers/registry.py) exposes `lookup_context_length` / `lookup_max_output_tokens` utilities the providers call directly. The active provider is chosen by `SKILLSPECTOR_PROVIDER` (default: `nv_build`):
+  - `nv_build/` — build.nvidia.com (HTTP, `NVIDIA_INFERENCE_KEY`)
+  - `openai/` — api.openai.com or any OpenAI-compatible URL (`OPENAI_API_KEY`)
+  - `anthropic/` — api.anthropic.com (`ANTHROPIC_API_KEY`)
+  - `claude_cli/` — **local `claude` binary; no API key**. Uses the CLI's own auth session (`claude auth login`). Set `SKILLSPECTOR_PROVIDER=claude_cli`.
+  - `codex_cli/` — **local `codex` binary; no API key**. Uses the CLI's own auth session (`codex login`). Set `SKILLSPECTOR_PROVIDER=codex_cli`.
+
+  CLI providers (`claude_cli`, `codex_cli`) implement the optional `AgentCLICapable` interface (`is_available()` + `complete()`) defined in [providers/base.py](../src/skillspector/providers/base.py). `has_cli_capability(provider)` detects this at runtime.  All subprocess calls go through the hardened helper [providers/_agent_cli.py](../src/skillspector/providers/_agent_cli.py) which enforces: no shell (`shell=False`), untrusted content via stdin only, capability stripping (tools disabled / sandboxed), environment scrubbing (no API keys forwarded), per-call timeout, and fail-closed error handling.
+
+- **LLM calls** ([llm_utils.py](../src/skillspector/llm_utils.py)): **`get_chat_model()`** and **`chat_completion()`** dispatch based on the active provider:
+  - **HTTP providers**: resolve credentials in two tiers — active provider (`NVIDIA_INFERENCE_KEY` / `ANTHROPIC_API_KEY` / `OPENAI_API_KEY` → endpoint) — against any OpenAI-compatible endpoint. `max_tokens` is auto-bound to `get_max_output_tokens(model)` from `model_info`.
+  - **CLI providers** (`claude_cli`, `codex_cli`): `get_chat_model()` returns an `AgentCLIChatModel` adapter backed by `provider.complete()`, so the analyzers' `.invoke()` / `.with_structured_output(schema).invoke()` calls work with no API key (structured output is produced by prompting for JSON, then Pydantic-validating). `chat_completion()` routes through `get_chat_model()` as well. `is_llm_available()` calls `provider.is_available()` instead of credential resolution.
 - **LLM analyzer base** ([llm_analyzer_base.py](../src/skillspector/nodes/llm_analyzer_base.py)): `LLMAnalyzerBase` provides per-file/per-chunk batching, token-budget-aware chunking, and a run loop for all LLM-based analyzers. `LLMMetaAnalyzer` extends it for filter/enrich (meta_analyzer node). Future semantic analyzers extend `LLMAnalyzerBase` for discovery mode.
 
 ---

diff --git a/src/skillspector/llm_utils.py b/src/skillspector/llm_utils.py
@@ -13,13 +13,17 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-"""Shared LLM utilities (OpenAI-compatible chat models).
+"""Shared LLM utilities (OpenAI-compatible chat models + agent CLI transports).
 
 Credentials are resolved in this order:
-    1. The active NVIDIA provider (see :mod:`skillspector.providers`) —
-       reads ``NVIDIA_INFERENCE_KEY`` and supplies the matching endpoint.
+    1. The active provider (see :mod:`skillspector.providers`):
+       - CLI providers (``claude_cli``, ``codex_cli``): use ``is_available()``
+         and ``complete()`` — no API key needed.
+       - HTTP providers (``anthropic``, ``openai``, ``nv_build``): read their
+         respective credential env vars and supply a base URL.
     2. ``OPENAI_API_KEY`` / ``OPENAI_BASE_URL`` (the langchain-openai
-       defaults).
+       defaults) — only consulted for HTTP providers when the provider's
+       own credential env var is unset.
 
 There is no SkillSpector-specific credential env var: setting
 ``NVIDIA_INFERENCE_KEY`` configures whichever NVIDIA endpoint the
@@ -29,23 +33,31 @@
 
 from __future__ import annotations
 
+import asyncio
+import json
 import os
 
 from langchain_openai import ChatOpenAI
 
 from skillspector.constants import MODEL_CONFIG
 from skillspector.model_info import get_max_input_tokens, get_max_output_tokens
-from skillspector.providers import resolve_provider_credentials
+from skillspector.providers import (
+    get_active_provider,
+    has_cli_capability,
+    resolve_provider_credentials,
+)
 
 
 def _resolve_llm_credentials() -> tuple[str, str | None]:
     """Return ``(api_key, base_url)`` resolved from the environment.
 
-    Tries the active NVIDIA provider first; falls back to ``OPENAI_API_KEY``
+    Tries the active provider first; falls back to ``OPENAI_API_KEY``
     / ``OPENAI_BASE_URL`` when the provider is not configured.
 
     Raises:
         ValueError: when no API key can be resolved from any source.
+        RuntimeError: when called for a CLI provider (use ``is_llm_available``
+            / ``chat_completion`` directly instead).
     """
     creds = resolve_provider_credentials()
     if creds is not None:
@@ -65,7 +77,15 @@ def _resolve_llm_credentials() -> tuple[str, str | None]:
 
 
 def is_llm_available() -> tuple[bool, str | None]:
-    """Return ``(available, error_message)`` describing LLM credential status."""
+    """Return ``(available, error_message)`` describing LLM availability.
+
+    For CLI providers (``claude_cli``, ``codex_cli``) the check delegates
+    to the provider's ``is_available()`` method (binary on PATH + auth).
+    For HTTP providers, it falls back to credential resolution.
+    """
+    provider = get_active_provider()
+    if has_cli_capability(provider):
+        return provider.is_available()  # type: ignore[attr-defined]
     try:
         _resolve_llm_credentials()
     except ValueError as exc:
@@ -78,26 +98,153 @@ def fetch_model_token_limits(model_label: str) -> tuple[int, int]:
     return get_max_input_tokens(model_label), get_max_output_tokens(model_label)
 
 
-def get_chat_model(model: str | None = None) -> ChatOpenAI:
-    """Return a :class:`ChatOpenAI` configured against the resolved endpoint.
+# ---------------------------------------------------------------------------
+# Agent CLI chat-model adapter
+# ---------------------------------------------------------------------------
+#
+# The LLM analyzers (meta_analyzer, semantic_*) obtain a model from
+# ``get_chat_model()`` and call ``.invoke()`` / ``.with_structured_output(
+# schema).invoke()`` on it (see ``llm_analyzer_base``) — they never go through
+# ``chat_completion``. To support CLI providers there, ``get_chat_model``
+# returns this minimal adapter, which mimics the slice of the ``ChatOpenAI``
+# interface the analyzers rely on, backed by the provider's ``complete()``
+# subprocess transport.
+
+
+class _AgentCLIMessage:
+    """Minimal stand-in for a LangChain message: exposes ``.content``."""
+
+    def __init__(self, content: str) -> None:
+        self.content = content
+
+
+def _extract_json_object(raw: str) -> dict:
+    """Extract a single JSON object from a CLI model's text response.
+
+    Tolerates markdown code fences and surrounding prose. Raises ``ValueError``
+    (fail-closed) when no JSON object can be parsed.
+    """
+    text = raw.strip()
+    if text.startswith("```"):
+        # Drop the opening fence line (``` or ```json) and any closing fence.
+        text = text.split("\n", 1)[1] if "\n" in text else ""
+        fence = text.rfind("```")
+        if fence != -1:
+            text = text[:fence]
+        text = text.strip()
+    try:
+        obj = json.loads(text)
+        if isinstance(obj, dict):
+            return obj
+    except json.JSONDecodeError:
+        pass
+    start, end = text.find("{"), text.rfind("}")
+    if start != -1 and end > start:
+        try:
+            obj = json.loads(text[start : end + 1])
+            if isinstance(obj, dict):
+                return obj
+        except json.JSONDecodeError:
+            pass
+    raise ValueError(f"could not extract a JSON object from CLI response: {raw[:200]!r}")
+
+
+class _StructuredAgentCLIModel:
+    """Mimics ``ChatOpenAI.with_structured_output(schema)`` for a CLI provider.
+
+    ``invoke`` augments the prompt with the schema, calls the provider's
+    ``complete()``, then parses and validates the response into *schema*.
+    """
+
+    def __init__(self, provider: object, model: str, max_output_tokens: int, schema: type) -> None:
+        self._provider = provider
+        self._model = model
+        self._max_output_tokens = max_output_tokens
+        self._schema = schema
+
+    def _augment(self, prompt: str) -> str:
+        schema_json = json.dumps(self._schema.model_json_schema(), indent=2)
+        return (
+            f"{prompt}\n\n"
+            "Respond with ONLY a single JSON object conforming to the JSON Schema "
+            "below. Do not wrap it in markdown code fences and do not add any prose "
+            f"before or after the JSON.\n\nJSON Schema:\n{schema_json}"
+        )
+
+    def invoke(self, prompt: str) -> object:
+        raw = self._provider.complete(  # type: ignore[attr-defined]
+            self._augment(prompt),
+            model=self._model,
+            max_output_tokens=self._max_output_tokens,
+        )
+        return self._schema.model_validate(_extract_json_object(raw))
+
+    async def ainvoke(self, prompt: str) -> object:
+        return await asyncio.to_thread(self.invoke, prompt)
+
+
+class AgentCLIChatModel:
+    """Minimal ``ChatOpenAI``-compatible adapter backed by a CLI provider.
+
+    Implements only the surface the analyzers use: ``invoke`` (returns an
+    object with ``.content``), ``ainvoke``, and ``with_structured_output``.
+    """
+
+    def __init__(self, provider: object, model: str, max_output_tokens: int) -> None:
+        self._provider = provider
+        self._model = model
+        self._max_output_tokens = max_output_tokens
+
+    def invoke(self, prompt: str) -> _AgentCLIMessage:
+        text = self._provider.complete(  # type: ignore[attr-defined]
+            prompt,
+            model=self._model,
+            max_output_tokens=self._max_output_tokens,
+        )
+        return _AgentCLIMessage(text)
+
+    async def ainvoke(self, prompt: str) -> _AgentCLIMessage:
+        return await asyncio.to_thread(self.invoke, prompt)
+
+    def with_structured_output(self, schema: type) -> _StructuredAgentCLIModel:
+        return _StructuredAgentCLIModel(
+            self._provider, self._model, self._max_output_tokens, schema
+        )
+
+
+def get_chat_model(model: str | None = None) -> ChatOpenAI | AgentCLIChatModel:
+    """Return a chat model for the active provider.
+
+    For CLI providers (``claude_cli``, ``codex_cli``) this returns an
+    :class:`AgentCLIChatModel` adapter backed by the provider's ``complete()``
+    subprocess transport — so the LLM analyzers (which use ``.invoke()`` and
+    ``.with_structured_output()``) work with no API key. For HTTP providers it
+    returns a :class:`ChatOpenAI` configured against the resolved endpoint.
 
     Raises:
-        ValueError: when no API key is configured (see ``is_llm_available``).
+        ValueError: when an HTTP provider has no API key configured.
     """
-    resolved_key, resolved_base = _resolve_llm_credentials()
-    model = model or MODEL_CONFIG["default"]
+    resolved_model = model or MODEL_CONFIG["default"]
 
+    provider = get_active_provider()
+    if has_cli_capability(provider):
+        return AgentCLIChatModel(provider, resolved_model, get_max_output_tokens(resolved_model))
+
+    resolved_key, resolved_base = _resolve_llm_credentials()
     return ChatOpenAI(
-        model=model,
+        model=resolved_model,
         base_url=resolved_base,
         api_key=resolved_key,
-        max_tokens=get_max_output_tokens(model),
+        max_tokens=get_max_output_tokens(resolved_model),
         timeout=120,
     )
 
 
 def chat_completion(prompt: str, *, model: str | None = None) -> str:
-    """Request a single chat completion and return the assistant content."""
-    llm = get_chat_model(model=model)
-    response = llm.invoke(prompt)
+    """Request a single chat completion and return the assistant content.
+
+    Routes through :func:`get_chat_model`, which dispatches to the CLI adapter
+    for CLI providers and to ``ChatOpenAI`` for HTTP providers.
+    """
+    response = get_chat_model(model=model).invoke(prompt)
     return response.content or ""