fix(swarm): production deploy bugs (12 fixes across auth, docker, terminal)#20
Open
MarcelocardosoLeal wants to merge 7 commits intoEvolutionAPI:mainfrom
Open
fix(swarm): production deploy bugs (12 fixes across auth, docker, terminal)#20MarcelocardosoLeal wants to merge 7 commits intoEvolutionAPI:mainfrom
MarcelocardosoLeal wants to merge 7 commits intoEvolutionAPI:mainfrom
Conversation
1. Add ANTHROPIC_API_KEY to ALLOWED_VARS in claude-bridge.js
The env var was silently filtered out, causing Claude Code to fall
back to OAuth login on every session start instead of using the
API key configured in the Providers page.
2. Fix orphaned session crash ("Session already exists")
When a Claude process died without firing the PTY onExit event,
the session remained in the bridge's in-memory Map as inactive.
The next start attempt threw "already exists". Now detects dead
sessions, cleans them up, and restarts normally.
3. Exclude dashboard/data/ and workspace/ from Docker build context
Without these entries in .dockerignore, the local SQLite database
(with hashed passwords) and workspace files were baked into the
image. On first Swarm deploy, the volume was seeded from the image,
making login impossible with any other credentials.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add evonexus_claude_auth:/root/.claude to all three Swarm services (dashboard, telegram, scheduler) so Claude Code OAuth tokens persist across redeploys — avoids re-authentication on every deploy - docker-compose.yml: use Dockerfile.swarm.dashboard, expose terminal port 32352, add claude-auth volume, fix config mount (remove :ro so providers.json can be written by the UI) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add evonexus_claude_auth:/root/.claude to all three services in evonexus.stack.yml so Claude Code OAuth tokens persist across redeploys. Same fix applied to evonexus.portainer.stack.yml in the previous commit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bug 1 — Theme picker on every agent Each agent runs in its own working directory, which Claude Code treats as a separate project. Without a global theme set, the user is asked to choose a theme on every single agent terminal. Pre-seed /root/.claude/settings.json with theme + onboarding flags during container startup so the first-run prompts are skipped. Only writes the file if it doesn't exist (preserves user-chosen overrides). Bug 2 — "Session already exists" error toast The previous fix only cleaned up *inactive* orphans. The actual production trigger is different: when a WebSocket reconnects through Traefik, the frontend can re-send start_claude before learning the session is still alive. The bridge's startSession then threw on a duplicate active session. Make startSession idempotent: if the session is already active, return the existing entry instead of throwing. Bug 3 — Misleading error on duplicate start Server.startClaude() responded with type:'error' "An agent is already running" when the session was active. From the user's perspective this looked like a failure even though everything was working. Send type:'claude_started' instead so the frontend updates UI to "running" and replays the buffer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude Code stores its main config at /root/.claude.json — a SIBLING of the /root/.claude/ directory, not inside it. The Swarm volume mounts /root/.claude/ only, so .claude.json sits in the container's writable layer and is wiped on every redeploy. Result: theme picker and onboarding reappear on every release, even though the OAuth tokens (in /root/.claude/) survive. Claude Code itself writes timestamped backups into /root/.claude/backups/ (which IS in the volume), so we just need to restore the latest one on startup when the main file is missing. If no backup exists either, seed a minimal config so first-run prompts are skipped. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Dockerfile only copied dashboard/backend/, social-auth/, scheduler.py and the built frontend. .claude/ (agents, skills, commands, templates, rules) and docs/ were never copied, so on a fresh deploy the backend's WORKSPACE / ".claude" / "agents" path was empty. Result: /api/agents, /api/skills, /api/commands and /api/templates all returned empty lists, and the UI showed "No agents found — Add agent files to .claude/agents/ to get started" on every clean Swarm deploy. Local development worked because uv runs the backend with cwd at the repo root, where .claude/ and docs/ exist. .claude/agent-memory and .claude/.env stay excluded by .dockerignore so user data and secrets remain out of the image. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- tsconfig.app.json: multi-line lib array for readability - evonexus.portainer.stack.yml: remove stray blank line in traefik labels Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Sorry @MarcelocardosoLeal, you have reached your weekly rate limit of 500000 diff characters.
Please try again later or upgrade to continue using Sourcery
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes 12 bugs found during production Docker Swarm + Portainer + Traefik deploy. Grouped in 4 categories:
Persistência de autenticação (3 bugs)
evonexus_claude_auth:/root/.claudevolume to all Swarm services so OAuth tokens survive redeploys/root/.claude.jsonfrom/root/.claude/backups/on container start (file is a sibling of/root/.claude/, was in writable layer and wiped every deploy)evonexus.stack.ymltemplateAssets da imagem Docker (2 bugs)
.claude/anddocs/into dashboard image (backend reads them for/api/agents,/api/skills,/api/commands,/api/templates). Without this, clean deploys showed "No agents found"dashboard/data/andworkspace/from build context (SQLite DB with hashed passwords was being baked into the image)Terminal-server / sessões (4 bugs)
ANTHROPIC_API_KEYenv var inclaude-bridge.js(was silently filtered, forcing OAuth fallback every session)onExit, causing "Session already exists")startSessionidempotent on WebSocket reconnect via Traefik (returns existing entry instead of throwing)claude_startedinstead of error on duplicate active session (was showing as failure even when working)Config / UX primeiro acesso (3 bugs)
:rofrom config mount so UI can writeproviders.json32352indocker-compose.yml/root/.claude/settings.jsonwith theme + onboarding flags (each agent has its own cwd, Claude Code treated each as separate project, prompting for theme on every one)Files touched
Dockerfile.dashboard— COPY.claude/anddocs/start-dashboard.sh— seed config + restore.claude.jsonfrom backup (+53 lines)dashboard/terminal-server/src/claude-bridge.js— env vars + idempotent sessionsdashboard/terminal-server/src/server.js— correct success message on duplicate startdocker-compose.yml— volume + port + rw configevonexus.stack.yml+ newevonexus.portainer.stack.yml— volume claude_auth.dockerignore— excludedashboard/data/,workspace/Test plan
docker compose build.claude/acrossdocker service update --force/api/agents,/api/skills,/api/commandsreturn non-empty after fresh deployANTHROPIC_API_KEYfrom Providers UI instead of OAuth loginAll validated on production host
evonexus.advancedbot.com.br(Swarm + Portainer + Traefik with letsencryptresolver).🤖 Generated with Claude Code