feat: GoZen v3.0.0 - Context Compression & Middleware Pipeline#11
Open
feat: GoZen v3.0.0 - Context Compression & Middleware Pipeline#11
Conversation
…ing (v2.2.0) This release adds comprehensive observability and smart routing capabilities: - Usage Tracking: Record API usage with cost calculation based on model pricing - Budget Control: Set daily/weekly/monthly limits with warn/downgrade/block actions - Provider Health: Monitor provider health with success rate and latency metrics - Smart Load Balancing: Support failover, round-robin, least-latency, least-cost strategies - Session Insights: Track per-session usage with turn-by-turn details - Webhook Notifications: Send alerts for budget warnings, provider status, failovers - Web UI: New Usage tab with cost summary, budget status, and provider health New files: - internal/proxy/usage.go, budget.go, healthcheck.go, loadbalancer.go, metrics.go - internal/notify/webhook.go - internal/web/api_usage.go, api_health.go, api_sessions.go, api_webhooks.go, api_pricing.go Config version: 7 → 8 SQLite schema version: 2 → 3 Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
… Mistral, Qwen models Expand default model pricing to cover common programming models: - OpenAI: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo, o1, o1-mini, o3-mini - DeepSeek: deepseek-chat, deepseek-coder, deepseek-reasoner - MiniMax: abab6.5s/6.5/6.5t/5.5-chat - GLM (Zhipu): glm-4-plus/0520/air/airx/long/flash/flashx, codegeex-4 - Google Gemini: gemini-2.0-flash, gemini-1.5-pro/flash - Mistral: mistral-large/small, codestral, ministral, pixtral - Qwen (Alibaba): qwen-max/plus/turbo/long, qwen-coder-plus/turbo Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
[BETA] Context Compression:
- Add CompressionConfig for transparent context compression
- Implement ContextCompressor with token estimation and summarization
- Add compression Web API endpoints
[BETA] Middleware Pipeline:
- Add pluggable middleware architecture with Middleware interface
- Implement Pipeline executor with priority-based ordering
- Add Registry for middleware lifecycle management
- Add PluginLoader for local (.so) and remote plugin support
Built-in Middleware:
- context-injection: Auto-inject .cursorrules, CLAUDE.md
- request-logger: Log all requests and responses
- session-memory: Cross-session intelligence (v3.1 feature)
- orchestration: Multi-model orchestration - voting, chain, review (v3.2 feature)
Web API:
- GET/PUT /api/v1/compression - Compression config
- GET /api/v1/compression/stats - Compression statistics
- GET/PUT /api/v1/middleware - Middleware config
- POST /api/v1/middleware/{name}/enable|disable
- POST /api/v1/middleware/reload
Documentation:
- Add middleware development guide for third-party developers
All features are disabled by default and marked as BETA.
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Observatory: session monitoring, stuck detection, idle timeout - Guardrails: spending caps, rate limiting, sensitive operation detection - Coordinator: file locking, change awareness, context warnings - TaskQueue: priority-based task management with retry support - Runtime: autonomous agent execution with planning/execution/validation phases - Web API endpoints for all agent components All features are BETA and disabled by default. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add tests for proxy package (metrics, usage, budget, healthcheck, loadbalancer, session, compression, logger) - Add tests for web package (API v2 endpoints, server helpers) - Add tests for config and daemon packages - Achieve 82% coverage for proxy package (target: ≥80%) - Achieve 80.1% coverage for web package (target: ≥80%) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- proxy/session.go: Fix race condition in GetSessionUsage, GetSessionInsight, and GetContextWarning by holding lock during sync.Map access - proxy/healthcheck.go: Fix double-close panic in Stop() by tracking stopped state - agent/runtime.go: Fix ignored rand.Read error, add lock for task.Plan assignment - agent/observatory.go: Fix data race by reading config.StuckThreshold under lock - config/migrate.go: Clean up incomplete file on copy failure - web/auth.go: Add graceful shutdown for sessionCleanupLoop, fix rand.Read error - web/server.go: Add sync.RWMutex for syncMgr access - web/api_sync.go: Use lock when accessing/modifying syncMgr Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- config/config.go: Add nil check in ScenarioRoute.UnmarshalJSON to prevent panic when providers array contains null elements - config/config.go: Add nil checks in ProviderNames and ModelForProvider methods - proxy/logdb.go: Handle stmt.Exec errors in flushBatch, rollback on failure - web/auth.go: Fix IP spoofing in clientIP by properly parsing X-Forwarded-For header (extract first IP from comma-separated list) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- agent/taskqueue.go: Handle rand.Read error with timestamp fallback - daemon/server.go: Fix randomID to properly check os.Open and Read errors - notify/webhook.go: Handle json.Marshal errors in format functions - proxy/logdb.go: Add explicit error ignoring with comments for best-effort operations (os.Chmod, os.Remove, setSchemaVersion) - update/check.go: Add explicit error ignoring for cache operations Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- cmd/web.go: ignore exec.Command().Start() errors for browser open - internal/daemon/daemon.go: ignore os.Remove errors in cleanup functions - internal/middleware/loader.go: ignore os.Remove errors in cache operations Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- proxy/server.go: safe type assertion for message role - proxy/session.go: use fmt.Sprintf for duration formatting (fixes overflow) - web/server.go: explicitly ignore JSON encode errors (best-effort) - middleware/loader.go: ensure temp file closed via defer Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- proxy/logdb.go: explicitly ignore tx.Rollback/Commit errors (best-effort) - daemon/server.go: call pullCancel() immediately after Pull() returns Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- proxy/profile_proxy.go: ignore JSON encode error in writeError - daemon/api.go: close request body after JSON decode Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add tests for sessionCleanupLoop, StopCleanup, clientIP - Add tests for HandleFunc, SetSyncManager - Web coverage: 79.7% -> 80.7% - Add disclaimer to usage page: data is for reference only Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add tests for Shutdown resource cleanup (syncCancel, pushTimer, watcher) - Add tests for session cleanup logic (stale session removal) - Add tests for initSync cancellation of existing sync - Add tests for DaemonSysProcAttr, IsDaemonRunning, StopDaemonProcess - Add test for startProxy - Update CI coverage requirement: daemon 40% -> 50% These tests specifically target memory leak prevention by verifying: - Context cancellation on shutdown - Timer cleanup - Goroutine termination paths - Stale session cleanup Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Update version from 4.0.0 to 3.0.0 - Consolidate all features (v2.2-v4.0) into single v3.0 release - Create unified release plan document (.dev/v3.0-release-plan.md) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Website updates: - Upgrade docs version from 2.1 to 3.0 - Add Japanese (ja) and Korean (ko) locale support - Add v3.0 feature documentation: - Usage Tracking & Budget Control - Health Monitoring - Load Balancing - Webhooks - Context Compression - Middleware Pipeline - Agent Infrastructure Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add TDD requirement for new feature development - Add formal release checklist: 1. Bug check 2. Version number verification 3. Website documentation review 4. README files update - Add v3.0.0 to version history Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add v3.0 new features section covering usage tracking, budget control, provider health monitoring, smart load balancing, webhooks, context compression, middleware pipeline, and agent infrastructure. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Replace vanilla JS frontend with modern React stack: - React 18 + TypeScript + Vite build system - shadcn/ui components (Radix UI + Tailwind CSS) - TanStack Query for server state, Zustand for UI state - React Router v6 for navigation - react-i18next with 6 languages (en, zh-CN, zh-TW, es, ja, ko) - Dark/light/system theme support - Type-safe API client with React Query hooks Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Implement Bot Agent system (Phase 1-4): - Add bot gateway with IPC communication via Unix socket - Support 5 chat platforms: Telegram, Discord, Slack, Lark, FB Messenger - Natural language intent parsing for commands - Process registry with auto-generated unique names - Session management and approval workflow - Bot configuration in zen.json with platform-specific settings Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add GET/PUT /api/v1/bot API endpoints with token masking - Create Bot config page with 5 tabs: General, Platforms, Interaction, Aliases, Notifications - Support 5 chat platforms: Telegram, Discord, Slack, Lark, Facebook Messenger - Add Collapsible UI component for platform config sections - Add i18n translations for all 6 languages (en, zh-CN, zh-TW, es, ja, ko) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add comprehensive unit and integration tests for the bot package: - gateway_test.go: Start/Stop, handleConnection, IPC message handling - handlers_test.go: intent processing, message handling, approvals - client_test.go: client initialization and error cases - nlu_test.go: NLU parser for various intents and languages - registry_test.go: process registry operations - session_test.go: session management - protocol_test.go: IPC protocol types - adapters/adapter_test.go: adapter config helpers Use short socket paths (/tmp/zen-test-*.sock) for macOS compatibility with Unix socket 104-byte path limit. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
The custom UnmarshalJSON was missing Sync, Pricing, Budgets, Webhooks, HealthCheck, Compression, Middleware, Agent, and Bot fields, causing them to be nil after JSON parsing. Also adds comprehensive tests for the bot API endpoints, bringing internal/web coverage from 73.8% to 81.2%. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add Bot Gateway section to all README files (EN, zh-CN, zh-TW, es) - Create comprehensive bot.md documentation for website with: - Platform setup guides (Telegram, Discord, Slack, Lark, FB Messenger) - Bot commands and natural language support - Configuration examples - Security best practices - Update sidebars to include bot documentation - Bump version to 3.0.0-alpha.3 Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
87f657b to
02d2e09
Compare
…oval When IsDaemonRunning() checked if the daemon was listening on the expected port, it would remove the PID file if the port check failed (e.g., timeout). This made it impossible to stop the daemon later, causing upgrade and restart commands to fail silently while the old daemon kept running. Now IsDaemonRunning() returns the PID even when port check fails (as long as the process is alive), and StopDaemonProcess() will attempt to stop any alive process found in the PID file. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add comprehensive integration tests for the daemon module covering: - Daemon start: PID file creation, port listening, status API - Daemon stop: process termination, PID file removal, port release - Daemon restart: old process cleanup, PID file update - Upgrade scenario: stopping daemon even when port check fails - Stale PID file handling - Graceful shutdown with active requests These tests run against the actual binary and verify real-world behavior that unit tests cannot catch (like the PID file removal bug fixed in the previous commit). Run with: go test -tags=integration ./test/integration/... Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Run daemon integration tests as part of CI to catch real-world issues that unit tests cannot detect. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Proxy tests (8 tests): - Basic routing to provider - Failover to backup provider when primary fails - Error handling when all providers fail - Session persistence across requests - Profile-based routing - Streaming response handling - Slow provider handling - Invalid profile rejection Web API tests (12 tests): - Health endpoint - Providers list and get - Profiles list - Settings get - Daemon status - Config reload - Static files serving - CORS handling - Bindings list - Logs endpoint These tests use mock HTTP servers to simulate provider behavior, allowing us to test failover, error handling, and streaming without hitting real APIs. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
The daemon's Shutdown() runs in a goroutine when handling SIGTERM. Previously, the process could exit before Shutdown() completed, leaving the PID file behind. This caused flaky integration tests in CI environments. Now we wait for the shutdown goroutine to complete before returning from runDaemonForeground(). Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add support for running a separate dev daemon with isolated config: - GOZEN_CONFIG_DIR env var to specify custom config directory - scripts/dev.sh for managing dev daemon (ports 29840/29841) - vite.config.ts supports VITE_API_PORT env var Usage: ./scripts/dev.sh # Start dev daemon ./scripts/dev.sh stop # Stop dev daemon ./scripts/dev.sh web # Start frontend dev server ./scripts/dev.sh all # Start both The dev daemon uses ~/.zen-dev for config, completely isolated from the production daemon at ~/.zen. Frontend hot-reloads while API requests proxy to the dev daemon. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
1. Fix daemon PID file cleanup race condition: - Wait for shutdown goroutine to complete before exiting - Add fallback cleanup if Start() returns for other reasons 2. Add Nunito as primary font for Web UI and website: - Import from Google Fonts - Set as default sans-serif in Tailwind config - Update website SCSS Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
The auto-generated changelog from GitHub releases is not readable. Hide the releases page until we have a better changelog generation method. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Settings page improvements: - General tab: Add default profile/client selection (was just placeholder text) - Bindings tab: Add "Add Binding" button with dialog - Sync tab: Move "Sync Enabled" switch to top, add backend configuration - Password tab: Rename to "Web UI Password" with description Bot page fixes: - Remove leftover __CONTINUE_*__ comments from code generation - Fix Discord section display Navigation: - Swap Usage and Logs order in sidebar Also update types and API functions to match actual backend responses. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR implements GoZen v3.0.0 with two major BETA features:
All features are disabled by default and marked as BETA.
Features
Context Compression (BETA)
Transparent context compression that intercepts large conversation histories, summarizes them with a cheap model, and forwards compressed requests upstream.
Middleware Pipeline (BETA)
Transform GoZen into a programmable AI API gateway with a pluggable middleware chain.
Architecture:
Middlewareinterface for custom middleware developmentPipelineexecutor with priority-based orderingRegistryfor middleware lifecycle managementPluginLoaderfor local (.so) and remote plugin supportBuilt-in Middleware:
context-injectionsession-memoryrequest-loggerorchestrationThird-Party Plugin Support
Configuration
{ "compression": { "enabled": false, "threshold_tokens": 50000, "target_tokens": 20000, "summary_model": "claude-3-haiku-20240307", "preserve_recent": 4 }, "middleware": { "enabled": false, "middlewares": [ { "name": "context-injection", "enabled": true, "source": "builtin" } ] } }Web API
Compression
GET/PUT /api/v1/compression- ConfigurationGET /api/v1/compression/stats- StatisticsMiddleware
GET/PUT /api/v1/middleware- ConfigurationGET /api/v1/middleware/{name}- DetailsPOST /api/v1/middleware/{name}/enable- EnablePOST /api/v1/middleware/{name}/disable- DisablePOST /api/v1/middleware/reload- Reload allFiles Changed
New Files
internal/proxy/compression.go- Context compressorinternal/middleware/*.go- Middleware packageinternal/web/api_compression.go- Compression APIinternal/web/api_middleware.go- Middleware APIdocs/middleware-development.md- Development guideModified Files
internal/config/config.go- New config typesinternal/config/store.go- New getters/settersinternal/proxy/server.go- Integrationinternal/daemon/server.go- Initializationcmd/root.go- Version bump to 3.0.0Testing
All existing tests pass. New tests added for:
go test ./...Breaking Changes
None. All features are opt-in and disabled by default.