Skip to content

fix: prevent apisix_llm_active_connections gauge leak when plugin exits early via ngx.exit()#13139

Open
shreemaan-abhishek wants to merge 2 commits intoapache:masterfrom
shreemaan-abhishek:fix/llm-active-connections-gauge-leak
Open

fix: prevent apisix_llm_active_connections gauge leak when plugin exits early via ngx.exit()#13139
shreemaan-abhishek wants to merge 2 commits intoapache:masterfrom
shreemaan-abhishek:fix/llm-active-connections-gauge-leak

Conversation

@shreemaan-abhishek
Copy link
Copy Markdown
Contributor

Problem

apisix_llm_active_connections is a Prometheus Gauge that tracks in-flight LLM requests. The counter leaks (never decrements) whenever a plugin calls ngx.exit() during request processing — not only in SSE streaming, but also in non-streaming responses.

Root cause: When ai-aliyun-content-moderation (or any other plugin) calls ngx.exit() inside a phase handler (e.g. body_filter, header_filter), OpenResty terminates the current coroutine immediately. This exit is not caught by the pcall wrapping the upstream request in ai-proxy/base.lua. As a result:

  1. exporter.inc_llm_active_connections(ctx) is called before pcall(do_request)
  2. A plugin calls ngx.exit() — either mid-stream (SSE) or after receiving a complete non-streaming response
  3. exporter.dec_llm_active_connections(ctx) placed after pcall is never reached
  4. Gauge leaks — only goes up, never down

This affects both ai-proxy and ai-proxy-multi in all request types: non-streaming chat, SSE streaming, and any other path where a downstream plugin exits early.

Fix

Remove the dec call from after pcall in ai-proxy/base.lua and instead rely solely on the log phase, which always runs even after ngx.exit(). Introduce a ctx.llm_active_connections_tracked flag to prevent double-decrement:

ai-proxy/base.lua — increment and set flag, no dec after pcall:

exporter.inc_llm_active_connections(ctx)
ctx.llm_active_connections_tracked = true
local ok, code_or_err, body = pcall(do_request)
-- dec is intentionally NOT here — handled in log phase

ai-proxy.lua and ai-proxy-multi.lua log phase:

function _M.log(conf, ctx)
    if ctx.llm_active_connections_tracked then
        exporter.dec_llm_active_connections(ctx)
        ctx.llm_active_connections_tracked = false
    end
    -- ...
end

The log phase runs unconditionally regardless of how the request ended (normal completion, upstream error, or ngx.exit() from any plugin), so the gauge is always correctly decremented.

Tests

Added a regression test in t/plugin/ai-aliyun-content-moderation.t:

  • Creates a route with prometheus + ai-proxy + ai-aliyun-content-moderation (check_response=true)
  • Sends a non-streaming chat request (LLM mock always returns offensive content)
  • Content moderation denies the response via ngx.exit(400)
  • Asserts apisix_llm_active_connections{...} 0 in Prometheus metrics after the log phase completes

All existing tests in t/plugin/prometheus-ai-proxy.t (40 tests) continue to pass.

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

…ts early via ngx.exit()

Signed-off-by: Abhishek Choudhary <shreemaan.abhishek@gmail.com>
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants