feat(api): POST /ai/summarize endpoint (HDX-3992)#2206
Open
alex-fedotyev wants to merge 8 commits into
Open
Conversation
Adds a reusable best-effort secret redactor with conservative allowlist patterns covering: PEM blocks, basic-auth URLs, key=value pairs, JSON-shaped secrets, HTTP secret headers, Bearer/Basic auth values, JWTs, AWS access keys, Slack tokens, and GitHub token shapes. Codifies the design rule for HyperDX AI endpoints in the file header: LLM input derived from observability data passes through redactSecrets; user-authored prose does not. Internal-only; no consumer in this commit. Imported by the upcoming /ai/summarize endpoint and any future LLM endpoints that ingest observability data. Refs HDX-3992.
Address review comments on #2188: - basic-auth-url now handles "@" in passwords. Previous regex stopped at the first "@", leaving any password tail before the host visible. New regex greedily consumes the password and backtracks to the last "@" before the host; host is captured and preserved in the replacement. New test: a password containing "@" must be fully redacted, with the host intact. - key-value pattern now matches shell-style quoted values: PASSWORD="hunter2 with spaces" and API_KEY='abc 123' are redacted. Previously the unquoted character class stopped at the leading quote, so neither pattern fired. Two new tests cover both quote styles. - pem pattern is bounded by {0,16000}? on the lazy match so an unmatched BEGIN does not scan an unbounded amount of trailing input. Real PEM blocks are well under 16KB; the API caps the whole request body at 50KB. New test asserts unchanged output and sub-500ms wall-clock on a 50KB unmatched-BEGIN payload. - Header "Known gaps" comment now mentions raw "@" in basic-auth usernames (ambiguous to parse without percent-encoding). 44 tests pass; eight new cases for the items above. No changes to the public surface. Refs HDX-3992.
The previous review-fix commit pushed prod lines from 139 to 153, just over the Tier 2 threshold (< 150 prod lines). Compressing the verbose comments on PEM, basic-auth-url, and key-value patterns brings prod back to 144. No behavior change.
Co-Authored-By: Claude Opus <model> <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: ef0357f The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
PR Review
No critical bugs or security regressions. Secret redaction, |
alex-fedotyev
added a commit
that referenced
this pull request
May 6, 2026
Three review-prep changes against #2206: 1. Trim TONE_VALUES to `default | noir`. The original four-tone set came from the April Fools 2026 easter egg; with that egg sunset, only the detective-noir option stays as a hidden-gem alternate the front-end will gate behind a debug flag in PR D. New tones come back when the UI is ready to consume them. 2. Cap model output at 1024 tokens. Summaries are bounded at 4 sentences by the prompt rules; this is a defense-in-depth ceiling so a misbehaving model cannot stream an unbounded response within the per-minute rate limit. 3. Document the `as unknown as LanguageModel` test-mock cast and the rate-limit keyGenerator's auth-header / IP fallback so the mounted-behind-isUserAuthenticated invariant is explicit. Tests updated for the trimmed tone set; 26/26 still green. Refs HDX-3992.
The PR body has always declared this PR as having no user-facing change (internal-only utility, no consumer in this PR). The changeset was added in error and would surface a stray "feat(api)" line in the next release notes for code that no production caller reaches yet. Drop it; the consumer's PR (#2206) carries the changeset that ships the user-facing behavior.
Backend endpoint for natural-language summaries of logs/traces and patterns. Subject-prompt registry keyed by `kind`, hardcoded tone modifiers (default | noir | attenborough | shakespeare), and a 30 req/min per-user rate limit. User content is wrapped in <data> tags so the model can separate data from instructions; secrets are redacted via the utility from #2188. Initial release covers `event` and `pattern`. The `alert` kind, conversation history (`messages` array), and trace-context enrichment land in follow-up PRs as their UI consumers ship.
Three review-prep changes against #2206: 1. Trim TONE_VALUES to `default | noir`. The original four-tone set came from the April Fools 2026 easter egg; with that egg sunset, only the detective-noir option stays as a hidden-gem alternate the front-end will gate behind a debug flag in PR D. New tones come back when the UI is ready to consume them. 2. Cap model output at 1024 tokens. Summaries are bounded at 4 sentences by the prompt rules; this is a defense-in-depth ceiling so a misbehaving model cannot stream an unbounded response within the per-minute rate limit. 3. Document the `as unknown as LanguageModel` test-mock cast and the rate-limit keyGenerator's auth-header / IP fallback so the mounted-behind-isUserAuthenticated invariant is explicit. Tests updated for the trimmed tone set; 26/26 still green. Refs HDX-3992.
c10c9aa to
ef0357f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR A of the AI Summarize stack (parent: HDX-3992). Adds a backend endpoint that generates natural-language summaries of log/trace events and patterns via the configured LLM provider. Stacks on top of #2188 (redactSecrets utility).
This replaces the original PR #2108, which is being decomposed into focused, reviewable PRs.
What this PR ships
Endpoint:
POST /ai/summarizekind(event|pattern),content, optionaltone{ summary: string }Prompt registry (
aiSummarize.ts):kind. Adding a new summarize target (alerts, anomalies, etc.) is a single registry entry plus a matching subject on the client.<data>") doesn't drift between subjects.Tone modifiers:
default,noir.defaultis the only tone the standard UI exposes;noiris a hidden-gem alternate that PR D will gate behind a debug flag. Tone is keyed by enum, never taken from raw user input, so there is no freeform prompt-injection surface.Security:
redactSecretsfrom feat(api): redactSecrets util for LLM input from observability data #2188 before being sent to the model.<data>...</data>delimiters and the system prompt explicitly tells the model to treat that block as data, not instructions.Rate limiting:
@/utils/rateLimiter(already wired forrouters/external-api/v2/index.ts); no new packages, no new middleware.Tier
Auto-classified Tier 3 because the change touches
packages/api/src/routers/, which the triage classifier flags as "hidden complexity risk" regardless of size. Production lines (178) and file count (2) fit the new Tier 2 ceiling, but the routers-touch rule is non-overrideable. Splitting the endpoint registration off does not buy a smaller diff (the router file is the new logic), so this PR lands as Tier 3 with the 26 tests below intended to make the review fast.Deliberately deferred
These were in #2108 but are not user-visible until later PRs, so they belong in those PRs:
alertkind: no UI consumer yet.messagesarray (multi-turn follow-up Q&A): no UI consumer yet.?smart=true/ localStorage wiring: lands in PR D.noirbecomes reachable then.Tests
26 tests in
packages/api/src/routers/api/__tests__/aiSummarize.test.ts:<data>, tone passed through, single-shot mode (nomessages), 429 once per-identity cap is exceeded.Stack
useAISummarizeStatehook +SummarizeBoxcomponent (front-end)noirbecomes reachableTest plan
yarn workspace @hyperdx/api jest --testPathPatterns aiSummarize: 26/26 passingyarn workspace @hyperdx/api lint:fix: 0 errors on new filesyarn workspace @hyperdx/api tsc --noEmit: cleanprose-lint: clean