Skip to content

Feature query interaction latency observability#1081

Open
baiyuqing wants to merge 11 commits intoserverlessfrom
feature-query-interaction-latency-observability
Open

Feature query interaction latency observability#1081
baiyuqing wants to merge 11 commits intoserverlessfrom
feature-query-interaction-latency-observability

Conversation

@baiyuqing
Copy link
Collaborator

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

What is changed and how it works:

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Notable changes

  • Has configuration change
  • Has HTTP API interfaces change
  • Has tiproxyctl change
  • Other user behavior changes

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Comments:

- Add per-interaction latency histogram, slow interaction logging, and backend metric label GC with TTL.

- Add dynamic runtime settings via config hot reload for thresholds and GC interval/TTL.

- Add design doc and usage manual under docs/ for rollout, tuning, and capacity planning guidance.

- Keep backward compatibility by preserving existing query duration semantics and defaulting new feature off.
- add advance.query-interaction-user-patterns with glob syntax validation

- hot-reload user patterns and filter only query interaction histogram collection

- use handshake/change-user usernames in command execution path, including COM_CHANGE_USER

- extend docs, config examples, and add tests for config/runtime/filter behavior
@ti-chi-bot ti-chi-bot bot requested review from djshow832 and xhebox February 7, 2026 09:55
@ti-chi-bot
Copy link

ti-chi-bot bot commented Feb 7, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign xhebox for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/XXL label Feb 7, 2026
@codecov-commenter
Copy link

codecov-commenter commented Feb 7, 2026

Codecov Report

❌ Patch coverage is 84.50363% with 64 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (serverless@b0716f5). Learn more about missing BASE report.

Files with missing lines Patch % Lines
pkg/proxy/backend/cmd_processor_exec.go 84.29% 14 Missing and 5 partials ⚠️
pkg/proxy/backend/metrics.go 82.60% 8 Missing and 4 partials ⚠️
pkg/metrics/metrics.go 9.09% 10 Missing ⚠️
pkg/proxy/backend/sql_type.go 88.31% 6 Missing and 3 partials ⚠️
cmd/tiproxy/main.go 82.85% 6 Missing ⚠️
lib/config/proxy.go 79.16% 4 Missing and 1 partial ⚠️
pkg/metrics/interaction.go 96.07% 1 Missing and 1 partial ⚠️
pkg/proxy/backend/backend_conn_mgr.go 87.50% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             serverless    #1081   +/-   ##
=============================================
  Coverage              ?   56.74%           
=============================================
  Files                 ?       83           
  Lines                 ?     7528           
  Branches              ?        0           
=============================================
  Hits                  ?     4272           
  Misses                ?     2925           
  Partials              ?      331           
Flag Coverage Δ
unit 56.74% <84.50%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- add sql_type label to query_interaction_duration_seconds

- classify COM_QUERY into select/update/begin/commit/... with other fallback

- keep non-COM_QUERY commands on sql_type=other

- update docs and tests for new interaction metric granularity
- enrich slow interaction logs with connection_id, interaction_time, username and sql_type

- add username pattern match fields and dedicated matched-pattern warning log

- expose matcher API returning matched pattern and wire connection id into cmd processor

- update docs for new log events and fields
- rewrite query-interaction-latency-design.md in English

- add query-interaction-latency-design-zh.md as standalone Chinese version

- add README link to Chinese design doc
@ti-chi-bot
Copy link

ti-chi-bot bot commented Feb 8, 2026

@baiyuqing: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-check 7b6a5b8 link true /test pull-check

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants