Skip to content

Conversation

@wk989898
Copy link
Collaborator

What problem does this PR solve?

Issue Number: close #xxx

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 24, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Dec 24, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 24, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kennytm for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 24, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @wk989898, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces crucial enhancements to support batch DML operations on tables containing virtual columns. The changes primarily involve adjusting the internal logic that determines when batch DML can be utilized, specifically by removing the restriction imposed by virtual columns. Furthermore, the handling of generated columns during the construction of SQL statements has been refined to correctly exclude them, ensuring compatibility and preventing errors, thereby enabling more efficient data synchronization.

Highlights

  • Batch DML for Virtual Columns: Batch Data Manipulation Language (DML) operations are now supported for tables that include virtual columns, removing a previous limitation.
  • Updated Batch SQL Generation Logic: The shouldGenBatchSQL function has been modified to no longer consider the presence of virtual columns as a disqualifier for generating batch DML, streamlining the process.
  • Refined Generated Column Handling: Generated columns (including virtual ones) are now explicitly skipped when preparing SQL arguments and constructing WHERE clauses, ensuring correct DML execution without interference from computed values.
  • Simplified Batching Logic: The SameTypeTargetAndColumns function has been removed, which may indicate a simplification or refactoring of the logic used to determine if multiple row changes can be merged into a multi-value DML.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@wk989898
Copy link
Collaborator Author

/test pull-cdc-mysql-integration-heavy

@wk989898
Copy link
Collaborator Author

/test pull-cdc-mysql-integration-light

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to add support for batch DML with virtual columns. The main changes involve removing the restriction on batching for tables with virtual columns, and ensuring that generated columns are skipped when preparing DML arguments. My review has found a couple of critical issues in pkg/sink/sqlmodel/row_change.go that could lead to panics. There's an inconsistency in how generated columns are handled between ColumnCount and getArgsWithGeneratedColumn, and a bug in whereColumnsAndValues where column names are filtered but their corresponding values are not. I've also pointed out an outdated comment and a misleading test case name that should be addressed for clarity. Overall, the direction is good, but the implementation needs to be more careful about consistency and correctness when handling generated columns.

c := 0
for _, col := range r.sourceTableInfo.GetColumns() {
if !col.Hidden {
if !col.Hidden && !col.IsVirtualGenerated() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The method IsVirtualGenerated does not seem to exist on *model.ColumnInfo. This will likely cause a compilation error.

Also, there's a potential inconsistency. This function uses IsVirtualGenerated (assuming it checks for virtual generated columns), but getArgsWithGeneratedColumn in pkg/sink/mysql/sql_builder.go uses IsGenerated(), which checks for both virtual and stored generated columns. This can lead to a mismatch between the number of columns counted here and the number of arguments prepared, causing a panic in NewRowChange if the table has stored generated columns.

To handle all generated columns consistently, you should probably use IsGenerated() here.

Suggested change
if !col.Hidden && !col.IsVirtualGenerated() {
if !col.Hidden && !col.IsGenerated() {

Comment on lines 234 to 237
if !column.IsGenerated() {
columnNames = append(columnNames, column.Name.O)
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There seems to be a bug here. You are filtering out generated columns when building columnNames, but the corresponding values in the values slice are not filtered. If columns contains any generated columns (for example, if a unique index used for the WHERE clause contains a generated column), len(columnNames) will be different from len(values). This will trigger the panic you've added on line 248. You should filter values in sync with columnNames.

// 4. There's more than one row in the group
// 5. All events have the same safe mode status
func (w *Writer) shouldGenBatchSQL(hasPKOrNotNullUK bool, hasVirtualCols bool, events []*commonEvent.DMLEvent) bool {
func (w *Writer) shouldGenBatchSQL(hasPKOrNotNullUK bool, events []*commonEvent.DMLEvent) bool {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment on line 132 // 3. The table doesn't have virtual columns is now outdated because this function no longer checks for virtual columns. Please remove it to avoid confusion.

t.Run(tt.name, func(t *testing.T) {
t.Parallel()
got := writer.shouldGenBatchSQL(tt.hasPK, tt.hasVirtualCols, tt.events)
got := writer.shouldGenBatchSQL(tt.hasPK, tt.events)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test case "table with virtual columns should not use batch SQL" (lines 52-58) is now misleading. The shouldGenBatchSQL function no longer checks for virtual columns. This test case passes only because it uses a single-row event.

Please consider updating this test case to either:

  1. Rename it to reflect that it's testing single-row events.
  2. Create a new test case that specifically verifies that tables with virtual columns can be batched when they have multiple rows.

Also, the hasVirtualCols field in the test struct (line 37) is no longer used and can be removed.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 26, 2025

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

📖 For more info, you can check the "Contribute Code" section in the development guide.

@wk989898
Copy link
Collaborator Author

/test all

@wk989898
Copy link
Collaborator Author

/retest

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 27, 2025

@wk989898: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-error-log-review d538fc5 link true /test pull-error-log-review

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant