Skip to content

MDEV-39600 Reduce contention in buf_flush_ahead()#5078

Draft
iMineLink wants to merge 1 commit into
MariaDB:11.8from
iMineLink:MDEV-39600
Draft

MDEV-39600 Reduce contention in buf_flush_ahead()#5078
iMineLink wants to merge 1 commit into
MariaDB:11.8from
iMineLink:MDEV-39600

Conversation

@iMineLink
Copy link
Copy Markdown
Contributor

When calling buf_flush_ahead() with furious=false, the function updates the async target buf_flush_async_lsn and, if necessary notifies the page cleaner thread to start flushing more eagerly, to avoid a long stall later when checkpoint will be required.

When many threads call buf_flush_ahead() with furious=false, if the update of buf_flush_async_lsn is gated inside
buf_pool.flush_list_mutex, this can cause significant contention, just to deliver the new buf_flush_async_lsn value and a potential notification to the page cleaner thread.

This commit enables, for the non furious case, lock-free update of buf_flush_async_lsn and lock-free check of the page cleaner idle flag, to avoid the contention on buf_pool.flush_list_mutex. The lock is acquired by the CAS-loop winner thread and only after the page cleaner is observed idle.
Since the the reads are not protected by the mutex, there is a chance that the decision to not signal the page cleaner thread is wrong. In the worst case though, under sustained redo-log pressure, eventually a wakeup signal will be delivered to the page cleaner thread anyway, so the risk of a long stall is limited.

When calling buf_flush_ahead() with furious=false, the function
updates the async target buf_flush_async_lsn and, if necessary notifies
the page cleaner thread to start flushing more eagerly, to avoid
a long stall later when checkpoint will be required.

When many threads call buf_flush_ahead() with furious=false, if
the update of buf_flush_async_lsn is gated inside
buf_pool.flush_list_mutex, this can cause significant contention,
just to deliver the new buf_flush_async_lsn value and a potential
notification to the page cleaner thread.

This commit enables, for the non furious case, lock-free update
of buf_flush_async_lsn and lock-free check of the page cleaner idle
flag, to avoid the contention on buf_pool.flush_list_mutex.
The lock is acquired by the CAS-loop winner thread and only
after the page cleaner is observed idle.
Since the the reads are not protected by the mutex, there is a chance
that the decision to not signal the page cleaner thread is wrong.
In the worst case though, under sustained redo-log pressure,
eventually a wakeup signal will be delivered to the page cleaner thread
anyway, so the risk of a long stall is limited.
@iMineLink iMineLink requested a review from Copilot May 14, 2026 14:56
@iMineLink iMineLink self-assigned this May 14, 2026
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces contention in buf_flush_ahead() by making the non-furious async flush target update lock-free while preserving the mutex-protected furious checkpoint path.

Changes:

  • Splits page cleaner idle state into a separate relaxed atomic flag.
  • Changes non-furious buf_flush_ahead() to update buf_flush_async_lsn via CAS.
  • Adds compare_exchange_weak() to Atomic_relaxed.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
storage/innobase/include/buf0buf.h Separates page cleaner idle state from page_cleaner_status.
storage/innobase/buf/buf0flu.cc Implements lock-free async LSN bumping and updates page-cleaner wakeup handling.
include/my_atomic_wrapper.h Adds weak compare-exchange support to the atomic wrapper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2145 to +2171
/* Non-furious: lock-free monotonic-max via CAS-loop. */
lsn_t cur= buf_flush_async_lsn;
while (cur < lsn)
if (buf_flush_async_lsn.compare_exchange_weak(cur, lsn))
goto bumped;
return;

bumped:
/* In non-furious mode, concurrent writes to the log will remain
possible, and we are gently requesting buf_flush_page_cleaner()
to do more work to avoid a later call with furious=true.
We will only wake the buf_flush_page_cleaner() from an indefinite
my_cond_wait(), but we will not disturb the regular 1-second sleep.

To reduce contention for a gentle request case, the idle flag of
buf_pool is read outside of a buf_pool.flush_list_mutex critical
section, allowing early return.

If the decision to return early was (unlikely) wrong, subsequent
wake-side paths (including further buf_flush_ahead() calls under
redo-log pressure) are expected to perform the wake eventually. */
if (!buf_pool.page_cleaner_idle())
return;

/* Signal under buf_pool.flush_list_mutex. */
mysql_mutex_lock(&buf_pool.flush_list_mutex);
goto wake;
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the page cleaner's idle state management by replacing a bitmask flag with an atomic boolean and optimizes the buf_flush_ahead function using a lock-free CAS loop for non-furious LSN updates. These changes aim to reduce mutex contention by allowing lock-free idle status checks. However, the refactoring introduced a critical compilation error because the wake label was removed while goto wake statements remain in the code. The feedback highlights the need to restore this label to maintain the shared signaling logic.

@@ -2133,16 +2137,38 @@ ATTRIBUTE_COLD void buf_flush_ahead(lsn_t lsn, bool furious) noexcept
buf_pool.page_cleaner_set_idle(false);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The wake label is missing in the refactored buf_flush_ahead function, which will cause a compilation error because goto wake is used later in the non-furious path (line 2171). The label should be placed here to allow both paths to share the signaling logic.

      wake:
        buf_pool.page_cleaner_set_idle(false);


/* Signal under buf_pool.flush_list_mutex. */
mysql_mutex_lock(&buf_pool.flush_list_mutex);
goto wake;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This goto wake will fail to compile because the wake: label is not defined within the function scope in the current changes. As suggested above, the label should be added to the signaling block inside the furious path to avoid duplication and fix the compilation error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants