-
Notifications
You must be signed in to change notification settings - Fork 129
Fix: Deterministic Crash (SIGSEGV) in AsyncAppender when resizing buffer #586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The setBufferSize method failed to re-align data indices when changing the modulo (buffer size). This caused readers to access invalid slots (nullptr), leading to SIGSEGV. This patch implements a drain-and-reset strategy to safely realign pending events into the new buffer and reset atomic counters.
So why is your fix is any better than the current implementation? |
|
Thanks for the challenge @swebb2066. The difference is fundamental: The current implementation is mathematically broken even in a single-threaded (quiescent) scenario. Here is exactly why the current code causes a deterministic crash regardless of concurrency, based on The Mathematical Flaw // Line 519 in AsyncAppender::dispatch
auto index = priv->dispatchedCount % priv->buffer.size();If Scenario (Current Code Failure):
My Fix (Drain & Re-align)
Conclusion:
|
|
Are you able to create a test to validate this behavior? If it is reliable a test case to validate that it is broken or not would be helpful. |
|
Agreed, that's a fair point. Since the issue is just the modulo logic getting out of sync with the data layout, the bug is deterministic. We can reproduce it reliably without race conditions. Here is a standalone test case. On master, this asserts or crashes because You can add this to src/test/cpp/asyncappendertestcase.cpp: void testBufferResizeWraparound()
{
int initialSize = 10;
int newSize = 20;
auto asyncAppender = std::make_shared<AsyncAppender>();
asyncAppender->setBufferSize(initialSize);
auto vectorAppender = std::make_shared<VectorAppender>();
asyncAppender->addAppender(vectorAppender);
Pool p;
asyncAppender->activateOptions(p);
LoggerPtr root = Logger::getRootLogger();
// 1. warmup: push 5 events and let them drain completely.
// this moves the internal read head (dispatchedCount) to 5.
for (int i = 0; i < 5; ++i) {
auto event = std::make_shared<spi::LoggingEvent>(
root->getName(), Level::getInfo(), LOG4CXX_STR("Warmup"),
spi::LocationInfo::getLocationUnavailable());
asyncAppender->append(event, p);
}
// simple spin-wait to ensure dispatcher caught up
for (int i = 0; i < 100000 && vectorAppender->getVector().size() < 5; ++i) {
std::this_thread::yield();
}
LOGUNIT_ASSERT_EQUAL((size_t)5, vectorAppender->getVector().size());
// 2. wrap-around: fill the buffer (10 items).
// since we started at index 5, the data wraps: [10-14, 5-9]
for (int i = 0; i < 10; ++i) {
auto event = std::make_shared<spi::LoggingEvent>(
root->getName(), Level::getInfo(), LOG4CXX_STR("CrashTest"),
spi::LocationInfo::getLocationUnavailable());
asyncAppender->append(event, p);
}
// 3. resize while wrapped.
// without the fix, this leaves data at the old indices, but the read logic
// uses the new size modulo, pointing to empty slots.
asyncAppender->setBufferSize(newSize);
// 4. trigger drain.
// master: reads invalid memory/nullptr.
// fix: correctly reads realigned data.
asyncAppender->close();
LOGUNIT_ASSERT_EQUAL((size_t)15, vectorAppender->getVector().size());
}I'm happy to include this test directly in the PR if that helps the review. |
|
yes, please add this to the PR |
I believe setBufferSize() could wait for dispatch thread quiescence after changing bufferSize to zero for the duration of the lock guarded changes (bufferSize would need to be an atomic variable) |
Summary
I identified a critical algorithmic defect in
AsyncAppender::setBufferSize.The appender uses a Ring Buffer with modulo arithmetic (
index = counter % size).When the buffer size changes, the modulus changes, invalidating the mapping of existing data.
The Crash:
If the buffer contains wrapped data, the reader calculates indices based on the new size. These indices point to slots that are logically empty (containing
nullptrfromstd::vector::resize). Dereferencing them causes a SEGFAULT.Technical Analysis
The
appendmethod uses a lock-free write path relying oneventCount.setBufferSizesimply calledresize(), which breaks theindex = count % sizeinvariant for existing data.Remediation
The patch implements a "Drain and Re-align" strategy:
dispatchedCount,commitCount,eventCount) to match the new linear layout.Note: This fix assumes the caller pauses logging before resizing (Quiescence), which is required anyway as
append()does not hold the mutex during writes.