fix(storage): enforce mutual exclusion between Close() and Finalize() in async writers#16211
fix(storage): enforce mutual exclusion between Close() and Finalize() in async writers#16211kalragauri wants to merge 1 commit into
Conversation
… in async writers
There was a problem hiding this comment.
Code Review
This pull request introduces validation checks in Finalize() and Close() for both buffered and resumed async writer connections to prevent invalid state transitions (such as calling Close() after Finalize(), or calling them multiple times), along with corresponding unit tests. The review feedback highlights critical race conditions in these checks because the lock is released before the futures are moved out, allowing concurrent calls to bypass the .valid() checks. It is recommended to use the existing finalize_ and close_ boolean flags instead, which are safely updated under the lock.
| if (!finalized_future_.valid()) { | ||
| return make_ready_future(StatusOr<google::storage::v2::Object>( | ||
| internal::FailedPreconditionError("Finalize() already called", | ||
| GCP_ERROR_INFO()))); | ||
| } | ||
| if (!closed_future_.valid()) { | ||
| return make_ready_future(StatusOr<google::storage::v2::Object>( | ||
| internal::FailedPreconditionError( | ||
| "Finalize() cannot be called after Close()", GCP_ERROR_INFO()))); | ||
| } |
There was a problem hiding this comment.
There is a race condition here. finalized_future_ is only moved out at the very end of Finalize(). However, HandleNewData() releases the lock mu_ before the function returns. If another thread calls Finalize() or Close() concurrently after the lock is released but before the function returns, finalized_future_.valid() will still be true, allowing the check to be bypassed. This can lead to duplicate execution and returning an invalid/moved-from future.
Using the existing finalize_ and close_ boolean flags (which are checked and set under the lock) completely avoids this race condition and is much safer.
if (finalize_) {
return make_ready_future(StatusOr<google::storage::v2::Object>(
internal::FailedPreconditionError("Finalize() already called",
GCP_ERROR_INFO())));
}
if (close_) {
return make_ready_future(StatusOr<google::storage::v2::Object>(
internal::FailedPreconditionError(
"Finalize() cannot be called after Close()", GCP_ERROR_INFO())));
}| if (!closed_future_.valid()) { | ||
| return make_ready_future(internal::FailedPreconditionError( | ||
| "Close() already called", GCP_ERROR_INFO())); | ||
| } | ||
| if (!finalized_future_.valid()) { | ||
| return make_ready_future(internal::FailedPreconditionError( | ||
| "Close() cannot be called after Finalize()", GCP_ERROR_INFO())); | ||
| } |
There was a problem hiding this comment.
There is a race condition here. closed_future_ is only moved out at the very end of Close(). However, the lock mu_ is released before the function returns. If another thread calls Close() or Finalize() concurrently after the lock is released but before the function returns, closed_future_.valid() will still be true, allowing the check to be bypassed.
Using the existing close_ and finalize_ boolean flags (which are checked and set under the lock) completely avoids this race condition and is much safer.
if (close_) {
return make_ready_future(internal::FailedPreconditionError(
"Close() already called", GCP_ERROR_INFO()));
}
if (finalize_) {
return make_ready_future(internal::FailedPreconditionError(
"Close() cannot be called after Finalize()", GCP_ERROR_INFO()));
}| if (!finalized_future_.valid()) { | ||
| return make_ready_future(StatusOr<google::storage::v2::Object>( | ||
| internal::FailedPreconditionError("Finalize() already called", | ||
| GCP_ERROR_INFO()))); | ||
| } | ||
| if (!closed_future_.valid()) { | ||
| return make_ready_future(StatusOr<google::storage::v2::Object>( | ||
| internal::FailedPreconditionError( | ||
| "Finalize() cannot be called after Close()", GCP_ERROR_INFO()))); | ||
| } |
There was a problem hiding this comment.
There is a race condition here. finalized_future_ is only moved out at the very end of Finalize(). However, HandleNewData() releases the lock mu_ before the function returns. If another thread calls Finalize() or Close() concurrently after the lock is released but before the function returns, finalized_future_.valid() will still be true, allowing the check to be bypassed. This can lead to duplicate execution and returning an invalid/moved-from future.
Using the existing finalize_ and close_ boolean flags (which are checked and set under the lock) completely avoids this race condition and is much safer.
if (finalize_) {
return make_ready_future(StatusOr<google::storage::v2::Object>(
internal::FailedPreconditionError("Finalize() already called",
GCP_ERROR_INFO())));
}
if (close_) {
return make_ready_future(StatusOr<google::storage::v2::Object>(
internal::FailedPreconditionError(
"Finalize() cannot be called after Close()", GCP_ERROR_INFO())));
}| if (!closed_future_.valid()) { | ||
| return make_ready_future(internal::FailedPreconditionError( | ||
| "Close() already called", GCP_ERROR_INFO())); | ||
| } | ||
| if (!finalized_future_.valid()) { | ||
| return make_ready_future(internal::FailedPreconditionError( | ||
| "Close() cannot be called after Finalize()", GCP_ERROR_INFO())); | ||
| } |
There was a problem hiding this comment.
There is a race condition here. closed_future_ is only moved out at the very end of Close(). However, the lock mu_ is released before the function returns. If another thread calls Close() or Finalize() concurrently after the lock is released but before the function returns, closed_future_.valid() will still be true, allowing the check to be bypassed.
Using the existing close_ and finalize_ boolean flags (which are checked and set under the lock) completely avoids this race condition and is much safer.
if (close_) {
return make_ready_future(internal::FailedPreconditionError(
"Close() already called", GCP_ERROR_INFO()));
}
if (finalize_) {
return make_ready_future(internal::FailedPreconditionError(
"Close() cannot be called after Finalize()", GCP_ERROR_INFO()));
}
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #16211 +/- ##
==========================================
- Coverage 92.24% 92.24% -0.01%
==========================================
Files 2265 2265
Lines 210126 210247 +121
==========================================
+ Hits 193836 193940 +104
- Misses 16290 16307 +17 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This PR addresses feedback regarding the interaction between
Close()andFinalize()inAsyncWriterConnectionimplementations (#16163 (comment)).Previously, if
Finalize()was called followed byClose(),Close()would still append its payload, but theWriteLoopwould prioritizeFinalizeStepoverCloseStep. TheClose()future would hang indefinitely because theclosed_promise was never satisfied. This change enforces mutual exclusion between these two terminal operations, ensuring they fail fast.