NIFI-15570: Keep track of Content Claims where the last Claim in a Re…#10874
NIFI-15570: Keep track of Content Claims where the last Claim in a Re…#10874markap14 wants to merge 1 commit intoapache:mainfrom
Conversation
…source Claim can be truncated if it is large. Whenever FlowFile Repository is checkpointed, truncate any large Resource Claims when possible and necessary to avoid having a situtation where a small FlowFile in a given Resource Claim prevents a large Content Claim from being cleaned up.
| return false; | ||
| } | ||
|
|
||
| private void truncate(final ContentClaim claim) { |
There was a problem hiding this comment.
The truncate method doesn't verify that the claimant count is still 0 before truncating. If a clone operation increments the claimant count while the truncation task is mid-flight, we could truncate content that is still referenced. Isn't it a concern?
Wondering if we could have a race condition:
TruncateClaims.truncateClaims()checksclaim.isTruncationCandidate()and seestrue- A clone operation calls
incrementClaimaintCount(), which setstruncationCandidate = falseand increments the claimant count TruncateClaims.truncate()proceeds to truncate the file anyway, corrupting the data for the newly cloned FlowFile
Or maybe this scenario is not an option for some reasons that I missed?
There was a problem hiding this comment.
Thanks for reviewing @pvillard31!
In short, no, that should not be possible. The only way we will ever queue up the ContentClaim for truncation is if the FlowFile Repository is synched to disk (typically on checkpoint but also possible on every commit if fsync property in nifi.properties is set to true) and the Content Claim has truncationCandidate = true. So at this point, the FlowFile Repository is the owner of the Content Claim and no Processor has access to it, and the Repository determines that there are no longer any references to it. As a result, we'll only queue up the Content Claim for truncation if there's only 1 referencing FlowFile and that one referencing FlowFile is now being removed. So no concerns about the claimant count going back up.
|
As a side note, the integration test failure was caused by another commit and is now fixed if you rebase on main. |
…source Claim can be truncated if it is large. Whenever FlowFile Repository is checkpointed, truncate any large Resource Claims when possible and necessary to avoid having a situtation where a small FlowFile in a given Resource Claim prevents a large Content Claim from being cleaned up.
Summary
NIFI-00000
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000NIFI-00000VerifiedstatusPull Request Formatting
mainbranchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
./mvnw clean install -P contrib-checkLicensing
LICENSEandNOTICEfilesDocumentation