fix(amber): avoid AtomicInteger.get_and_set deadlock#5010
Conversation
|
Thanks @nathant27! Please do the following:
this seems to be a bug on both main and |
|
Thanks for the feedback! I changed the formatting to match template format, and Im looking into adding the testing(getting rid of the bug tracking test and adding a new one). Currently out at the moment but will be done by later tonight or early tomorrow morning |
|
Just updated my description for the updated tests @Yicong-Huang. Let me know if these updates to the test seem fine for now. I'm happy to add more if needed |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5010 +/- ##
=========================================
Coverage 42.64% 42.64%
Complexity 2188 2188
=========================================
Files 1045 1045
Lines 39880 39880
Branches 4205 4205
=========================================
+ Hits 17006 17008 +2
+ Misses 21814 21812 -2
Partials 1060 1060
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bcaa234 to
d1f2e41
Compare
What changes were proposed in this PR?
Fixed deadlock happening in AtomicInteger.get_and_set (amber/src/main/python/core/util/atomic.py).
Before, get_and_set acquired non-reentrant lock and then accessed value property, which attempts to grab the same lock, causing a deadlock.
After the change, get_and_set now accesses the objects self._value property directly inside the existing critical section, which avoids the nested lock acquisition.
-Chose to do the inline change instead of changing to Reentrant lock to avoid potentially unnecessary overhead and because it seems like the more appropriate way to access in this context, since we're already grabbing the lock anyway in get_and_set
Any related issues, documentation, or discussions?
Fixes #4794
How was this PR tested?
in src/test/python/core/util/test_atomic.py
test_get_and_set_currently_deadlocks_on_non_reentrant_lock, bug pinned test
after the fixes now fails on both asserts lines 99 to 106 on removed test(test_get_and_set_currently_deadlocks_on_non_reentrant_lock)
This is expected behavior because worker should not be alive after fixing deadlock, and should be completed.
2 changes to test_atomic.py
REPLACED
test_get_and_set_currently_deadlocks_on_non_reentrant_lock
WITH
test_get_and_set_does_not_deadlock_on_non_reentrant_lock
UPDATED test_get_and_set_should_return_old_value_and_replace_state
Was this PR authored or co-authored using generative AI tooling?
No