Prep for driving a fork of rayon#45
Conversation
| if waiting_bitmask != 0 { | ||
| let i = waiting_bitmask.trailing_zeros() as usize; | ||
| self.get_member_data().semaphores[i].signal(); | ||
| } |
There was a problem hiding this comment.
I'm wondering if this could lead to a race condition. hear me out:
In latch.rs:245, you have
let state = self.state.swap(ASLEEP, Ordering::Relaxed); //commits to sleeping
if state == LOCKED {
waiting_bitmask.fetch_or(seat_mask, Ordering::Relaxed); //bit becomes visible here
atomic_wait::wait(&self.state, ASLEEP); //blocks, there's no re-check of `find_work`
...So as I understand it, it's possible that the worker's find_work is empty, a producer pushes a job, the producer reads waiting_bitmask == 0 (as the worker's bit is not set yet), the signal is skipped, the worker then runs The above code, and sleeps. wait() doesn't re-check the queues after announcing sleep, so it just kinda hangs around in the shared_queue.
I imagine this scenario would be rare as any other awake worker would pop the job, but may be problematic at low numbers of N (esp if it's 1). Thoughts?
There was a problem hiding this comment.
Yep, correct! Though I have not seen it lead to any issues in practice.
You are right that double-checking find_work again after setting the sleepy-bit should resolve this. It might be worth it to try that, and I think rayon does something similar.
| /// `add_reference`, a direct `fetch_add` on the underlying counter, or the | ||
| /// implicit initial increment the scope starts with. | ||
| unsafe fn remove_reference(&self) { | ||
| let counter = self.count.fetch_sub(1, Ordering::Relaxed); |
There was a problem hiding this comment.
I noticed this separately. Is ordering important here?
There was a problem hiding this comment.
Hmmm. No I don't think so. Any references with lifetime scope are guaranteed not to be dropped until this fetch_sub is observed. The way the scope blocks until the counter decrements to zero establishes a happens-before relationship.
Unless there's another thing you were referring to?
There was a problem hiding this comment.
In weakly ordered memory architectures, isn't is possible for self.completed to be stale? The surrounding non-atomic memory isn't totally ordered, and therefore, the comment about it being a live latch may be false?
This PR includes a bundle of significant improvements, largely driven by the need of rayon's parallel iterators. This includes:
!Sendfutures, as requested in Feature: Spawning jobs/tasks to a target thread #31.broadcast,spawn_broadcast, andScope::spawn_broadcast, to further parallelrayon_core.rayon-core, which allows for "headless" versions of all the parallel ops.