void gpu_hashtable_insert(KeyValue* hashtable, uint32_t key, uint32_t value)
{
uint32_t slot = hash(key);
while (true)
{
uint32_t prev = atomicCAS(&hashtable[slot].key, kEmpty, key);
if (prev == kEmpty || prev == key)
{
hashtable[slot].value = value;
break;
}
slot = (slot + 1) & (kHashTableCapacity-1);
}
}
Take a sample function like this. It is highly likely that slot will be greater than kHashTableCapacity and thus invoke undefined behavior on the first iteration.
This is an issue in both the cuda implementation
|
if (hashtable[slot].key == key) |
as well as the article
https://nosferalatu.com/SimpleGPUHashTable.html
Take a sample function like this. It is highly likely that
slotwill be greater than kHashTableCapacity and thus invoke undefined behavior on the first iteration.This is an issue in both the cuda implementation
SimpleGPUHashTable/src/linearprobing.cu
Line 166 in d3f5b74
as well as the article
https://nosferalatu.com/SimpleGPUHashTable.html