-
Notifications
You must be signed in to change notification settings - Fork 35
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Attempting to use numba.cuda.atomic.add()
results in the following error:
error: Error: : Function `_ZN8__main__11max_exampleB2v1B100cw51cXTLSUwv1sDUaKthaKSjYKCpo4DkeVtwuAKDFy2mbUGpDiIMSVRIAqgJBSaB5HDbanVYcKpbKUCjRR0oCeQZ12oCAA_3d_3dE5ArrayIfLi1E1C7mutable7alignedE5ArrayIfLi1E1C7mutable7alignedE' Basic Block `B0':
context: %0 = atomicrmw fadd ptr %.95, float %.75 seq_cst, align 4
atomicrmw does not support operation: 'fadd'.
Steps/Code to reproduce bug
Install latest numba-cuda.
Attempt to run the following script:
from numba import cuda
import numpy as np
@cuda.jit()
def sum_example(result, values):
"""Find the sum value in values and store in result[0]"""
tid = cuda.threadIdx.x
bid = cuda.blockIdx.x
bdim = cuda.blockDim.x
i = (bid * bdim) + tid
cuda.atomic.add(result, 0, values[i])
arr = np.random.rand(16384).astype(np.float32)
result = np.zeros(1, dtype=np.float32)
sum_example[256, 64](result, arr)
print(result[0])
print(sum(arr))
Expected behavior
Kernel compiles and runs with no errors.
Environment details (please complete the following information):
- Environment location: Windows 11 24H2
- Method of numba-cuda install: pip
- numba-cuda version: 0.18
- GPU: RTX 5070
Additional context
Other atomics do work. I made the Python code sample by modifying the use of numba.atomic.max
in the documentation. That example works fine.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working