Skip to content

[BUG] FP16 Type Fails with External Functions #336

@isVoid

Description

@isVoid

Describe the bug
If we use fp16 to foreign functions, there's prototype mismatch error. The error reproduces under:

  1. ctk 12.8 ctypes linker
  2. An older ctk with nvjitlink such as 12.0.1, 12.2, 11.8

Error Log:

/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/dispatcher.py:686: NumbaPerformanceWarning: Grid size 1 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
Traceback (most recent call last):
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py", line 3247, in _add_data
    driver.cuLinkAddData(
    ~~~~~~~~~~~~~~~~~~~~^
        self.handle,
        ^^^^^^^^^^^^
    ...<6 lines>...
        None,
        ^^^^^
    )
    ^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py", line 374, in safe_cuda_api_call
    self._check_ctypes_error(fname, retcode)
    ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py", line 445, in _check_ctypes_error
    raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [300] Call to cuLinkAddData results in CUDA_ERROR_INVALID_SOURCE

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/wangm/scratch/072125/minrepro.py", line 29, in <module>
    fails[1, 1](arr)
    ~~~~~~~~~~~^^^^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/dispatcher.py", line 689, in __call__
    return self.dispatcher.call(
           ~~~~~~~~~~~~~~~~~~~~^
        args, self.griddim, self.blockdim, self.stream, self.sharedmem
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/dispatcher.py", line 835, in call
    kernel = _dispatcher.Dispatcher._cuda_call(self, *args)
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/dispatcher.py", line 843, in _compile_for_args
    return self.compile(tuple(argtypes))
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba/core/compiler_lock.py", line 35, in _acquire_compile_lock
    return func(*args, **kwargs)
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/dispatcher.py", line 1111, in compile
    kernel.bind()
    ~~~~~~~~~~~^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/dispatcher.py", line 325, in bind
    cufunc = self._codelibrary.get_cufunc()
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/codegen.py", line 318, in get_cufunc
    cubin = self.get_cubin(cc=device.compute_capability)
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/codegen.py", line 296, in get_cubin
    self._link_all(linker, cc, ignore_nonlto=False)
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/codegen.py", line 263, in _link_all
    linker.add_file_guess_ext(path, ignore_nonlto)
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py", line 2900, in add_file_guess_ext
    self.add_cu(path_or_code.data, path_or_code.name)
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py", line 2830, in add_cu
    self.add_ptx(ptx.encode(), ptx_name)
    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py", line 3231, in add_ptx
    return self._add_data(enums.CU_JIT_INPUT_PTX, ptx, name)
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangm/miniforge3/envs/numbast/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/driver.py", line 3258, in _add_data
    raise LinkerError("%s\n%s" % (e, self.error_log))
numba.cuda.cudadrv.driver.LinkerError: [300] Call to cuLinkAddData results in CUDA_ERROR_INVALID_SOURCE
error   : Prototype doesn't match for 'add' in '<unnamed-cu>.ptx', first defined in '<unnamed-cu>.ptx'

Steps/Code to reproduce bug

import numpy as np
from numba import cuda
from numba.types import float16

src = """
#include <cuda_fp16.h>
extern "C"
__device__ int add(half * res, half a, half b) {
    *res = a + b;
    return 0;
}
"""
add = cuda.declare_device("add", float16(float16, float16), link=cuda.CUSource(src))
@cuda.jit
def fails(arr):
    arr[0] = add(float16(3.14), float16(3.14))


arr = np.zeros(1, dtype=np.float16)
fails[1, 1](arr)
print(arr)

Execute with

NUMBA_CUDA_USE_NVIDIA_BINDING=0 python minrepro.py

Expected behavior
Prints [6.28]

Environment details (please complete the following information):
Conda installed ctk, numba-cuda

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions