Skip to content

[BUG]: cudaLaunchHostFunc doesn't work with CUDA Graph #790

@syuoni

Description

@syuoni

Is this a duplicate?

Type of Bug

Runtime Error

Component

cuda.bindings

Describe the bug

Hi team,

I tested cuda.bindings.runtime.cudaLaunchHostFunc, it seemed not working well with CUDA Graph. However, the C++ cudaLaunchHostFunc can work with CUDA Graph.

Similar to cupy/cupy#9274, the host function can be captured by CUDA graph, but when replying CUDA graph, only the first reply can call the host function, while more replays cannot. Is this expected?

Please see the repro steps below. Thanks!

How to Reproduce

import ctypes

import cuda.bindings.runtime as cudart
import torch


class Struct(ctypes.Structure):
    _fields_ = [
        ("a", ctypes.c_int),
    ]


def hostfunc(userData):
    data = Struct.from_address(userData)
    print(f"Hello, World! {data.a}")
    return 0


HostFn_t = ctypes.PYFUNCTYPE(ctypes.c_int, ctypes.c_void_p)


def main():
    data = Struct(a=1)

    # ctypes is managing the pointer value for us
    c_hostfunc = HostFn_t(hostfunc)
    cuda_hostfunc = cudart.cudaHostFn_t(_ptr=ctypes.addressof(c_hostfunc))

    # Run
    stream = torch.cuda.Stream()
    cudart_stream = cudart.cudaStream_t(stream.cuda_stream)

    g = torch.cuda.CUDAGraph()
    with torch.cuda.graph(g, stream=stream):
        (err, ) = cudart.cudaLaunchHostFunc(cudart_stream, cuda_hostfunc,
                                            ctypes.addressof(data))
        assert err == cudart.cudaError_t.cudaSuccess
    torch.cuda.synchronize()
    print("Graph captured", flush=True)

    with torch.cuda.stream(stream):
        for i in range(10):
            g.replay()

    torch.cuda.synchronize()


if __name__ == "__main__":
    main()

Expected behavior

The hostfunc should be called for 10 times. However, only the first time is successful:

Graph captured
Hello, World! 1
Caught signal 11 (Segmentation fault: address not mapped to object at address 0x2e906)
Segmentation fault (core dumped)

Operating System

Ubuntu 24.04.2 LTS

nvidia-smi output

Mon Aug  4 10:10:28 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08              Driver Version: 575.57.08      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H20                     On  |   00000000:D1:00.0 Off |                    0 |
| N/A   37C    P0             73W /  500W |       0MiB /  97871MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H20                     On  |   00000000:DF:00.0 Off |                    0 |
| N/A   39C    P0             76W /  500W |       0MiB /  97871MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageNeeds the team's attention

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions