Skip to content

Add get_cuda_native_handle #773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 23, 2025
Merged

Add get_cuda_native_handle #773

merged 4 commits into from
Jul 23, 2025

Conversation

leofang
Copy link
Member

@leofang leofang commented Jul 22, 2025

Description

closes #564

The performance is comparable to the existing int() approach (O(10) ns):

In [1]: from cuda.bindings import driver

In [3]: s = driver.CUstream(1234)

In [4]: int(s)
Out[4]: 1234

In [6]: from cuda.bindings.utils import get_cuda_native_handle

In [7]: get_cuda_native_handle(s)
Out[7]: 1234

In [8]: %timeit int(s)
44.7 ns ± 1.34 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [9]: %timeit get_cuda_native_handle(s)
95.1 ns ± 1.06 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@leofang leofang added this to the cuda-python 12-next, 11-next milestone Jul 22, 2025
@leofang leofang self-assigned this Jul 22, 2025
@leofang leofang added P1 Medium priority - Should do feature New feature or request cuda.bindings Everything related to the cuda.bindings module labels Jul 22, 2025
@github-project-automation github-project-automation bot moved this to Todo in CCCL Jul 22, 2025
Copy link
Contributor

copy-pr-bot bot commented Jul 22, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang
Copy link
Member Author

leofang commented Jul 22, 2025

(added a short benchmark to the PR description)

@leofang
Copy link
Member Author

leofang commented Jul 22, 2025

/ok to test 8970dca

@leofang
Copy link
Member Author

leofang commented Jul 22, 2025

/ok to test 4ca551c

@@ -287,6 +288,7 @@ def prep_extensions(sources, libraries):

# new path for the bindings from cybind
def rename_architecture_specific_files():
path = os.path.join("cuda", "bindings", "_internal")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is a bug fix exposed by the new line 224. Previously, the path variable happened to be right (leaking from the loop variable at line 227), but it's not longer correct after we append one more item to path_list.

This comment has been minimized.

@leofang leofang requested a review from shwina July 23, 2025 01:39
@leofang
Copy link
Member Author

leofang commented Jul 23, 2025

Decision to make: Should we raise DeprecationWarning in __int__()? It can be very noisy but it's the only effective way for us to inform users about the deprecation.

rwgk
rwgk previously approved these changes Jul 23, 2025
Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decision to make: Should we raise DeprecationWarning in __int__()? It can be very noisy but it's the only effective way for us to inform users about the deprecation.

Do we want to give users a grace period, in case they need to support two cuda-bindings versions for a while?

Concretely:

In the next release, deprecate only in the documentation.

Only with the next minor release bump: actually raise DeprecationWarning.

@github-project-automation github-project-automation bot moved this from Todo to In Review in CCCL Jul 23, 2025
@leofang
Copy link
Member Author

leofang commented Jul 23, 2025

/ok to test 53055e5

@leofang leofang merged commit 8cc1b68 into NVIDIA:main Jul 23, 2025
53 checks passed
@rwgk
Copy link
Collaborator

rwgk commented Jul 23, 2025

Decision to make: Should we raise DeprecationWarning in __int__()? It can be very noisy but it's the only effective way for us to inform users about the deprecation.

Do we want to give users a grace period, in case they need to support two cuda-bindings versions for a while?

Concretely:

In the next release, deprecate only in the documentation.

Only with the next minor release bump: actually raise DeprecationWarning.

@leofang WDYT about introducing a "Phased deprecations" section somewhere, so we don't forget in the future.

E.g. for this case:

__int__() method for type A, B, C were informally deprecated in cuda-bindings release vX.Y.Z. DeprecationWarning will be raised starting with release vX.Y+1.0.

Copy link

Doc Preview CI
Preview removed because the pull request was closed or merged.

@leofang leofang deleted the utils_get_handle branch July 23, 2025 17:05
@leofang
Copy link
Member Author

leofang commented Jul 23, 2025

Yup that's great idea, tracked in #775.

@rwgk
Copy link
Collaborator

rwgk commented Jul 23, 2025

@leofang This PR is causing a weird issue under Windows. I first noticed when updating the unreleased branch, but the issue is actually showing up already in the CI logs here, e.g. this run against main right after this PR was merged:

All tests pass actually, but each invocation of python generates some noise. Have you seen something like this before?

Running bindinds tests
/c/a/cuda-python/cuda-python/cuda_bindings
Error processing line 4 of C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\_cuda_bindings_redirector.pth:

  Traceback (most recent call last):
    File "<frozen site>", line 206, in addpackage
    File "<string>", line 1, in <module>
    File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\_cuda_bindings_redirector.py", line 28, in <module>
      import cuda.bindings
    File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\bindings\__init__.py", line 16, in <module>
      from cuda.bindings import utils
    File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\bindings\utils\__init__.py", line 4, in <module>
      from ._get_handle import get_cuda_native_handle
    File "cuda\\bindings\\utils\\_get_handle.pyx", line 1, in init cuda.bindings.utils._get_handle
    File "cuda\\bindings\\_lib\\utils.pyx", line 1, in init cuda.bindings._lib.utils
    File "cuda\\bindings\\driver.pyx", line 1, in init cuda.bindings.driver
    File "cuda\\bindings\\cydriver.pyx", line 1, in init cuda.bindings.cydriver
    File "cuda\\bindings\\_bindings\\cydriver.pyx", line 6, in init cuda.bindings._bindings.cydriver
  ModuleNotFoundError: No module named 'win32api'

Remainder of file ignored
============================= test session starts =============================
platform win32 -- Python 3.12.10, pytest-8.4.1, pluggy-1.6.0 -- C:\a\_tool\Python\3.12.10\x64\python.exe
cachedir: .pytest_cache
benchmark: 5.1.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: C:\a\cuda-python\cuda-python\cuda_bindings\tests
configfile: pytest.ini
plugins: benchmark-5.1.0
collecting ... collected 176 items / 1 skipped

tests\test_cuda.py::test_cuda_memcpy PASSED                              [  0%]
... <snip> ...
================= 171 passed, 6 skipped, 3 warnings in 52.02s =================

@leofang
Copy link
Member Author

leofang commented Jul 23, 2025

I've seen errors locally like this too, but I thought it was because I had a corrupted environment and it did go away after a clean rebuild. I probably know what's going on, let me see if I can put together a patch quickly.

@rwgk
Copy link
Collaborator

rwgk commented Jul 23, 2025

it did go away after a clean rebuild

That's surprising. I did multiple clean rebuilds (while backtracking from the unreleased testing), for me the issue was reliably reproducible.

let me see if I can put together a patch quickly.

Sounds great, thanks!

rwgk added a commit to rwgk/cuda-python that referenced this pull request Jul 24, 2025
leofang pushed a commit that referenced this pull request Jul 25, 2025
* Simplify cuda_bindings/site-packages/_cuda_bindings_redirector.py (as suggested by ChatGPT) to resolve #773 (comment)

* Avoid premature import of `cuda.bindings` in redirector to fix startup error

Previously, the `_cuda_bindings_redirector.py` file imported `cuda.bindings`
at interpreter startup to ensure the `cuda` namespace was initialized.
However, this triggered early loading of Cython modules that depend on
`win32api`, which may not be available yet due to `.pth` file processing order.

This commit replaces `import cuda.bindings` with `import cuda`, which is
sufficient to ensure the `cuda` namespace is in `sys.modules`. The version
redirect via `cuda.__version__` is preserved by assigning a custom
`LazyCudaModule` type to `sys.modules["cuda"]`.

As a result, `cuda.__version__` continues to work (with a deprecation warning),
while avoiding startup errors when `win32api` is not yet importable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda.bindings Everything related to the cuda.bindings module feature New feature or request P1 Medium priority - Should do
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[FEA]: Expose get_cuda_native_handle to Python
2 participants