Skip to content

Update cuda.bindings to 13.0.0 #792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 91 commits into from
Aug 6, 2025
Merged

Conversation

leofang
Copy link
Member

@leofang leofang commented Aug 4, 2025

Description

Close #791.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

vzhurba01 and others added 30 commits May 27, 2025 14:48
…ersion_13

Make previously overlooked 12.9.0 → 13.0.0 changes
* Update SUPPORTED_WINDOWS_DLLS: kitpicks/cuda-r13-0/13.0.0/013/local_installers/cuda_13.0.0_windows.exe

* Update SUPPORTED_LINUX_SONAMES: kitpicks/cuda-r13-0/13.0.0/013/local_installers/cuda_13.0.0_580.31_linux.run

* 013 → 014: SUPPORTED_LINUX_SONAMES unchanged

* 013 → 014: SUPPORTED_WINDOWS_DLLS unchanged

* cybind update with 13.0.0 headers (014)

* Bump cuda/bindings/_version.py → 13.0.0

* test_nvjitlink.py: remove sm_60, add sm_100

* Updates from cybind after removing all 11.x headers (affects "automatically generated" comments only).

* Add new toolshed/reformat_cuda_enums_as_py.py (reads cuda.h, driver_types.h headers directly).

* Use new toolshed/reformat_cuda_enums_as_py.py to regenerate driver_cu_result_explanations.py, runtime_cuda_error_explanations.py

* Use `driver.cuDeviceGetUuid()` instead of `driver.cuDeviceGetUuid_v2()` with CTK 13.

* Adjustments for locating nvvm directory in CTK 13 installations.
* Add missing error handling (tests/test_nvjitlink.py)

* Add missing `const` in cudaMemcpyBatchAsync call (cuda/bindings/runtime.pyx.in)

* Add qa/13.0.0/01_linux.sh

* Remove qa/13.0.0/01_linux.sh after it was moved to a new upstream qa branch.

* Strictly correct casts for cudaMemcpyBatchAsync (generated by cython_gen).

* Pragmatic minimal fix for cudaMemcpyBatchAsync casts (works with Linux and
Windows). (generated with cython-gen)
Fix accident from updating `SUPPORTED_WINDOWS_DLLS` for CTK 13
)

* Linux update from cuda_13.0.0_580.46_kitpicks025_linux.run: no-op b/o NVIDIA/cuda-python-private#95

* Windows update from cuda_13.0.0_kitpicks025_windows.exe
…s overlooked. Direct commit for simplicity.
…VIDIA#94)

* CCCL_INCLUDE_PATH fixes in test_event.py, test_launcher.py

* Add new file (accidentally missing in a prior commit).

* Fix pre-commit errors in new tests/helpers.py

* 12→13 compatibility fixes in cuda/core/experimental/_graph.py

* CTK 12 compatibility (tests/test_cuda_utils.py)

* Make the cuda/core/experimental/_graph.py changes backwards compatible.

* Do not try to hide `13` in cuda_core/tests/test_cuda_utils.py

* More elegant handling of `CCCL_INCLUDE_PATHS` in cuda_core/tests/helpers.py

* Remove stray empty line (cuda_core/tests/conftest.py).

* Fix logic error computing CCCL_INCLUDE_PATHS in cuda_core/tests/helpers.py
@rwgk
Copy link
Collaborator

rwgk commented Aug 6, 2025

/ok to test

Copy link
Contributor

copy-pr-bot bot commented Aug 6, 2025

/ok to test

@rwgk, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@rwgk
Copy link
Collaborator

rwgk commented Aug 6, 2025

/ok to test de1e83a

@rwgk
Copy link
Collaborator

rwgk commented Aug 6, 2025

There is still a problem with 13.0.0 wheels, both Linux and Windows, but I think for nvvm only.

I'm puzzled, because it works interactively (I tried both Linux and Windows). I'll continue working on this in about one hour; good chance that I can keep working on it then until I wrestle this down for good.

@rwgk
Copy link
Collaborator

rwgk commented Aug 6, 2025

/ok to test e18a5e8

@rwgk
Copy link
Collaborator

rwgk commented Aug 6, 2025

I started the testing to see where we stand with the CI (mainly with Windows).

I know from local testing that we're still up against this error (Linux):

tests/test_kernelParams.py::test_kernelParams_empty nvrtc: error: failed to open libnvrtc-builtins.so.13.0.
  Make sure that libnvrtc-builtins.so.13.0 is installed correctly.^@
FAILED

I need to work on understanding why this happens, because this looks as expected:

rwgk-win11.localdomain:~/forked/cuda-python $ ll Cp13WslVenv/lib/python3.12/site-packages/nvidia/cu13/lib/
total 375M
-rw-r--r-- 1 rgrossekunst rgrossekunst 3.1M Aug  5 21:39 libcufile.so.0
-rw-r--r-- 1 rgrossekunst rgrossekunst  43K Aug  5 21:39 libcufile_rdma.so.1
-rw-r--r-- 1 rgrossekunst rgrossekunst  95M Aug  5 21:39 libnvJitLink.so.13
-rw-r--r-- 1 rgrossekunst rgrossekunst 4.2M Aug  5 21:39 libnvrtc-builtins.alt.so.13.0
-rw-r--r-- 1 rgrossekunst rgrossekunst 4.2M Aug  5 21:39 libnvrtc-builtins.so.13.0
-rw-r--r-- 1 rgrossekunst rgrossekunst 105M Aug  5 21:39 libnvrtc.alt.so.13
-rw-r--r-- 1 rgrossekunst rgrossekunst 105M Aug  5 21:39 libnvrtc.so.13
-rw-r--r-- 1 rgrossekunst rgrossekunst  61M Aug  5 21:42 libnvvm.so.4
rwgk-win11.localdomain:~/forked/cuda-python $

But why then the error?

@rwgk
Copy link
Collaborator

rwgk commented Aug 6, 2025

Windows passes!

But for Linux, I think there is a bug/oversight in the nvidia_cuda_nvrtc wheel.

For comparison, with CTK 12:

(Ctk12NvidiaWheelsVenv) rwgk-win11.localdomain:~ $ pip install "nvidia-cuda-nvrtc-cu12"
Collecting nvidia-cuda-nvrtc-cu12
  Using cached nvidia_cuda_nvrtc_cu12-12.9.86-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl.metadata (1.7 kB)
Using cached nvidia_cuda_nvrtc_cu12-12.9.86-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (89.6 MB)
Installing collected packages: nvidia-cuda-nvrtc-cu12
Successfully installed nvidia-cuda-nvrtc-cu12-12.9.86
(Ctk12NvidiaWheelsVenv) rwgk-win11.localdomain:~/Ctk12NvidiaWheelsVenv/lib/python3.12/site-packages/nvidia/cuda_nvrtc/lib $ readelf -d libnvrtc.so.12 | grep -E 'RPATH|RUNPATH'
 0x000000000000001d (RUNPATH)            Library runpath: [$ORIGIN:]
(Ctk12NvidiaWheelsVenv) rwgk-win11.localdomain:~/Ctk12NvidiaWheelsVenv/lib/python3.12/site-packages/nvidia/cuda_nvrtc/lib

Now with CTK 13:

(Ctk13NvidiaWheelsVenv) rwgk-win11.localdomain:~ $ pip install "nvidia-cuda-nvrtc~=13.0"
Collecting nvidia-cuda-nvrtc~=13.0
  Using cached nvidia_cuda_nvrtc-13.0.48-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl.metadata (1.7 kB)
Using cached nvidia_cuda_nvrtc-13.0.48-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (90.2 MB)
Installing collected packages: nvidia-cuda-nvrtc
Successfully installed nvidia-cuda-nvrtc-13.0.48
(Ctk13NvidiaWheelsVenv) rwgk-win11.localdomain:~/Ctk13NvidiaWheelsVenv/lib/python3.12/site-packages/nvidia/cu13/lib $ readelf -d libnvrtc.so.13 |
 grep -E 'RPATH|RUNPATH'
(Ctk13NvidiaWheelsVenv) rwgk-win11.localdomain:~/Ctk13NvidiaWheelsVenv/lib/python3.12/site-packages/nvidia/cu13/lib $

I'm afraid we have to work around that somehow.

@rwgk
Copy link
Collaborator

rwgk commented Aug 6, 2025

/ok to test ff339b6

@rwgk
Copy link
Collaborator

rwgk commented Aug 6, 2025

For completeness, this ChatGPT conversation explains how I arrived at commit ff339b6:

https://chatgpt.com/share/6892fd31-aafc-8008-a461-58ff0c602a0a

@kkraus14
Copy link
Collaborator

kkraus14 commented Aug 6, 2025

/ok to test 37ef8e0

kkraus14
kkraus14 previously approved these changes Aug 6, 2025
@github-project-automation github-project-automation bot moved this from Todo to In Review in CCCL Aug 6, 2025
@leofang leofang marked this pull request as ready for review August 6, 2025 15:04
@leofang
Copy link
Member Author

leofang commented Aug 6, 2025

Since the CI was already green and the latest changes were doc-only, let me admin-merge this and run the CI in #795.

@leofang leofang merged commit c016d65 into NVIDIA:main Aug 6, 2025
1 check passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Aug 6, 2025
@leofang leofang deleted the unreleased-13.0 branch August 6, 2025 15:27
Copy link

github-actions bot commented Aug 6, 2025

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD CI/CD infrastructure cuda.bindings Everything related to the cuda.bindings module feature New feature or request P0 High priority - Must do!
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Support CUDA 13.0 [BUG]: cuda_bindings/examples globalToShmemAsyncCopy_test.py "catastrophic error" is masked by pytest skipped
5 participants