Skip to content

Fix nvbugpro 5348750 #725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 26, 2025
Merged

Conversation

oleksandr-pavlyk
Copy link
Contributor

Description

This build on top of #724 and changes cuda.core.experimental._module to set _loader["paraminfo"] only when driver is >=12.4. If "paraminfo" is not found in _loader at run-time, a NotImplementedError is raised.

closes

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link
Contributor

copy-pr-bot bot commented Jun 25, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@leofang
Copy link
Member

leofang commented Jun 25, 2025

LGTM, is it still a draft?

@oleksandr-pavlyk
Copy link
Contributor Author

/ok to test

@oleksandr-pavlyk
Copy link
Contributor Author

@leofang I opened it as a draft. Will transition to 'ready-for-review' once CI gree-lights the change.

@leofang leofang added bug Something isn't working P0 High priority - Must do! cuda.core Everything related to the cuda.core module labels Jun 25, 2025
@leofang leofang added this to the cuda.core beta 5 milestone Jun 25, 2025

This comment has been minimized.

@oleksandr-pavlyk
Copy link
Contributor Author

/ok to test

For drivers in version range [12000, 12040), do not add
"paraminfo" to the _loader dictionary. At runtime, raise
NotImplementedError if "paraminfo" is not in the dictionary.

Only modify _loader["new"] for python version >=12
…lParamInfo

Except for one test where we check that NotImplementedError is raised.
@oleksandr-pavlyk
Copy link
Contributor Author

/ok to test

Older driver, specifically 535.247.01, returns error code 400
for cluster-size related occupancy queries for devices with compute
capability less than (9, 0)

It works fine with newer drivers, provided the actual requested
cluster size is zero.
@oleksandr-pavlyk
Copy link
Contributor Author

/ok to test

@oleksandr-pavlyk oleksandr-pavlyk marked this pull request as ready for review June 26, 2025 14:48
Copy link
Contributor

copy-pr-bot bot commented Jun 26, 2025

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Sasha!

@github-project-automation github-project-automation bot moved this from Todo to In Review in CCCL Jun 26, 2025
@leofang leofang merged commit 0edec40 into NVIDIA:main Jun 26, 2025
53 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Jun 26, 2025
@oleksandr-pavlyk oleksandr-pavlyk deleted the fix-nvbugpro-5348750 branch June 26, 2025 14:55
Copy link

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda.core Everything related to the cuda.core module P0 High priority - Must do!
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants