-
Notifications
You must be signed in to change notification settings - Fork 190
cuda.bindings: Fix segfault when converting char*
NULL
to bytes
#497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ad of segfaulting when converting char* NULL to bytes.
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
/ok to test |
This comment has been minimized.
This comment has been minimized.
@leofang The two test failures are these flakes:
|
Changes LGTM at the moment but I see that this is still a draft |
Thanks!
I'm looking into adding more tests, to ideally cover all changes. And we need to do something about the flakes, even though they are unrelated. |
Observed failures: ``` > assert delay_seconds * 1000 <= elapsed_time_ms < delay_seconds * 1000 + 2 # tolerance 2 ms E assert 503.1598205566406 < ((0.5 * 1000) + 2) ``` ``` > assert delay_seconds * 1000 <= elapsed_time_ms < delay_seconds * 1000 + 2 # tolerance 2 ms E assert 505.0583190917969 < ((0.5 * 1000) + 2) ```
…tring() CUDA_ERROR_INVALID_VALUE
…ve segfaulted before.
/ok to test |
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Woah, looks like I'm good at generating segfaults ...
@leofang should I just remove that new test and leave that for another PR? I don't think I'll get to the bottom of the newly discovered segfault today. Tomorrow pretty sure. |
Just isolate out the cuda.core changes from those for bindings. Let's merge core-related things first; bindings can wait. |
/ok to test |
The cuda.core change is to resolve the flakes. (Without it I'd (maybe) have to hit rerun a few times to deflake.) I just backed out the test that triggers the segfaults: commit a39720b |
Ah, I meant we can separate it out and merge it first. Mainly if we want to auto-backport the fix, the cuda-11 branch does not have any cuda.core code, so the apto-backport would fail. I am curious if it could work without any cuda-core changes involved. |
This reverts commit 9dd2630.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/ok to test |
@rwgk could you update the commit message and avoid (I noticed this because of a notification 😂) |
…ested by at-leofang
7814469
to
ab6db22
Compare
Oh ... sorry, done. (I did this a lot in the pybind11 repo, wanting to give proper credit. I didn't realize this can lead to spammy notifications.) |
/ok to test |
@leofang The tests passed previously with the exact same code. The only change was to the commit message. I.e. admin merge would be ideal. |
ok |
Backport failed because this pull request contains merge commits. You can either backport this pull request manually, or configure the action to skip merge commits. |
Should I take care of that? (Happy to.) |
|
Yes, please. My bad in updating your branch, sorry... |
I'll backport this together with #499, after that is merged. |
* PR #497 squash-merged * Bring back test_nvrtcGetLoweredName_failure() (it was originally under PR #497). * Add code for debugging * test_all_CUresult_codes(): max_code = int(max(cuda.CUresult)) as suggested by at-leofang * Change pytest options, mostly to disable output capturing (of both stdout and stderr) * Undo debugging changes in nvrtc.pyx.in * Revert "Change pytest options, mostly to disable output capturing (of both stdout and stderr)" This reverts commit b0464e7. * Skip new test if nvrtc version < 12.1
The segfault was discovered while working on PR #458.
Full test coverage for changed code.