Skip to content

Conversation

@dev-tomek
Copy link
Contributor

@dev-tomek dev-tomek commented Oct 28, 2025

Comes from #2755.
This PR is a temporary fix leveraging subprocess for catching the abort signal, enabling test_device_assert until pytorch/pytorch#142135 is resolved.

@dev-tomek dev-tomek linked an issue Oct 28, 2025 that may be closed by this pull request
@dev-tomek dev-tomek changed the title Tkuczynski/enable test device assert [TEST_DEBUG] Enable test device assert Oct 28, 2025
@dev-tomek dev-tomek marked this pull request as ready for review October 28, 2025 21:28
@whitneywhtsang
Copy link
Contributor

There is no updates from pytorch/pytorch#142135 for a while, do you know what's the ETA for that?

@dev-tomek
Copy link
Contributor Author

There is no updates from pytorch/pytorch#142135 for a while, do you know what's the ETA for that?

To my knowledge, sometime next year. I'll send you an internal ticket for that.


if should_fail:
abort_or_runtime_error = (
result.returncode == 1 or # RuntimeError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what cases do we get a RuntimeError? At first glance, I would expect all errors of SIGABRT type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is needed for other devices, then you can leave the old implementation for devices that are not XPU: if not is_xpu() and add a separate branch for us with what you wrote, perhaps this can simplify the code. This is optional because I don't know if it will result in less code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe easier to resolve merge conflicts

mask_str = "None" if mask is None else str(mask)
opt_flag_str = "None" if opt_flag is None else str(opt_flag)

result = subprocess.run([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set environment variables directly when calling subprocess.run, for example for TRITON_DEBUG. This way you can pass fewer parameters into test_debug_kernels.py.

Copy link
Contributor

@anmyachev anmyachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I only have minor comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable all tests from test_debug.py on XPU

4 participants