Skip to content

Conversation

@ThomasRaoux
Copy link
Collaborator

Add a barrier to avoid a race condition in case an assert is followed by an op that may trap if the assert condition is true. Since the tensor in those two operations may have different layout we need to make sure all the threads are done executing the assert before going to the next op.

@ThomasRaoux ThomasRaoux requested a review from ptillet as a code owner November 1, 2024 06:57
@ThomasRaoux ThomasRaoux merged commit d0db12b into triton-lang:main Nov 1, 2024
7 checks passed
@davidberard98
Copy link
Contributor

After this commit (and @peterbell10's #5037, which turns debug=True back on for the test_side_effectful_reduction test), test_side_effectful_reduction and test_side_effectful_scan are hanging for me

@ThomasRaoux
Copy link
Collaborator Author

After this commit (and @peterbell10's #5037, which turns debug=True back on for the test_side_effectful_reduction test), test_side_effectful_reduction and test_side_effectful_scan are hanging for me

hmm I wonder if we end up with barrier in control flow? @peterbell10 would this op be lowered within control flow?

@lezcano
Copy link
Contributor

lezcano commented Nov 1, 2024

ah, it makes sense that they are as we are effectively doing a sync inside an if statement lol
Not sure what's the best way to tackle this tho. We could make #4811 less general, and just edit with the mask the asserts in the region, as we may have this issue with any function that calls sync inside a region passed to reduce / scan

@davidberard98
Copy link
Contributor

end up with barrier in control flow

yes, #4811 introduces predicated asserts

// an op that may trap if the assert condition is true. Since the tensor in
// those two operations may have different layout we need to make sure all
// the threads are done executing the assert before going to the next op.
barrier();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we only insert the barrier for RankedTensor conditionals. That still fixes the issue, avoids an unnecessary barrier for scalar conds and fixes the hang with reductions all in one go.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah sounds like an easy enough solution

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Luosuu pushed a commit to Luosuu/triton that referenced this pull request Nov 13, 2024
…-lang#5035)

Add a barrier to avoid a race condition in case an assert is followed by
an op that may trap if the assert condition is true. Since the tensor in
those two operations may have different layout we need to make sure all
the threads are done executing the assert before going to the next op.
guacamoleo pushed a commit to guacamoleo/triton that referenced this pull request Nov 14, 2024
…-lang#5035)

Add a barrier to avoid a race condition in case an assert is followed by
an op that may trap if the assert condition is true. Since the tensor in
those two operations may have different layout we need to make sure all
the threads are done executing the assert before going to the next op.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants