-
Notifications
You must be signed in to change notification settings - Fork 75
Reland the block store lowering changes. #4646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
261e487 to
ed0a771
Compare
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp
Outdated
Show resolved
Hide resolved
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp
Outdated
Show resolved
Hide resolved
fac026c to
26ab3b4
Compare
Signed-off-by: Lu,Chengjun <[email protected]>
|
This PR looks too large. We will work on slitting it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR relands the 2D block store lowering changes, fixes a block store issue, and enhances StoreOp conversion with LinearLayout-based tile size computation.
- Introduces
getBlockIOTileSizeand extendsisMemoryRowMajorto handleStoreOp - Unifies and refactors StoreOp lowering into a single
matchAndRewrite, supporting both block pointers and regular pointers via a newMatrix2DBlockStoreOp - Updates tests (
blockptr_store.mlir) for expected 2D block store patterns and adds new kernel cases; adds a new env var inGetEnv.hpp
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
LoadStoreOpToLLVM.cpp |
Added includes, getBlockIOTileSize, extended StoreOp lowering |
test/TritonIntelGPU/blockptr_store.mlir |
Updated expected checks, added new block store test scenarios |
include/triton/Tools/Sys/GetEnv.hpp |
Added TRITON_INTEL_ENABLE_BLOCK_IO_STORE_ON_REGULAR_PTR to set |
| llvm_unreachable(("Could not find the input dim:" + inDim + | ||
| ", on the ll:" + ll.toString()) | ||
| .c_str()); |
Copilot
AI
Jul 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing c_str() of a temporary std::string into llvm_unreachable risks a dangling pointer. Construct the message in a local std::string variable or use llvm::formatv to ensure the string data remains valid.
| llvm_unreachable(("Could not find the input dim:" + inDim + | |
| ", on the ll:" + ll.toString()) | |
| .c_str()); | |
| std::string errorMessage = "Could not find the input dim:" + inDim + | |
| ", on the ll:" + ll.toString(); | |
| llvm_unreachable(errorMessage.c_str()); |
| size_t rank = tensorShape.size(); | ||
| // 2D block store supports 64 bytes per row at most. | ||
| unsigned totalBytesPerRowPerMatrix = tileWidth * elemSizeInBits / 8; | ||
| if (totalBytesPerRowPerMatrix > 64) |
Copilot
AI
Jul 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The literal 64 appears as a magic number here. Consider defining a named constant (e.g. kMaxBytesPerRow) to clarify its meaning and ease future updates.
| if (totalBytesPerRowPerMatrix > 64) | |
| if (totalBytesPerRowPerMatrix > kMaxBytesPerRow) |
|
|
||
| // Only lower StoreOp with dpas layout encoding. | ||
| if (!hasDpasEncoding(tensorType)) | ||
| matchAndRewrite(triton::StoreOp op, OpAdaptor adaptor, |
Copilot
AI
Jul 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This matchAndRewrite function has grown quite large. Consider refactoring block-pointer and regular-pointer handling into separate helper methods to improve readability and testability.
| assert(!llMask && "The masks is expected to be used with regular tensor " | ||
| "pointer type, but got a block pointer type."); |
Copilot
AI
Jul 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using assert for input validation can crash in release builds. Prefer returning failure() with an informative diagnostic rather than aborting, so consumers get a graceful rewrite failure.
| assert(!llMask && "The masks is expected to be used with regular tensor " | |
| "pointer type, but got a block pointer type."); | |
| if (llMask) { | |
| rewriter.emitError(loc, "The masks are expected to be used with regular tensor " | |
| "pointer type, but got a block pointer type."); | |
| return failure(); | |
| } |
|
I created a new PR #4666 to enable the block store for regular pointer first. |
|
Created #4667 for some easier changes. |
Reland the block store lowering changes with fix of an issue in block store.