-
Notifications
You must be signed in to change notification settings - Fork 75
Add a new stage to generate zebin to align CUDA stages in triton.compile
#5189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zebin to align CUDA stages in triton.compilezebin to align CUDA stages in triton.compile
4cba65d to
a943a26
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new "zebin" compilation stage for XPU backend to align with CUDA compilation stages in triton.compile. The change introduces zebin as a binary format alternative to SPIRV for Intel XPU targets.
- Adds
make_zebinmethod to generate zebin binary format from SPIRV input - Updates binary extension from "spv" to "zebin" for XPU backend
- Modifies compilation pipeline to handle zebin as a binary format alongside cubin and hsaco
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| third_party/intel/backend/compiler.py | Adds zebin compilation stage and updates binary extension |
| python/triton/compiler/compiler.py | Updates file parsing and compilation pipeline to support zebin format |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using ocloc to generate the native binary, can we use L0 to generate it ?
How about trying: https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/PROG.html#module-caching-with-native-binaries
a943a26 to
c7cbf86
Compare
c7cbf86 to
f2186c2
Compare
zebin to align CUDA stages in triton.compilezebin to align CUDA stages in triton.compile
f2186c2 to
dabaee1
Compare
It L0 API requires passing the device context which is not avaliable during |
bbd3669 to
b66f0b5
Compare
| size_t global_range_y = {gridY}; | ||
| size_t global_range_z = {gridZ}; | ||
| size_t local_range_x = {num_warps} * {threads_per_warp}; | ||
| if (driver_version.find("+") != std::string::npos) {{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code doesn't make sense. Remove it.
python/triton/compiler/compiler.py
Outdated
| # stores the text of each level of IR that was generated during compilation | ||
| asm_files = [Path(p) for c, p in metadata_group.items() if not c.endswith(".json")] | ||
|
|
||
| def read_file(path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both the spv and zebin are in binary format. To dump the intermidate file either by text or binary format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's worth rewriting without exceptions? They usually work noticeably slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add a new implementation following the function parse. Avoid to use the exception.
5f6a603 to
273e7b1
Compare
… or option = {"generate_native_code": 1}.
Signed-off-by: Lu,Chengjun <[email protected]>
Signed-off-by: Lu,Chengjun <[email protected]>
Signed-off-by: Lu,Chengjun <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! @whitneywhtsang @etiotto?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, there is no change by default. When generate_native_code is true, then instead of replacing spv stage with zebin stage, it adds an additional stage to generate zebin.
This PR adds a new "zebin" compilation stage for XPU backend to align with CUDA compilation stages in triton.compile. The change introduces zebin as a binary format alternative to SPIRV for Intel XPU targets.