Add a new stage to generate `zebin` to align CUDA stages in triton.compile #5189

chengjunlu · 2025-09-25T05:47:12Z

This PR adds a new "zebin" compilation stage for XPU backend to align with CUDA compilation stages in triton.compile. The change introduces zebin as a binary format alternative to SPIRV for Intel XPU targets.

third_party/intel/backend/compiler.py

Copilot

Pull Request Overview

This PR adds a new "zebin" compilation stage for XPU backend to align with CUDA compilation stages in triton.compile. The change introduces zebin as a binary format alternative to SPIRV for Intel XPU targets.

Adds make_zebin method to generate zebin binary format from SPIRV input
Updates binary extension from "spv" to "zebin" for XPU backend
Modifies compilation pipeline to handle zebin as a binary format alongside cubin and hsaco

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
third_party/intel/backend/compiler.py	Adds zebin compilation stage and updates binary extension
python/triton/compiler/compiler.py	Updates file parsing and compilation pipeline to support zebin format

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

third_party/intel/backend/compiler.py

python/triton/compiler/compiler.py

etiotto

Instead of using ocloc to generate the native binary, can we use L0 to generate it ?

How about trying: https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/PROG.html#module-caching-with-native-binaries

third_party/intel/backend/compiler.py

chengjunlu · 2025-10-29T04:29:52Z

Instead of using ocloc to generate the native binary, can we use L0 to generate it ?

How about trying: https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/PROG.html#module-caching-with-native-binaries

It L0 API requires passing the device context which is not avaliable during triton.compile context.

chengjunlu · 2025-10-29T07:13:57Z

third_party/intel/tools/intel/compile.cpp

  size_t global_range_y = {gridY};
  size_t global_range_z = {gridZ};
  size_t local_range_x = {num_warps} * {threads_per_warp};
-  if (driver_version.find("+") != std::string::npos) {{


This code doesn't make sense. Remove it.

anmyachev · 2025-09-25T13:51:14Z

python/triton/compiler/compiler.py

        # stores the text of each level of IR that was generated during compilation
        asm_files = [Path(p) for c, p in metadata_group.items() if not c.endswith(".json")]
+
+        def read_file(path):


Both the spv and zebin are in binary format. To dump the intermidate file either by text or binary format.

Maybe it's worth rewriting without exceptions? They usually work noticeably slower.

I add a new implementation following the function parse. Avoid to use the exception.

third_party/intel/backend/compiler.py

… or option = {"generate_native_code": 1}. Signed-off-by: Lu,Chengjun <[email protected]>

Signed-off-by: Lu,Chengjun <[email protected]>

anmyachev

LGTM! @whitneywhtsang @etiotto?

whitneywhtsang

LGTM, there is no change by default. When generate_native_code is true, then instead of replacing spv stage with zebin stage, it adds an additional stage to generate zebin.

chengjunlu requested review from anmyachev and whitneywhtsang September 25, 2025 05:47

chengjunlu changed the title ~~Add a new stage to generate zebin to align CUDA stages in triton.compile~~ [Draft] Add a new stage to generate zebin to align CUDA stages in triton.compile Sep 25, 2025

chengjunlu linked an issue Sep 25, 2025 that may be closed by this pull request

[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor. #5153

Open

chengjunlu force-pushed the chengjun/add_zebin_stage branch from 4cba65d to a943a26 Compare September 25, 2025 06:00

anmyachev reviewed Sep 25, 2025

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

etiotto requested a review from Copilot September 25, 2025 14:58

Copilot AI reviewed Sep 25, 2025

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

third_party/intel/backend/compiler.py Show resolved Hide resolved

python/triton/compiler/compiler.py Outdated Show resolved Hide resolved

etiotto reviewed Sep 25, 2025

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

etiotto marked this pull request as draft October 9, 2025 14:10

This was referenced Oct 17, 2025

[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor. #5153

Open

Generate native code using L0 sdk instead of ocloc #5342

Closed

chengjunlu force-pushed the chengjun/add_zebin_stage branch from a943a26 to c7cbf86 Compare October 29, 2025 03:32

chengjunlu marked this pull request as ready for review October 29, 2025 03:32

chengjunlu linked an issue Oct 29, 2025 that may be closed by this pull request

Binary kernel for Inductor static kernel launcher. #5388

Closed

chengjunlu mentioned this pull request Oct 29, 2025

Binary kernel for Inductor static kernel launcher. #5388

Closed

chengjunlu force-pushed the chengjun/add_zebin_stage branch from c7cbf86 to f2186c2 Compare October 29, 2025 03:50

chengjunlu changed the title ~~[Draft] Add a new stage to generate zebin to align CUDA stages in triton.compile~~ Add a new stage to generate zebin to align CUDA stages in triton.compile Oct 29, 2025

chengjunlu force-pushed the chengjun/add_zebin_stage branch from f2186c2 to dabaee1 Compare October 29, 2025 04:28

chengjunlu force-pushed the chengjun/add_zebin_stage branch 2 times, most recently from bbd3669 to b66f0b5 Compare October 29, 2025 07:13

chengjunlu commented Oct 29, 2025

View reviewed changes

anmyachev reviewed Oct 29, 2025

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

anmyachev reviewed Oct 30, 2025

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

chengjunlu force-pushed the chengjun/add_zebin_stage branch from 5f6a603 to 273e7b1 Compare October 31, 2025 02:01

chengjunlu requested review from anmyachev and etiotto October 31, 2025 02:04

chengjunlu removed a link to an issue Oct 31, 2025

[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor. #5153

Open

chengjunlu and others added 3 commits October 31, 2025 08:55

Add a new stage to generate zebin when TRITON_XPU_GEN_NATIVE_CODE=1…

ad0057b

… or option = {"generate_native_code": 1}. Signed-off-by: Lu,Chengjun <[email protected]>

Apply suggestion from @anmyachev

f4a8ccb

Signed-off-by: Lu,Chengjun <[email protected]>

Apply comments from @anmyachev

273e7b1

Signed-off-by: Lu,Chengjun <[email protected]>

anmyachev approved these changes Oct 31, 2025

View reviewed changes

chengjunlu merged commit 9e23713 into main Nov 3, 2025
23 checks passed

chengjunlu deleted the chengjun/add_zebin_stage branch November 3, 2025 03:26

whitneywhtsang reviewed Nov 3, 2025

View reviewed changes

Add a new stage to generate zebin to align CUDA stages in triton.compile #5189

Add a new stage to generate zebin to align CUDA stages in triton.compile #5189

Uh oh!

Conversation

chengjunlu commented Sep 25, 2025 • edited by etiotto Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

etiotto left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chengjunlu commented Oct 29, 2025

Uh oh!

chengjunlu Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

anmyachev Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

chengjunlu Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

anmyachev Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

chengjunlu Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anmyachev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

whitneywhtsang left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Add a new stage to generate `zebin` to align CUDA stages in triton.compile #5189

Add a new stage to generate `zebin` to align CUDA stages in triton.compile #5189

chengjunlu commented Sep 25, 2025 •

edited by etiotto

Loading