Skip to content

Conversation

@jakub-sochacki
Copy link
Contributor

@jakub-sochacki jakub-sochacki commented Aug 20, 2025

  • Add tests from triton_kernels directory to test-triton.sh
  • Tests can be triggered with test-triton.sh --triton-kernels - ENABLED BY DEFAULT
  • Ignore tests in test_tensor_details directory to skip NVIDIA HW tests
  • Added get_device_capability() function to conftest.py with XPU support, returning capability tuple (9,) for XPU devices and (0,) as fallback for unknown devices.
  • Parametrized device in mxfp and routing tests enabling testing on different backends including XPU.
  • Added XPU backend support to target_info.py by implementing is_xpu() function for XPU device detection and extending num_sms() to return max_compute_units for XPU devices.
  • Fixed SwiGLU kernel XPU compatibility by excluding XPU backend from NVIDIA-specific maxnreg parameter.

@jakub-sochacki jakub-sochacki linked an issue Aug 20, 2025 that may be closed by this pull request
@jakub-sochacki jakub-sochacki marked this pull request as ready for review August 25, 2025 11:02
@anmyachev
Copy link
Contributor

Hi @jakub-sochacki

What else is left to enable in tests folder?
triton_kernels has a separate file pyproject.toml for building. Could you please clarify (with code links) how Triton itself runs these tests? You probably need to check their CI: https://github.com/triton-lang/triton

@jakub-sochacki
Copy link
Contributor Author

Hi @jakub-sochacki

What else is left to enable in tests folder? triton_kernels has a separate file pyproject.toml for building. Could you please clarify (with code links) how Triton itself runs these tests? You probably need to check their CI: https://github.com/triton-lang/triton

Current status is:
test_tensor_details/ ❌ FAIL (NVIDIA-specific tests, SKIPPING)
test_compaction.py ✅ PASS
test_matmul.py ❌ FAIL (need to implement make_default_opt_flags() variant for XPU, more investigation needed, SKIPPING)
test_mxfp.py ✅ PASS
test_routing.py ✅ PASS
test_specialize.py ✅ PASS
test_swiglu.py ✅ PASS
test_tensor.py ⚠️ (empty test file, INCLUDE)

How Triton uses python/triton_kernels/tests?
The python/triton_kernels/tests directory is executed directly as part of the test-unit target without requiring installation via pyproject.toml. This is different from the bench/ directory approach where .toml is used.
$(PYTEST) -s -n 8 python/triton_kernels/tests/
Please see: https://github.com/triton-lang/triton/blob/main/Makefile#L38

The CI runs integration tests on NVIDIA and AMD dedicated yamls:
See: https://github.com/triton-lang/triton/blob/main/.github/workflows/ci.yml#L27
The NVIDIA-dedicated yml runs make test-unit: https://github.com/triton-lang/triton/blob/main/.github/workflows/integration-tests-nvidia.yml#L91

@vlad-penkin
Copy link
Contributor

Hi @jakub-sochacki

What else is left to enable in tests folder? triton_kernels has a separate file pyproject.toml for building. Could you please clarify (with code links) how Triton itself runs these tests? You probably need to check their CI: https://github.com/triton-lang/triton

Shall we install triton_kernels as a package before running the tests?

@jakub-sochacki
Copy link
Contributor Author

Hi @jakub-sochacki
What else is left to enable in tests folder? triton_kernels has a separate file pyproject.toml for building. Could you please clarify (with code links) how Triton itself runs these tests? You probably need to check their CI: https://github.com/triton-lang/triton

Shall we install triton_kernels as a package before running the tests?

In upstream they don't install it. I was also able to run tests on VM and GH without additional installations.

@jakub-sochacki jakub-sochacki force-pushed the dev/jsochacki/kernel-tests branch from 0afea8a to c545a90 Compare September 2, 2025 15:22
Copy link
Contributor

@vlad-penkin vlad-penkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakub-sochacki

Updated PR looks good for me in general.

Let's add minimal support in CI as an optional target and have one green run before final approval.

How do i suggest to track the work on enabling skipped tests - keep the linked issue open and add additional PR or create a new issue?

@etiotto
Copy link
Contributor

etiotto commented Sep 4, 2025

CI is green. I'd suggest to open an issue to track re-enableing of skipped tests. IMO this PR can go in @vlad-penkin.

@etiotto etiotto requested a review from vlad-penkin September 4, 2025 15:11
@whitneywhtsang
Copy link
Contributor

What's the impact in CI time and pass rate?

@anmyachev
Copy link
Contributor

test_matmul.py ❌ FAIL (need to implement make_default_opt_flags() variant for XPU, more investigation needed, SKIPPING)

FYI: work on this has already begun #5051. However, I think this pull request should be merged earlier.

@vlad-penkin please note that this pull request is mostly waiting for your review.

@jakub-sochacki could you fix conflicts please?

@jakub-sochacki
Copy link
Contributor Author

jakub-sochacki commented Sep 8, 2025

What's the impact in CI time and pass rate?

79 new tests, less than 5 minutes of CI time. Can we measure it precisely after merging?
These tests will be ENABLED BY DEFAULT with this PR as we discussed.

@whitneywhtsang
Copy link
Contributor

These tests will be ENABLED BY DEFAULT with this PR as we discussed.

I don't currently see it in this PR, I assume it will be added.

@anmyachev
Copy link
Contributor

anmyachev commented Sep 9, 2025

These tests will be ENABLED BY DEFAULT with this PR as we discussed.

@jakub-sochacki It should work on Windows, but on Linux and PVC it doesn't work this way. As you can see in build-test-reusable.yml all scripts/test-triton.sh calls go with the use of a flag which identify subset of tests like: --unit, --instrumentation and so on which turns off the flag that runs tests by default:

    --unit)
      TEST_UNIT=true
      TEST_DEFAULT=false
      shift
      ;;

In fact, to enable these tests in CI you also need to add code to build-test-reusable.yml, for example:

      - name: Run triton kernels tests
        if: matrix.suite == 'rest'
        run: |
          ${{ env.TRITON_TEST_CMD }} --triton-kernels

UPD: the similar code should be added into pip-test.yml, pip-test-windows.yml and build-test-windows.yml

@anmyachev
Copy link
Contributor

Copy link
Contributor

@vlad-penkin vlad-penkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@anmyachev anmyachev merged commit 07ac3a4 into main Sep 9, 2025
21 of 22 checks passed
@anmyachev anmyachev deleted the dev/jsochacki/kernel-tests branch September 9, 2025 16:30
@whitneywhtsang
Copy link
Contributor

What's the impact in CI time and pass rate?

Pass rate: 98.6%->84.11%, CI time increased by ~1min.
Pass rate has dropped significantly with this PR.
We need to fix the skipped test cases asap, or mark test cases that we don't plan to fix (e.g., NV specific) as XFAIL instead of SKIP.

@anmyachev
Copy link
Contributor

We need to fix the skipped test cases asap, or mark test cases that we don't plan to fix (e.g., NV specific) as XFAIL instead of SKIP.

Pass rate will be better after #5051

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider running python/triton_kernels/tests tests

7 participants