-
Notifications
You must be signed in to change notification settings - Fork 74
Add Composable Kernel examples #332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: amd-staging
Are you sure you want to change the base?
Conversation
bc1a295 to
05d8ec8
Compare
|
The failing markdown linter will be resolved once ROCm/rocm-docs-core#1449 is merged. |
|
This PR requires ROCm 7.1. Once #341 is merged the build errors should disappear. |
bc4819e to
60c923a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Composable Kernel's ck_tile examples to the ROCm Examples repository, focusing exclusively on ROCm and Linux platforms (CUDA and Windows are not supported). The examples demonstrate various GPU operations using CK Tile's programming model, including GEMM operations, convolutions, and basic tensor operations.
Key Changes
- Added comprehensive examples for GEMM operations (batched, block-scale, flatmm, multi-d, grouped)
- Introduced grouped convolution examples (forward and backward weight)
- Implemented basic operations (elementwise, reduce, permute, img2col)
- Provided build infrastructure through CMake and Makefiles with architecture-specific support checks
Reviewed Changes
Copilot reviewed 111 out of 281 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| Libraries/ComposableKernel/gemm/flatmm/flatmm_basic.cpp | Implements FLATMM GEMM kernel with tile partitioning and pipeline configuration |
| Libraries/ComposableKernel/gemm/block_scale_gemm/gemm_aquant_basic.cpp | Implements block-scale quantized GEMM with group quantization support |
| Libraries/ComposableKernel/gemm/batched_gemm/batched_gemm.cpp | Implements batched GEMM operations with configurable pipeline strategies |
| Libraries/ComposableKernel/convolution/grouped_convolution/grouped_convolution_forward.cpp | Implements grouped convolution forward pass |
| Libraries/ComposableKernel/basic/reduce/reduce.cpp | Demonstrates 2D reduction operations with block tiling |
| Libraries/ComposableKernel/basic/permute/permute.cpp | Generic tensor permutation with matrix-core optimized alternative |
| CMakeLists.txt and Makefile files | Build configuration with architecture checks for gfx908/gfx90a/gfx942/gfx950 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
zichguan-amd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
vidyasagar-amd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the addition
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
Signed-off-by: Jan Stephan <[email protected]>
60c923a to
f7b0481
Compare
adeljo-amd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Motivation
This PR adds Composable Kernel's ck_tile examples to this repository.
Technical Details
This PR only targets ROCm + Linux; Windows and CUDA are not supported by Composable Kernel.
Test Plan
Test Result
Submission Checklist