[Dev] Serialize Generated Kernel Name with Operator Config and Hint #153

LeiWang1999 · 2024-08-24T04:11:54Z

This pull request includes several significant updates to the bitblas library, focusing on enhancing code modularity, improving kernel name generation, and refining CUDA initialization processes. The most important changes are grouped by theme below.

Enhancements to Kernel Name Generation:

Added MatmulKernelNameGenerator class to generate kernel names based on configuration and hints in bitblas/ops/general_matmul/__init__.py. This class includes methods for serializing hints and simplifying data types. [1] [2]
Introduced BaseKernelNameGenerator abstract class to provide a base for kernel name generators in bitblas/ops/operator.py.

Improvements in CUDA Initialization:

Refactored CUDA initialization functions to use predefined templates for setting attributes and defining functions in bitblas/builder/wrapper/tir.py. [1] [2] [3] [4]

Updates to Function and Module Handling:

Modified create_dispatch_mod and fast_tune_with_dynamic_range functions to handle additional parameters and improve dispatching in bitblas/base/utils.py. [1] [2] [3]
Changed references from optimized_func to optimized_mod across several files to reflect the updated structure of optimized modules. [1] [2] [3]

Code Modularization:

Added import statements for new modules and classes to improve code modularity in bitblas/ops/general_matmul/__init__.py and bitblas/ops/operator.py. [1] [2] [3]

These changes collectively enhance the maintainability and functionality of the bitblas library, particularly in the areas of kernel name generation and CUDA initialization.

For example, a tuned kernel can finally be wrapped with

extern "C" void init() {
    
    cudaFuncSetAttribute(matmul_n256k256_Af16Wf16_tcx16x64x128w16x16xp2_opt_m_256, cudaFuncAttributeMaxDynamicSharedMemorySize, 41472);

}

extern "C" void call(half* __restrict__ A, half* __restrict__ B, half* __restrict__ C, int m, cudaStream_t stream=cudaStreamDefault) {
        if (m == 0) return; 
        if (m <= 1) {
                matmul_n256k256_Af16Wf16_simt_opt_m_1<<<dim3(64, 1, 1), dim3(32, 4, 1), 0, stream>>>(A, B, C, m); 
        }
        else if (m <= 256) {
                matmul_n256k256_Af16Wf16_tcx16x64x128w16x16xp2_opt_m_256<<<dim3(4, (m + 15) / 16, 1), dim3(32, 1, 4), 41472, stream>>>(A, B, C, m); 
        }
        else {
                matmul_n256k256_Af16Wf16_tcx16x64x128w16x16xp2_opt_m_256<<<dim3(4, (m + 15) / 16, 1), dim3(32, 1, 4), 41472, stream>>>(A, B, C, m); 
        }

}

…ability and maintainability

…nability

LeiWang1999 added 15 commits July 5, 2024 08:54

Refactor BatchMatMulEmitter and BatchMatMulSelector for improved read…

d8884e6

…ability and maintainability

Refactor import statements for improved readability and maintainability

fc84173

Refactor import statements for improved readability and maintainability

02f64de

disable failure email for ci

397eee6

remove email notifications.

20f6ad1

move relax pass from testing to mlc_llm

b93c394

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

ba6a6df

Refactor scripts with se check_eual_ref_scripts_with_emitter function

257693a

Lint Fix

9bb7f49

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

39e7614

Refactor scripts with se check_eual_ref_scripts_with_emitter function

93eb5a5

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

72b9740

Kernel Name

9efa5ab

Refactor TIR CUDA source wrapper for improved readability and maintai…

af99e58

…nability

bug fix

2729aa8

LeiWang1999 merged commit f40d9ba into microsoft:main Aug 24, 2024

LeiWang1999 mentioned this pull request Aug 24, 2024

Release Plan of BitBLAS 0.0.1 #150

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Dev] Serialize Generated Kernel Name with Operator Config and Hint #153

[Dev] Serialize Generated Kernel Name with Operator Config and Hint #153

Uh oh!

LeiWang1999 commented Aug 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Dev] Serialize Generated Kernel Name with Operator Config and Hint #153

[Dev] Serialize Generated Kernel Name with Operator Config and Hint #153

Uh oh!

Conversation

LeiWang1999 commented Aug 24, 2024

Enhancements to Kernel Name Generation:

Improvements in CUDA Initialization:

Updates to Function and Module Handling:

Code Modularization:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant