This is a simple python library, wrapping Apple's Metal API to run compute kernels from python, with full control over buffers and methods. No copying behind the scenes, and raw access to the buffers as numpy arrays
Running pip install metalgpu
to download latest release. After the first install, you will need to compile the C library.
To do so, simply run in your terminal python -m metalgpu build
, and let it build the library. This leaves no files behind, apart from the compiled library.
Alternatively, simply run pip install metalgpu && python -m metalgpu build
Note: You need to have the git command line to use this tool, otherwise manually compile the folder metal-gpu-c
and move the output library to the lib folder
main.py
import metalgpu
instance = metalgpu.Interface() # Initialise the metal instance
shader_string = """
#include <metal_stdlib>
using namespace metal;
kernel void adder(device int *arr1 [[buffer(0)]],
device int *arr2 [[buffer(1)]],
device int *arr3 [[buffer(2)]],
uint id [[thread_position_in_grid]]) {
arr3[id] = arr2[id] + arr1[id];
}
"""
# Note: For clearer code, use instance.load_shader(shaderPath) to load a .metal file
instance.load_shader_from_string(shader_string)
instance.set_function("adder")
buffer_size = 100000 # Number of items in the buffer
buffer_type = "int" # Or np.int32, ctype.c_int32
initial_array = [i for i in range(buffer_size)]
buffer1 = instance.array_to_buffer(initial_array)
buffer2 = instance.array_to_buffer(initial_array)
buffer3 = instance.create_buffer(buffer_size, buffer_type)
instance.run_function(buffer_size, [buffer1, buffer2, buffer3])
assert(all(buffer3.contents == [i * 2 for i in range(buffer_size)]))
buffer1.release() # This isnt required, python can do it alone
buffer2.release()
buffer3.release()
When tested using performance_suite.py, on Apple Silicon M1 Pro, base specs:
To view the documentation, simply go to the docs folder and view the docs.md
file
The code related to the C interface can be found in /metal-gpu-c
. The code related to the python package can be found in /metalgpu
- None :)
- MyMetalKernel.py Didn't manage to get this to work, overcomplicated for python code
- metalcompute Although similar, performs lots of array copies instead of buffer management, and has some memory leaks.