Skip to content

Al0den/metalgpu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python application

Metal GPU

This is a simple python library, wrapping Apple's Metal API to run compute kernels from python, with full control over buffers and methods. No copying behind the scenes, and raw access to the buffers as numpy arrays

Installing

Running pip install metalgpu to download latest release. After the first install, you will need to compile the C library.

To do so, simply run in your terminal python -m metalgpu build, and let it build the library. This leaves no files behind, apart from the compiled library.

Alternatively, simply run pip install metalgpu && python -m metalgpu build

Note: You need to have the git command line to use this tool, otherwise manually compile the folder metal-gpu-c and move the output library to the lib folder

Examples

main.py

import metalgpu

instance = metalgpu.Interface()  # Initialise the metal instance
shader_string = """
#include <metal_stdlib>

using namespace metal;

kernel void adder(device int *arr1 [[buffer(0)]],
        device int *arr2 [[buffer(1)]],
        device int *arr3 [[buffer(2)]],
        uint id [[thread_position_in_grid]]) {
    arr3[id] = arr2[id] + arr1[id];
}
"""
# Note: For clearer code, use instance.load_shader(shaderPath) to load a .metal file

instance.load_shader_from_string(shader_string)
instance.set_function("adder")

buffer_size = 100000  # Number of items in the buffer
buffer_type = "int" # Or np.int32, ctype.c_int32

initial_array = [i for i in range(buffer_size)]

buffer1 = instance.array_to_buffer(initial_array)
buffer2 = instance.array_to_buffer(initial_array)
buffer3 = instance.create_buffer(buffer_size, buffer_type)

instance.run_function(buffer_size, [buffer1, buffer2, buffer3])

assert(all(buffer3.contents == [i * 2 for i in range(buffer_size)]))

buffer1.release() # This isnt required, python can do it alone
buffer2.release()
buffer3.release()

Performance

When tested using performance_suite.py, on Apple Silicon M1 Pro, base specs:

image

Documentation

To view the documentation, simply go to the docs folder and view the docs.md file

The code related to the C interface can be found in /metal-gpu-c. The code related to the python package can be found in /metalgpu

Known issues

  • None :)

Credits

  • MyMetalKernel.py Didn't manage to get this to work, overcomplicated for python code
  • metalcompute Although similar, performs lots of array copies instead of buffer management, and has some memory leaks.

About

A python interface for Apple GPU's Metal API

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •