-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [X ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [X ] I carefully followed the README.md.
- [X ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [X ] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Feature Description
As today ggml-cuda.cu tries to take advantage of cuda VMM if possible:
https://github.com/ggerganov/llama.cpp/blob/784e11dea1f5ce9638851b2b0dddb107e2a609c8/ggml-cuda.cu#L116
This is not necessarilly possible/desired eg:
https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
Would you mind if I add a cmake/make option/define in order to build ggml-cuda with(default) or without VMM support ?
Best
WT
Motivation
In order to build llamacpp/ggml without vmm.
Possible Implementation
Adding a cmake arg 'GGML_USE_CUDA_VMM' (default ON)
and then line 115:
#if !defined(GGML_USE_HIPBLAS) && defined(GGML_USE_CUDA_VMM)
CUdevice device;
CU_CHECK(cuDeviceGet(&device, id));
CU_CHECK(cuDeviceGetAttribute(&device_vmm, CU_DEVICE_ATTRIBUTE_VIRTUAL_MEMORY_MANAGEMENT_SUPPORTED, device));
...
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request