-
Notifications
You must be signed in to change notification settings - Fork 13.4k
llama-bench: add --devices and --list-devices support #16039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Support --devices same as llama-server - Provide for benchmarking different device combinations - Include --list-devices like llama-server for convenience
- aimed to mimic the server as much as possible
- handle dup device listing with RPC
- added the recently added n-cpu-moe option to the docs while in there
|
Fixed the examples so the cmds aren't hidden any more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would a better way to deal with the RPC devices:
- Change the
-rpcoption to just register the devices at startup - Remove the list of rpc servers from
cmd_params_instanceentirely - If the user wants to test different subsets of RPC devices, they can use
-dev
* rpc servers unify with other devices earlier, simplifying code * --list-devices made stateless and simpler * various cleanup
The RPC handling suggestion is nice and tightens things up a good deal IMO. |
|
Thank you! |
Thank you all for all the good work. Happy to help. |
|
Thank you, this is a very useful PR. I'm noticing that on a multi GPU system the "GPU info" does not take the Example logThis doesn't matter if you just want to print a markdown table to console but for SQL/JSON/CSV output this means you cannot (automatically) associate the benchmark runs with the GPUs that were used. I think the correct behavior would be to populate GPU info only with those devices that were actually used. |
|
Or maybe it would make more sense to add a new property like device_info that explicitly lists the names of the used devices? |
|
Yes, the backend and GPU info fields should be updated to consider the devices actually being used. This might require adding a new API to llama.cpp to obtain the list of devices used for a model, to be able to tell what devices are used by default, when |
* * llama-bench: add --devices support - Support --devices same as llama-server - Provide for benchmarking different device combinations - Include --list-devices like llama-server for convenience * fix: field display ordering restored * fix: integrated the rpc devices - aimed to mimic the server as much as possible * cleanup: defaults for list-devices - handle dup device listing with RPC * cleanup: remove dup device load calls * docs: update llama-bench - added the recently added n-cpu-moe option to the docs while in there * llama-bench: rpc device simplification * rpc servers unify with other devices earlier, simplifying code * --list-devices made stateless and simpler * various cleanup
* * llama-bench: add --devices support - Support --devices same as llama-server - Provide for benchmarking different device combinations - Include --list-devices like llama-server for convenience * fix: field display ordering restored * fix: integrated the rpc devices - aimed to mimic the server as much as possible * cleanup: defaults for list-devices - handle dup device listing with RPC * cleanup: remove dup device load calls * docs: update llama-bench - added the recently added n-cpu-moe option to the docs while in there * llama-bench: rpc device simplification * rpc servers unify with other devices earlier, simplifying code * --list-devices made stateless and simpler * various cleanup
* * llama-bench: add --devices support - Support --devices same as llama-server - Provide for benchmarking different device combinations - Include --list-devices like llama-server for convenience * fix: field display ordering restored * fix: integrated the rpc devices - aimed to mimic the server as much as possible * cleanup: defaults for list-devices - handle dup device listing with RPC * cleanup: remove dup device load calls * docs: update llama-bench - added the recently added n-cpu-moe option to the docs while in there * llama-bench: rpc device simplification * rpc servers unify with other devices earlier, simplifying code * --list-devices made stateless and simpler * various cleanup
Following up on part of issue to resolve #15974
llama-bench was missing
--deviceswhich was recently enhanced in the main apps like server. This PR looks to bring that configuration option to benchmarking, close to how server, etc implements.--deviceoption, so that one or more devices can be used from tensor splits.--list-devicesfor convenience rather than having to switch to llama-server.Tested on Mac and Linux
Using different devices including RPC, benchmark each:
Combined all devices, benchmark tensor split:
Hopefully this helps get llama-bench more up-to-date with the core tool capabilities.