-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[ROCm] fma intrin #15464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
[ROCm] fma intrin #15464
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
tqchen
approved these changes
Aug 2, 2023
tqchen
pushed a commit
to mlc-ai/mlc-llm
that referenced
this pull request
Aug 3, 2023
Depending on this PR: apache/tvm#15464 On 7900 xtx. ROCm 5.6 ``` ~/mlc-llm (rocm ✔) ./build/mlc_chat_cli --local-id Llama-2-7b-chat-hf-q4f16_1 Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/mlc-chat-config.json" Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/ndarray-cache.json" Use model library: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/Llama-2-7b-chat-hf-q4f16_1-rocm.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified Loading model... Loading finished Running system prompts... System prompts finished [INST]: Hi [/INST]: Hello! It's nice to meet you. I'm here to help you with any questions or tasks you may have, while always being safe and respectful. Is there something specific you would like to know or discuss? Please feel free to ask me anything, and I will do my best to provide a helpful and positive response. [INST]: /stats prefill: 507.3 tok/s, decode: 92.0 tok/s ``` ``` ~/mlc-llm (rocm ✗) ./build/mlc_chat_cli --local-id Llama-2-13b-chat-hf-q4f16_1 Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/mlc-chat-config.json" Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/ndarray-cache.json" Use model library: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/Llama-2-13b-chat-hf-q4f16_1-rocm.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified Loading model... Loading finished Running system prompts... System prompts finished [INST]: Hi [/INST]: Hello! I'm here to assist you with any questions you may have. Please keep in mind that I strive to provide safe and positive responses that are free of harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question does not make sense or is not factually coherent, I will do my best to explain why instead of providing an incorrect answer. If I don't know the answer to a question, I will not provide false information. Is there anything specific you would like to know or discuss? [INST]: /stats prefill: 495.7 tok/s, decode: 69.0 tok/s ```
smickey040404
added a commit
to smickey040404/mlc-llm
that referenced
this pull request
Feb 11, 2025
Depending on this PR: apache/tvm#15464 On 7900 xtx. ROCm 5.6 ``` ~/mlc-llm (rocm ✔) ./build/mlc_chat_cli --local-id Llama-2-7b-chat-hf-q4f16_1 Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/mlc-chat-config.json" Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/ndarray-cache.json" Use model library: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/Llama-2-7b-chat-hf-q4f16_1-rocm.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified Loading model... Loading finished Running system prompts... System prompts finished [INST]: Hi [/INST]: Hello! It's nice to meet you. I'm here to help you with any questions or tasks you may have, while always being safe and respectful. Is there something specific you would like to know or discuss? Please feel free to ask me anything, and I will do my best to provide a helpful and positive response. [INST]: /stats prefill: 507.3 tok/s, decode: 92.0 tok/s ``` ``` ~/mlc-llm (rocm ✗) ./build/mlc_chat_cli --local-id Llama-2-13b-chat-hf-q4f16_1 Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/mlc-chat-config.json" Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/ndarray-cache.json" Use model library: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/Llama-2-13b-chat-hf-q4f16_1-rocm.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified Loading model... Loading finished Running system prompts... System prompts finished [INST]: Hi [/INST]: Hello! I'm here to assist you with any questions you may have. Please keep in mind that I strive to provide safe and positive responses that are free of harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question does not make sense or is not factually coherent, I will do my best to explain why instead of providing an incorrect answer. If I don't know the answer to a question, I will not provide false information. Is there anything specific you would like to know or discuss? [INST]: /stats prefill: 495.7 tok/s, decode: 69.0 tok/s ```
tristankincaid
added a commit
to tristankincaid/mlc-llm
that referenced
this pull request
Feb 16, 2025
Depending on this PR: apache/tvm#15464 On 7900 xtx. ROCm 5.6 ``` ~/mlc-llm (rocm ✔) ./build/mlc_chat_cli --local-id Llama-2-7b-chat-hf-q4f16_1 Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/mlc-chat-config.json" Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/ndarray-cache.json" Use model library: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/Llama-2-7b-chat-hf-q4f16_1-rocm.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified Loading model... Loading finished Running system prompts... System prompts finished [INST]: Hi [/INST]: Hello! It's nice to meet you. I'm here to help you with any questions or tasks you may have, while always being safe and respectful. Is there something specific you would like to know or discuss? Please feel free to ask me anything, and I will do my best to provide a helpful and positive response. [INST]: /stats prefill: 507.3 tok/s, decode: 92.0 tok/s ``` ``` ~/mlc-llm (rocm ✗) ./build/mlc_chat_cli --local-id Llama-2-13b-chat-hf-q4f16_1 Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/mlc-chat-config.json" Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/ndarray-cache.json" Use model library: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/Llama-2-13b-chat-hf-q4f16_1-rocm.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified Loading model... Loading finished Running system prompts... System prompts finished [INST]: Hi [/INST]: Hello! I'm here to assist you with any questions you may have. Please keep in mind that I strive to provide safe and positive responses that are free of harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question does not make sense or is not factually coherent, I will do my best to explain why instead of providing an incorrect answer. If I don't know the answer to a question, I will not provide false information. Is there anything specific you would like to know or discuss? [INST]: /stats prefill: 495.7 tok/s, decode: 69.0 tok/s ```
knightdev159
pushed a commit
to knightdev159/Doctor-Dignity
that referenced
this pull request
Aug 2, 2025
Depending on this PR: apache/tvm#15464 On 7900 xtx. ROCm 5.6 ``` ~/mlc-llm (rocm ✔) ./build/mlc_chat_cli --local-id Llama-2-7b-chat-hf-q4f16_1 Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/mlc-chat-config.json" Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/ndarray-cache.json" Use model library: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/Llama-2-7b-chat-hf-q4f16_1-rocm.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified Loading model... Loading finished Running system prompts... System prompts finished [INST]: Hi [/INST]: Hello! It's nice to meet you. I'm here to help you with any questions or tasks you may have, while always being safe and respectful. Is there something specific you would like to know or discuss? Please feel free to ask me anything, and I will do my best to provide a helpful and positive response. [INST]: /stats prefill: 507.3 tok/s, decode: 92.0 tok/s ``` ``` ~/mlc-llm (rocm ✗) ./build/mlc_chat_cli --local-id Llama-2-13b-chat-hf-q4f16_1 Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/mlc-chat-config.json" Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/ndarray-cache.json" Use model library: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/Llama-2-13b-chat-hf-q4f16_1-rocm.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified Loading model... Loading finished Running system prompts... System prompts finished [INST]: Hi [/INST]: Hello! I'm here to assist you with any questions you may have. Please keep in mind that I strive to provide safe and positive responses that are free of harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question does not make sense or is not factually coherent, I will do my best to explain why instead of providing an incorrect answer. If I don't know the answer to a question, I will not provide false information. Is there anything specific you would like to know or discuss? [INST]: /stats prefill: 495.7 tok/s, decode: 69.0 tok/s ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
use llvm fma intrin for ROCm