Skip to content

Conversation

@spectrometerHBH
Copy link
Contributor

use llvm fma intrin for ROCm

@tvm-bot
Copy link
Collaborator

tvm-bot commented Aug 2, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

  • No users to tag found in teams: rocm See #10317 for details

Generated by tvm-bot

@MasterJH5574 MasterJH5574 merged commit 49fd318 into apache:main Aug 3, 2023
tqchen pushed a commit to mlc-ai/mlc-llm that referenced this pull request Aug 3, 2023
Depending on this PR: apache/tvm#15464

On 7900 xtx. ROCm 5.6
```
~/mlc-llm (rocm ✔) ./build/mlc_chat_cli --local-id Llama-2-7b-chat-hf-q4f16_1
Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/mlc-chat-config.json"
Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/ndarray-cache.json"
Use model library: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/Llama-2-7b-chat-hf-q4f16_1-rocm.so"
You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]: Hi
[/INST]: Hello! It's nice to meet you. I'm here to help you with any questions or tasks you may have, while always being safe and respectful. Is there something specific you would like to know or discuss? Please feel free to ask me anything, and I will do my best to provide a helpful and positive response.
[INST]: /stats
prefill: 507.3 tok/s, decode: 92.0 tok/s
```

```
~/mlc-llm (rocm ✗) ./build/mlc_chat_cli --local-id Llama-2-13b-chat-hf-q4f16_1
Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/mlc-chat-config.json"
Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/ndarray-cache.json"
Use model library: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/Llama-2-13b-chat-hf-q4f16_1-rocm.so"
You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]: Hi
[/INST]: Hello! I'm here to assist you with any questions you may have. Please keep in mind that I strive to provide safe and positive responses that are free of harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question does not make sense or is not factually coherent, I will do my best to explain why instead of providing an incorrect answer. If I don't know the answer to a question, I will not provide false information. Is there anything specific you would like to know or discuss?
[INST]: /stats
prefill: 495.7 tok/s, decode: 69.0 tok/s
```
smickey040404 added a commit to smickey040404/mlc-llm that referenced this pull request Feb 11, 2025
Depending on this PR: apache/tvm#15464

On 7900 xtx. ROCm 5.6
```
~/mlc-llm (rocm ✔) ./build/mlc_chat_cli --local-id Llama-2-7b-chat-hf-q4f16_1
Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/mlc-chat-config.json"
Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/ndarray-cache.json"
Use model library: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/Llama-2-7b-chat-hf-q4f16_1-rocm.so"
You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]: Hi
[/INST]: Hello! It's nice to meet you. I'm here to help you with any questions or tasks you may have, while always being safe and respectful. Is there something specific you would like to know or discuss? Please feel free to ask me anything, and I will do my best to provide a helpful and positive response.
[INST]: /stats
prefill: 507.3 tok/s, decode: 92.0 tok/s
```

```
~/mlc-llm (rocm ✗) ./build/mlc_chat_cli --local-id Llama-2-13b-chat-hf-q4f16_1
Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/mlc-chat-config.json"
Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/ndarray-cache.json"
Use model library: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/Llama-2-13b-chat-hf-q4f16_1-rocm.so"
You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]: Hi
[/INST]: Hello! I'm here to assist you with any questions you may have. Please keep in mind that I strive to provide safe and positive responses that are free of harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question does not make sense or is not factually coherent, I will do my best to explain why instead of providing an incorrect answer. If I don't know the answer to a question, I will not provide false information. Is there anything specific you would like to know or discuss?
[INST]: /stats
prefill: 495.7 tok/s, decode: 69.0 tok/s
```
tristankincaid added a commit to tristankincaid/mlc-llm that referenced this pull request Feb 16, 2025
Depending on this PR: apache/tvm#15464

On 7900 xtx. ROCm 5.6
```
~/mlc-llm (rocm ✔) ./build/mlc_chat_cli --local-id Llama-2-7b-chat-hf-q4f16_1
Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/mlc-chat-config.json"
Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/ndarray-cache.json"
Use model library: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/Llama-2-7b-chat-hf-q4f16_1-rocm.so"
You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]: Hi
[/INST]: Hello! It's nice to meet you. I'm here to help you with any questions or tasks you may have, while always being safe and respectful. Is there something specific you would like to know or discuss? Please feel free to ask me anything, and I will do my best to provide a helpful and positive response.
[INST]: /stats
prefill: 507.3 tok/s, decode: 92.0 tok/s
```

```
~/mlc-llm (rocm ✗) ./build/mlc_chat_cli --local-id Llama-2-13b-chat-hf-q4f16_1
Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/mlc-chat-config.json"
Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/ndarray-cache.json"
Use model library: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/Llama-2-13b-chat-hf-q4f16_1-rocm.so"
You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]: Hi
[/INST]: Hello! I'm here to assist you with any questions you may have. Please keep in mind that I strive to provide safe and positive responses that are free of harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question does not make sense or is not factually coherent, I will do my best to explain why instead of providing an incorrect answer. If I don't know the answer to a question, I will not provide false information. Is there anything specific you would like to know or discuss?
[INST]: /stats
prefill: 495.7 tok/s, decode: 69.0 tok/s
```
knightdev159 pushed a commit to knightdev159/Doctor-Dignity that referenced this pull request Aug 2, 2025
Depending on this PR: apache/tvm#15464

On 7900 xtx. ROCm 5.6
```
~/mlc-llm (rocm ✔) ./build/mlc_chat_cli --local-id Llama-2-7b-chat-hf-q4f16_1
Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/mlc-chat-config.json"
Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/params/ndarray-cache.json"
Use model library: "/home/bohan/mlc-llm/dist/Llama-2-7b-chat-hf-q4f16_1/Llama-2-7b-chat-hf-q4f16_1-rocm.so"
You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]: Hi
[/INST]: Hello! It's nice to meet you. I'm here to help you with any questions or tasks you may have, while always being safe and respectful. Is there something specific you would like to know or discuss? Please feel free to ask me anything, and I will do my best to provide a helpful and positive response.
[INST]: /stats
prefill: 507.3 tok/s, decode: 92.0 tok/s
```

```
~/mlc-llm (rocm ✗) ./build/mlc_chat_cli --local-id Llama-2-13b-chat-hf-q4f16_1
Use MLC config: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/mlc-chat-config.json"
Use model weights: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/params/ndarray-cache.json"
Use model library: "/home/bohan/mlc-llm/dist/Llama-2-13b-chat-hf-q4f16_1/Llama-2-13b-chat-hf-q4f16_1-rocm.so"
You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]: Hi
[/INST]: Hello! I'm here to assist you with any questions you may have. Please keep in mind that I strive to provide safe and positive responses that are free of harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. If a question does not make sense or is not factually coherent, I will do my best to explain why instead of providing an incorrect answer. If I don't know the answer to a question, I will not provide false information. Is there anything specific you would like to know or discuss?
[INST]: /stats
prefill: 495.7 tok/s, decode: 69.0 tok/s
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants