[SYCL]rm wait() to improve the performance #7233

arthw · 2024-05-12T03:21:41Z

This PR is used to revert the workaround solution in #5895.
That was a workaround to fix a known issue of oneMKL in Intel MTL Arc GPU.
Now, looks like the new oneMKL (oneAPI base toolkit 2024.1) is fixed the issue.
So, revert the old solution.

Now, we get the +32% in Intel MTL Arc GPU and +21% in Arc 770, tested with llama2-7b-Q4.

Next token:

MTL
7.06 tokens per second -> 9.37 tokens per second

Arc770
25.14 tokens per second ->30.50 tokens per second

airMeng · 2024-05-13T00:03:46Z

can you paste the absolute performance number here?

rm wait()

364c375

arthw requested a review from airMeng May 12, 2024 03:22

airMeng approved these changes May 13, 2024

View reviewed changes

airMeng merged commit 948f4ec into ggml-org:master May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL]rm wait() to improve the performance #7233

[SYCL]rm wait() to improve the performance #7233

Uh oh!

arthw commented May 12, 2024 •

edited by NeoZhangJianyu

Loading

Uh oh!

airMeng commented May 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SYCL]rm wait() to improve the performance #7233

[SYCL]rm wait() to improve the performance #7233

Uh oh!

Conversation

arthw commented May 12, 2024 • edited by NeoZhangJianyu Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

airMeng commented May 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

arthw commented May 12, 2024 •

edited by NeoZhangJianyu

Loading