Skip to content

Conversation

tjohnson31415
Copy link
Member

Motivation

We recently found that AutoGPTQ vendors its own versions of exllama and exllamav2 kernels in augotgptq_extension that are installed with the library. Since we install AutoGPTQ after we installed our own builds of the exllama kernels, the AutoGPTQ ones overwrite our copies. So it turns out that we don't need to vendor and compile our own exllama kernels.

Modifications

Removes the vendored copies of exllama kernels.

Result

There should be no functional changes other than faster build times and less code.

@tjohnson31415 tjohnson31415 marked this pull request as ready for review March 14, 2024 17:01
Copy link
Collaborator

@maxdebayser maxdebayser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tjohnson31415 tjohnson31415 changed the title 🔥 Remove custom exllama code, use auto-gptq vendored instead 🔥 Remove our exllama code, we use auto-gptq vendored kernels Mar 14, 2024
@tjohnson31415 tjohnson31415 changed the title 🔥 Remove our exllama code, we use auto-gptq vendored kernels 🔥 Remove our exllama code because we use auto-gptq vendored kernels Mar 14, 2024
Copy link
Collaborator

@joerunde joerunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it

@joerunde joerunde merged commit 0cc4a2e into main Mar 14, 2024
@tjohnson31415 tjohnson31415 deleted the autogptq-exllama branch March 14, 2024 19:22
Xaenalt pushed a commit to Xaenalt/text-generation-inference that referenced this pull request Aug 1, 2024
[pull] main from IBM:main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants