Skip to content

Conversation

joerunde
Copy link
Collaborator

Motivation

We need to update a whole bunch of things that will cause output differences, and we want to bundle them up together.

Modifications

Updates:

  • pytorch
  • flash attention
  • autogptq
  • cuda

Result

Slight differences in outputs for some text generation prompts on many models, but our quality tests indicate no major drop in result quality.

njhill and others added 2 commits March 18, 2024 16:46
@joerunde joerunde marked this pull request as ready for review March 19, 2024 20:13
@joerunde
Copy link
Collaborator Author

python package list from this image:

$ pip3 list | grep -iE '(flash|torch|auto|cuda)'
DEPRECATION: Loading egg at /opt/tgis/lib/python3.11/site-packages/custom_kernels-0.0.0-py3.11-linux-x86_64.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation.. Discussion can be found at https://github.com/pypa/pip/issues/12330
auto_gptq                 0.7.1
flash-attn                2.5.6
nvidia-cuda-cupti-cu12    12.1.105
nvidia-cuda-nvrtc-cu12    12.1.105
nvidia-cuda-runtime-cu12  12.1.105
torch                     2.2.1+cu121

looks like the versions I expect- running performance and integration tests to make sure nothing is totally borked

Copy link
Contributor

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks @joerunde

@joerunde joerunde merged commit dacfe50 into main Mar 21, 2024
@joerunde joerunde deleted the big-upgrades branch March 21, 2024 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants