Skip to content

[Bug/Future] bitsandbytes higher than 0.35.0 breaks training on 8bit adamW (Windows) #523

@Panchovix

Description

@Panchovix

Hi there! When updating bitsandbytes to any version higher than 0.35.0, all trainings get loss value of nan.

I've tested with cu116, cu117, cu118 and cu121 binaries (with torch+cu116, +cu117, +cu118 and +cu121 respectively) and the issue happens on all of them.

I know it is more for a future issue, but if somehow the binaries have to be updated, they may suffer this issue.

The bitsandbytes whls were obtained from https://github.com/jllllll/bitsandbytes-windows-webui

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions