Skip to content

Works like a charm !  #13

@dbddv01

Description

@dbddv01

Just wanted to report that this works perfect on my gtx1060 (6gb) on my old i5-7200 16gb ram under win10. So far, i never reached such a speed with all other existing solution ( oobabooga, textsynth, llama.cpp). No single issue during install. I can't tell exactly but it's surely a couple of tokens / sec during inference. Need more deep dive to get a feeling of quality as it seems to be quantized model in int3 ?
Now, we want more : more models, 13b size, parameter access (temp, topp etc) and api. Anyway, i think this is great work already !
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions