Works like a charm ! 

Just wanted to report that this works perfect on my gtx1060 (6gb) on my old i5-7200 16gb ram under win10. So far, i never reached such a speed  with all other existing solution ( oobabooga, textsynth, llama.cpp). No single issue during install. I can't tell exactly but it's surely a couple of tokens / sec during inference. Need more deep dive to get a feeling of quality as it seems to be quantized model in int3 ? 
Now, we want more : more models, 13b size, parameter access (temp, topp etc) and api. Anyway, i think this is great work already ! 
![image](https://user-images.githubusercontent.com/39023861/235347457-b4f946ef-f0bf-4584-920c-065c60ff64e4.png)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Works like a charm ! #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Works like a charm ! #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions