Activity - The best way to run a Llama model locally is using [Text generation web...

L_Acacia , 10 months ago (edited 10 months ago)

The best way to run a Llama model locally is using Text generation web UI, the model will most likely be quantized to 4/5bit GGML / GPTQ today, which will make it possible to run on a “normal” computer.

Phind might make it accessible on their website soon, but it doesn’t seem to be the case yet.

EDIT : Quantized version are available thanks to TheBloke

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...