1-bit LLM performs similarly to full-precision Transformer LLMs with the same model size and training tokens but is much more efficient in terms of latency, memory, throughput, and energy consumption.

kevlar21 , 4 months ago

Why use lot bit when one bit do trick?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

tubbadu , 4 months ago

Bits together weak

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

QBertReynolds , 4 months ago

Says 1-bit then goes on to describe inputs as -1, 0, or 1. That’s 2-bit. Am I missing something here?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

will_a113 , 4 months ago

It’s actually 1.58bits weirdly. The addition of 0 here was the significant change/improvement in this experiment. The paper isn’t too dense and has some decent tables that explain things fairly accessibly.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...