What's the quantization format of 4bit / 8bit?

#39

by WatermelonEast - opened Oct 31, 2024

Discussion

WatermelonEast

Oct 31, 2024

or it means fp4 / fp8?

GopiUppari

Google org Nov 4, 2024

Hi @WatermelonEast ,

The quantization format for 8-bit precision is int8, and for 4-bit precision, it is int4. To enable these quantization formats, you can use the following lines of code:

quantization_config = BitsAndBytesConfig(load_in_8bit=True)
quantization_config = BitsAndBytesConfig(load_in_4bit=True)

Thank you.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment