whisper-tiny-it-multi-ggml

whisper.cpp GGML quantizations of LocalAI-io/whisper-tiny-it-multi for fast CPU/GPU inference.

Author: Ettore Di Giacinto

Brought to you by the LocalAI team. These models can be used directly with LocalAI and any whisper.cpp-based runtime.

Files

File Quantization Description
ggml-model-f16.bin float16 Full precision (no quantization) — highest quality
ggml-model-q8_0.bin int8 8-bit quantization — minimal quality loss
ggml-model-q5_0.bin int5 5-bit quantization — good quality/size tradeoff
ggml-model-q4_0.bin int4 4-bit quantization — smallest size, fastest

Training

Fine-tuned openai/whisper-tiny (39M params) on Common Voice 25.0 + MLS + VoxPopuli + FLEURS Italian.

See LocalAI-io/whisper-tiny-it-multi for the full safetensors model and detailed WER results.

Usage

whisper.cpp

# Download a quant
huggingface-cli download LocalAI-io/whisper-tiny-it-multi-ggml ggml-model-q5_0.bin --local-dir .

# Run
./whisper-cli -m ggml-model-q5_0.bin -f audio.wav -l it

whisper.cpp Python bindings (pywhispercpp)

from pywhispercpp.model import Model

model = Model("ggml-model-q5_0.bin", language="it")
segments = model.transcribe("audio.wav")
for seg in segments:
    print(seg.text)

LocalAI

# In your LocalAI model config
name: whisper-tiny-it-multi
backend: whisper
parameters:
  model: ggml-model-q5_0.bin

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LocalAI-io/whisper-tiny-it-multi-ggml

Finetuned
(1808)
this model

Datasets used to train LocalAI-io/whisper-tiny-it-multi-ggml