whisper-ja-1.5B-ct2

CTranslate2 conversion of efwkjn/whisper-ja-1.5B with bfloat16 weights, for use with faster-whisper.

The original model is a Whisper large-v3 finetune for Japanese ASR, achieving competitive/SOTA CER across tested sets. See the original repo for details and benchmarks.

Usage

from faster_whisper import WhisperModel

model = WhisperModel("TransWithAI/whisper-ja-1.5B-ct2", device="cuda", compute_type="bfloat16")

segments, info = model.transcribe("audio.wav", language="ja")
for segment in segments:
    print(f"[{segment.start:.2f} -> {segment.end:.2f}] {segment.text}")

Conversion

Converted with CTranslate2 4.7.2:

ct2-transformers-converter \
    --model efwkjn/whisper-ja-1.5B \
    --output_dir whisper-ja-1.5B-ct2 \
    --quantization bfloat16 \
    --copy_files tokenizer.json preprocessor_config.json

Acknowledgements

All credit for the model goes to efwkjn. Acknowledgements from the original model card:

  • Train sets: OOPPEENN, Reazon, 小虫哥_, Common Voice 20, deepghs
  • Test sets: KitsuneX07, TEDxJP, kotoba-tech, Saruwatari-lab, grider-withourai
Downloads last month
40
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TransWithAI/whisper-ja-1.5B-ct2

Finetuned
(2)
this model