TinCan Speech Commands Model
A compact English speech-command recognition model for tincan app.
This model recognizes 47 short command classes and is designed for small-footprint command recognition where cloud ASR is unnecessary or undesirable. The exported ONNX artifact is under 400 KB, making it practical for local-first applications, prototypes, and edge deployments.
- 12 custom words
- and 35 words from the Google Speech Commands dataset v2
Highlights
- 47-class English command recognizer
- ONNX export for portable inference
- Small model artifact:
model.onnxis approximately 378 KB - Based on NVIDIA NeMo's MatchboxNet command-recognition model family
Base Model
This model uses NVIDIA NeMo's commandrecognition_en_matchboxnet3x2x64_v2 MatchboxNet command-recognition architecture.
Base model reference: commandrecognition_en_matchboxnet3x2x64_v2
Metrics
These metrics describe the currently exported model.onnx artifact.
| Metric | Value |
|---|---|
| Validation loss | 0.1493 |
| Validation micro top-1 accuracy | 95.28% |
| Validation macro accuracy | 94.61% |
Supported Commands
Custom TinCan commands:
astra, bali, boston, capri, delhi, dublin, frisco, monaco, oslo, paris, seatown, tokyo
Google Speech Commands labels:
yes, no, up, down, left, right, on, off, stop, go, zero, one, two, three, four, five, six, seven, eight, nine, bed, bird, cat, dog, happy, house, marvin, sheila, tree, wow, backward, forward, follow, learn, visual
Inference Notes
The model outputs logits over the 47 labels listed in labels.json. Use the output index to look up the predicted command label.
Training Provenance
| Field | Value |
|---|---|
| Model name | commandrecognition_en_matchboxnet3x2x64_v2 |
| Export format | ONNX |
| Epochs | 10 |
| Batch size | 32 |
Limitations
- This is a closed-vocabulary command recognizer, not a general speech-to-text model.
- The model is intended for English short-command recognition.
- Validation metrics may not fully predict performance with every microphone, speaker, accent, room, or noise condition.
- Downloads last month
- 34
Dataset used to train HashNuke/tincan-wakewords
Evaluation results
- Validation loss on TinCan Speech Commands validation setself-reported0.149
- Validation micro top-1 accuracy on TinCan Speech Commands validation setself-reported95.280
- Validation macro accuracy on TinCan Speech Commands validation setself-reported94.610