Question Classifiers
Collection
taxonomy, datasets and baseline models for question type classification • 6 items • Updated
Multilingual sentence-type classifiers (ONNX) trained on TigreGotico/sentence-types-multilingual (9,900 balanced samples per language, 6 classes).
Used by little_questions.
command, exclamation, polar_question, request, statement, wh_question
| File | Language |
|---|---|
sentence_type_EN_0.8.0.onnx |
English |
sentence_type_DE_0.8.0.onnx |
German |
sentence_type_ES_0.8.0.onnx |
Spanish |
sentence_type_FR_0.8.0.onnx |
French |
sentence_type_IT_0.8.0.onnx |
Italian |
sentence_type_NL_0.8.0.onnx |
Dutch |
sentence_type_PT_0.8.0.onnx |
Portuguese |
| Language | Accuracy | Macro F1 |
|---|---|---|
| EN | 99.2% | 99.2% |
| NL | 98.8% | 98.8% |
| FR | 97.1% | 97.1% |
| IT | 97.0% | 97.0% |
| PT | 95.4% | 95.4% |
| DE | 85.6% | 84.9% |
| ES | 74.6% | 72.7% |
import onnxruntime as rt, numpy as np, json
sess = rt.InferenceSession("sentence_type_EN_0.8.0.onnx")
classes = json.loads(sess.get_modelmeta().custom_metadata_map["classes"])
inp = np.array(["Who invented the telephone?"], dtype=object)
label_idx, probs = sess.run(None, {"input": inp})
print(classes[int(label_idx[0])]) # wh_question