Dumb-1.2-Preview-0618

——a new dumb model!

training time: ~1 hour

The architecture is the same as Dumb-1.2-Exp-0616.

Efficiency

Benchmark (per training step)

Benchmark 06-18 1st (6000s) 06-18 2nd (26000s, easymaxxed) Dumb-1.2-Preview-0618 (3rd, balanced)
MMLU (acc) 23.21% 23.18% 23.71%
HellaSwag (acc_norm) 26.31% 26.67% 26.43%
ARC-Easy (acc_norm) 30.39% 32.03% 30.77%
PIQA (acc) 55.98% 55.55% 54.90%
SciQ (acc_norm) 35.40% 38.50% 38.00%
ARC-C (acc_norm) 22.53% 20.90% 21.76%
WinoGrande (acc) 49.09% 51.54% 51.46%
OpenBookQA (acc_norm) 24.00% 24.80% 25.20%
Downloads last month
18
Safetensors
Model size
34.6M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 56m/Dumb-1.2-Preview-0618

Quantizations
2 models

Space using 56m/Dumb-1.2-Preview-0618 1