Gros-Michel-90m-Base is a 90m parameter billingual LLM trained on 4.5 billion tokens of a custom dataset mixture, then further enhanced with a 2 billion token continued pretraining run. The goal with this model is to provide a flexible base for further finetuning on downstream tasks, such as translation, sentiment analysis and extraction.
Gros-Michel-90m-Base uses a tokenizer trained on both english and german data, with a vocab size of 20000.
Pretrain Data mixture
| Dataset | Weight |
|---|---|
HuggingFaceFW/fineweb-edu |
38% |
epfml/FineWeb-HQ |
18% |
HuggingFaceTB/cosmopedia (stories split) |
18% |
HuggingFaceTB/finemath (finemath-4plus) |
6% |
finnianx/de_corpus |
20% |
Continued pretrain Data mixture
| Dataset | Weight |
|---|---|
"wikimedia/wikipedia", "20231101.en" |
40% |
"wikimedia/wikipedia", "20231101.de" |
40% |
HuggingFaceTB/finemath (finemath-4plus) |
20% |
Comparison to other models
| Maker | Model | Hellaswag | ARC (easy) | PIQA | BLiMP | Average |
|---|---|---|---|---|---|---|
| finnianx | Gros-Michel-90M | 30.26% | 41.50% | 59.41% | 78.35% | 52.38% |
| finnianx | Michel-Nano-v2 | 27.40% | 35.90% | 56.75% | 72.52% | 48.14% |
| Axiomic Labs | GPT-S-5M | 27.39% | 33.16% | 57.13% | 72.21% | 47.47% |
| EleutherAI | pythia-31m | 27.14% | 33.88% | 56.26% | 67.78% | 46.27% |
| MaliosDark | Isabel-50M | 27.1% | 43.81% | 57.12% | 73.75% | 50.44% |
German Benchmarks
| Model | arc_de acc | arc_de acc_norm | hellaswag_de acc | hellaswag_de acc_norm | m_mmlu_de acc | truthfulqa_de_mc1 acc | truthfulqa_de_mc2 acc |
|---|---|---|---|---|---|---|---|
| Gros-Michel-90M-Base | 0.1865 | 0.2284 | 0.2697 | 0.2852 | 0.2346 | 0.2348 | 0.4285 |
| nanochat German v1 | 0.2241 | 0.2626 | 0.3203 | 0.3581 | 0.2285 | 0.2500 | 0.4184 |
| LLäMmlein-120M | 0.1942 | 0.2301 | 0.2945 | 0.3178 | 0.2285 | 0.2310 | 0.4055 |
| LLäMmlein-1B | 0.2515 | 0.2960 | 0.3703 | 0.4490 | 0.2317 | 0.2322 | 0.3617 |
Notice
This model has not undergone any alignment, and therefore may produce harmful content.
Evaluation was done in lm-eval-harness by EleutherAI, all benchmark scores use normalized accuracy where applicable and are zero-shot.
Future plans
Sometime in the near(ish) future i will release an instruction tuned variant of this model, along with a translation focused finetune. GGUF support will also come in the near(ish) future.
- Downloads last month
- 239