view article Article mmBERT: ModernBERT goes Multilingual +4 mmarone, orionweller, will-fleshman, eugene-yang, dlawrie, vandurme • Sep 9, 2025 • 146
view article Article BioClinical ModernBERT: an example of continued pre-training of ModernBERT thomas-sounack • Sep 10, 2025 • 6
Common Pile v0.1 Collection All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 40
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 163
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 183
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 514
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated Mar 12 • 36
GTE models Collection General Text Embedding Models Released by Tongyi Lab of Alibaba Group • 20 items • Updated Mar 2 • 36
view article Article Fine-tune ModernBERT for text classification using synthetic data davidberenstein1957 • Dec 30, 2024 • 39
view article Article Finally, a Replacement for BERT: Introducing ModernBERT +13 bwarner, NohTow, bclavie, orionweller, ohallstrom, staghado, alexisgallagher, rbiswasfc, fladhak, tomaarsen, ncoop57, griffin, jph00, johnowhitaker, iacolippo • Dec 19, 2024 • 740
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10, 2024 • 111