WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs Paper • 2406.18495 • Published Jun 26, 2024 • 14
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published 27 days ago • 116
view article Article DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models lightonai • Apr 21 • 37
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 902
Mistral Small 4 Collection A state-of-the-art model, open-weight, with a granular Mixture-of-Experts architecture that fuses instruct, reasoning and agentic skills. • 3 items • Updated Mar 16 • 74
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1, 2025 • 62
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 18 items • Updated 39 minutes ago • 297
view article Article Introducing Storage Buckets on the Hugging Face Hub +10 Wauplin, coyotte508, XciD, victor, julien-c, lhoestq, pierric, Sylvestre, hlarcher, rajatarya, seanses, assafvayner • Mar 10 • 195
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published Feb 9 • 290
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published Feb 11 • 199
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published Feb 5 • 355
view article Article IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST ibm-research • Feb 18 • 19
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 ggerganov, ngxson, allozaur, lysandre, victor, julien-c • Feb 20 • 507
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 304
NanoBEIR 🍺 Collection A collection of smaller versions of BEIR datasets with 50 queries and up to 10K documents each. • 13 items • Updated Sep 11, 2024 • 27
view article Article **ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models?** lightonai • Feb 19 • 21
ColBERT-Zero 🐶 Collection First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT • 10 items • Updated 2 days ago • 22
view article Article LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling lightonai • Feb 12 • 56
LateOn-Code 💻 Collection State-of-the-art late interaction code retrieval models • 6 items • Updated 2 days ago • 20