Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

427

Base only

Active filters: rl

caiyuchen/Spiral-step-13

Text Generation • 4B • Updated Nov 15, 2025

caiyuchen/Spiral-step-15

Text Generation • 4B • Updated Nov 15, 2025 • 3

caiyuchen/Spiral-step-16

Text Generation • 4B • Updated Nov 15, 2025 • 1

caiyuchen/Spiral-step-18

Text Generation • 4B • Updated Nov 15, 2025

caiyuchen/Spiral-step-17

Text Generation • 4B • Updated Nov 15, 2025 • 2

caiyuchen/Spiral-step-20

Text Generation • 4B • Updated Nov 15, 2025 • 1

caiyuchen/Spiral-step-19

Text Generation • 4B • Updated Nov 15, 2025

caiyuchen/Spiral-step-22

Text Generation • 4B • Updated Nov 15, 2025 • 1

caiyuchen/Spiral-step-21

Text Generation • 4B • Updated Nov 15, 2025 • 1

HarleyCooper/Qwen3-30B-Dakota1890

Text Generation • Updated Nov 23, 2025 • 7 • 2

HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_step_120

Text Generation • 4B • Updated Feb 13 • 6

HarleyCooper/Qwen3-30B-ThinkingMachines-Dakota1890

Reinforcement Learning • Updated Nov 23, 2025 • 4

tigres2526/CAI-20B-v2

Text Generation • 21B • Updated Dec 16, 2025 • 4

mradermacher/CAI-20B-v2-GGUF

Text Generation • 21B • Updated Dec 1, 2025 • 36

mradermacher/CAI-20B-v2-i1-GGUF

Text Generation • 21B • Updated Dec 4, 2025 • 121

socaitcy/SOCAIT-Hermes-14B

Text Generation • Updated Dec 4, 2025

ash256/qwen3-4b-question-gen

Text Generation • 4B • Updated Dec 7, 2025 • 4 • • 1

pankajmathur/nanochat-d34-rl-all-ckpts

Text Generation • Updated Dec 9, 2025 • 1

pankajmathur/nanochat-d34-rl

Text Generation • Updated Dec 9, 2025

pankajmathur/RenCoder-Devstral-Small-2507

Text Generation • 24B • Updated Apr 10 • 27 • • 1

HallD/SkeptiSTEM-4B-v2-stageR3-grpo-lora

Text Generation • Updated Jan 4

anakin87/LFM2-2.6B-ttt-rl

Text Generation • Updated Apr 5 • 2

anakin87/LFM2-2.6B-ttt-rl-merged

Text Generation • 3B • Updated Apr 5 • 2

ModalityDance/Omni-R1

Any-to-Any • 7B • Updated Jan 21 • 9

ModalityDance/Omni-R1-Zero

Any-to-Any • 7B • Updated Jan 21 • 10

ibrahima2222/nanochat-d32

IIGroup/X-Coder-RL-Qwen2.5-7B

8B • Updated Jan 13 • 44 • 1

IIGroup/X-Coder-RL-Qwen3-8B

8B • Updated Jan 13 • 9 • 1

mradermacher/X-Coder-RL-Qwen3-8B-GGUF

8B • Updated Jan 11 • 200 • 1

mradermacher/X-Coder-RL-Qwen2.5-7B-GGUF

8B • Updated Jan 11 • 297