Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

432

Base only

Active filters: rl

caiyuchen/DAPO-step-18

Text Generation • 8B • Updated Oct 3, 2025 • 5

caiyuchen/DAPO-step-19

Text Generation • 8B • Updated Oct 3, 2025 • 5

caiyuchen/DAPO-step-20

Text Generation • 8B • Updated Oct 3, 2025 • 4

caiyuchen/DAPO-step-21

Text Generation • 8B • Updated Oct 3, 2025 • 4

caiyuchen/DAPO-step-22

Text Generation • 8B • Updated Oct 3, 2025 • 7

caiyuchen/DAPO-step-23

Text Generation • 8B • Updated Oct 3, 2025 • 4

caiyuchen/DAPO-step-24

Text Generation • 8B • Updated Oct 3, 2025 • 4

caiyuchen/DAPO-step-25

Text Generation • 8B • Updated Oct 3, 2025 • 6 • 1

caiyuchen/DAPO-step-26

Text Generation • 8B • Updated Oct 3, 2025 • 3

caiyuchen/DAPO-step-27

Text Generation • 8B • Updated Oct 8, 2025 • 3

McClain/naive-dna-llama-6mer

Text Generation • 0.2B • Updated Oct 5, 2025 • 2

abaryan/CyberXP_Agent_Llama_3.2_1B

Text Generation • 1B • Updated Oct 7, 2025 • 116 •

mradermacher/CyberXP_Agent_Llama_3.2_1B-GGUF

1B • Updated Oct 7, 2025 • 134 • 1

PokeeAI/pokee_research_7b

Text Generation • 8B • Updated Oct 23, 2025 • 14 • • 100

ArtusDev/PokeeAI_pokee_research_7b-EXL3

Updated Oct 22, 2025 • 14

Anonymouslolol/qwen3-8B-hanabi-step110

Reinforcement Learning • Updated Oct 24, 2025 • 21

SoumilR/nanochat-rl

Updated Oct 26, 2025 • 3

Mungert/pokee_research_7b-GGUF

Text Generation • 8B • Updated Nov 1, 2025 • 956 • 1

HarleyCooper/Qwen3-0.6B-Dakota-Grammar-RL

Text Generation • 0.8B • Updated Nov 10, 2025 • 3

mradermacher/Qwen3-0.6B-Dakota-Grammar-RL-GGUF

Reinforcement Learning • 0.8B • Updated Nov 10, 2025 • 179

HarleyCooper/Qwen3-0.6B-Dakota-Grammar-RL-400

Text Generation • Updated Nov 13, 2025 • 4

caiyuchen/PPO-step-1

Text Generation • 8B • Updated Nov 14, 2025 • 2

caiyuchen/PPO-step-2

Text Generation • 8B • Updated Nov 14, 2025 • 3

caiyuchen/PPO-step-3

Text Generation • 8B • Updated Nov 14, 2025 • 3

caiyuchen/PPO-step-4

Text Generation • 8B • Updated Nov 14, 2025 • 3

caiyuchen/PPO-step-5

Text Generation • 8B • Updated Nov 14, 2025 • 2

caiyuchen/PPO-step-6

Text Generation • 8B • Updated Nov 14, 2025 • 2

caiyuchen/PPO-step-16

Text Generation • 8B • Updated Nov 14, 2025 • 2

caiyuchen/PPO-step-7

Text Generation • 8B • Updated Nov 14, 2025 • 2

caiyuchen/PPO-step-17

Text Generation • 8B • Updated Nov 14, 2025 • 3