RLHFlow/pair-preference-model-LLaMA3-8B Text Generation • 8B • Updated Oct 14, 2024 • 18 • • 38
RLHFlow/RewardModel-Mistral-7B-for-DPA-v1 Text Classification • 7B • Updated May 23, 2024 • 213 • 4