LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_2 Updated 15 days ago • 55
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_0 Updated 19 days ago • 66
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_1 Updated 19 days ago • 61
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_0 Text Generation • 0.6B • Updated 11 days ago • 1.29k
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_1 Text Generation • 0.6B • Updated 12 days ago • 234
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_2 Text Generation • 0.6B • Updated 12 days ago • 238
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_2 Text Generation • 0.6B • Updated 12 days ago • 186
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_1 Text Generation • 0.6B • Updated 12 days ago • 187
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated 12 days ago • 1.03k
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated 12 days ago • 1.03k
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_0 Text Generation • 0.6B • Updated 11 days ago • 1.29k
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_1 Text Generation • 0.6B • Updated 12 days ago • 234
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_1 Text Generation • 0.6B • Updated 12 days ago • 187
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_2 Text Generation • 0.6B • Updated 12 days ago • 186
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_2 Text Generation • 0.6B • Updated 12 days ago • 238
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_2 Updated 15 days ago • 55
LorenaYannnnn/general_reward-Qwen3-0.6B_7168-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated 15 days ago • 342