Tsinghua NLP Group

university

http://nlp.csai.tsinghua.edu.cn/

AI & ML interests

None defined yet.

Recent Activity

thunlp published a dataset 3 days ago

TsinghuaNLP/LexKairos

waang updated a dataset about 1 month ago

TsinghuaNLP/EVIL

Raincleared submitted a paper about 1 month ago

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

View all activity

Papers

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

View all Papers

published a dataset 3 days ago

TsinghuaNLP/LexKairos

Updated 3 days ago • 5

updated a dataset about 1 month ago

TsinghuaNLP/EVIL

Viewer • Updated May 24 • 5.75k • 71

submitted a paper to Daily Papers about 1 month ago

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

Paper • 2605.10933 • Published May 11 • 4

hbx

authored a paper 2 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113

hbx

submitted a paper to Daily Papers 2 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113

in TsinghuaNLP/LexChain 3 months ago

Upload 4 files

#1 opened 3 months ago by

hbx

submitted a paper to Daily Papers 4 months ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 60

submitted a paper to Daily Papers 6 months ago

FaithLens: Detecting and Explaining Faithfulness Hallucination

Paper • 2512.20182 • Published Dec 23, 2025 • 9

hbx

submitted a paper to Daily Papers 6 months ago

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Paper • 2512.16649 • Published Dec 18, 2025 • 31

published a dataset 8 months ago

TsinghuaNLP/EVIL

Viewer • Updated May 24 • 5.75k • 71

hbx

authored 7 papers 9 months ago

Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents

Paper • 2402.09205 • Published Feb 14, 2024

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

Paper • 2504.03612 • Published Apr 4, 2025 • 2

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published Jun 9, 2025 • 99

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

Paper • 2412.13549 • Published Dec 18, 2024

The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning

Paper • 2406.11721 • Published Jun 17, 2024

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 62

hbx

authored a paper over 1 year ago

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3, 2025 • 62