2 1

Yiding Shi

snoopd

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes

updated a model 11 months ago

snoopd/Reinforce-07212025

published a model 11 months ago

snoopd/Reinforce-07212025

View all activity

Organizations

None yet

upvoted a paper 4 days ago

DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes

Paper • 2605.28421 • Published 5 days ago • 44

updated a model 11 months ago

snoopd/Reinforce-07212025

Reinforcement Learning • Updated Jul 21, 2025

published a model 11 months ago

snoopd/Reinforce-07212025

Reinforcement Learning • Updated Jul 21, 2025

updated a model 11 months ago

snoopd/dqn-SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated Jul 14, 2025

published a model 11 months ago

snoopd/dqn-SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated Jul 14, 2025

updated a model 11 months ago

snoopd/gymnasium-rl-v1

Reinforcement Learning • Updated Jul 11, 2025

published a model 11 months ago

snoopd/gymnasium-rl-v1

Reinforcement Learning • Updated Jul 11, 2025

updated 2 models 11 months ago

snoopd/distilbert-base-uncased-lora-text-classification

Updated Jul 7, 2025

snoopd/distilbert-base-uncased-lora-text-classification_test

Updated Jul 7, 2025

published 2 models 11 months ago

snoopd/distilbert-base-uncased-lora-text-classification

Updated Jul 7, 2025

snoopd/distilbert-base-uncased-lora-text-classification_test

Updated Jul 7, 2025

liked a dataset over 1 year ago

fka/prompts.chat

Viewer • Updated about 4 hours ago • 1.85k • 27.5k • 9.72k

upvoted an article over 1 year ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

natolambert, LouisCastricato, lvwerra, Dahoas

•

Dec 9, 2022

• 414

Yiding Shi

AI & ML interests

Recent Activity

Organizations

snoopd's activity

Illustrating Reinforcement Learning from Human Feedback (RLHF)