Simone Van Taylor's picture

4 8

Simone Van Taylor

svannie678

·

AI & ML interests

None yet

Organizations

None yet

upvoted a paper over 1 year ago

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

Paper • 2308.03825 • Published Aug 7, 2023 • 2

upvoted an article over 1 year ago

Article

Introducing the Red-Teaming Resistance Leaderboard

+2

steve-sli, richard2, leonardtang, clefourrier

•

Feb 23, 2024

• 13

upvoted an article almost 2 years ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

+2

natolambert, LouisCastricato, lvwerra, Dahoas

•

Dec 9, 2022

• 412

upvoted a collection over 2 years ago

Awesome RLHF

A curated collection of datasets, models, Spaces, and papers on Reinforcement Learning from Human Feedback (RLHF). • 11 items • Updated Oct 2, 2023 • 7