Difan Jiao's picture

Difan Jiao

difanjiao

·

difanj0713

AI & ML interests

Generative Models & Mech Interp

Recent Activity

submitted a paper 7 days ago

LLM Safety From Within: Detecting Harmful Content with Internal Representations

upvoted a paper 8 days ago

LLM Safety From Within: Detecting Harmful Content with Internal Representations

updated a model 8 days ago

UofTCSSLab/SIREN-Llama-3.1-8B

View all activity

Organizations

difanjiao 's collections 1