Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness Paper • 2604.12373 • Published 2 days ago • 4
CRISP: Persistent Concept Unlearning via Sparse Autoencoders Paper • 2508.13650 • Published Aug 19, 2025 • 16
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space Paper • 2406.09325 • Published Jun 13, 2024 • 1