SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models
By: Anil Ramakrishna , Yixin Wan , Xiaomeng Jin and more
Potential Business Impact:
Removes private info from AI writing.
We introduce SemEval-2025 Task 4: unlearning sensitive content from Large Language Models (LLMs). The task features 3 subtasks for LLM unlearning spanning different use cases: (1) unlearn long form synthetic creative documents spanning different genres; (2) unlearn short form synthetic biographies containing personally identifiable information (PII), including fake names, phone number, SSN, email and home addresses, and (3) unlearn real documents sampled from the target model's training dataset. We received over 100 submissions from over 30 institutions and we summarize the key techniques and lessons in this paper.
Similar Papers
Cyber for AI at SemEval-2025 Task 4: Forgotten but Not Lost: The Balancing Act of Selective Unlearning in Large Language Models
Computation and Language
Removes private info from AI without retraining.
A Survey on Unlearning in Large Language Models
Computation and Language
Lets AI forget private or bad information.
AILS-NTUA at SemEval-2025 Task 4: Parameter-Efficient Unlearning for Large Language Models using Data Chunking
Computation and Language
Removes bad info from AI without hurting its smarts.