Score: 0

Unlearning in LLMs: Methods, Evaluation, and Open Challenges

Published: January 19, 2026 | arXiv ID: 2601.13264v1

By: Tyler Lizzo, Larry Heck

Potential Business Impact:

Removes unwanted information from AI without retraining.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models (LLMs) have achieved remarkable success across natural language processing tasks, yet their widespread deployment raises pressing concerns around privacy, copyright, security, and bias. Machine unlearning has emerged as a promising paradigm for selectively removing knowledge or data from trained models without full retraining. In this survey, we provide a structured overview of unlearning methods for LLMs, categorizing existing approaches into data-centric, parameter-centric, architecture-centric, hybrid, and other strategies. We also review the evaluation ecosystem, including benchmarks, metrics, and datasets designed to measure forgetting effectiveness, knowledge retention, and robustness. Finally, we outline key challenges and open problems, such as scalable efficiency, formal guarantees, cross-language and multimodal unlearning, and robustness against adversarial relearning. By synthesizing current progress and highlighting open directions, this paper aims to serve as a roadmap for developing reliable and responsible unlearning techniques in large language models.