A Survey on Current Trends and Recent Advances in Text Anonymization
By: Tobias Deußer , Lorenz Sparrenberg , Armin Berger and more
Potential Business Impact:
Keeps private text private, even with smart AI.
The proliferation of textual data containing sensitive personal information across various domains requires robust anonymization techniques to protect privacy and comply with regulations, while preserving data usability for diverse and crucial downstream tasks. This survey provides a comprehensive overview of current trends and recent advances in text anonymization techniques. We begin by discussing foundational approaches, primarily centered on Named Entity Recognition, before examining the transformative impact of Large Language Models, detailing their dual role as sophisticated anonymizers and potent de-anonymization threats. The survey further explores domain-specific challenges and tailored solutions in critical sectors such as healthcare, law, finance, and education. We investigate advanced methodologies incorporating formal privacy models and risk-aware frameworks, and address the specialized subfield of authorship anonymization. Additionally, we review evaluation frameworks, comprehensive metrics, benchmarks, and practical toolkits for real-world deployment of anonymization solutions. This review consolidates current knowledge, identifies emerging trends and persistent challenges, including the evolving privacy-utility trade-off, the need to address quasi-identifiers, and the implications of LLM capabilities, and aims to guide future research directions for both academics and practitioners in this field.
Similar Papers
How do we measure privacy in text? A survey of text anonymization metrics
Computation and Language
Makes sure private text stays private.
Current State in Privacy-Preserving Text Preprocessing for Domain-Agnostic NLP
Computation and Language
Keeps your private words safe from AI.
Augmenting Anonymized Data with AI: Exploring the Feasibility and Limitations of Large Language Models in Data Enrichment
Cryptography and Security
Keeps private information safe while still useful.