How do we measure privacy in text? A survey of text anonymization metrics
By: Yaxuan Ren , Krithika Ramesh , Yaxing Yao and more
Potential Business Impact:
Makes sure private text stays private.
In this work, we aim to clarify and reconcile metrics for evaluating privacy protection in text through a systematic survey. Although text anonymization is essential for enabling NLP research and model development in domains with sensitive data, evaluating whether anonymization methods sufficiently protect privacy remains an open challenge. In manually reviewing 47 papers that report privacy metrics, we identify and compare six distinct privacy notions, and analyze how the associated metrics capture different aspects of privacy risk. We then assess how well these notions align with legal privacy standards (HIPAA and GDPR), as well as user-centered expectations grounded in HCI studies. Our analysis offers practical guidance on navigating the landscape of privacy evaluation approaches further and highlights gaps in current practices. Ultimately, we aim to facilitate more robust, comparable, and legally aware privacy evaluations in text anonymization.
Similar Papers
A Survey on Current Trends and Recent Advances in Text Anonymization
Computation and Language
Keeps private text private, even with smart AI.
Current State in Privacy-Preserving Text Preprocessing for Domain-Agnostic NLP
Computation and Language
Keeps your private words safe from AI.
A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage
Cryptography and Security
Keeps private details hidden even after cleaning text.