Score: 2

How do we measure privacy in text? A survey of text anonymization metrics

Published: November 30, 2025 | arXiv ID: 2512.01109v1

By: Yaxuan Ren , Krithika Ramesh , Yaxing Yao and more

BigTech Affiliations: Johns Hopkins University

Potential Business Impact:

Makes sure private text stays private.

Business Areas:
Text Analytics Data and Analytics, Software

In this work, we aim to clarify and reconcile metrics for evaluating privacy protection in text through a systematic survey. Although text anonymization is essential for enabling NLP research and model development in domains with sensitive data, evaluating whether anonymization methods sufficiently protect privacy remains an open challenge. In manually reviewing 47 papers that report privacy metrics, we identify and compare six distinct privacy notions, and analyze how the associated metrics capture different aspects of privacy risk. We then assess how well these notions align with legal privacy standards (HIPAA and GDPR), as well as user-centered expectations grounded in HCI studies. Our analysis offers practical guidance on navigating the landscape of privacy evaluation approaches further and highlights gaps in current practices. Ultimately, we aim to facilitate more robust, comparable, and legally aware privacy evaluations in text anonymization.

Country of Origin
🇺🇸 United States

Repos / Data Links

Page Count
13 pages

Category
Computer Science:
Computation and Language