Position: Privacy Is Not Just Memorization!
By: Niloofar Mireshghallah, Tianshi Li
Potential Business Impact:
Protects your secrets from smart computer programs.
The discourse on privacy risks in Large Language Models (LLMs) has disproportionately focused on verbatim memorization of training data, while a constellation of more immediate and scalable privacy threats remain underexplored. This position paper argues that the privacy landscape of LLM systems extends far beyond training data extraction, encompassing risks from data collection practices, inference-time context leakage, autonomous agent capabilities, and the democratization of surveillance through deep inference attacks. We present a comprehensive taxonomy of privacy risks across the LLM lifecycle -- from data collection through deployment -- and demonstrate through case studies how current privacy frameworks fail to address these multifaceted threats. Through a longitudinal analysis of 1,322 AI/ML privacy papers published at leading conferences over the past decade (2016--2025), we reveal that while memorization receives outsized attention in technical research, the most pressing privacy harms lie elsewhere, where current technical approaches offer little traction and viable paths forward remain unclear. We call for a fundamental shift in how the research community approaches LLM privacy, moving beyond the narrow focus of current technical solutions and embracing interdisciplinary approaches that address the sociotechnical nature of these emerging threats.
Similar Papers
Beyond Data Privacy: New Privacy Risks for Large Language Models
Cryptography and Security
Protects your secrets from smart computer programs.
A Survey on Privacy Risks and Protection in Large Language Models
Cryptography and Security
Keeps your secrets safe from smart computer programs.
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and Mitigation
Cryptography and Security
Keeps your private info safe from smart computer programs.