A Review of Privacy Metrics for Privacy-Preserving Synthetic Data Generation
By: Frederik Marinus Trudslev , Matteo Lissandrini , Juan Manuel Rodriguez and more
Potential Business Impact:
Shows how safe your private data is.
Privacy Preserving Synthetic Data Generation (PP-SDG) has emerged to produce synthetic datasets from personal data while maintaining privacy and utility. Differential privacy (DP) is the property of a PP-SDG mechanism that establishes how protected individuals are when sharing their sensitive data. It is however difficult to interpret the privacy budget ($\varepsilon$) expressed by DP. To make the actual risk associated with the privacy budget more transparent, multiple privacy metrics (PMs) have been proposed to assess the privacy risk of the data. These PMs are utilized in separate studies to assess newly introduced PP-SDG mechanisms. Consequently, these PMs embody the same assumptions as the PP-SDG mechanism they were made to assess. Therefore, a thorough definition of how these are calculated is necessary. In this work, we present the assumptions and mathematical formulations of 17 distinct privacy metrics.
Similar Papers
How to DP-fy Your Data: A Practical Guide to Generating Synthetic Data With Differential Privacy
Cryptography and Security
Creates fake data that protects real people's secrets.
Graph Structure Learning with Privacy Guarantees for Open Graph Data
Machine Learning (CS)
Keeps private info safe when sharing data.
Leveraging Vertical Public-Private Split for Improved Synthetic Data Generation
Machine Learning (CS)
Makes private data useful without showing real info.