Score: 2

GEM+: Scalable State-of-the-Art Private Synthetic Data with Generator Networks

Published: November 12, 2025 | arXiv ID: 2511.09672v1

By: Samuel Maddock , Shripad Gade , Graham Cormode and more

BigTech Affiliations: Meta

Potential Business Impact:

Creates private data faster for more computers.

Business Areas:
Intelligent Systems Artificial Intelligence, Data and Analytics, Science and Engineering

State-of-the-art differentially private synthetic tabular data has been defined by adaptive 'select-measure-generate' frameworks, exemplified by methods like AIM. These approaches iteratively measure low-order noisy marginals and fit graphical models to produce synthetic data, enabling systematic optimisation of data quality under privacy constraints. Graphical models, however, are inefficient for high-dimensional data because they require substantial memory and must be retrained from scratch whenever the graph structure changes, leading to significant computational overhead. Recent methods, like GEM, overcome these limitations by using generator neural networks for improved scalability. However, empirical comparisons have mostly focused on small datasets, limiting real-world applicability. In this work, we introduce GEM+, which integrates AIM's adaptive measurement framework with GEM's scalable generator network. Our experiments show that GEM+ outperforms AIM in both utility and scalability, delivering state-of-the-art results while efficiently handling datasets with over a hundred columns, where AIM fails due to memory and computational overheads.

Country of Origin
🇺🇸 United States

Page Count
4 pages

Category
Computer Science:
Machine Learning (CS)