Assessing the Effectiveness of Membership Inference on Generative Music
By: Kurtis Chow , Omar Samiullah , Vinesh Sridhar and more
Potential Business Impact:
Finds if songs were copied to train AI.
Generative AI systems are quickly improving, now able to produce believable output in several modalities including images, text, and audio. However, this fast development has prompted increased scrutiny concerning user privacy and the use of copyrighted works in training. A recent attack on machine-learning models called membership inference lies at the crossroads of these two concerns. The attack is given as input a set of records and a trained model and seeks to identify which of those records may have been used to train the model. On one hand, this attack can be used to identify user data used to train a model, which may violate their privacy especially in sensitive applications such as models trained on medical data. On the other hand, this attack can be used by rights-holders as evidence that a company used their works without permission to train a model. Remarkably, it appears that no work has studied the effect of membership inference attacks (MIA) on generative music. Given that the music industry is worth billions of dollars and artists would stand to gain from being able to determine if their works were being used without permission, we believe this is a pressing issue to study. As such, in this work we begin a preliminary study into whether MIAs are effective on generative music. We study the effect of several existing attacks on MuseGAN, a popular and influential generative music model. Similar to prior work on generative audio MIAs, our findings suggest that music data is fairly resilient to known membership inference techniques.
Similar Papers
Membership and Dataset Inference Attacks on Large Audio Generative Models
Machine Learning (CS)
Finds if artists' music trained AI.
Membership Inference Attacks Beyond Overfitting
Cryptography and Security
Protects private data used to train smart programs.
Membership Inference Attack Should Move On to Distributional Statistics for Distilled Generative Models
Machine Learning (CS)
Finds hidden unauthorized data in AI models.