VICTOR: Dataset Copyright Auditing in Video Recognition Systems
By: Quan Yuan , Zhikun Zhang , Linkang Du and more
Potential Business Impact:
Protects video data from being copied without permission.
Video recognition systems are increasingly being deployed in daily life, such as content recommendation and security monitoring. To enhance video recognition development, many institutions have released high-quality public datasets with open-source licenses for training advanced models. At the same time, these datasets are also susceptible to misuse and infringement. Dataset copyright auditing is an effective solution to identify such unauthorized use. However, existing dataset copyright solutions primarily focus on the image domain; the complex nature of video data leaves dataset copyright auditing in the video domain unexplored. Specifically, video data introduces an additional temporal dimension, which poses significant challenges to the effectiveness and stealthiness of existing methods. In this paper, we propose VICTOR, the first dataset copyright auditing approach for video recognition systems. We develop a general and stealthy sample modification strategy that enhances the output discrepancy of the target model. By modifying only a small proportion of samples (e.g., 1%), VICTOR amplifies the impact of published modified samples on the prediction behavior of the target models. Then, the difference in the model's behavior for published modified and unpublished original samples can serve as a key basis for dataset auditing. Extensive experiments on multiple models and datasets highlight the superiority of VICTOR. Finally, we show that VICTOR is robust in the presence of several perturbation mechanisms to the training videos or the target models.
Similar Papers
Dataset Ownership in the Era of Large Language Models
Cryptography and Security
Protects computer learning data from being stolen.
Evading Data Provenance in Deep Neural Networks
CV and Pattern Recognition
Bypasses AI data copyright detection stealthily
Auditing Data Provenance in Real-world Text-to-Image Diffusion Models for Privacy and Copyright Protection
CV and Pattern Recognition
Checks if AI art uses private photos.