Score: 0

Perturb Your Data: Paraphrase-Guided Training Data Watermarking

Published: December 18, 2025 | arXiv ID: 2512.17075v1

By: Pranav Shetty , Mirazul Haque , Petr Babkin and more

Potential Business Impact:

Marks text so AI can't steal it.

Business Areas:

Text Analytics Data and Analytics, Software

Training data detection is critical for enforcing copyright and data licensing, as Large Language Models (LLM) are trained on massive text corpora scraped from the internet. We present SPECTRA, a watermarking approach that makes training data reliably detectable even when it comprises less than 0.001% of the training corpus. SPECTRA works by paraphrasing text using an LLM and assigning a score based on how likely each paraphrase is, according to a separate scoring model. A paraphrase is chosen so that its score closely matches that of the original text, to avoid introducing any distribution shifts. To test whether a suspect model has been trained on the watermarked data, we compare its token probabilities against those of the scoring model. We demonstrate that SPECTRA achieves a consistent p-value gap of over nine orders of magnitude when detecting data used for training versus data not used for training, which is greater than all baselines tested. SPECTRA equips data owners with a scalable, deploy-before-release watermark that survives even large-scale LLM training.

DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack

Cryptography and Security

Protects AI writing from being faked or changed.

18 Dec 2025 1

89%

Leave No TRACE: Black-box Detection of Copyrighted Dataset Usage in Large Language Models via Watermarking

Computation and Language

Protects writing from being copied by AI.

3 Oct 2025 2

89%

LexiMark: Robust Watermarking via Lexical Substitutions to Enhance Membership Verification of an LLM's Textual Training Data

Computation and Language

Marks text so AI can't steal it.

17 Jun 2025 2

View PDF Login to Bookmark

Page Count

17 pages

Perturb Your Data: Paraphrase-Guided Training Data Watermarking

Marks text so AI can't steal it.

Technical Abstract

DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack

Leave No TRACE: Black-box Detection of Copyrighted Dataset Usage in Large Language Models via Watermarking

LexiMark: Robust Watermarking via Lexical Substitutions to Enhance Membership Verification of an LLM's Textual Training Data