DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection
By: Zhenhua Xu , Yiran Zhao , Mengting Zhong and more
The rapid growth of large language models raises pressing concerns about intellectual property protection under black-box deployment. Existing backdoor-based fingerprints either rely on rare tokens -- leading to high-perplexity inputs susceptible to filtering -- or use fixed trigger-response mappings that are brittle to leakage and post-hoc adaptation. We propose \textsc{Dual-Layer Nested Fingerprinting} (DNF), a black-box method that embeds a hierarchical backdoor by coupling domain-specific stylistic cues with implicit semantic triggers. Across Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, DNF achieves perfect fingerprint activation while preserving downstream utility. Compared with existing methods, it uses lower-perplexity triggers, remains undetectable under fingerprint detection attacks, and is relatively robust to incremental fine-tuning and model merging. These results position DNF as a practical, stealthy, and resilient solution for LLM ownership verification and intellectual property protection.
Similar Papers
EditMF: Drawing an Invisible Fingerprint for Your Large Language Models
Cryptography and Security
Protects AI secrets by hiding ownership codes.
DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection
Cryptography and Security
Finds who copied a smart computer program.
Inhibitory Attacks on Backdoor-based Fingerprinting for Large Language Models
Cryptography and Security
Stops people from stealing smart computer language.