Yet Another Watermark for Large Language Models
By: Siyuan Bao , Ying Shi , Zhiguang Yang and more
Potential Business Impact:
Marks computer writing so you know it's real.
Existing watermarking methods for large language models (LLMs) mainly embed watermark by adjusting the token sampling prediction or post-processing, lacking intrinsic coupling with LLMs, which may significantly reduce the semantic quality of the generated marked texts. Traditional watermarking methods based on training or fine-tuning may be extendable to LLMs. However, most of them are limited to the white-box scenario, or very time-consuming due to the massive parameters of LLMs. In this paper, we present a new watermarking framework for LLMs, where the watermark is embedded into the LLM by manipulating the internal parameters of the LLM, and can be extracted from the generated text without accessing the LLM. Comparing with related methods, the proposed method entangles the watermark with the intrinsic parameters of the LLM, which better balances the robustness and imperceptibility of the watermark. Moreover, the proposed method enables us to extract the watermark under the black-box scenario, which is computationally efficient for use. Experimental results have also verified the feasibility, superiority and practicality. This work provides a new perspective different from mainstream works, which may shed light on future research.
Similar Papers
Yet Another Watermark for Large Language Models
Cryptography and Security
Marks AI writing so you know it's from a machine.
EditMark: Watermarking Large Language Models based on Model Editing
Cryptography and Security
Marks AI writing to prove it's yours.
Large Language Models Are Effective Code Watermarkers
Cryptography and Security
Tags code to prove who wrote it.