Score: 1

From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection

Published: December 18, 2025 | arXiv ID: 2512.16439v1

By: Hao Li , Yubing Ren , Yanan Cao and more

Potential Business Impact:

Protects AI writing from being copied secretly.

Business Areas:

Semantic Search Internet Services

Benefiting from the superior capabilities of large language models in natural language understanding and generation, Embeddings-as-a-Service (EaaS) has emerged as a successful commercial paradigm on the web platform. However, prior studies have revealed that EaaS is vulnerable to imitation attacks. Existing methods protect the intellectual property of EaaS through watermarking techniques, but they all ignore the most important properties of embedding: semantics, resulting in limited harmlessness and stealthiness. To this end, we propose SemMark, a novel semantic-based watermarking paradigm for EaaS copyright protection. SemMark employs locality-sensitive hashing to partition the semantic space and inject semantic-aware watermarks into specific regions, ensuring that the watermark signals remain imperceptible and diverse. In addition, we introduce the adaptive watermark weight mechanism based on the local outlier factor to preserve the original embedding distribution. Furthermore, we propose Detect-Sampling and Dimensionality-Reduction attacks and construct four scenarios to evaluate the watermarking method. Extensive experiments are conducted on four popular NLP datasets, and SemMark achieves superior verifiability, diversity, stealthiness, and harmlessness.

Watermarks for Embeddings-as-a-Service Large Language Models

Computation and Language

Protects AI text tools from being copied.

28 Nov 2025 2

94%

RegionMarker: A Region-Triggered Semantic Watermarking Framework for Embedding-as-a-Service Copyright Protection

Computation and Language

Protects writing from being stolen by computers.

17 Nov 2025 0

90%

Rotation, Scale, and Translation Resilient Black-box Fingerprinting for Intellectual Property Protection of EaaS Models

Cryptography and Security

Proves who owns smart computer models.

19 Oct 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

17 pages

From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection

Protects AI writing from being copied secretly.

Technical Abstract

Watermarks for Embeddings-as-a-Service Large Language Models

RegionMarker: A Region-Triggered Semantic Watermarking Framework for Embedding-as-a-Service Copyright Protection

Rotation, Scale, and Translation Resilient Black-box Fingerprinting for Intellectual Property Protection of EaaS Models