AgentMark: Utility-Preserving Behavioral Watermarking for Agents
By: Kaibo Huang , Jin Tan , Yukun Wei and more
Potential Business Impact:
Tracks how AI plans its steps to solve problems.
LLM-based agents are increasingly deployed to autonomously solve complex tasks, raising urgent needs for IP protection and regulatory provenance. While content watermarking effectively attributes LLM-generated outputs, it fails to directly identify the high-level planning behaviors (e.g., tool and subgoal choices) that govern multi-step execution. Critically, watermarking at the planning-behavior layer faces unique challenges: minor distributional deviations in decision-making can compound during long-term agent operation, degrading utility, and many agents operate as black boxes that are difficult to intervene in directly. To bridge this gap, we propose AgentMark, a behavioral watermarking framework that embeds multi-bit identifiers into planning decisions while preserving utility. It operates by eliciting an explicit behavior distribution from the agent and applying distribution-preserving conditional sampling, enabling deployment under black-box APIs while remaining compatible with action-layer content watermarking. Experiments across embodied, tool-use, and social environments demonstrate practical multi-bit capacity, robust recovery from partial logs, and utility preservation. The code is available at https://github.com/Tooooa/AgentMark.
Similar Papers
Agent Guide: A Simple Agent Behavioral Watermarking Framework
Artificial Intelligence
Tracks computer agents to find bad ones.
Towards Trustworthy Multi-Turn LLM Agents via Behavioral Guidance
Artificial Intelligence
Makes AI agents follow rules and complete tasks.
From Bits to Boardrooms: A Cutting-Edge Multi-Agent LLM Framework for Business Excellence
Artificial Intelligence
Helps businesses make smarter, faster decisions.