Agent Guide: A Simple Agent Behavioral Watermarking Framework
By: Kaibo Huang , Zipei Zhang , Zhongliang Yang and more
Potential Business Impact:
Tracks computer agents to find bad ones.
The increasing deployment of intelligent agents in digital ecosystems, such as social media platforms, has raised significant concerns about traceability and accountability, particularly in cybersecurity and digital content protection. Traditional large language model (LLM) watermarking techniques, which rely on token-level manipulations, are ill-suited for agents due to the challenges of behavior tokenization and information loss during behavior-to-action translation. To address these issues, we propose Agent Guide, a novel behavioral watermarking framework that embeds watermarks by guiding the agent's high-level decisions (behavior) through probability biases, while preserving the naturalness of specific executions (action). Our approach decouples agent behavior into two levels, behavior (e.g., choosing to bookmark) and action (e.g., bookmarking with specific tags), and applies watermark-guided biases to the behavior probability distribution. We employ a z-statistic-based statistical analysis to detect the watermark, ensuring reliable extraction over multiple rounds. Experiments in a social media scenario with diverse agent profiles demonstrate that Agent Guide achieves effective watermark detection with a low false positive rate. Our framework provides a practical and robust solution for agent watermarking, with applications in identifying malicious agents and protecting proprietary agent systems.
Similar Papers
AgentMark: Utility-Preserving Behavioral Watermarking for Agents
Cryptography and Security
Tracks how AI plans its steps to solve problems.
SentinelAgent: Graph-based Anomaly Detection in Multi-Agent Systems
Artificial Intelligence
Keeps AI teams from making mistakes or being tricked.
AgentGuard: Repurposing Agentic Orchestrator for Safety Evaluation of Tool Orchestration
Cryptography and Security
Keeps AI from doing harmful things with tools.