Score: 1

QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems

Published: December 18, 2025 | arXiv ID: 2512.16279v1

By: Yiliu Yang , Yilei Jiang , Qunzhong Wang and more

Potential Business Impact:

Guards AI agents to prevent them from doing bad things.

Business Areas:

Intelligent Systems Artificial Intelligence, Data and Analytics, Science and Engineering

Safety risks arise as large language model-based agents solve complex tasks with tools, multi-step plans, and inter-agent messages. However, deployer-written policies in natural language are ambiguous and context dependent, so they map poorly to machine-checkable rules, and runtime enforcement is unreliable. Expressing safety policies as sequents, we propose \textsc{QuadSentinel}, a four-agent guard (state tracker, policy verifier, threat watcher, and referee) that compiles these policies into machine-checkable rules built from predicates over observable state and enforces them online. Referee logic plus an efficient top-$k$ predicate updater keeps costs low by prioritizing checks and resolving conflicts hierarchically. Measured on ST-WebAgentBench (ICML CUA~'25) and AgentHarm (ICLR~'25), \textsc{QuadSentinel} improves guardrail accuracy and rule recall while reducing false positives. Against single-agent baselines such as ShieldAgent (ICML~'25), it yields better overall safety control. Near-term deployments can adopt this pattern without modifying core agents by keeping policies separate and machine-checkable. Our code will be made publicly available at https://github.com/yyiliu/QuadSentinel.

SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents

Artificial Intelligence

Keeps robots from doing dangerous things.

14 Oct 2025 1

88%

Sentinel Agents for Secure and Trustworthy Agentic AI in Multi-Agent Systems

Artificial Intelligence

Protects smart systems from bad actors.

18 Sep 2025 0

88%

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Artificial Intelligence

Finds dangerous actions by phone apps.

28 Oct 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

21 pages

QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems

Guards AI agents to prevent them from doing bad things.

Technical Abstract

SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents

Sentinel Agents for Secure and Trustworthy Agentic AI in Multi-Agent Systems

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows