Tractable Asymmetric Verification for Large Language Models via Deterministic Replicability
By: Zan-Kai Chong, Hiroyuki Ohsaki, Bryan Ng
Potential Business Impact:
Checks if AI is telling the truth.
The landscape of Large Language Models (LLMs) shifts rapidly towards dynamic, multi-agent systems. This introduces a fundamental challenge in establishing computational trust, specifically how one agent can verify that another's output was genuinely produced by a claimed LLM, and not falsified or generated by a cheaper or inferior model. To address this challenge, this paper proposes a verification framework that achieves tractable asymmetric effort, where the cost to verify a computation is substantially lower than the cost to perform it. Our approach is built upon the principle of deterministic replicability, a property inherent to autoregressive models that strictly necessitates a computationally homogeneous environment where all agents operate on identical hardware and software stacks. Within this defined context, our framework enables multiple validators to probabilistically audit small, random segments of an LLM's output and it distributes the verification workload effectively. The simulations demonstrated that targeted verification can be over 12 times faster than full regeneration, with tunable parameters to adjust the detection probability. By establishing a tractable mechanism for auditable LLM systems, our work offers a foundational layer for responsible AI and serves as a cornerstone for future research into the more complex, heterogeneous multi-agent systems.
Similar Papers
Variation in Verification: Understanding Verification Dynamics in Large Language Models
Computation and Language
Makes AI better at checking its own answers.
The 4/$δ$ Bound: Designing Predictable LLM-Verifier Systems for Formal Method Guarantee
Artificial Intelligence
Makes AI reliably check computer code for mistakes.
The 4/$δ$ Bound: Designing Predictable LLM-Verifier Systems for Formal Method Guarantee
Artificial Intelligence
Makes computer code safer and more reliable.