Regulating the Agency of LLM-based Agents
By: Seán Boddy, Joshua Joseph
Potential Business Impact:
Controls AI to prevent it from causing harm.
As increasingly capable large language model (LLM)-based agents are developed, the potential harms caused by misalignment and loss of control grow correspondingly severe. To address these risks, we propose an approach that directly measures and controls the agency of these AI systems. We conceptualize the agency of LLM-based agents as a property independent of intelligence-related measures and consistent with the interdisciplinary literature on the concept of agency. We offer (1) agency as a system property operationalized along the dimensions of preference rigidity, independent operation, and goal persistence, (2) a representation engineering approach to the measurement and control of the agency of an LLM-based agent, and (3) regulatory tools enabled by this approach: mandated testing protocols, domain-specific agency limits, insurance frameworks that price risk based on agency, and agency ceilings to prevent societal-scale risks. We view our approach as a step toward reducing the risks that motivate the ``Scientist AI'' paradigm, while still capturing some of the benefits from limited agentic behavior.
Similar Papers
Inherent and emergent liability issues in LLM-based agentic systems: a principal-agent perspective
Computers and Society
Makes AI agents safer and more responsible.
How to evaluate control measures for LLM agents? A trajectory from today to superintelligence
Artificial Intelligence
Tests AI to stop it from doing bad things.
Multi-Agent Systems for Robotic Autonomy with LLMs
Robotics
Builds robots that can do jobs by themselves.