Assurance of Frontier AI Built for National Security
By: Matteo Pistillo, Charlotte Stix
Potential Business Impact:
Makes AI safer and more trustworthy.
This memorandum presents four recommendations aimed at strengthening the principles of AI model reliability and AI model governability, as DoW, ODNI, NIST, and CAISI refine AI assurance frameworks under the AI Action Plan. Our focus concerns the open scientific problem of misalignment and its implications on AI model behavior. Specifically, misalignment and scheming capabilities can be a red flag indicating AI model insufficient reliability and governability. To address the national security threats arising from misalignment, we recommend that DoW and the IC strategically leverage existing testing and evaluation pipelines and their OT authority to future proof the principles of AI model reliability and AI model governability through a suite of scheming and control evaluations.
Similar Papers
A Framework for the Assurance of AI-Enabled Systems
Artificial Intelligence
Makes military AI safe and trustworthy for use.
Preserving security in a world with powerful AI Considerations for the future Defense Architecture
Computers and Society
Builds new defenses against smart AI weapons.
International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management
Computers and Society
Makes AI safer from being used for bad things.