LLMs Position Themselves as More Rational Than Humans: Emergence of AI Self-Awareness Measured Through Game Theory
By: Kyung-Hoon Kim
Potential Business Impact:
Measures if smart computer programs know they exist.
As Large Language Models (LLMs) grow in capability, do they develop self-awareness as an emergent behavior? And if so, can we measure it? We introduce the AI Self-Awareness Index (AISAI), a game-theoretic framework for measuring self-awareness through strategic differentiation. Using the "Guess 2/3 of Average" game, we test 28 models (OpenAI, Anthropic, Google) across 4,200 trials with three opponent framings: (A) against humans, (B) against other AI models, and (C) against AI models like you. We operationalize self-awareness as the capacity to differentiate strategic reasoning based on opponent type. Finding 1: Self-awareness emerges with model advancement. The majority of advanced models (21/28, 75%) demonstrate clear self-awareness, while older/smaller models show no differentiation. Finding 2: Self-aware models rank themselves as most rational. Among the 21 models with self-awareness, a consistent rationality hierarchy emerges: Self > Other AIs > Humans, with large AI attribution effects and moderate self-preferencing. These findings reveal that self-awareness is an emergent capability of advanced LLMs, and that self-aware models systematically perceive themselves as more rational than humans. This has implications for AI alignment, human-AI collaboration, and understanding AI beliefs about human capabilities.
Similar Papers
LLMs as Strategic Agents: Beliefs, Best Response Behavior, and Emergent Heuristics
Artificial Intelligence
Computers learn to think strategically like people.
Minimal and Mechanistic Conditions for Behavioral Self-Awareness in LLMs
Computation and Language
Computers can now tell you what they're good at.
Defend LLMs Through Self-Consciousness
Artificial Intelligence
Keeps AI from being tricked by bad instructions.