Exploring Human-AI Conceptual Alignment through the Prism of Chess
By: Semyon Lomaso , Judah Goldfeder , Mehmet Hamza Erol and more
Potential Business Impact:
AI learns chess moves, not how humans think.
Do AI systems truly understand human concepts or merely mimic surface patterns? We investigate this through chess, where human creativity meets precise strategic concepts. Analyzing a 270M-parameter transformer that achieves grandmaster-level play, we uncover a striking paradox: while early layers encode human concepts like center control and knight outposts with up to 85\% accuracy, deeper layers, despite driving superior performance, drift toward alien representations, dropping to 50-65\% accuracy. To test conceptual robustness beyond memorization, we introduce the first Chess960 dataset: 240 expert-annotated positions across 6 strategic concepts. When opening theory is eliminated through randomized starting positions, concept recognition drops 10-20\% across all methods, revealing the model's reliance on memorized patterns rather than abstract understanding. Our layer-wise analysis exposes a fundamental tension in current architectures: the representations that win games diverge from those that align with human thinking. These findings suggest that as AI systems optimize for performance, they develop increasingly alien intelligence, a critical challenge for creative AI applications requiring genuine human-AI collaboration. Dataset and code are available at: https://github.com/slomasov/ChessConceptsLLM.
Similar Papers
ChessQA: Evaluating Large Language Models for Chess Understanding
Machine Learning (CS)
Tests how well computers understand chess.
Predicting Human Chess Moves: An AI Assisted Analysis of Chess Games Using Skill-group Specific n-gram Language Models
Artificial Intelligence
Predicts chess moves based on player skill.
LLM CHESS: Benchmarking Reasoning and Instruction-Following in LLMs through Chess
Artificial Intelligence
Tests how well AI plays and understands chess.