Score: 1

Mechanistic Interpretability of Socio-Political Frames in Language Models

Published: October 4, 2025 | arXiv ID: 2510.03799v1

By: Hadi Asghari, Sami Nenno

Potential Business Impact:

Helps computers understand how people think about politics.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

This paper explores the ability of large language models to generate and recognize deep cognitive frames, particularly in socio-political contexts. We demonstrate that LLMs are highly fluent in generating texts that evoke specific frames and can recognize these frames in zero-shot settings. Inspired by mechanistic interpretability research, we investigate the location of the `strict father' and `nurturing parent' frames within the model's hidden representation, identifying singular dimensions that correlate strongly with their presence. Our findings contribute to understanding how LLMs capture and express meaningful human concepts.