Where meaning lives: Layer-wise accessibility of psycholinguistic features in encoder and decoder language models
By: Taisiia Tikhomirova, Dirk U. Wulff
Potential Business Impact:
Finds where words' feelings and meanings hide in AI.
Understanding where transformer language models encode psychologically meaningful aspects of meaning is essential for both theory and practice. We conduct a systematic layer-wise probing study of 58 psycholinguistic features across 10 transformer models, spanning encoder-only and decoder-only architectures, and compare three embedding extraction methods. We find that apparent localization of meaning is strongly method-dependent: contextualized embeddings yield higher feature-specific selectivity and different layer-wise profiles than isolated embeddings. Across models and methods, final-layer representations are rarely optimal for recovering psycholinguistic information with linear probes. Despite these differences, models exhibit a shared depth ordering of meaning dimensions, with lexical properties peaking earlier and experiential and affective dimensions peaking later. Together, these results show that where meaning "lives" in transformer models reflects an interaction between methodological choices and architectural constraints.
Similar Papers
Word Meanings in Transformer Language Models
Computation and Language
Computers understand word meanings like people do.
Layer by Layer: Uncovering Hidden Representations in Language Models
Machine Learning (CS)
Computers understand things better using middle parts.
Hierarchical Geometry of Cognitive States in Transformer Embedding Spaces
Computation and Language
Computers learn how people think and organize ideas.