Word Meanings in Transformer Language Models
By: Jumbly Grindrod, Peter Grindrod
Potential Business Impact:
Computers understand word meanings like people do.
We investigate how word meanings are represented in the transformer language models. Specifically, we focus on whether transformer models employ something analogous to a lexical store - where each word has an entry that contains semantic information. To do this, we extracted the token embedding space of RoBERTa-base and k-means clustered it into 200 clusters. In our first study, we then manually inspected the resultant clusters to consider whether they are sensitive to semantic information. In our second study, we tested whether the clusters are sensitive to five psycholinguistic measures: valence, concreteness, iconicity, taboo, and age of acquisition. Overall, our findings were very positive - there is a wide variety of semantic information encoded within the token embedding space. This serves to rule out certain "meaning eliminativist" hypotheses about how transformer LLMs process semantic information.
Similar Papers
Semantic Structure in Large Language Model Embeddings
Computation and Language
Words have simple meanings inside computers.
The aftermath of compounds: Investigating Compounds and their Semantic Representations
Computation and Language
Helps computers understand word meanings better.
Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models
Computation and Language
Helps computers remember events in order.