An Exploratory Analysis on the Explanatory Potential of Embedding-Based Measures of Semantic Transparency for Malay Word Recognition
By: M. Maziyah Mohamed, R. H. Baayen
Potential Business Impact:
Helps computers understand word meanings better.
Studies of morphological processing have shown that semantic transparency is crucial for word recognition. Its computational operationalization is still under discussion. Our primary objectives are to explore embedding-based measures of semantic transparency, and assess their impact on reading. First, we explored the geometry of complex words in semantic space. To do so, we conducted a t-distributed Stochastic Neighbor Embedding clustering analysis on 4,226 Malay prefixed words. Several clusters were observed for complex words varied by their prefix class. Then, we derived five simple measures, and investigated whether they were significant predictors of lexical decision latencies. Two sets of Linear Discriminant Analyses were run in which the prefix of a word is predicted from either word embeddings or shift vectors (i.e., a vector subtraction of the base word from the derived word). The accuracy with which the model predicts the prefix of a word indicates the degree of transparency of the prefix. Three further measures were obtained by comparing embeddings between each word and all other words containing the same prefix (i.e., centroid), between each word and the shift from their base word, and between each word and the predicted word of the Functional Representations of Affixes in Compositional Semantic Space model. In a series of Generalized Additive Mixed Models, all measures predicted decision latencies after accounting for word frequency, word length, and morphological family size. The model that included the correlation between each word and their centroid as a predictor provided the best fit to the data.
Similar Papers
How Small Transformation Expose the Weakness of Semantic Similarity Measures
Computation and Language
Finds computer code that means the same thing.
The aftermath of compounds: Investigating Compounds and their Semantic Representations
Computation and Language
Helps computers understand word meanings better.
Word Meanings in Transformer Language Models
Computation and Language
Computers understand word meanings like people do.