Do Syntactic Categories Help in Developmentally Motivated Curriculum Learning for Language Models?
By: Arzu Burcu Güven, Anna Rogers, Rob van der Goot
Potential Business Impact:
Teaches computers language rules from baby talk.
We examine the syntactic properties of BabyLM corpus, and age-groups within CHILDES. While we find that CHILDES does not exhibit strong syntactic differentiation by age, we show that the syntactic knowledge about the training data can be helpful in interpreting model performance on linguistic tasks. For curriculum learning, we explore developmental and several alternative cognitively inspired curriculum approaches. We find that some curricula help with reading tasks, but the main performance improvement come from using the subset of syntactically categorizable data, rather than the full noisy corpus.
Similar Papers
Do Construction Distributions Shape Formal Language Learning In German BabyLMs?
Computation and Language
Helps computers learn language like babies.
Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors
Computation and Language
Fixes math problems by changing how they're asked.
Leveraging Large Language Models for Robot-Assisted Learning of Morphological Structures in Preschool Children with Language Vulnerabilities
Robotics
Robot helps kids with talking problems learn words.