Score: 1

Modelling Child Learning and Parsing of Long-range Syntactic Dependencies

Published: March 17, 2025 | arXiv ID: 2503.12832v1

By: Louis Mahon, Mark Johnson, Mark Steedman

Potential Business Impact:

Teaches computers how kids learn language.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

This work develops a probabilistic child language acquisition model to learn a range of linguistic phenonmena, most notably long-range syntactic dependencies of the sort found in object wh-questions, among other constructions. The model is trained on a corpus of real child-directed speech, where each utterance is paired with a logical form as a meaning representation. It then learns both word meanings and language-specific syntax simultaneously. After training, the model can deduce the correct parse tree and word meanings for a given utterance-meaning pair, and can infer the meaning if given only the utterance. The successful modelling of long-range dependencies is theoretically important because it exploits aspects of the model that are, in general, trans-context-free.