GUMBridge: a Corpus for Varieties of Bridging Anaphora
By: Lauren Levine, Amir Zeldes
Potential Business Impact:
Helps computers understand how words connect.
Bridging is an anaphoric phenomenon where the referent of an entity in a discourse is dependent on a previous, non-identical entity for interpretation, such as in "There is 'a house'. 'The door' is red," where the door is specifically understood to be the door of the aforementioned house. While there are several existing resources in English for bridging anaphora, most are small, provide limited coverage of the phenomenon, and/or provide limited genre coverage. In this paper, we introduce GUMBridge, a new resource for bridging, which includes 16 diverse genres of English, providing both broad coverage for the phenomenon and granular annotations for the subtype categorization of bridging varieties. We also present an evaluation of annotation quality and report on baseline performance using open and closed source contemporary LLMs on three tasks underlying our data, showing that bridging resolution and subtype classification remain difficult NLP tasks in the age of LLMs.
Similar Papers
Subjectivity in the Annotation of Bridging Anaphora
Computation and Language
Helps computers understand how words connect ideas.
Semantic Bridge: Universal Multi-Hop Question Generation via AMR-Driven Graph Synthesis
Computation and Language
Makes AI smarter by creating harder questions.
Multilingual corpora for the study of new concepts in the social sciences and humanities:
Computation and Language
Helps computers understand new ideas from company websites.