Score: 0

GUMBridge: a Corpus for Varieties of Bridging Anaphora

Published: December 8, 2025 | arXiv ID: 2512.07134v1

By: Lauren Levine, Amir Zeldes

Potential Business Impact:

Helps computers understand how words connect.

Business Areas:
Semantic Web Internet Services

Bridging is an anaphoric phenomenon where the referent of an entity in a discourse is dependent on a previous, non-identical entity for interpretation, such as in "There is 'a house'. 'The door' is red," where the door is specifically understood to be the door of the aforementioned house. While there are several existing resources in English for bridging anaphora, most are small, provide limited coverage of the phenomenon, and/or provide limited genre coverage. In this paper, we introduce GUMBridge, a new resource for bridging, which includes 16 diverse genres of English, providing both broad coverage for the phenomenon and granular annotations for the subtype categorization of bridging varieties. We also present an evaluation of annotation quality and report on baseline performance using open and closed source contemporary LLMs on three tasks underlying our data, showing that bridging resolution and subtype classification remain difficult NLP tasks in the age of LLMs.

Country of Origin
🇺🇸 United States

Page Count
10 pages

Category
Computer Science:
Computation and Language