Coreference Resolution for Vietnamese Narrative Texts
By: Hieu-Dai Tran, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen
Potential Business Impact:
Helps computers understand Vietnamese stories better.
Coreference resolution is a vital task in natural language processing (NLP) that involves identifying and linking different expressions in a text that refer to the same entity. This task is particularly challenging for Vietnamese, a low-resource language with limited annotated datasets. To address these challenges, we developed a comprehensive annotated dataset using narrative texts from VnExpress, a widely-read Vietnamese online news platform. We established detailed guidelines for annotating entities, focusing on ensuring consistency and accuracy. Additionally, we evaluated the performance of large language models (LLMs), specifically GPT-3.5-Turbo and GPT-4, on this dataset. Our results demonstrate that GPT-4 significantly outperforms GPT-3.5-Turbo in terms of both accuracy and response consistency, making it a more reliable tool for coreference resolution in Vietnamese.
Similar Papers
Enhancing Coreference Resolution with Pretrained Language Models: Bridging the Gap Between Syntax and Semantics
Computation and Language
Helps computers understand who or what "they" refers to.
Coreference as an indicator of context scope in multimodal narrative
Computation and Language
Helps computers tell stories like people.
Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures
Computation and Language
Helps computers understand what you mean in chats.