MEIC-DT: Memory-Efficient Incremental Clustering for Long-Text Coreference Resolution with Dual-Threshold Constraints
By: Kangyang Luo , Shuzheng Si , Yuzhuo Bai and more
In the era of large language models (LLMs), supervised neural methods remain the state-of-the-art (SOTA) for Coreference Resolution. Yet, their full potential is underexplored, particularly in incremental clustering, which faces the critical challenge of balancing efficiency with performance for long texts. To address the limitation, we propose \textbf{MEIC-DT}, a novel dual-threshold, memory-efficient incremental clustering approach based on a lightweight Transformer. MEIC-DT features a dual-threshold constraint mechanism designed to precisely control the Transformer's input scale within a predefined memory budget. This mechanism incorporates a Statistics-Aware Eviction Strategy (\textbf{SAES}), which utilizes distinct statistical profiles from the training and inference phases for intelligent cache management. Furthermore, we introduce an Internal Regularization Policy (\textbf{IRP}) that strategically condenses clusters by selecting the most representative mentions, thereby preserving semantic integrity. Extensive experiments on common benchmarks demonstrate that MEIC-DT achieves highly competitive coreference performance under stringent memory constraints.
Similar Papers
TCDE: Topic-Centric Dual Expansion of Queries and Documents with Large Language Models for Information Retrieval
Information Retrieval
Helps computers find information better by understanding topics.
Satisfactory Medical Consultation based on Terminology-Enhanced Information Retrieval and Emotional In-Context Learning
Computation and Language
Helps doctors understand patient problems better.
Multiple Token Divergence: Measuring and Steering In-Context Computation Density
Machine Learning (CS)
Shows how much a computer thinks to solve problems.