Evaluation of LLMs on Long-tail Entity Linking in Historical Documents
By: Marta Boscariol , Luana Bulla , Lia Draetta and more
Potential Business Impact:
Helps computers understand rare names and places.
Entity Linking (EL) plays a crucial role in Natural Language Processing (NLP) applications, enabling the disambiguation of entity mentions by linking them to their corresponding entries in a reference knowledge base (KB). Thanks to their deep contextual understanding capabilities, LLMs offer a new perspective to tackle EL, promising better results than traditional methods. Despite the impressive generalization capabilities of LLMs, linking less popular, long-tail entities remains challenging as these entities are often underrepresented in training data and knowledge bases. Furthermore, the long-tail EL task is an understudied problem, and limited studies address it with LLMs. In the present work, we assess the performance of two popular LLMs, GPT and LLama3, in a long-tail entity linking scenario. Using MHERCL v0.1, a manually annotated benchmark of sentences from domain-specific historical texts, we quantitatively compare the performance of LLMs in identifying and linking entities to their corresponding Wikidata entries against that of ReLiK, a state-of-the-art Entity Linking and Relation Extraction framework. Our preliminary experiments reveal that LLMs perform encouragingly well in long-tail EL, indicating that this technology can be a valuable adjunct in filling the gap between head and long-tail EL.
Similar Papers
Harnessing Deep LLM Participation for Robust Entity Linking
Computation and Language
Helps computers understand names in text better.
LLMs as Data Annotators: How Close Are We to Human Performance
Computation and Language
Finds best examples to teach computers faster.
Named Entity Recognition of Historical Text via Large Language Model
Digital Libraries
Helps computers find names in old writings.