A Unified Biomedical Named Entity Recognition Framework with Large Language Models
By: Tengxiao Lv , Ling Luo , Juntao Li and more
Potential Business Impact:
Helps doctors find important words in medical texts.
Accurate recognition of biomedical named entities is critical for medical information extraction and knowledge discovery. However, existing methods often struggle with nested entities, entity boundary ambiguity, and cross-lingual generalization. In this paper, we propose a unified Biomedical Named Entity Recognition (BioNER) framework based on Large Language Models (LLMs). We first reformulate BioNER as a text generation task and design a symbolic tagging strategy to jointly handle both flat and nested entities with explicit boundary annotation. To enhance multilingual and multi-task generalization, we perform bilingual joint fine-tuning across multiple Chinese and English datasets. Additionally, we introduce a contrastive learning-based entity selector that filters incorrect or spurious predictions by leveraging boundary-sensitive positive and negative samples. Experimental results on four benchmark datasets and two unseen corpora show that our method achieves state-of-the-art performance and robust zero-shot generalization across languages. The source codes are freely available at https://github.com/dreamer-tx/LLMNER.
Similar Papers
GLiNER-BioMed: A Suite of Efficient Models for Open Biomedical Named Entity Recognition
Computation and Language
Finds new medical words automatically.
Do LLMs Surpass Encoders for Biomedical NER?
Computation and Language
Finds important words in medical texts.
Named Entity Recognition of Historical Text via Large Language Model
Digital Libraries
Helps computers find names in old writings.