Score: 2

A Unified Biomedical Named Entity Recognition Framework with Large Language Models

Published: October 10, 2025 | arXiv ID: 2510.08902v1

By: Tengxiao Lv , Ling Luo , Juntao Li and more

Potential Business Impact:

Helps doctors find important words in medical texts.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Accurate recognition of biomedical named entities is critical for medical information extraction and knowledge discovery. However, existing methods often struggle with nested entities, entity boundary ambiguity, and cross-lingual generalization. In this paper, we propose a unified Biomedical Named Entity Recognition (BioNER) framework based on Large Language Models (LLMs). We first reformulate BioNER as a text generation task and design a symbolic tagging strategy to jointly handle both flat and nested entities with explicit boundary annotation. To enhance multilingual and multi-task generalization, we perform bilingual joint fine-tuning across multiple Chinese and English datasets. Additionally, we introduce a contrastive learning-based entity selector that filters incorrect or spurious predictions by leveraging boundary-sensitive positive and negative samples. Experimental results on four benchmark datasets and two unseen corpora show that our method achieves state-of-the-art performance and robust zero-shot generalization across languages. The source codes are freely available at https://github.com/dreamer-tx/LLMNER.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
9 pages

Category
Computer Science:
Computation and Language