Score: 2

ReProCon: Scalable and Resource-Efficient Few-Shot Biomedical Named Entity Recognition

Published: August 22, 2025 | arXiv ID: 2508.16833v1

By: Jeongkyun Yoo, Nela Riddle, Andrew Hoblitzell

Potential Business Impact:

Helps computers understand rare medical words.

Business Areas:
Image Recognition Data and Analytics, Software

Named Entity Recognition (NER) in biomedical domains faces challenges due to data scarcity and imbalanced label distributions, especially with fine-grained entity types. We propose ReProCon, a novel few-shot NER framework that combines multi-prototype modeling, cosine-contrastive learning, and Reptile meta-learning to tackle these issues. By representing each category with multiple prototypes, ReProCon captures semantic variability, such as synonyms and contextual differences, while a cosine-contrastive objective ensures strong interclass separation. Reptile meta-updates enable quick adaptation with little data. Using a lightweight fastText + BiLSTM encoder with much lower memory usage, ReProCon achieves a macro-$F_1$ score close to BERT-based baselines (around 99 percent of BERT performance). The model remains stable with a label budget of 30 percent and only drops 7.8 percent in $F_1$ when expanding from 19 to 50 categories, outperforming baselines such as SpanProto and CONTaiNER, which see 10 to 32 percent degradation in Few-NERD. Ablation studies highlight the importance of multi-prototype modeling and contrastive learning in managing class imbalance. Despite difficulties with label ambiguity, ReProCon demonstrates state-of-the-art performance in resource-limited settings, making it suitable for biomedical applications.

Country of Origin
🇺🇸 United States

Repos / Data Links

Page Count
9 pages

Category
Computer Science:
Computation and Language