Score: 0

Comparative Analysis of LoRA-Adapted Embedding Models for Clinical Cardiology Text Representation

Published: November 24, 2025 | arXiv ID: 2511.19739v1

By: Richard J. Young, Alice M. Matthews

Potential Business Impact:

Finds best computer models for heart doctor words.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Domain-specific text embeddings are critical for clinical natural language processing, yet systematic comparisons across model architectures remain limited. This study evaluates ten transformer-based embedding models adapted for cardiology through Low-Rank Adaptation (LoRA) fine-tuning on 106,535 cardiology text pairs derived from authoritative medical textbooks. Results demonstrate that encoder-only architectures, particularly BioLinkBERT, achieve superior domain-specific performance (separation score: 0.510) compared to larger decoder-based models, while requiring significantly fewer computational resources. The findings challenge the assumption that larger language models necessarily produce better domain-specific embeddings and provide practical guidance for clinical NLP system development. All models, training code, and evaluation datasets are publicly available to support reproducible research in medical informatics.