Score: 1

Evaluating the Challenges of LLMs in Real-world Medical Follow-up: A Comparative Study and An Optimized Framework

Published: December 22, 2025 | arXiv ID: 2512.18999v1

By: Jinyan Liu , Zikang Chen , Qinchuan Wang and more

Potential Business Impact:

Makes chatbots better at asking patients questions.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

When applied directly in an end-to-end manner to medical follow-up tasks, Large Language Models (LLMs) often suffer from uncontrolled dialog flow and inaccurate information extraction due to the complexity of follow-up forms. To address this limitation, we designed and compared two follow-up chatbot systems: an end-to-end LLM-based system (control group) and a modular pipeline with structured process control (experimental group). Experimental results show that while the end-to-end approach frequently fails on lengthy and complex forms, our modular method-built on task decomposition, semantic clustering, and flow management-substantially improves dialog stability and extraction accuracy. Moreover, it reduces the number of dialogue turns by 46.73% and lowers token consumption by 80% to 87.5%. These findings highlight the necessity of integrating external control mechanisms when deploying LLMs in high-stakes medical follow-up scenarios.

Enabling Doctor-Centric Medical AI with LLMs through Workflow-Aligned Tasks and Benchmarks

Computation and Language

Helps doctors use AI for better patient care.

13 Oct 2025 1

91%

Identifying Imaging Follow-Up in Radiology Reports: A Comparative Analysis of Traditional ML and LLM Approaches

Computation and Language

Helps doctors know if patients need more scans.

14 Nov 2025 1

90%

Large Language Models in Healthcare

Computers and Society

Helps doctors use smart computers for better patient care.

6 Feb 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

10 pages

Evaluating the Challenges of LLMs in Real-world Medical Follow-up: A Comparative Study and An Optimized Framework

Makes chatbots better at asking patients questions.

Technical Abstract

Enabling Doctor-Centric Medical AI with LLMs through Workflow-Aligned Tasks and Benchmarks

Identifying Imaging Follow-Up in Radiology Reports: A Comparative Analysis of Traditional ML and LLM Approaches

Large Language Models in Healthcare