Enhancing the Medical Context-Awareness Ability of LLMs via Multifaceted Self-Refinement Learning
By: Yuxuan Zhou , Yubin Wang , Bin Wang and more
Potential Business Impact:
Helps AI doctors understand patient needs better.
Large language models (LLMs) have shown great promise in the medical domain, achieving strong performance on several benchmarks. However, they continue to underperform in real-world medical scenarios, which often demand stronger context-awareness, i.e., the ability to recognize missing or critical details (e.g., user identity, medical history, risk factors) and provide safe, helpful, and contextually appropriate responses. To address this issue, we propose Multifaceted Self-Refinement (MuSeR), a data-driven approach that enhances LLMs' context-awareness along three key facets (decision-making, communication, and safety) through self-evaluation and refinement. Specifically, we first design a attribute-conditioned query generator that simulates diverse real-world user contexts by varying attributes such as role, geographic region, intent, and degree of information ambiguity. An LLM then responds to these queries, self-evaluates its answers along three key facets, and refines its responses to better align with the requirements of each facet. Finally, the queries and refined responses are used for supervised fine-tuning to reinforce the model's context-awareness ability. Evaluation results on the latest HealthBench dataset demonstrate that our method significantly improves LLM performance across multiple aspects, with particularly notable gains in the context-awareness axis. Furthermore, by incorporating knowledge distillation with the proposed method, the performance of a smaller backbone LLM (e.g., Qwen3-32B) surpasses its teacher model, achieving a new SOTA across all open-source LLMs on HealthBench (63.8%) and its hard subset (43.1%). Code and dataset will be released at https://muser-llm.github.io.
Similar Papers
AR-Med: Automated Relevance Enhancement in Medical Search via LLM-Driven Information Augmentation
Computation and Language
Finds the right health answers online, safely.
Reasoning LLMs in the Medical Domain: A Literature Survey
Artificial Intelligence
Helps doctors make better health choices.
Balancing Safety and Helpfulness in Healthcare AI Assistants through Iterative Preference Alignment
Artificial Intelligence
Makes AI doctors safer by catching bad advice.