VL-MedGuide: A Visual-Linguistic Large Model for Intelligent and Explainable Skin Disease Auxiliary Diagnosis
By: Kexin Yu , Zihan Xu , Jialei Xie and more
Potential Business Impact:
Helps doctors diagnose skin problems with clear reasons.
Accurate diagnosis of skin diseases remains a significant challenge due to the complex and diverse visual features present in dermatoscopic images, often compounded by a lack of interpretability in existing purely visual diagnostic models. To address these limitations, this study introduces VL-MedGuide (Visual-Linguistic Medical Guide), a novel framework leveraging the powerful multi-modal understanding and reasoning capabilities of Visual-Language Large Models (LVLMs) for intelligent and inherently interpretable auxiliary diagnosis of skin conditions. VL-MedGuide operates in two interconnected stages: a Multi-modal Concept Perception Module, which identifies and linguistically describes dermatologically relevant visual features through sophisticated prompt engineering, and an Explainable Disease Reasoning Module, which integrates these concepts with raw visual information via Chain-of-Thought prompting to provide precise disease diagnoses alongside transparent rationales. Comprehensive experiments on the Derm7pt dataset demonstrate that VL-MedGuide achieves state-of-the-art performance in both disease diagnosis (83.55% BACC, 80.12% F1) and concept detection (76.10% BACC, 67.45% F1), surpassing existing baselines. Furthermore, human evaluations confirm the high clarity, completeness, and trustworthiness of its generated explanations, bridging the gap between AI performance and clinical utility by offering actionable, explainable insights for dermatological practice.
Similar Papers
MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from Textbooks
CV and Pattern Recognition
Helps doctors find skin problems from pictures.
CLARIFY: A Specialist-Generalist Framework for Accurate and Lightweight Dermatological Visual Question Answering
CV and Pattern Recognition
Helps doctors diagnose skin problems faster and better.
MedM-VL: What Makes a Good Medical LVLM?
CV and Pattern Recognition
Helps doctors understand medical pictures better.