An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care
By: Zhi Da Soh , Yang Bai , Kai Yu and more
Potential Business Impact:
Helps doctors find eye problems using AI.
Current deep learning models are mostly task specific and lack a user-friendly interface to operate. We present Meta-EyeFM, a multi-function foundation model that integrates a large language model (LLM) with vision foundation models (VFMs) for ocular disease assessment. Meta-EyeFM leverages a routing mechanism to enable accurate task-specific analysis based on text queries. Using Low Rank Adaptation, we fine-tuned our VFMs to detect ocular and systemic diseases, differentiate ocular disease severity, and identify common ocular signs. The model achieved 100% accuracy in routing fundus images to appropriate VFMs, which achieved $\ge$ 82.2% accuracy in disease detection, $\ge$ 89% in severity differentiation, $\ge$ 76% in sign identification. Meta-EyeFM was 11% to 43% more accurate than Gemini-1.5-flash and ChatGPT-4o LMMs in detecting various eye diseases and comparable to an ophthalmologist. This system offers enhanced usability and diagnostic performance, making it a valuable decision support tool for primary eye care or an online LLM for fundus evaluation.
Similar Papers
FusionFM: Fusing Eye-specific Foundational Models for Optimized Ophthalmic Diagnosis
CV and Pattern Recognition
Helps doctors find eye and body diseases from eye pictures.
Foundation Models in Medical Image Analysis: A Systematic Review and Meta-Analysis
CV and Pattern Recognition
Helps doctors understand medical pictures better.
FunBench: Benchmarking Fundus Reading Skills of MLLMs
CV and Pattern Recognition
Helps AI understand eye pictures to find diseases.