Score: 0

An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care

Published: May 13, 2025 | arXiv ID: 2505.08414v1

By: Zhi Da Soh , Yang Bai , Kai Yu and more

Potential Business Impact:

Helps doctors find eye problems using AI.

Business Areas:
Image Recognition Data and Analytics, Software

Current deep learning models are mostly task specific and lack a user-friendly interface to operate. We present Meta-EyeFM, a multi-function foundation model that integrates a large language model (LLM) with vision foundation models (VFMs) for ocular disease assessment. Meta-EyeFM leverages a routing mechanism to enable accurate task-specific analysis based on text queries. Using Low Rank Adaptation, we fine-tuned our VFMs to detect ocular and systemic diseases, differentiate ocular disease severity, and identify common ocular signs. The model achieved 100% accuracy in routing fundus images to appropriate VFMs, which achieved $\ge$ 82.2% accuracy in disease detection, $\ge$ 89% in severity differentiation, $\ge$ 76% in sign identification. Meta-EyeFM was 11% to 43% more accurate than Gemini-1.5-flash and ChatGPT-4o LMMs in detecting various eye diseases and comparable to an ophthalmologist. This system offers enhanced usability and diagnostic performance, making it a valuable decision support tool for primary eye care or an online LLM for fundus evaluation.

Country of Origin
πŸ‡ΈπŸ‡¬ Singapore

Page Count
36 pages

Category
Electrical Engineering and Systems Science:
Image and Video Processing