Score: 0

Benchmarking Ophthalmology Foundation Models for Clinically Significant Age Macular Degeneration Detection

Published: May 8, 2025 | arXiv ID: 2505.05291v2

By: Benjamin A. Cohen , Jonathan Fhima , Meishar Meisel and more

Potential Business Impact:

Helps doctors spot eye disease from pictures.

Business Areas:
Image Recognition Data and Analytics, Software

Self-supervised learning (SSL) has enabled Vision Transformers (ViTs) to learn robust representations from large-scale natural image datasets, enhancing their generalization across domains. In retinal imaging, foundation models pretrained on either natural or ophthalmic data have shown promise, but the benefits of in-domain pretraining remain uncertain. To investigate this, we benchmark six SSL-pretrained ViTs on seven digital fundus image (DFI) datasets totaling 70,000 expert-annotated images for the task of moderate-to-late age-related macular degeneration (AMD) identification. Our results show that iBOT pretrained on natural images achieves the highest out-of-distribution generalization, with AUROCs of 0.80-0.97, outperforming domain-specific models, which achieved AUROCs of 0.78-0.96 and a baseline ViT-L with no pretraining, which achieved AUROCs of 0.68-0.91. These findings highlight the value of foundation models in improving AMD identification and challenge the assumption that in-domain pretraining is necessary. Furthermore, we release BRAMD, an open-access dataset (n=587) of DFIs with AMD labels from Brazil.

Page Count
10 pages

Category
Electrical Engineering and Systems Science:
Image and Video Processing