Cataract-LMM: Large-Scale, Multi-Source, Multi-Task Benchmark for Deep Learning in Surgical Video Analysis
By: Mohammad Javad Ahmadi , Iman Gandomi , Parisa Abdi and more
Potential Business Impact:
Teaches robots to perform eye surgery better.
The development of computer-assisted surgery systems depends on large-scale, annotated datasets. Current resources for cataract surgery often lack the diversity and annotation depth needed to train generalizable deep-learning models. To address this gap, we present a dataset of 3,000 phacoemulsification cataract surgery videos from two surgical centers, performed by surgeons with a range of experience levels. This resource is enriched with four annotation layers: temporal surgical phases, instance segmentation of instruments and anatomical structures, instrument-tissue interaction tracking, and quantitative skill scores based on the established competency rubrics like the ICO-OSCAR. The technical quality of the dataset is supported by a series of benchmarking experiments for key surgical AI tasks, including workflow recognition, scene segmentation, and automated skill assessment. Furthermore, we establish a domain adaptation baseline for the phase recognition task by training a model on a subset of surgical centers and evaluating its performance on a held-out center. The dataset and annotations are available in Google Form (https://docs.google.com/forms/d/e/1FAIpQLSfmyMAPSTGrIy2sTnz0-TMw08ZagTimRulbAQcWdaPwDy187A/viewform?usp=dialog).
Similar Papers
CataractSurg-80K: Knowledge-Driven Benchmarking for Structured Reasoning in Ophthalmic Surgery Planning
Multiagent Systems
Helps doctors plan eye surgery better.
WetCat: Enabling Automated Skill Assessment in Wet-Lab Cataract Surgery Videos
CV and Pattern Recognition
Teaches robots to judge eye surgery skills.
Data-Efficient Learning for Generalizable Surgical Video Understanding
Image and Video Processing
Helps doctors learn and improve surgery with AI.