Lesion-Aware Visual-Language Fusion for Automated Image Captioning of Ulcerative Colitis Endoscopic Examinations
By: Alexis Ivan Lopez Escamilla, Gilberto Ochoa, Sharib Al
Potential Business Impact:
Helps doctors describe stomach pictures better.
We present a lesion-aware image captioning framework for ulcerative colitis (UC). The model integrates ResNet embeddings, Grad-CAM heatmaps, and CBAM-enhanced attention with a T5 decoder. Clinical metadata (MES score 0-3, vascular pattern, bleeding, erythema, friability, ulceration) is injected as natural-language prompts to guide caption generation. The system produces structured, interpretable descriptions aligned with clinical practice and provides MES classification and lesion tags. Compared with baselines, our approach improves caption quality and MES classification accuracy, supporting reliable endoscopic reporting.
Similar Papers
CLoE: Curriculum Learning on Endoscopic Images for Robust MES Classification
CV and Pattern Recognition
Helps doctors better judge gut disease from pictures.
A Multi-Modal Deep Learning Framework for Colorectal Pathology Diagnosis: Integrating Histological and Colonoscopy Data in a Pilot Study
CV and Pattern Recognition
Helps doctors find gut diseases faster.
Domain Adaptation for Ulcerative Colitis Severity Estimation Using Patient-Level Diagnoses
CV and Pattern Recognition
Helps doctors better see gut sickness severity.