EchoAgent: Guideline-Centric Reasoning Agent for Echocardiography Measurement and Interpretation
By: Matin Daghyani , Lyuyang Wang , Nima Hashemi and more
Potential Business Impact:
Helps doctors understand heart videos better.
Purpose: Echocardiographic interpretation requires video-level reasoning and guideline-based measurement analysis, which current deep learning models for cardiac ultrasound do not support. We present EchoAgent, a framework that enables structured, interpretable automation for this domain. Methods: EchoAgent orchestrates specialized vision tools under Large Language Model (LLM) control to perform temporal localization, spatial measurement, and clinical interpretation. A key contribution is a measurement-feasibility prediction model that determines whether anatomical structures are reliably measurable in each frame, enabling autonomous tool selection. We curated a benchmark of diverse, clinically validated video-query pairs for evaluation. Results: EchoAgent achieves accurate, interpretable results despite added complexity of spatiotemporal video analysis. Outputs are grounded in visual evidence and clinical guidelines, supporting transparency and traceability. Conclusion: This work demonstrates the feasibility of agentic, guideline-aligned reasoning for echocardiographic video analysis, enabled by task-specific tools and full video-level automation. EchoAgent sets a new direction for trustworthy AI in cardiac ultrasound.
Similar Papers
EchoVLM: Measurement-Grounded Multimodal Learning for Echocardiography
CV and Pattern Recognition
Helps doctors understand heart scans faster.
Automated Interpretable 2D Video Extraction from 3D Echocardiography
CV and Pattern Recognition
Makes heart scans easier for doctors to read.
A multimodal AI agent for clinical decision support in ophthalmology
Human-Computer Interaction
Helps eye doctors diagnose problems better and faster.