Score: 2

CRG Score: A Distribution-Aware Clinical Metric for Radiology Report Generation

Published: May 22, 2025 | arXiv ID: 2505.17167v1

By: Ibrahim Ethem Hamamci , Sezgin Er , Suprosanna Shit and more

Potential Business Impact:

Helps AI understand medical scans better.

Business Areas:
Image Recognition Data and Analytics, Software

Evaluating long-context radiology report generation is challenging. NLG metrics fail to capture clinical correctness, while LLM-based metrics often lack generalizability. Clinical accuracy metrics are more relevant but are sensitive to class imbalance, frequently favoring trivial predictions. We propose the CRG Score, a distribution-aware and adaptable metric that evaluates only clinically relevant abnormalities explicitly described in reference reports. CRG supports both binary and structured labels (e.g., type, location) and can be paired with any LLM for feature extraction. By balancing penalties based on label distribution, it enables fairer, more robust evaluation and serves as a clinically aligned reward function.

Country of Origin
🇬🇧 🇩🇪 🇹🇷 🇨🇭 Turkey, Germany, Switzerland, United Kingdom

Repos / Data Links

Page Count
4 pages

Category
Computer Science:
Computation and Language