Score: 2

EvalBlocks: A Modular Pipeline for Rapidly Evaluating Foundation Models in Medical Imaging

Published: January 7, 2026 | arXiv ID: 2601.03811v1

By: Jan Tagscherer , Sarah de Boer , Lena Philipp and more

Potential Business Impact:

Helps doctors find diseases faster with better AI.

Business Areas:

Image Recognition Data and Analytics, Software

Developing foundation models in medical imaging requires continuous monitoring of downstream performance. Researchers are burdened with tracking numerous experiments, design choices, and their effects on performance, often relying on ad-hoc, manual workflows that are inherently slow and error-prone. We introduce EvalBlocks, a modular, plug-and-play framework for efficient evaluation of foundation models during development. Built on Snakemake, EvalBlocks supports seamless integration of new datasets, foundation models, aggregation methods, and evaluation strategies. All experiments and results are tracked centrally and are reproducible with a single command, while efficient caching and parallel execution enable scalable use on shared compute infrastructure. Demonstrated on five state-of-the-art foundation models and three medical imaging classification tasks, EvalBlocks streamlines model evaluation, enabling researchers to iterate faster and focus on model innovation rather than evaluation logistics. The framework is released as open source software at https://github.com/DIAGNijmegen/eval-blocks.