TL;DR
EvalBlocks is an open-source, modular framework built on Snakemake that streamlines the evaluation of foundation models in medical imaging, enabling faster iteration and reproducibility.
Contribution
It introduces a flexible, plug-and-play evaluation pipeline that simplifies tracking, reproducibility, and scalability of medical imaging model assessments.
Findings
Supports seamless integration of datasets and models
Enables scalable, reproducible evaluations with caching and parallel execution
Demonstrated on five models and three medical imaging tasks
Abstract
Developing foundation models in medical imaging requires continuous monitoring of downstream performance. Researchers are burdened with tracking numerous experiments, design choices, and their effects on performance, often relying on ad-hoc, manual workflows that are inherently slow and error-prone. We introduce EvalBlocks, a modular, plug-and-play framework for efficient evaluation of foundation models during development. Built on Snakemake, EvalBlocks supports seamless integration of new datasets, foundation models, aggregation methods, and evaluation strategies. All experiments and results are tracked centrally and are reproducible with a single command, while efficient caching and parallel execution enable scalable use on shared compute infrastructure. Demonstrated on five state-of-the-art foundation models and three medical imaging classification tasks, EvalBlocks streamlines model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
