EvalBlocks: A Modular Pipeline for Rapidly Evaluating Foundation Models in Medical Imaging

Jan Tagscherer; Sarah de Boer; Lena Philipp; Fennie van der Graaf; Dr\'e Peeters; Joeran Bosma; Lars Leijten; Bogdan Obreja; Ewoud Smit; Alessa Hering

arXiv:2601.03811·cs.CV·April 2, 2026

EvalBlocks: A Modular Pipeline for Rapidly Evaluating Foundation Models in Medical Imaging

Jan Tagscherer, Sarah de Boer, Lena Philipp, Fennie van der Graaf, Dr\'e Peeters, Joeran Bosma, Lars Leijten, Bogdan Obreja, Ewoud Smit, Alessa Hering

PDF

1 Repo

TL;DR

EvalBlocks is an open-source, modular framework built on Snakemake that streamlines the evaluation of foundation models in medical imaging, enabling faster iteration and reproducibility.

Contribution

It introduces a flexible, plug-and-play evaluation pipeline that simplifies tracking, reproducibility, and scalability of medical imaging model assessments.

Findings

01

Supports seamless integration of datasets and models

02

Enables scalable, reproducible evaluations with caching and parallel execution

03

Demonstrated on five models and three medical imaging tasks

Abstract

Developing foundation models in medical imaging requires continuous monitoring of downstream performance. Researchers are burdened with tracking numerous experiments, design choices, and their effects on performance, often relying on ad-hoc, manual workflows that are inherently slow and error-prone. We introduce EvalBlocks, a modular, plug-and-play framework for efficient evaluation of foundation models during development. Built on Snakemake, EvalBlocks supports seamless integration of new datasets, foundation models, aggregation methods, and evaluation strategies. All experiments and results are tracked centrally and are reproducible with a single command, while efficient caching and parallel execution enable scalable use on shared compute infrastructure. Demonstrated on five state-of-the-art foundation models and three medical imaging classification tasks, EvalBlocks streamlines model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DIAGNijmegen/eval-blocks
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.