MedVAR: Towards Scalable and Efficient Medical Image Generation via Next-scale Autoregressive Prediction
Zhicheng He, Yunpeng Zhao, Junde Wu, Ziwei Niu, Zijun Li, Bohan Li, Lanfen Lin, and Yueming Jin

TL;DR
MedVAR is a novel autoregressive foundation model for medical image generation that enables scalable, efficient, and high-quality synthesis across multiple organs, supporting downstream clinical applications.
Contribution
It introduces MedVAR, the first autoregressive model using next-scale prediction for scalable, efficient medical image synthesis with structured multi-scale representations.
Findings
Achieves state-of-the-art generative performance.
Supports multi-organ image synthesis with high fidelity and diversity.
Demonstrates scalability and efficiency in medical image generation.
Abstract
Medical image generation is pivotal in applications like data augmentation for low-resource clinical tasks and privacy-preserving data sharing. However, developing a scalable generative backbone for medical imaging requires architectural efficiency, sufficient multi-organ data, and principled evaluation, yet current approaches leave these aspects unresolved. Therefore, we introduce MedVAR, the first autoregressive-based foundation model that adopts the next-scale prediction paradigm to enable fast and scale-up-friendly medical image synthesis. MedVAR generates images in a coarse-to-fine manner and produces structured multi-scale representations suitable for downstream use. To support hierarchical generation, we curate a harmonized dataset of around 440,000 CT and MRI images spanning six anatomical regions. Comprehensive experiments across fidelity, diversity, and scalability show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Advanced Neural Network Applications
