CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis
Jiayi Wang, Hadrien Reynaud, Franciskus Xaverius Erick, and Bernhard Kainz

TL;DR
This paper introduces CTFlow, a large-scale latent flow matching transformer model conditioned on clinical reports, capable of generating coherent 3D CT volumes with improved diversity and alignment, leveraging a novel autoregressive approach.
Contribution
The paper presents CTFlow, a 0.5B parameter model that advances 3D CT volume synthesis by integrating latent flow matching with autoregressive generation conditioned on clinical reports.
Findings
Outperforms state-of-the-art models in FID, FVD, IS, and CLIP scores.
Demonstrates superior temporal coherence and image diversity.
Effectively aligns generated volumes with clinical reports.
Abstract
Generative modelling of entire CT volumes conditioned on clinical reports has the potential to accelerate research through data augmentation, privacy-preserving synthesis and reducing regulator-constraints on patient data while preserving diagnostic signals. With the recent release of CT-RATE, a large-scale collection of 3D CT volumes paired with their respective clinical reports, training large text-conditioned CT volume generation models has become achievable. In this work, we introduce CTFlow, a 0.5B latent flow matching transformer model, conditioned on clinical reports. We leverage the A-VAE from FLUX to define our latent space, and rely on the CT-Clip text encoder to encode the clinical reports. To generate consistent whole CT volumes while keeping the memory constraints tractable, we rely on a custom autoregressive approach, where the model predicts the first sequence of slices…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
