Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model
Pengfei Guo, Can Zhao, Dong Yang, Yufan He, Vishwesh Nath, Ziyue Xu,, Pedro R. A. S. Bassi, Zongwei Zhou, Benjamin D. Simon, Stephanie Anne Harmon,, Baris Turkbey, Daguang Xu

TL;DR
Text2CT introduces a diffusion model-based method that generates high-quality 3D CT volumes from free-text descriptions, enabling more flexible and accurate medical image synthesis for diagnostics and research.
Contribution
The paper presents a novel diffusion model framework that encodes medical text into latent space and decodes it into 3D CT scans, handling diverse free-text inputs unlike previous fixed-format methods.
Findings
Achieves state-of-the-art accuracy in anatomical fidelity
Effectively captures intricate structures from text descriptions
Demonstrates potential for diagnostics and data augmentation
Abstract
Generating 3D CT volumes from descriptive free-text inputs presents a transformative opportunity in diagnostics and research. In this paper, we introduce Text2CT, a novel approach for synthesizing 3D CT volumes from textual descriptions using the diffusion model. Unlike previous methods that rely on fixed-format text input, Text2CT employs a novel prompt formulation that enables generation from diverse, free-text descriptions. The proposed framework encodes medical text into latent representations and decodes them into high-resolution 3D CT scans, effectively bridging the gap between semantic text inputs and detailed volumetric representations in a unified 3D framework. Our method demonstrates superior performance in preserving anatomical fidelity and capturing intricate structures as described in the input text. Extensive evaluations show that our approach achieves state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · 3D Shape Modeling and Analysis
MethodsDiffusion
