Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model
By: Pengfei Guo , Can Zhao , Dong Yang and more
Potential Business Impact:
Creates 3D body scans from simple text.
Generating 3D CT volumes from descriptive free-text inputs presents a transformative opportunity in diagnostics and research. In this paper, we introduce Text2CT, a novel approach for synthesizing 3D CT volumes from textual descriptions using the diffusion model. Unlike previous methods that rely on fixed-format text input, Text2CT employs a novel prompt formulation that enables generation from diverse, free-text descriptions. The proposed framework encodes medical text into latent representations and decodes them into high-resolution 3D CT scans, effectively bridging the gap between semantic text inputs and detailed volumetric representations in a unified 3D framework. Our method demonstrates superior performance in preserving anatomical fidelity and capturing intricate structures as described in the input text. Extensive evaluations show that our approach achieves state-of-the-art results, offering promising potential applications in diagnostics, and data augmentation.
Similar Papers
Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining
CV and Pattern Recognition
Creates realistic CT scans from text descriptions.
TextDiffSeg: Text-guided Latent Diffusion Model for 3d Medical Images Segmentation
Image and Video Processing
Helps doctors see inside bodies better with words.
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models
CV and Pattern Recognition
Creates pictures from words for designs.