Score: 1

Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model

Published: May 7, 2025 | arXiv ID: 2505.04522v1

By: Pengfei Guo , Can Zhao , Dong Yang and more

Potential Business Impact:

Creates 3D body scans from simple text.

Business Areas:
Text Analytics Data and Analytics, Software

Generating 3D CT volumes from descriptive free-text inputs presents a transformative opportunity in diagnostics and research. In this paper, we introduce Text2CT, a novel approach for synthesizing 3D CT volumes from textual descriptions using the diffusion model. Unlike previous methods that rely on fixed-format text input, Text2CT employs a novel prompt formulation that enables generation from diverse, free-text descriptions. The proposed framework encodes medical text into latent representations and decodes them into high-resolution 3D CT scans, effectively bridging the gap between semantic text inputs and detailed volumetric representations in a unified 3D framework. Our method demonstrates superior performance in preserving anatomical fidelity and capturing intricate structures as described in the input text. Extensive evaluations show that our approach achieves state-of-the-art results, offering promising potential applications in diagnostics, and data augmentation.

Page Count
15 pages

Category
Electrical Engineering and Systems Science:
Image and Video Processing