LatentFM: A Latent Flow Matching Approach for Generative Medical Image Segmentation
By: Huynh Trinh Ngoc , Hoang Anh Nguyen Kim , Toan Nguyen Hai and more
Potential Business Impact:
Creates better medical scans with built-in confidence.
Generative models have achieved remarkable progress with the emergence of flow matching (FM). It has demonstrated strong generative capabilities and attracted significant attention as a simulation-free flow-based framework capable of learning exact data densities. Motivated by these advances, we propose LatentFM, a flow-based model operating in the latent space for medical image segmentation. To model the data distribution, we first design two variational autoencoders (VAEs) to encode both medical images and their corresponding masks into a lower-dimensional latent space. We then estimate a conditional velocity field that guides the flow based on the input image. By sampling multiple latent representations, our method synthesizes diverse segmentation outputs whose pixel-wise variance reliably captures the underlying data distribution, enabling both highly accurate and uncertainty-aware predictions. Furthermore, we generate confidence maps that quantify the model certainty, providing clinicians with richer information for deeper analysis. We conduct experiments on two datasets, ISIC-2018 and CVC-Clinic, and compare our method with several prior baselines, including both deterministic and generative approach models. Through comprehensive evaluations, both qualitative and quantitative results show that our approach achieves superior segmentation accuracy while remaining highly efficient in the latent space.
Similar Papers
CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis
CV and Pattern Recognition
Creates fake CT scans from doctor's notes.
Diffusion Model in Latent Space for Medical Image Segmentation Task
CV and Pattern Recognition
Helps doctors see uncertain details in medical scans.
Efficient Flow Matching using Latent Variables
CV and Pattern Recognition
Makes AI create better, faster, and more realistic pictures.