DeFloMat: Detection with Flow Matching for Stable and Efficient Generative Object Localization
By: Hansang Lee , Chaelin Lee , Nieun Seo and more
Potential Business Impact:
Finds diseases in scans much faster.
We propose DeFloMat (Detection with Flow Matching), a novel generative object detection framework that addresses the critical latency bottleneck of diffusion-based detectors, such as DiffusionDet, by integrating Conditional Flow Matching (CFM). Diffusion models achieve high accuracy by formulating detection as a multi-step stochastic denoising process, but their reliance on numerous sampling steps ($T \gg 60$) makes them impractical for time-sensitive clinical applications like Crohn's Disease detection in Magnetic Resonance Enterography (MRE). DeFloMat replaces this slow stochastic path with a highly direct, deterministic flow field derived from Conditional Optimal Transport (OT) theory, specifically approximating the Rectified Flow. This shift enables fast inference via a simple Ordinary Differential Equation (ODE) solver. We demonstrate the superiority of DeFloMat on a challenging MRE clinical dataset. Crucially, DeFloMat achieves state-of-the-art accuracy ($43.32\% \text{ } AP_{10:50}$) in only $3$ inference steps, which represents a $1.4\times$ performance improvement over DiffusionDet's maximum converged performance ($31.03\% \text{ } AP_{10:50}$ at $4$ steps). Furthermore, our deterministic flow significantly enhances localization characteristics, yielding superior Recall and stability in the few-step regime. DeFloMat resolves the trade-off between generative accuracy and clinical efficiency, setting a new standard for stable and rapid object localization.
Similar Papers
FlowDet: Unifying Object Detection and Generative Transport Flows
CV and Pattern Recognition
Finds objects in pictures much faster.
Efficiency vs. Fidelity: A Comparative Analysis of Diffusion Probabilistic Models and Flow Matching on Low-Resource Hardware
Machine Learning (CS)
Makes AI create pictures much faster on phones.
SoFlow: Solution Flow Models for One-Step Generative Modeling
Machine Learning (CS)
Makes AI create pictures instantly, not slowly.