MedSapiens: Taking a Pose to Rethink Medical Imaging Landmark Detection
By: Marawan Elbatel , Anbang Wang , Keyuan Liu and more
Potential Business Impact:
Helps doctors find body parts in X-rays better.
This paper does not introduce a novel architecture; instead, it revisits a fundamental yet overlooked baseline: adapting human-centric foundation models for anatomical landmark detection in medical imaging. While landmark detection has traditionally relied on domain-specific models, the emergence of large-scale pre-trained vision models presents new opportunities. In this study, we investigate the adaptation of Sapiens, a human-centric foundation model designed for pose estimation, to medical imaging through multi-dataset pretraining, establishing a new state of the art across multiple datasets. Our proposed model, MedSapiens, demonstrates that human-centric foundation models, inherently optimized for spatial pose localization, provide strong priors for anatomical landmark detection, yet this potential has remained largely untapped. We benchmark MedSapiens against existing state-of-the-art models, achieving up to 5.26% improvement over generalist models and up to 21.81% improvement over specialist models in the average success detection rate (SDR). To further assess MedSapiens adaptability to novel downstream tasks with few annotations, we evaluate its performance in limited-data settings, achieving 2.69% improvement over the few-shot state of the art in SDR. Code and model weights are available at https://github.com/xmed-lab/MedSapiens .
Similar Papers
Geometric-Guided Few-Shot Dental Landmark Detection with Human-Centric Foundation Model
CV and Pattern Recognition
Finds important teeth spots faster for dentists.
SapiensID: Foundation for Human Recognition
CV and Pattern Recognition
Helps computers recognize people from any angle.
Landmarks Are Alike Yet Distinct: Harnessing Similarity and Individuality for One-Shot Medical Landmark Detection
CV and Pattern Recognition
Helps doctors find body parts in X-rays better.