Score: 2

Efficient Diffusion-Based 3D Human Pose Estimation with Hierarchical Temporal Pruning

Published: August 29, 2025 | arXiv ID: 2508.21363v1

By: Yuquan Bi , Hongsong Wang , Xinli Shi and more

Potential Business Impact:

Makes 3D human pose guessing much faster.

Business Areas:

Motion Capture Media and Entertainment, Video

Diffusion models have demonstrated strong capabilities in generating high-fidelity 3D human poses, yet their iterative nature and multi-hypothesis requirements incur substantial computational cost. In this paper, we propose an Efficient Diffusion-Based 3D Human Pose Estimation framework with a Hierarchical Temporal Pruning (HTP) strategy, which dynamically prunes redundant pose tokens across both frame and semantic levels while preserving critical motion dynamics. HTP operates in a staged, top-down manner: (1) Temporal Correlation-Enhanced Pruning (TCEP) identifies essential frames by analyzing inter-frame motion correlations through adaptive temporal graph construction; (2) Sparse-Focused Temporal MHSA (SFT MHSA) leverages the resulting frame-level sparsity to reduce attention computation, focusing on motion-relevant tokens; and (3) Mask-Guided Pose Token Pruner (MGPTP) performs fine-grained semantic pruning via clustering, retaining only the most informative pose tokens. Experiments on Human3.6M and MPI-INF-3DHP show that HTP reduces training MACs by 38.5\%, inference MACs by 56.8\%, and improves inference speed by an average of 81.1\% compared to prior diffusion-based methods, while achieving state-of-the-art performance.

FastDDHPose: Towards Unified, Efficient, and Disentangled 3D Human Pose Estimation

CV and Pattern Recognition

Makes computers understand body poses better and faster.

16 Dec 2025 2

90%

HyperDiff: Hypergraph Guided Diffusion Model for 3D Human Pose Estimation

CV and Pattern Recognition

Lets computers see people's bodies in 3D.

20 Aug 2025 1

90%

DreamPose3D: Hallucinative Diffusion with Prompt Learning for 3D Human Pose Estimation

CV and Pattern Recognition

Lets computers guess people's movements in 3D.

12 Nov 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 🇲🇴 China, Macao

Page Count

13 pages

Efficient Diffusion-Based 3D Human Pose Estimation with Hierarchical Temporal Pruning

Makes 3D human pose guessing much faster.

Technical Abstract

FastDDHPose: Towards Unified, Efficient, and Disentangled 3D Human Pose Estimation

HyperDiff: Hypergraph Guided Diffusion Model for 3D Human Pose Estimation

DreamPose3D: Hallucinative Diffusion with Prompt Learning for 3D Human Pose Estimation