Score: 1

Robust Long-term Test-Time Adaptation for 3D Human Pose Estimation through Motion Discretization

Published: November 24, 2025 | arXiv ID: 2511.18851v1

By: Yilin Wen, Kechuan Dong, Yusuke Sugano

Potential Business Impact:

Fixes computer vision errors in moving people.

Business Areas:
Motion Capture Media and Entertainment, Video

Online test-time adaptation addresses the train-test domain gap by adapting the model on unlabeled streaming test inputs before making the final prediction. However, online adaptation for 3D human pose estimation suffers from error accumulation when relying on self-supervision with imperfect predictions, leading to degraded performance over time. To mitigate this fundamental challenge, we propose a novel solution that highlights the use of motion discretization. Specifically, we employ unsupervised clustering in the latent motion representation space to derive a set of anchor motions, whose regularity aids in supervising the human pose estimator and enables efficient self-replay. Additionally, we introduce an effective and efficient soft-reset mechanism by reverting the pose estimator to its exponential moving average during continuous adaptation. We examine long-term online adaptation by continuously adapting to out-of-domain streaming test videos of the same individual, which allows for the capture of consistent personal shape and motion traits throughout the streaming observation. By mitigating error accumulation, our solution enables robust exploitation of these personal traits for enhanced accuracy. Experiments demonstrate that our solution outperforms previous online test-time adaptation methods and validate our design choices.

Country of Origin
🇯🇵 Japan

Page Count
16 pages

Category
Computer Science:
CV and Pattern Recognition