Score: 0

Bi-modal Prediction and Transformation Coding for Compressing Complex Human Dynamics

Published: September 21, 2025 | arXiv ID: 2509.16919v1

By: Huong Hoang , Keito Suzuki , Truong Nguyen and more

Potential Business Impact:

Makes animated characters move more realistically.

Business Areas:

Motion Capture Media and Entertainment, Video

For dynamic human motion sequences, the original KeyNode-Driven codec often struggles to retain compression efficiency when confronted with rapid movements or strong non-rigid deformations. This paper proposes a novel Bi-modal coding framework that enhances the flexibility of motion representation by integrating semantic segmentation and region-specific transformation modeling. The rigid transformation model (rotation & translation) is extended with a hybrid scheme that selectively applies affine transformations-rotation, translation, scaling, and shearing-only to deformation-rich regions (e.g., the torso, where loose clothing induces high variability), while retaining rigid models elsewhere. The affine model is decomposed into minimal parameter sets for efficient coding and combined through a component selection strategy guided by a Lagrangian Rate-Distortion optimization. The results show that the Bi-modal method achieves more accurate mesh deformation, especially in sequences involving complex non-rigid motion, without compromising compression efficiency in simpler regions, with an average bit-rate saving of 33.81% compared to the baseline.

KeyNode-Driven Geometry Coding for Real-World Scanned Human Dynamic Mesh Compression

CV and Pattern Recognition

Makes 3D people in games look real with less data.

3 Jan 2025 1

88%

Rethinking Generative Human Video Coding with Implicit Motion Transformation

CV and Pattern Recognition

Makes videos of people move more smoothly.

12 Jun 2025 1

87%

Fine-Grained Motion Compression and Selective Temporal Fusion for Neural B-Frame Video Coding

Image and Video Processing

Makes videos load faster and look better.

9 Jun 2025 3

View PDF Login to Bookmark

Page Count

14 pages

Bi-modal Prediction and Transformation Coding for Compressing Complex Human Dynamics

Makes animated characters move more realistically.

Technical Abstract

KeyNode-Driven Geometry Coding for Real-World Scanned Human Dynamic Mesh Compression

Rethinking Generative Human Video Coding with Implicit Motion Transformation

Fine-Grained Motion Compression and Selective Temporal Fusion for Neural B-Frame Video Coding