Gaussian See, Gaussian Do: Semantic 3D Motion Transfer from Multiview Video
By: Yarin Bekor , Gal Michael Harari , Or Perel and more
Potential Business Impact:
Makes 3D objects dance like real people.
We present Gaussian See, Gaussian Do, a novel approach for semantic 3D motion transfer from multiview video. Our method enables rig-free, cross-category motion transfer between objects with semantically meaningful correspondence. Building on implicit motion transfer techniques, we extract motion embeddings from source videos via condition inversion, apply them to rendered frames of static target shapes, and use the resulting videos to supervise dynamic 3D Gaussian Splatting reconstruction. Our approach introduces an anchor-based view-aware motion embedding mechanism, ensuring cross-view consistency and accelerating convergence, along with a robust 4D reconstruction pipeline that consolidates noisy supervision videos. We establish the first benchmark for semantic 3D motion transfer and demonstrate superior motion fidelity and structural consistency compared to adapted baselines. Code and data for this paper available at https://gsgd-motiontransfer.github.io/
Similar Papers
Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding
CV and Pattern Recognition
Makes videos show 3D worlds without flickering.
SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer
CV and Pattern Recognition
Makes 3D objects look like any picture.
SyncTrack4D: Cross-Video Motion Alignment and Video Synchronization for Multi-Video 4D Gaussian Splatting
CV and Pattern Recognition
Makes shaky videos look like smooth 3D movies.