LMVC: An End-to-End Learned Multiview Video Coding Framework
By: Xihua Sheng , Yingwen Zhang , Long Xu and more
Potential Business Impact:
Makes 3D videos smaller for easier sharing.
Multiview video is a key data source for volumetric video, enabling immersive 3D scene reconstruction but posing significant challenges in storage and transmission due to its massive data volume. Recently, deep learning-based end-to-end video coding has achieved great success, yet most focus on single-view or stereo videos, leaving general multiview scenarios underexplored. This paper proposes an end-to-end learned multiview video coding (LMVC) framework that ensures random access and backward compatibility while enhancing compression efficiency. Our key innovation lies in effectively leveraging independent-view motion and content information to enhance dependent-view compression. Specifically, to exploit the inter-view motion correlation, we propose a feature-based inter-view motion vector prediction method that conditions dependent-view motion encoding on decoded independent-view motion features, along with an inter-view motion entropy model that learns inter-view motion priors. To exploit the inter-view content correlation, we propose a disparity-free inter-view context prediction module that predicts inter-view contexts from decoded independent-view content features, combined with an inter-view contextual entropy model that captures inter-view context priors. Experimental results show that our proposed LMVC framework outperforms the reference software of the traditional MV-HEVC standard by a large margin, establishing a strong baseline for future research in this field.
Similar Papers
Multi-View Factorizing and Disentangling: A Novel Framework for Incomplete Multi-View Multi-Label Classification
CV and Pattern Recognition
Helps computers learn from messy, incomplete data.
LVC: A Lightweight Compression Framework for Enhancing VLMs in Long Video Understanding
CV and Pattern Recognition
Helps computers understand long videos better.
L-LBVC: Long-Term Motion Estimation and Prediction for Learned Bi-Directional Video Compression
CV and Pattern Recognition
Makes video streaming smoother with less data.