AirGS: Real-Time 4D Gaussian Streaming for Free-Viewpoint Video Experiences
By: Zhe Wang, Jinghang Li, Yifei Zhu
Free-viewpoint video (FVV) enables immersive viewing experiences by allowing users to view scenes from arbitrary perspectives. As a prominent reconstruction technique for FVV generation, 4D Gaussian Splatting (4DGS) models dynamic scenes with time-varying 3D Gaussian ellipsoids and achieves high-quality rendering via fast rasterization. However, existing 4DGS approaches suffer from quality degradation over long sequences and impose substantial bandwidth and storage overhead, limiting their applicability in real-time and wide-scale deployments. Therefore, we present AirGS, a streaming-optimized 4DGS framework that rearchitects the training and delivery pipeline to enable high-quality, low-latency FVV experiences. AirGS converts Gaussian video streams into multi-channel 2D formats and intelligently identifies keyframes to enhance frame reconstruction quality. It further combines temporal coherence with inflation loss to reduce training time and representation size. To support communication-efficient transmission, AirGS models 4DGS delivery as an integer linear programming problem and design a lightweight pruning level selection algorithm to adaptively prune the Gaussian updates to be transmitted, balancing reconstruction quality and bandwidth consumption. Extensive experiments demonstrate that AirGS reduces quality deviation in PSNR by more than 20% when scene changes, maintains frame-level PSNR consistently above 30, accelerates training by 6 times, reduces per-frame transmission size by nearly 50% compared to the SOTA 4DGS approaches.
Similar Papers
StreamSTGS: Streaming Spatial and Temporal Gaussian Grids for Real-Time Free-Viewpoint Video
CV and Pattern Recognition
Makes 3D videos stream smoothly with less data.
D-FCGS: Feedforward Compression of Dynamic Gaussian Splatting for Free-Viewpoint Videos
CV and Pattern Recognition
Makes 3D videos smaller for faster sharing.
Geometry-Consistent 4D Gaussian Splatting for Sparse-Input Dynamic View Synthesis
CV and Pattern Recognition
Creates realistic 3D scenes from few pictures.