NeRV360: Neural Representation for 360-Degree Videos with a Viewport Decoder
By: Daichi Arai , Kyohei Unno , Yasuko Sugito and more
Implicit neural representations for videos (NeRV) have shown strong potential for video compression. However, applying NeRV to high-resolution 360-degree videos causes high memory usage and slow decoding, making real-time applications impractical. We propose NeRV360, an end-to-end framework that decodes only the user-selected viewport instead of reconstructing the entire panoramic frame. Unlike conventional pipelines, NeRV360 integrates viewport extraction into decoding and introduces a spatial-temporal affine transform module for conditional decoding based on viewpoint and time. Experiments on 6K-resolution videos show that NeRV360 achieves a 7-fold reduction in memory consumption and a 2.5-fold increase in decoding speed compared to HNeRV, a representative prior work, while delivering better image quality in terms of objective metrics.
Similar Papers
Boosting Neural Video Representation via Online Structural Reparameterization
Image and Video Processing
Makes videos smaller for faster sending.
MetaNeRV: Meta Neural Representations for Videos with Spatial-Temporal Guidance
CV and Pattern Recognition
Makes videos load much faster.
Ultra-lightweight Neural Video Representation Compression
CV and Pattern Recognition
Makes videos smaller and faster to send.