Benefits of Feature Extraction and Temporal Sequence Analysis for Video Frame Prediction: An Evaluation of Hybrid Deep Learning Models
By: Jose M. Sánchez Velázquez , Mingbo Cai , Andrew Coney and more
Potential Business Impact:
Predicts future video frames more accurately.
In recent years, advances in Artificial Intelligence have significantly impacted computer science, particularly in the field of computer vision, enabling solutions to complex problems such as video frame prediction. Video frame prediction has critical applications in weather forecasting or autonomous systems and can provide technical improvements, such as video compression and streaming. Among Artificial Intelligence methods, Deep Learning has emerged as highly effective for solving vision-related tasks, although current frame prediction models still have room for enhancement. This paper evaluates several hybrid deep learning approaches that combine the feature extraction capabilities of autoencoders with temporal sequence modelling using Recurrent Neural Networks (RNNs), 3D Convolutional Neural Networks (3D CNNs), and related architectures. The proposed solutions were rigorously evaluated on three datasets that differ in terms of synthetic versus real-world scenarios and grayscale versus color imagery. Results demonstrate that the approaches perform well, with SSIM metrics increasing from 0.69 to 0.82, indicating that hybrid models utilizing 3DCNNs and ConvLSTMs are the most effective, and greyscale videos with real data are the easiest to predict.
Similar Papers
Next-Frame Feature Prediction for Multimodal Deepfake Detection and Temporal Localization
CV and Pattern Recognition
Finds fake videos by predicting what happens next.
A Framework Combining 3D CNN and Transformer for Video-Based Behavior Recognition
CV and Pattern Recognition
Helps computers understand actions in videos better.
Fair and Interpretable Deepfake Detection in Videos
CV and Pattern Recognition
Finds fake videos fairly for everyone.