EvDiff: High Quality Video with an Event Camera
By: Weilun Li , Lei Sun , Ruixi Gao and more
Potential Business Impact:
Turns event camera data into clear videos.
As neuromorphic sensors, event cameras asynchronously record changes in brightness as streams of sparse events with the advantages of high temporal resolution and high dynamic range. Reconstructing intensity images from events is a highly ill-posed task due to the inherent ambiguity of absolute brightness. Early methods generally follow an end-to-end regression paradigm, directly mapping events to intensity frames in a deterministic manner. While effective to some extent, these approaches often yield perceptually inferior results and struggle to scale up in model capacity and training data. In this work, we propose EvDiff, an event-based diffusion model that follows a surrogate training framework to produce high-quality videos. To reduce the heavy computational cost of high-frame-rate video generation, we design an event-based diffusion model that performs only a single forward diffusion step, equipped with a temporally consistent EvEncoder. Furthermore, our novel Surrogate Training Framework eliminates the dependence on paired event-image datasets, allowing the model to leverage large-scale image datasets for higher capacity. The proposed EvDiff is capable of generating high-quality colorful videos solely from monochromatic event streams. Experiments on real-world datasets demonstrate that our method strikes a sweet spot between fidelity and realism, outperforming existing approaches on both pixel-level and perceptual metrics.
Similar Papers
Event Camera Guided Visual Media Restoration & 3D Reconstruction: A Survey
CV and Pattern Recognition
Improves blurry videos and 3D pictures.
Drone Detection with Event Cameras
CV and Pattern Recognition
Finds tiny drones in any light.
EventDiff: A Unified and Efficient Diffusion Model Framework for Event-based Video Frame Interpolation
CV and Pattern Recognition
Makes videos smoother using special camera data.