Score: 1

DynaMind: Reconstructing Dynamic Visual Scenes from EEG by Aligning Temporal Dynamics and Multimodal Semantics to Guided Diffusion

Published: September 1, 2025 | arXiv ID: 2509.01177v1

By: Junxiang Liu , Junming Lin , Jiangtong Li and more

Potential Business Impact:

Shows what you're seeing from brain waves.

Business Areas:

Image Recognition Data and Analytics, Software

Reconstruction dynamic visual scenes from electroencephalography (EEG) signals remains a primary challenge in brain decoding, limited by the low spatial resolution of EEG, a temporal mismatch between neural recordings and video dynamics, and the insufficient use of semantic information within brain activity. Therefore, existing methods often inadequately resolve both the dynamic coherence and the complex semantic context of the perceived visual stimuli. To overcome these limitations, we introduce DynaMind, a novel framework that reconstructs video by jointly modeling neural dynamics and semantic features via three core modules: a Regional-aware Semantic Mapper (RSM), a Temporal-aware Dynamic Aligner (TDA), and a Dual-Guidance Video Reconstructor (DGVR). The RSM first utilizes a regional-aware encoder to extract multimodal semantic features from EEG signals across distinct brain regions, aggregating them into a unified diffusion prior. In the mean time, the TDA generates a dynamic latent sequence, or blueprint, to enforce temporal consistency between the feature representations and the original neural recordings. Together, guided by the semantic diffusion prior, the DGVR translates the temporal-aware blueprint into a high-fidelity video reconstruction. On the SEED-DV dataset, DynaMind sets a new state-of-the-art (SOTA), boosting reconstructed video accuracies (video- and frame-based) by 12.5 and 10.3 percentage points, respectively. It also achieves a leap in pixel-level quality, showing exceptional visual fidelity and temporal coherence with a 9.4% SSIM improvement and a 19.7% FVMD reduction. This marks a critical advancement, bridging the gap between neural dynamics and high-fidelity visual semantics.

Dynamic Vision from EEG Brain Recordings, How much does EEG know?

Human-Computer Interaction

Lets computers guess what you're watching from brain waves.

27 May 2025 0

91%

MindCine: Multimodal EEG-to-Video Reconstruction with Large-Scale Pretrained Models

CV and Pattern Recognition

Shows what you're seeing from brain waves.

26 Jan 2026 2

91%

MindCine: Multimodal EEG-to-Video Reconstruction with Large-Scale Pretrained Models

CV and Pattern Recognition

Shows what you're seeing from brain waves.

26 Jan 2026 2

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

14 pages

DynaMind: Reconstructing Dynamic Visual Scenes from EEG by Aligning Temporal Dynamics and Multimodal Semantics to Guided Diffusion

Shows what you're seeing from brain waves.

Technical Abstract

Dynamic Vision from EEG Brain Recordings, How much does EEG know?

MindCine: Multimodal EEG-to-Video Reconstruction with Large-Scale Pretrained Models

MindCine: Multimodal EEG-to-Video Reconstruction with Large-Scale Pretrained Models