VideoAgent: Personalized Synthesis of Scientific Videos
By: Xiao Liang , Bangxin Li , Zixuan Chen and more
Potential Business Impact:
Makes science videos automatically from papers.
Automating the generation of scientific videos is a crucial yet challenging task for effective knowledge dissemination. However, existing works on document automation primarily focus on static media such as posters and slides, lacking mechanisms for personalized dynamic orchestration and multimodal content synchronization. To address these challenges, we introduce VideoAgent, a novel multi-agent framework that synthesizes personalized scientific videos through a conversational interface. VideoAgent parses a source paper into a fine-grained asset library and, guided by user requirements, orchestrates a narrative flow that synthesizes both static slides and dynamic animations to explain complex concepts. To enable rigorous evaluation, we also propose SciVidEval, the first comprehensive suite for this task, which combines automated metrics for multimodal content quality and synchronization with a Video-Quiz-based human evaluation to measure knowledge transfer. Extensive experiments demonstrate that our method significantly outperforms existing commercial scientific video generation services and approaches human-level quality in scientific communication.
Similar Papers
SlideGen: Collaborative Multimodal Agents for Scientific Slide Generation
Artificial Intelligence
Creates presentation slides from research papers.
Paper2Video: Automatic Video Generation from Scientific Papers
CV and Pattern Recognition
Makes research videos automatically from papers.
Stealing Creator's Workflow: A Creator-Inspired Agentic Framework with Iterative Feedback Loop for Improved Scientific Short-form Generation
Computation and Language
Makes science videos from papers accurately.