Score: 0

ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions

Published: December 11, 2025 | arXiv ID: 2512.10286v1

By: Xiaoxue Wu , Xinyuan Chen , Yaohui Wang and more

Potential Business Impact:

Makes videos look like movies with better scene changes.

Business Areas:
Video Editing Content and Publishing, Media and Entertainment, Video

Shot transitions play a pivotal role in multi-shot video generation, as they determine the overall narrative expression and the directorial design of visual storytelling. However, recent progress has primarily focused on low-level visual consistency across shots, neglecting how transitions are designed and how cinematographic language contributes to coherent narrative expression. This often leads to mere sequential shot changes without intentional film-editing patterns. To address this limitation, we propose ShotDirector, an efficient framework that integrates parameter-level camera control and hierarchical editing-pattern-aware prompting. Specifically, we adopt a camera control module that incorporates 6-DoF poses and intrinsic settings to enable precise camera information injection. In addition, a shot-aware mask mechanism is employed to introduce hierarchical prompts aware of professional editing patterns, allowing fine-grained control over shot content. Through this design, our framework effectively combines parameter-level conditions with high-level semantic guidance, achieving film-like controllable shot transitions. To facilitate training and evaluation, we construct ShotWeaver40K, a dataset that captures the priors of film-like editing patterns, and develop a set of evaluation metrics for controllable multi-shot video generation. Extensive experiments demonstrate the effectiveness of our framework.

Page Count
16 pages

Category
Computer Science:
CV and Pattern Recognition