From Static to Dynamic: Evaluating the Perceptual Impact of Dynamic Elements in Urban Scenes Using Generative Inpainting
By: Zhiwei Wei , Mengzi Zhang , Boyan Lu and more
Potential Business Impact:
Makes cities seem less lively without people.
Understanding urban perception from street view imagery has become a central topic in urban analytics and human centered urban design. However, most existing studies treat urban scenes as static and largely ignore the role of dynamic elements such as pedestrians and vehicles, raising concerns about potential bias in perception based urban analysis. To address this issue, we propose a controlled framework that isolates the perceptual effects of dynamic elements by constructing paired street view images with and without pedestrians and vehicles using semantic segmentation and MLLM guided generative inpainting. Based on 720 paired images from Dongguan, China, a perception experiment was conducted in which participants evaluated original and edited scenes across six perceptual dimensions. The results indicate that removing dynamic elements leads to a consistent 30.97% decrease in perceived vibrancy, whereas changes in other dimensions are more moderate and heterogeneous. To further explore the underlying mechanisms, we trained 11 machine learning models using multimodal visual features and identified that lighting conditions, human presence, and depth variation were key factors driving perceptual change. At the individual level, 65% of participants exhibited significant vibrancy changes, compared with 35-50% for other dimensions; gender further showed a marginal moderating effect on safety perception. Beyond controlled experiments, the trained model was extended to a city-scale dataset to predict vibrancy changes after the removal of dynamic elements. The city level results reveal that such perceptual changes are widespread and spatially structured, affecting 73.7% of locations and 32.1% of images, suggesting that urban perception assessments based solely on static imagery may substantially underestimate urban liveliness.
Similar Papers
Global urban visual perception varies across demographics and personalities
CV and Pattern Recognition
Shows how different people see places differently.
Human vs. AI Safety Perception? Decoding Human Safety Perception with Eye-Tracking Systems, Street View Images, and Explainable AI
Human-Computer Interaction
Shows what makes city places feel safe.
From Pixels to Predicates Structuring urban perception with scene graphs
CV and Pattern Recognition
Makes computers understand how people feel about places.