Score: 1

Human vs. AI Safety Perception? Decoding Human Safety Perception with Eye-Tracking Systems, Street View Images, and Explainable AI

Published: September 29, 2025 | arXiv ID: 2509.25457v1

By: Yuhao Kang , Junda Chen , Liu Liu and more

BigTech Affiliations: Massachusetts Institute of Technology

Potential Business Impact:

Shows what makes city places feel safe.

Business Areas:
Visual Search Internet Services

The way residents perceive safety plays an important role in how they use public spaces. Studies have combined large-scale street view images and advanced computer vision techniques to measure the perception of safety of urban environments. Despite their success, such studies have often overlooked the specific environmental visual factors that draw human attention and trigger people's feelings of safety perceptions. In this study, we introduce a computational framework that enriches the existing body of literature on place perception by using eye-tracking systems with street view images and deep learning approaches. Eye-tracking systems quantify not only what users are looking at but also how long they engage with specific environmental elements. This allows us to explore the nuance of which visual environmental factors influence human safety perceptions. We conducted our research in Helsingborg, Sweden, where we recruited volunteers outfitted with eye-tracking systems. They were asked to indicate which of the two street view images appeared safer. By examining participants' focus on specific features using Mean Object Ratio in Highlighted Regions (MoRH) and Mean Object Hue (MoH), we identified key visual elements that attract human attention when perceiving safe environments. For instance, certain urban infrastructure and public space features draw more human attention while the sky is less relevant in influencing safety perceptions. These insights offer a more human-centered understanding of which urban features influence human safety perceptions. Furthermore, we compared the real human attention from eye-tracking systems with attention maps obtained from eXplainable Artificial Intelligence (XAI) results. Several XAI models were tested, and we observed that XGradCAM and EigenCAM most closely align with human safety perceptual patterns.

Country of Origin
🇺🇸 United States

Page Count
28 pages

Category
Computer Science:
Human-Computer Interaction