TrajFusionNet: Pedestrian Crossing Intention Prediction via Fusion of Sequential and Visual Trajectory Representations
By: François G. Landry, Moulay A. Akhloufi
Potential Business Impact:
Helps self-driving cars guess if people will cross.
With the introduction of vehicles with autonomous capabilities on public roads, predicting pedestrian crossing intention has emerged as an active area of research. The task of predicting pedestrian crossing intention involves determining whether pedestrians in the scene are likely to cross the road or not. In this work, we propose TrajFusionNet, a novel transformer-based model that combines future pedestrian trajectory and vehicle speed predictions as priors for predicting crossing intention. TrajFusionNet comprises two branches: a Sequence Attention Module (SAM) and a Visual Attention Module (VAM). The SAM branch learns from a sequential representation of the observed and predicted pedestrian trajectory and vehicle speed. Complementarily, the VAM branch enables learning from a visual representation of the predicted pedestrian trajectory by overlaying predicted pedestrian bounding boxes onto scene images. By utilizing a small number of lightweight modalities, TrajFusionNet achieves the lowest total inference time (including model runtime and data preprocessing) among current state-of-the-art approaches. In terms of performance, it achieves state-of-the-art results across the three most commonly used datasets for pedestrian crossing intention prediction.
Similar Papers
VIT-Ped: Visionary Intention Transformer for Pedestrian Behavior Analysis
CV and Pattern Recognition
Helps self-driving cars predict where people will walk.
Pedestrian Intention Prediction via Vision-Language Foundation Models
CV and Pattern Recognition
Helps self-driving cars guess when people will cross.
Intention Enhanced Diffusion Model for Multimodal Pedestrian Trajectory Prediction
CV and Pattern Recognition
Helps self-driving cars guess where people will walk.