STARE: Predicting Decision Making Based on Spatio-Temporal Eye Movements
By: Moshe Unger, Alexander Tuzhilin, Michel Wedel
Potential Business Impact:
Predicts what you'll buy by watching your eyes.
The present work proposes a Deep Learning architecture for the prediction of various consumer choice behaviors from time series of raw gaze or eye fixations on images of the decision environment, for which currently no foundational models are available. The architecture, called STARE (Spatio-Temporal Attention Representation for Eye Tracking), uses a new tokenization strategy, which involves mapping the x- and y- pixel coordinates of eye-movement time series on predefined, contiguous Regions of Interest. That tokenization makes the spatio-temporal eye-movement data available to the Chronos, a time-series foundation model based on the T5 architecture, to which co-attention and/or cross-attention is added to capture directional and/or interocular influences of eye movements. We compare STARE with several state-of-the art alternatives on multiple datasets with the purpose of predicting consumer choice behaviors from eye movements. We thus make a first step towards developing and testing DL architectures that represent visual attention dynamics rooted in the neurophysiology of eye movements.
Similar Papers
Learning Spatio-Temporal Feature Representations for Video-Based Gaze Estimation
CV and Pattern Recognition
Tracks where people look in videos better.
CapStARE: Capsule-based Spatiotemporal Architecture for Robust and Efficient Gaze Estimation
CV and Pattern Recognition
Helps computers know where you're looking.
A deep learning approach to track eye movements based on events
CV and Pattern Recognition
Tracks eyes cheaply for better VR/AR games.