STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data
By: Maximilian Forstenhäusler , Daniel Külzer , Christos Anagnostopoulos and more
Potential Business Impact:
Helps smart cars guess where you'll go next.
Accurate predictions using sequential spatiotemporal data are crucial for various applications. Utilizing real-world data, we aim to learn the intent of a smart device user within confined areas of a vehicle's surroundings. However, in real-world scenarios, environmental factors and sensor limitations result in non-stationary and irregularly sampled data, posing significant challenges. To address these issues, we developed a Transformer-based approach, STaRFormer, which serves as a universal framework for sequential modeling. STaRFormer employs a novel, dynamic attention-based regional masking scheme combined with semi-supervised contrastive learning to enhance task-specific latent representations. Comprehensive experiments on 15 datasets varying in types (including non-stationary and irregularly sampled), domains, sequence lengths, training samples, and applications, demonstrate the efficacy and practicality of STaRFormer. We achieve notable improvements over state-of-the-art approaches. Code and data will be made available.
Similar Papers
STPFormer: A State-of-the-Art Pattern-Aware Spatio-Temporal Transformer for Traffic Forecasting
Artificial Intelligence
Predicts traffic jams before they happen.
Learning Streaming Video Representation via Multitask Training
CV and Pattern Recognition
Helps robots and cars understand moving pictures instantly.
STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits
CV and Pattern Recognition
Makes talking videos from a picture and voice.