Data Augmentation as Free Lunch: Exploring the Test-Time Augmentation for Sequential Recommendation
By: Yizhou Dang , Yuting Liu , Enneng Yang and more
Potential Business Impact:
Makes movie suggestions better without retraining.
Data augmentation has become a promising method of mitigating data sparsity in sequential recommendation. Existing methods generate new yet effective data during model training to improve performance. However, deploying them requires retraining, architecture modification, or introducing additional learnable parameters. The above steps are time-consuming and costly for well-trained models, especially when the model scale becomes large. In this work, we explore the test-time augmentation (TTA) for sequential recommendation, which augments the inputs during the model inference and then aggregates the model's predictions for augmented data to improve final accuracy. It avoids significant time and cost overhead from loss calculation and backward propagation. We first experimentally disclose the potential of existing augmentation operators for TTA and find that the Mask and Substitute consistently achieve better performance. Further analysis reveals that these two operators are effective because they retain the original sequential pattern while adding appropriate perturbations. Meanwhile, we argue that these two operators still face time-consuming item selection or interference information from mask tokens. Based on the analysis and limitations, we present TNoise and TMask. The former injects uniform noise into the original representation, avoiding the computational overhead of item selection. The latter blocks mask token from participating in model calculations or directly removes interactions that should have been replaced with mask tokens. Comprehensive experiments demonstrate the effectiveness, efficiency, and generalizability of our method. We provide an anonymous implementation at https://github.com/KingGugu/TTA4SR.
Similar Papers
Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
CV and Pattern Recognition
Makes computer models smarter without retraining them.
Test-time augmentation improves efficiency in conformal prediction
Machine Learning (CS)
Makes computer guesses more accurate and smaller.
Self-Bootstrapping for Versatile Test-Time Adaptation
CV and Pattern Recognition
Makes computer vision work better on new images.