Self-Guided Masked Autoencoder
By: Jeongwoo Shin , Inseo Lee , Junho Lee and more
Potential Business Impact:
Teaches computers to see patterns faster.
Masked Autoencoder (MAE) is a self-supervised approach for representation learning, widely applicable to a variety of downstream tasks in computer vision. In spite of its success, it is still not fully uncovered what and how MAE exactly learns. In this paper, with an in-depth analysis, we discover that MAE intrinsically learns pattern-based patch-level clustering from surprisingly early stages of pretraining. Upon this understanding, we propose self-guided masked autoencoder, which internally generates informed mask by utilizing its progress in patch clustering, substituting the naive random masking of the vanilla MAE. Our approach significantly boosts its learning process without relying on any external models or supplementary information, keeping the benefit of self-supervised nature of MAE intact. Comprehensive experiments on various downstream tasks verify the effectiveness of the proposed method.
Similar Papers
Gaussian Masked Autoencoders
CV and Pattern Recognition
Teaches computers to understand pictures' depth and layers.
LV-MAE: Learning Long Video Representations through Masked-Embedding Autoencoders
CV and Pattern Recognition
Helps computers understand long videos better.
Self Pre-training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical Imaging
Image and Video Processing
Helps doctors find strokes on brain scans better.