Score: 0

Enhancing Automatic Modulation Recognition With a Reconstruction-Driven Vision Transformer Under Limited Labels

Published: August 27, 2025 | arXiv ID: 2508.20193v2

By: Hossein Ahmadi , Banafsheh Saffari , Sajjad Emdadi Mahdimahalleh and more

Potential Business Impact:

Helps radios understand signals with less data.

Business Areas:
Image Recognition Data and Analytics, Software

Automatic modulation recognition (AMR) is critical for cognitive radio, spectrum monitoring, and secure wireless communication. However, existing solutions often rely on large labeled datasets or multi-stage training pipelines, which limit scalability and generalization in practice. We propose a unified Vision Transformer (ViT) framework that integrates supervised, self-supervised, and reconstruction objectives. The model combines a ViT encoder, a lightweight convolutional decoder, and a linear classifier; the reconstruction branch maps augmented signals back to their originals, anchoring the encoder to fine-grained I/Q structure. This strategy promotes robust, discriminative feature learning during pretraining, while partial label supervision in fine-tuning enables effective classification with limited labels. On the RML2018.01A dataset, our approach outperforms supervised CNN and ViT baselines in low-label regimes, approaches ResNet-level accuracy with only 15-20% labeled data, and maintains strong performance across varying SNR levels. Overall, the framework provides a simple, generalizable, and label-efficient solution for AMR.

Country of Origin
🇺🇸 United States

Page Count
24 pages

Category
Computer Science:
CV and Pattern Recognition