AdaptViG: Adaptive Vision GNN with Exponential Decay Gating
By: Mustafa Munir, Md Mostafijur Rahman, Radu Marculescu
Potential Business Impact:
Makes computer vision faster and smarter.
Vision Graph Neural Networks (ViGs) offer a new direction for advancements in vision architectures. While powerful, ViGs often face substantial computational challenges stemming from their graph construction phase, which can hinder their efficiency. To address this issue we propose AdaptViG, an efficient and powerful hybrid Vision GNN that introduces a novel graph construction mechanism called Adaptive Graph Convolution. This mechanism builds upon a highly efficient static axial scaffold and a dynamic, content-aware gating strategy called Exponential Decay Gating. This gating mechanism selectively weighs long-range connections based on feature similarity. Furthermore, AdaptViG employs a hybrid strategy, utilizing our efficient gating mechanism in the early stages and a full Global Attention block in the final stage for maximum feature aggregation. Our method achieves a new state-of-the-art trade-off between accuracy and efficiency among Vision GNNs. For instance, our AdaptViG-M achieves 82.6% top-1 accuracy, outperforming ViG-B by 0.3% while using 80% fewer parameters and 84% fewer GMACs. On downstream tasks, AdaptViG-M obtains 45.8 mIoU, 44.8 APbox, and 41.1 APmask, surpassing the much larger EfficientFormer-L7 by 0.7 mIoU, 2.2 APbox, and 2.1 APmask, respectively, with 78% fewer parameters.
Similar Papers
Accelerating Dynamic Image Graph Construction on FPGA for Vision GNNs
Distributed, Parallel, and Cluster Computing
Makes computers see pictures much faster.
DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition
CV and Pattern Recognition
Helps computers see and understand images better.
Multi-Scale High-Resolution Logarithmic Grapher Module for Efficient Vision GNNs
CV and Pattern Recognition
Makes computer vision faster and more accurate.