Score: 1

Adaptive Token Merging for Efficient Transformer Semantic Communication at the Edge

Published: September 12, 2025 | arXiv ID: 2509.09955v1

By: Omar Erak , Omar Alhussein , Hatem Abou-Zeid and more

Potential Business Impact:

Makes smart computer programs run faster, cheaper.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large-scale transformers are central to modern semantic communication, yet their high computational and communication costs hinder deployment on resource-constrained edge devices. This paper introduces a training-free framework for adaptive token merging, a novel mechanism that compresses transformer representations at runtime by selectively merging semantically redundant tokens under per-layer similarity thresholds. Unlike prior fixed-ratio reduction, our approach couples merging directly to input redundancy, enabling data-dependent adaptation that balances efficiency and task relevance without retraining. We cast the discovery of merging strategies as a multi-objective optimization problem and leverage Bayesian optimization to obtain Pareto-optimal trade-offs between accuracy, inference cost, and communication cost. On ImageNet classification, we match the accuracy of the unmodified transformer with 30\% fewer floating-point operations per second and under 20\% of the original communication cost, while for visual question answering our method achieves performance competitive with the full LLaVA model at less than one-third of the compute and one-tenth of the bandwidth. Finally, we show that our adaptive merging is robust across varying channel conditions and provides inherent privacy benefits, substantially degrading the efficacy of model inversion attacks. Our framework provides a practical and versatile solution for deploying powerful transformer models in resource-limited edge intelligence scenarios.

Adaptive Pareto-Optimal Token Merging for Edge Transformer Models in Semantic Communication

Machine Learning (CS)

Lets AI understand pictures faster on phones.

11 Sep 2025 1

89%

AdaTok: Adaptive Token Compression with Object-Aware Representations for Efficient Multimodal LLMs

CV and Pattern Recognition

Makes AI understand pictures using fewer computer steps.

18 Nov 2025 0

88%

Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration

CV and Pattern Recognition

Makes AI see faster with less work.

6 Jun 2025 0

View PDF Login to Bookmark

Country of Origin

🇫🇮 🇦🇪 🇨🇦 Finland, Canada, United Arab Emirates

Page Count

14 pages

Adaptive Token Merging for Efficient Transformer Semantic Communication at the Edge

Makes smart computer programs run faster, cheaper.

Technical Abstract

Adaptive Pareto-Optimal Token Merging for Edge Transformer Models in Semantic Communication

AdaTok: Adaptive Token Compression with Object-Aware Representations for Efficient Multimodal LLMs

Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration