Wireless Agentic AI with Retrieval-Augmented Multimodal Semantic Perception
By: Guangyuan Liu , Yinqiu Liu , Ruichen Zhang and more
Potential Business Impact:
Lets self-driving cars share information faster.
The rapid development of multimodal AI and Large Language Models (LLMs) has greatly enhanced real-time interaction, decision-making, and collaborative tasks. However, in wireless multi-agent scenarios, limited bandwidth poses significant challenges to exchanging semantically rich multimodal information efficiently. Traditional semantic communication methods, though effective, struggle with redundancy and loss of crucial details. To overcome these challenges, we propose a Retrieval-Augmented Multimodal Semantic Communication (RAMSemCom) framework. RAMSemCom incorporates iterative, retrieval-driven semantic refinement tailored for distributed multi-agent environments, enabling efficient exchange of critical multimodal elements through local caching and selective transmission. Our approach dynamically optimizes retrieval using deep reinforcement learning (DRL) to balance semantic fidelity with bandwidth constraints. A comprehensive case study on multi-agent autonomous driving demonstrates that our DRL-based retrieval strategy significantly improves task completion efficiency and reduces communication overhead compared to baseline methods.
Similar Papers
Context-Aware Semantic Communication for the Wireless Networks
Networking and Internet Architecture
Lets phones send data faster and better.
Retrieval Augmented Generation with Multi-Modal LLM Framework for Wireless Environments
Networking and Internet Architecture
Makes wireless internet faster and more reliable.
Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences
Machine Learning (CS)
Makes future internet understand what you want.