M3LLM: Model Context Protocol-aided Mixture of Vision Experts For Multimodal LLMs in Networks
By: Yongjie Zeng, Hongyang Du
Potential Business Impact:
Boosts AI vision using shared device experts
Current Multimodal Large Language Models (MLLMs) rely on centralized architectures and often suffer from poor alignment between the input task and their fixed visual encoding modules, which limits performance on diverse and dynamic visual tasks. With the increasing deployment of resource-efficient models on edge devices in wireless networks, a new opportunity emerges to dynamically use distributed vision experts for improved MLLM inference quality. To enable this, we propose M3LLM, where the Model Context Protocol (MCP) coordinates a mixture of vision experts to achieve distributed MLLMs. Specifically, MCP is an open protocol that structures the input task context into interpretable representations, enabling wireless network-aware coordination between the central model backbone and edge-hosted vision experts. Based on the MCP representation, M3LLM formulates vision expert routing as a joint optimization problem that balances task-expert semantic compatibility and channel performance. To solve the resulting gradient conflicts, we develop a dual-stream Soft Actor-Critic (SAC) algorithm with decoupled reward signals and introduce an Adaptive Stability Enhancement Module (ASEM) based on hierarchical Bayesian modeling to ensure effective routing. Experiments show that M3LLM improves task accuracy, reduces communication cost, and enhances expert routing adaptability under dynamic wireless network conditions.
Similar Papers
Model Context Protocol-based Internet of Experts For Wireless Environment-aware LLM Agents
Networking and Internet Architecture
Lets computers understand wireless signals like humans.
NetMCP: Network-Aware Model Context Protocol Platform for LLM Capability Extension
Networking and Internet Architecture
Helps AI pick the best tool, even when slow.
Tele-LLM-Hub: Building Context-Aware Multi-Agent LLM Systems for Telecom Networks
Networking and Internet Architecture
Builds smart phone networks faster and easier.