AVERY: Adaptive VLM Split Computing through Embodied Self-Awareness for Efficient Disaster Response Systems
By: Rajat Bhattacharjya , Sing-Yao Wu , Hyunwoo Oh and more
Potential Business Impact:
Lets drones understand disaster scenes better.
Unmanned Aerial Vehicles (UAVs) in disaster response require complex, queryable intelligence that on-board CNNs cannot provide. While Vision-Language Models (VLMs) offer this semantic reasoning, their high resource demands make on-device deployment infeasible, and naive cloud offloading fails under the low-bandwidth networks common in disaster zones. We present AVERY, a framework that enables VLM deployment through adaptive split computing. We advance the split computing paradigm beyond traditional depth-wise partitioning by introducing a functional, cognitive-inspired dual-stream split that separates the VLM into a high-frequency, low-resolution "context stream" for real-time awareness and a low-frequency, high-fidelity "insight stream" for deep analysis. A lightweight, self-aware on-board controller manages this architecture, monitoring network conditions and operator intent to dynamically select from pre-trained compression models, navigating the fundamental accuracy-throughput trade-off. Evaluated using the VLM LISA-7B across an edge-cloud scenario under fluctuating network conditions, AVERY consistently outperforms static configurations, achieving 11.2% higher accuracy than raw image compression and 93.98% lower energy consumption compared to full-edge execution, thereby enhancing mission efficiency and enabling real-time, queryable intelligence on resource-constrained platforms in dynamic environments.
Similar Papers
Lite VLA: Efficient Vision-Language-Action Control on CPU-Bound Edge Robots
Robotics
Robots see and think without internet.
SpotVLM: Cloud-edge Collaborative Real-time VLM based on Context Transfer
CV and Pattern Recognition
Helps self-driving cars see and react faster.
Efficient Onboard Vision-Language Inference in UAV-Enabled Low-Altitude Economy Networks via LLM-Enhanced Optimization
Machine Learning (CS)
Drones understand what they see and send info fast.