AVERY: Intent-Driven Adaptive VLM Split Computing via Embodied Self-Awareness for Efficient Disaster Response Systems
Rajat Bhattacharjya, Sing-Yao Wu, Hyunwoo Oh, Chaewon Nam, Suyeon Koo, Mohsen Imani, Elaheh Bozorgzadeh, Nikil Dutt

TL;DR
AVERY is an intent-driven adaptive split computing framework that enables efficient, real-time vision-language model deployment on resource-constrained UAVs during disaster response, balancing accuracy and resource use.
Contribution
It introduces a dual-stream, hierarchical split computing approach with onboard self-awareness to adapt to network conditions and operator intent in disaster scenarios.
Findings
Achieves 11.2% higher accuracy than raw image compression.
Reduces energy consumption by 93.98% compared to full-edge execution.
Maintains accuracy within 0.75% of static high-accuracy baseline during dynamic adaptation.
Abstract
Unmanned Aerial Vehicles (UAVs) in disaster response require complex, queryable intelligence that onboard CNNs cannot provide. While Vision-Language Models (VLMs) offer this semantic reasoning, their high resource demands make on-device deployment infeasible, and naive cloud offloading fails under the low-bandwidth, unstable networks endemic to disaster zones. We present AVERY, an intent-driven adaptive split computing framework for efficient VLM deployment on resource-constrained platforms. AVERY is motivated by the observation that operator intent must be treated as a first-class system objective, since missions such as broad situational monitoring and precise, spatially grounded investigation require different semantic products, latency targets, and resource allocations. To reflect this, AVERY advances split computing beyond traditional depth-wise partitioning through a functional,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
