SWAN: World-Aware Adaptive Multimodal Networks for Runtime Variations
Jason Wu, Shir-Kang Scott Jin, Yuyang Yuan, Maggie Wigness, Lance M. Kaplan, Hang Qiu, Mani Srivastava

TL;DR
SWAN is an innovative adaptive multimodal neural network designed for autonomous driving, dynamically allocating resources based on input quality and complexity to optimize efficiency and performance.
Contribution
It introduces a comprehensive adaptive framework that manages resource allocation, input complexity, and semantic relevance in multimodal networks during runtime.
Findings
Reduced FLOPs by up to 49% in autonomous driving detection tasks.
Achieved minimal performance degradation while optimizing computational efficiency.
Demonstrated effectiveness in real-world, complex multi-object detection scenarios.
Abstract
Multimodal deep neural networks deployed in realistic environments must contend with runtime variations: changes in modality quality, overall input complexity, and available platform resources. Current networks struggle with such fluctuations -- adaptive networks cannot adhere to a strict compute budget, controller-based networks neglect to consider input complexity, and statically provisioned networks fail at all the above. Consequently, they do not extract maximum utility from the expended computational resources. We present SWAN (Sample and World-Aware Multimodal Network), the first adaptive multimodal network that accomplishes all three goals. SWAN employs a quality-aware controller to assign resources among modalities according to a variable user-specified maximum budget. Within this budget, an adaptive gating module further optimizes efficiency by scaling layer utilization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
