Scaling Graph Convolutions for Mobile Vision
William Avery, Mustafa Munir, Radu Marculescu

TL;DR
This paper introduces MobileViGv2, a scalable vision graph neural network architecture that improves accuracy and efficiency on mobile devices across image classification, detection, and segmentation tasks.
Contribution
The paper proposes Mobile Graph Convolution (MGC) and MobileViGv2, a new architecture that enhances scaling and performance of vision GNNs on mobile hardware.
Findings
MobileViGv2-Ti achieves 77.7% top-1 accuracy on ImageNet-1K.
MobileViGv2-B achieves 83.4% top-1 accuracy with 2.7 ms latency.
MobileViGv2 outperforms prior models on MS COCO and ADE20K tasks.
Abstract
To compete with existing mobile architectures, MobileViG introduces Sparse Vision Graph Attention (SVGA), a fast token-mixing operator based on the principles of GNNs. However, MobileViG scales poorly with model size, falling at most 1% behind models with similar latency. This paper introduces Mobile Graph Convolution (MGC), a new vision graph neural network (ViG) module that solves this scaling problem. Our proposed mobile vision architecture, MobileViGv2, uses MGC to demonstrate the effectiveness of our approach. MGC improves on SVGA by increasing graph sparsity and introducing conditional positional encodings to the graph operation. Our smallest model, MobileViGv2-Ti, achieves a 77.7% top-1 accuracy on ImageNet-1K, 2% higher than MobileViG-Ti, with 0.9 ms inference latency on the iPhone 13 Mini NPU. Our largest model, MobileViGv2-B, achieves an 83.4% top-1 accuracy, 0.8% higher than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Video Surveillance and Tracking Methods
MethodsGraph Neural Network · Convolution
