Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks
Jierun Chen, Shiu-hong Kao, Hao He, Weipeng Zhuo, Song Wen, Chul-Ho, Lee, S.-H. Gary Chan

TL;DR
This paper introduces FasterNet, a neural network family with higher FLOPS efficiency and speed, achieved through a novel partial convolution that reduces redundant computation and memory access, leading to faster inference without accuracy loss.
Contribution
We propose PConv, a new partial convolution operator, and FasterNet, a neural network architecture that significantly improves inference speed while maintaining accuracy across vision tasks.
Findings
FasterNet-T0 is 2.4-3.3x faster than MobileViT-XXS on various devices.
FasterNet-L achieves 83.5% top-1 accuracy with 36% higher throughput.
FasterNet reduces compute time by 37% on CPU compared to existing models.
Abstract
To design fast neural networks, many works have been focusing on reducing the number of floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does not necessarily lead to a similar level of reduction in latency. This mainly stems from inefficiently low floating-point operations per second (FLOPS). To achieve faster networks, we revisit popular operators and demonstrate that such low FLOPS is mainly due to frequent memory access of the operators, especially the depthwise convolution. We hence propose a novel partial convolution (PConv) that extracts spatial features more efficiently, by cutting down redundant computation and memory access simultaneously. Building upon our PConv, we further propose FasterNet, a new family of neural networks, which attains substantially higher running speed than others on a wide range of devices, without compromising on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution
