Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks

Jierun Chen; Shiu-hong Kao; Hao He; Weipeng Zhuo; Song Wen; Chul-Ho; Lee; S.-H. Gary Chan

arXiv:2303.03667·cs.CV·May 23, 2023·101 cites

Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks

Jierun Chen, Shiu-hong Kao, Hao He, Weipeng Zhuo, Song Wen, Chul-Ho, Lee, S.-H. Gary Chan

PDF

Open Access 3 Repos 6 Models

TL;DR

This paper introduces FasterNet, a neural network family with higher FLOPS efficiency and speed, achieved through a novel partial convolution that reduces redundant computation and memory access, leading to faster inference without accuracy loss.

Contribution

We propose PConv, a new partial convolution operator, and FasterNet, a neural network architecture that significantly improves inference speed while maintaining accuracy across vision tasks.

Findings

01

FasterNet-T0 is 2.4-3.3x faster than MobileViT-XXS on various devices.

02

FasterNet-L achieves 83.5% top-1 accuracy with 36% higher throughput.

03

FasterNet reduces compute time by 37% on CPU compared to existing models.

Abstract

To design fast neural networks, many works have been focusing on reducing the number of floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does not necessarily lead to a similar level of reduction in latency. This mainly stems from inefficiently low floating-point operations per second (FLOPS). To achieve faster networks, we revisit popular operators and demonstrate that such low FLOPS is mainly due to frequent memory access of the operators, especially the depthwise convolution. We hence propose a novel partial convolution (PConv) that extracts spatial features more efficiently, by cutting down redundant computation and memory access simultaneously. Building upon our PConv, we further propose FasterNet, a new family of neural networks, which attains substantially higher running speed than others on a wide range of devices, without compromising on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution