A Simple Approach to Image Tilt Correction with Self-Attention MobileNet for Smartphones
Siddhant Garg, Debi Prasanna Mohanty, Siva Prasad Thota, Sukumar, Moharana

TL;DR
This paper introduces SA-MobileNet, a self-attention integrated model for smartphones that improves image tilt detection accuracy and speed by modeling long-range dependencies and employing a novel multi-label training pipeline.
Contribution
The paper presents a novel self-attention MobileNet architecture and a new training method for image tilt detection on low-resource devices, achieving state-of-the-art results.
Findings
SA-MobileNet outperforms MobileNetV3 in accuracy on multiple datasets.
SA-MobileNet is faster by at least 4 milliseconds on Snapdragon 750.
The proposed training pipeline effectively predicts multiple tilt angles with minimal overhead.
Abstract
The main contributions of our work are two-fold. First, we present a Self-Attention MobileNet, called SA-MobileNet Network that can model long-range dependencies between the image features instead of processing the local region as done by standard convolutional kernels. SA-MobileNet contains self-attention modules integrated with the inverted bottleneck blocks of the MobileNetV3 model which results in modeling of both channel-wise attention and spatial attention of the image features and at the same time introduce a novel self-attention architecture for low-resource devices. Secondly, we propose a novel training pipeline for the task of image tilt detection. We treat this problem in a multi-label scenario where we predict multiple angles for a tilted input image in a narrow interval of range 1-2 degrees, depending on the dataset used. This process induces an implicit correlation between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Remote Sensing and Land Use · Advanced Image and Video Retrieval Techniques
MethodsPointwise Convolution · Depthwise Convolution · Batch Normalization · Dense Connections · Depthwise Separable Convolution · Inverted Residual Block · Sigmoid Activation · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · ReLU6
