Sector Patch Embedding: An Embedding Module Conforming to The Distortion Pattern of Fisheye Image
Dianyi Yang, Jiadong Tang, Yu Gao, Yi Yang, Mengyin Fu

TL;DR
This paper introduces Sector Patch Embedding (SPE), a novel embedding method tailored to fisheye image distortion patterns, improving Transformer model performance on fisheye image classification tasks.
Contribution
The paper proposes SPE, a new patch embedding technique that aligns with fisheye distortion patterns, enhancing feature extraction and model accuracy.
Findings
SPE improves ViT top-1 accuracy by 0.75%.
SPE improves PVT top-1 accuracy by 2.8%.
The method effectively perceives distortion in fisheye images.
Abstract
Fisheye cameras suffer from image distortion while having a large field of view(LFOV). And this fact leads to poor performance on some fisheye vision tasks. One of the solutions is to optimize the current vision algorithm for fisheye images. However, most of the CNN-based methods and the Transformer-based methods lack the capability of leveraging distortion information efficiently. In this work, we propose a novel patch embedding method called Sector Patch Embedding(SPE), conforming to the distortion pattern of the fisheye image. Furthermore, we put forward a synthetic fisheye dataset based on the ImageNet-1K and explore the performance of several Transformer models on the dataset. The classification top-1 accuracy of ViT and PVT is improved by 0.75% and 2.8% with SPE respectively. The experiments show that the proposed sector patch embedding method can better perceive distortion and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptical measurement and interference techniques · Digital Imaging for Blood Diseases · Image Processing Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Residual Connection · Label Smoothing · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Dropout · Dense Connections
