SimFIR: A Simple Framework for Fisheye Image Rectification with   Self-supervised Representation Learning

Hao Feng; Wendi Wang; Jiajun Deng; Wengang Zhou; Li Li; Houqiang Li

arXiv:2308.09040·cs.CV·August 21, 2023

SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning

Hao Feng, Wendi Wang, Jiajun Deng, Wengang Zhou, Li Li, Houqiang Li

PDF

Open Access

TL;DR

SimFIR introduces a self-supervised learning framework using Vision Transformers to effectively learn distortion representations from fisheye images, significantly improving rectification accuracy and generalization over existing methods.

Contribution

The paper proposes a novel self-supervised framework with a distortion-aware pretext task for fisheye image rectification, leveraging ViT to learn fine-grained distortion features.

Findings

01

Outperforms state-of-the-art fisheye rectification methods.

02

Demonstrates strong generalization on real-world fisheye images.

03

Boosts downstream rectification performance through learned representations.

Abstract

In fisheye images, rich distinct distortion patterns are regularly distributed in the image plane. These distortion patterns are independent of the visual content and provide informative cues for rectification. To make the best of such rectification cues, we introduce SimFIR, a simple framework for fisheye image rectification based on self-supervised representation learning. Technically, we first split a fisheye image into multiple patches and extract their representations with a Vision Transformer (ViT). To learn fine-grained distortion representations, we then associate different image patches with their specific distortion patterns based on the fisheye model, and further subtly design an innovative unified distortion-aware pretext task for their learning. The transfer performance on the downstream rectification task is remarkably boosted, which verifies the effectiveness of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Advanced Image Processing Techniques · Advanced Vision and Imaging

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Layer Normalization · Absolute Position Encodings · Residual Connection