Revisiting Optical Flow Estimation in 360 Videos
Keshav Bhandari, Ziliang Zong, Yan Yan

TL;DR
This paper introduces LiteFlowNet360, a lightweight, domain-adapted neural network for optical flow estimation in 360 videos, addressing distortion issues and reducing computational costs through innovative layer transformations and data augmentation.
Contribution
The paper presents a novel domain adaptation framework for 360 videos, employing kernel transformation techniques and selective layer modifications to improve optical flow estimation.
Findings
Effective adaptation of LiteFlowNet for 360 videos
Reduced network size and computation cost
Promising experimental results on 360 video datasets
Abstract
Nowadays 360 video analysis has become a significant research topic in the field since the appearance of high-quality and low-cost 360 wearable devices. In this paper, we propose a novel LiteFlowNet360 architecture for 360 videos optical flow estimation. We design LiteFlowNet360 as a domain adaptation framework from perspective video domain to 360 video domain. We adapt it from simple kernel transformation techniques inspired by Kernel Transformer Network (KTN) to cope with inherent distortion in 360 videos caused by the sphere-to-plane projection. First, we apply an incremental transformation of convolution layers in feature pyramid network and show that further transformation in inference and regularization layers are not important, hence reducing the network growth in terms of size and computation cost. Second, we refine the network by training with augmented data in a supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Enhancement Techniques
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Softmax · Layer Normalization · Convolution · Dense Connections · Multi-Head Attention
