Rethinking the Zigzag Flattening for Image Reading
Qingsong Zhao, Yi Wang, Zhipeng Zhou, Duoqian Miao, Limin Wang, Yu, Qiao, Cairong Zhao

TL;DR
This paper explores alternative sequence flattening methods for image reading in computer vision, comparing Hilbert fractal flattening to the traditional zigzag flattening, and demonstrates its advantages in maintaining spatial locality and improving model performance.
Contribution
The paper introduces Hilbert fractal flattening as a superior alternative to zigzag flattening for image sequence ordering, enhancing spatial locality and model performance in vision tasks.
Findings
Hilbert fractal flattening outperforms zigzag flattening in maintaining spatial locality.
HF yields significant performance improvements across various neural network architectures.
The method is easily integrated into existing deep learning models.
Abstract
Sequence ordering of word vector matters a lot to text reading, which has been proven in natural language processing (NLP). However, the rule of different sequence ordering in computer vision (CV) was not well explored, e.g., why the ``zigzag" flattening (ZF) is commonly utilized as a default option to get the image patches ordering in vision networks. Notably, when decomposing multi-scale images, the ZF could not maintain the invariance of feature point positions. To this end, we investigate the Hilbert fractal flattening (HF) as another method for sequence ordering in CV and contrast it against ZF. The HF has proven to be superior to other curves in maintaining spatial locality, when performing multi-scale transformations of dimensional space. And it can be easily plugged into most deep neural networks (DNNs). Extensive experiments demonstrate that it can yield consistent and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Medical Image Segmentation Techniques · Image Retrieval and Classification Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Linear Layer · Adam · Layer Normalization · Softmax · Byte Pair Encoding · Residual Connection · Label Smoothing
