Parameter-Inverted Image Pyramid Networks

Xizhou Zhu; Xue Yang; Zhaokai Wang; Hao Li; Wenhan Dou; Junqi Ge,; Lewei Lu; Yu Qiao; Jifeng Dai

arXiv:2406.04330·cs.CV·October 29, 2024

Parameter-Inverted Image Pyramid Networks

Xizhou Zhu, Xue Yang, Zhaokai Wang, Hao Li, Wenhan Dou, Junqi Ge,, Lewei Lu, Yu Qiao, Jifeng Dai

PDF

Open Access 1 Repo 10 Models 1 Video

TL;DR

The paper introduces Parameter-Inverted Image Pyramid Networks (PIIP), a novel architecture that processes multi-scale images with varying model sizes to improve efficiency and performance in vision tasks.

Contribution

PIIP employs models with different parameter sizes for each resolution level and introduces a feature interaction mechanism, reducing computation while enhancing multi-scale feature integration.

Findings

01

Achieves superior performance in object detection, segmentation, and classification.

02

Reduces computational cost by 40-60% compared to traditional methods.

03

Improves large-scale vision model performance by 1-2%.

Abstract

Image pyramids are commonly used in modern computer vision tasks to obtain multi-scale features for precise understanding of images. However, image pyramids process multiple resolutions of images using the same large-scale model, which requires significant computational cost. To overcome this issue, we propose a novel network architecture known as the Parameter-Inverted Image Pyramid Networks (PIIP). Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid, thereby balancing computational efficiency and performance. Specifically, the input to PIIP is a set of multi-scale images, where higher resolution images are processed by smaller networks. We further propose a feature interaction mechanism to allow features of different resolutions to complement each other and effectively integrate information from different spatial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

opengvlab/piip
pytorchOfficial

Models

Videos

Parameter-Inverted Image Pyramid Networks· slideslive

Taxonomy

TopicsImage and Signal Denoising Methods · Advanced Vision and Imaging · Advanced Image Processing Techniques

MethodsSparse Evolutionary Training