Deep High-Resolution Representation Learning for Visual Recognition

Jingdong Wang; Ke Sun; Tianheng Cheng; Borui Jiang; Chaorui Deng; Yang; Zhao; Dong Liu; Yadong Mu; Mingkui Tan; Xinggang Wang; Wenyu Liu; and Bin; Xiao

arXiv:1908.07919·cs.CV·March 16, 2020·354 cites

Deep High-Resolution Representation Learning for Visual Recognition

Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang, Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, Wenyu Liu, and Bin, Xiao

PDF

Open Access 5 Repos 10 Models

TL;DR

The paper introduces HRNet, a novel neural network architecture that maintains high-resolution representations throughout the process, leading to improved performance in various vision tasks like pose estimation and segmentation.

Contribution

HRNet is the first to keep high-resolution features in parallel and exchange information across resolutions, enhancing spatial precision and semantic richness.

Findings

01

HRNet outperforms existing methods in human pose estimation.

02

HRNet achieves superior results in semantic segmentation.

03

HRNet serves as a stronger backbone for diverse vision tasks.

Abstract

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation. Instead, our proposed network, named as High-Resolution Network (HRNet), maintains high-resolution representations through the whole process. There are two key characteristics: (i) Connect the high-to-low resolution convolution streams \emph{in parallel}; (ii) Repeatedly exchange the information across resolutions. The benefit is that the resulting representation is semantically richer and spatially more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning

MethodsDeep Layer Aggregation · Average Pooling · Center Pooling · Cascade Corner Pooling · Cascade R-CNN · CenterNet · Region Proposal Network · RoIPool · Faster R-CNN · Softmax