Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang, Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, Wenyu Liu, and Bin, Xiao

TL;DR
The paper introduces HRNet, a novel neural network architecture that maintains high-resolution representations throughout the process, leading to improved performance in various vision tasks like pose estimation and segmentation.
Contribution
HRNet is the first to keep high-resolution features in parallel and exchange information across resolutions, enhancing spatial precision and semantic richness.
Findings
HRNet outperforms existing methods in human pose estimation.
HRNet achieves superior results in semantic segmentation.
HRNet serves as a stronger backbone for diverse vision tasks.
Abstract
High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation. Instead, our proposed network, named as High-Resolution Network (HRNet), maintains high-resolution representations through the whole process. There are two key characteristics: (i) Connect the high-to-low resolution convolution streams \emph{in parallel}; (ii) Repeatedly exchange the information across resolutions. The benefit is that the resulting representation is semantically richer and spatially more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗kadirnar/timm_model_listmodel· ♡ 1♡ 1
- 🤗timm/hrnet_w18.ms_aug_in1kmodel· 28k dl· ♡ 328k dl♡ 3
- 🤗timm/hrnet_w18.ms_in1kmodel· 69 dl69 dl
- 🤗timm/hrnet_w18_small.ms_in1kmodel· 69 dl69 dl
- 🤗timm/hrnet_w18_small_v2.ms_in1kmodel· 147 dl· ♡ 2147 dl♡ 2
- 🤗timm/hrnet_w18_ssld.paddle_in1kmodel· 69 dl69 dl
- 🤗timm/hrnet_w30.ms_in1kmodel· 121 dl121 dl
- 🤗timm/hrnet_w32.ms_in1kmodel· 170k dl170k dl
- 🤗timm/hrnet_w40.ms_in1kmodel· 96 dl96 dl
- 🤗timm/hrnet_w44.ms_in1kmodel· 152 dl152 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning
MethodsDeep Layer Aggregation · Average Pooling · Center Pooling · Cascade Corner Pooling · Cascade R-CNN · CenterNet · Region Proposal Network · RoIPool · Faster R-CNN · Softmax
