# Multi-Scale Dual-Branch Fully Convolutional Network for Hand Parsing

**Authors:** Yang Lu, Xiaohui Liang, Frederick W. B. Li

arXiv: 1905.10100 · 2019-05-27

## TL;DR

This paper introduces a novel multi-scale dual-branch FCN architecture for hand parsing, addressing challenges like small size and occlusion, and proposes a new loss function to handle data imbalance, achieving state-of-the-art results.

## Contribution

The paper presents a new MSDB-FCN framework with a dual-branch design, a specialized DB-Block for feature merging, and a generalized Focal Loss for improved hand parsing performance.

## Key findings

- Achieved state-of-the-art accuracy on RHD-PARSING dataset.
- Effectively handles data imbalance with Multi-Class Balanced Focal Loss.
- Demonstrates superior performance in complex hand parsing scenarios.

## Abstract

Recently, fully convolutional neural networks (FCNs) have shown significant performance in image parsing, including scene parsing and object parsing. Different from generic object parsing tasks, hand parsing is more challenging due to small size, complex structure, heavy self-occlusion and ambiguous texture problems. In this paper, we propose a novel parsing framework, Multi-Scale Dual-Branch Fully Convolutional Network (MSDB-FCN), for hand parsing tasks. Our network employs a Dual-Branch architecture to extract features of hand area, paying attention on the hand itself. These features are used to generate multi-scale features with pyramid pooling strategy. In order to better encode multi-scale features, we design a Deconvolution and Bilinear Interpolation Block (DB-Block) for upsampling and merging the features of different scales. To address data imbalance, which is a common problem in many computer vision tasks as well as hand parsing tasks, we propose a generalization of Focal Loss, namely Multi-Class Balanced Focal Loss, to tackle data imbalance in multi-class classification. Extensive experiments on RHD-PARSING dataset demonstrate that our MSDB-FCN has achieved the state-of-the-art performance for hand parsing.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.10100/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/1905.10100/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1905.10100/full.md

---
Source: https://tomesphere.com/paper/1905.10100