# Scale Invariant Fully Convolutional Network: Detecting Hands Efficiently

**Authors:** Dan Liu, Dawei Du, Libo Zhang, Tiejian Luo, Yanjun Wu, Feiyue Huang,, Siwei Lyu

arXiv: 1906.04634 · 2020-01-20

## TL;DR

This paper introduces a novel end-to-end scale-invariant fully convolutional network for efficient hand detection, combining multi-scale feature fusion, rotation handling, and accelerated training, achieving high speed and accuracy.

## Contribution

The paper presents a new SIFCN architecture with CWF blocks and rotation maps, improving speed and scale invariance in hand detection over existing methods.

## Key findings

- 4.23 times faster than state-of-the-art methods
- Achieves 62.5 fps on VIVA dataset
- Comparable or better accuracy on hand detection datasets

## Abstract

Existing hand detection methods usually follow the pipeline of multiple stages with high computation cost, i.e., feature extraction, region proposal, bounding box regression, and additional layers for rotated region detection. In this paper, we propose a new Scale Invariant Fully Convolutional Network (SIFCN) trained in an end-to-end fashion to detect hands efficiently. Specifically, we merge the feature maps from high to low layers in an iterative way, which handles different scales of hands better with less time overhead comparing to concatenating them simply. Moreover, we develop the Complementary Weighted Fusion (CWF) block to make full use of the distinctive features among multiple layers to achieve scale invariance. To deal with rotated hand detection, we present the rotation map to get rid of complex rotation and derotation layers. Besides, we design the multi-scale loss scheme to accelerate the training process significantly by adding supervision to the intermediate layers of the network. Compared with the state-of-the-art methods, our algorithm shows comparable accuracy and runs a 4.23 times faster speed on the VIVA dataset and achieves better average precision on Oxford hand detection dataset at a speed of 62.5 fps.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.04634/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1906.04634/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1906.04634/full.md

---
Source: https://tomesphere.com/paper/1906.04634