To Perceive or Not to Perceive: Lightweight Stacked Hourglass Network
Jameel Hassan Abdul Samadh, Salwa K. Al Khatib

TL;DR
This paper introduces a lightweight stacked hourglass network for human pose estimation that significantly reduces parameters and computational cost while maintaining comparable accuracy.
Contribution
A novel lightweight 2-stacked hourglass network using depthwise separable convolutions and residual connections, with minimal performance loss.
Findings
79% reduction in parameters
Similar performance drop in MAdds
Maintains accuracy with fewer resources
Abstract
Human pose estimation (HPE) is a classical task in computer vision that focuses on representing the orientation of a person by identifying the positions of their joints. We design a lighterversion of the stacked hourglass network with minimal loss in performance of the model. The lightweight 2-stacked hourglass has a reduced number of channels with depthwise separable convolutions, residual connections with concatenation, and residual connections between the necks of the hourglasses. The final model has a marginal drop in performance with 79% reduction in the number of parameters and a similar drop in MAdds
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Hand Gesture Recognition Systems
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Convolution · 1x1 Convolution · Residual Connection · Max Pooling · Hourglass Module · Stacked Hourglass Network
