Egocentric Height Estimation
Jessica Finocchiaro, Aisha Urooj Khan, Ali Borji

TL;DR
This paper introduces a novel deep learning framework for estimating a person's height from egocentric videos without calibration, achieving high accuracy in both continuous and categorical height predictions.
Contribution
The work presents a new two-stream neural network approach combining spatial and motion cues for egocentric height estimation without prior calibration or reference points.
Findings
Mean Absolute Error of 14.04 cm in height estimation
Classification accuracy of 93.75% for height categories
Effective fusion of spatial and temporal features
Abstract
Egocentric, or first-person vision which became popular in recent years with an emerge in wearable technology, is different than exocentric (third-person) vision in some distinguishable ways, one of which being that the camera wearer is generally not visible in the video frames. Recent work has been done on action and object recognition in egocentric videos, as well as work on biometric extraction from first-person videos. Height estimation can be a useful feature for both soft-biometrics and object tracking. Here, we propose a method of estimating the height of an egocentric camera without any calibration or reference points. We used both traditional computer vision approaches and deep learning in order to determine the visual cues that results in best height estimation. Here, we introduce a framework inspired by two stream networks comprising of two Convolutional Neural Networks, one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
