Understanding Human-Centric Images: From Geometry to Fashion
Edgar Simo-Serra

TL;DR
This paper develops a comprehensive set of tools for understanding humans in images, from low-level features to high-level fashion analysis, demonstrating significant improvements across multiple tasks.
Contribution
It introduces novel low-level descriptors, pose models, and high-level fashion understanding methods, integrating them into a unified framework for human-centric image analysis.
Findings
Significant improvements in human pose estimation accuracy.
Effective clothing segmentation and fashionability prediction.
Robust low and mid-level cues enhance high-level understanding.
Abstract
Understanding humans from photographs has always been a fundamental goal of computer vision. In this thesis we have developed a hierarchy of tools that cover a wide range of topics with the objective of understanding humans from monocular RGB image: from low level feature point descriptors to high level fashion-aware conditional random fields models. In order to build these high level models it is paramount to have a battery of robust and reliable low and mid level cues. Along these lines, we have proposed two low-level keypoint descriptors: one based on the theory of the heat diffusion on images, and the other that uses a convolutional neural network to learn discriminative image patch representations. We also introduce distinct low-level generative models for representing human pose: in particular we present a discrete model based on a directed acyclic graph and a continuous model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition
