Unsupervised Learning of Object Landmarks through Conditional Image   Generation

Tomas Jakab; Ankush Gupta; Hakan Bilen; Andrea Vedaldi

arXiv:1806.07823·cs.CV·December 17, 2018·105 cites

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi

PDF

Open Access 2 Repos

TL;DR

This paper introduces an unsupervised method for learning object landmarks by generating images conditioned on appearance and geometry, effectively capturing key features without manual labels across diverse datasets.

Contribution

The authors present a novel unsupervised approach that learns object landmarks through conditional image generation, outperforming existing methods and applicable to various object types.

Findings

01

Successfully learned landmarks from synthetic deformations and videos

02

Outperformed state-of-the-art unsupervised landmark detectors

03

Applicable to diverse datasets including faces, objects, and digits

Abstract

We propose a method for learning landmark detectors for visual objects (such as the eyes and the nose in a face) without any manual supervision. We cast this as the problem of generating images that combine the appearance of the object as seen in a first example image with the geometry of the object as seen in a second example image, where the two examples differ by a viewpoint change and/or an object deformation. In order to factorize appearance and geometry, we introduce a tight bottleneck in the geometry-extraction process that selects and distils geometry-related features. Compared to standard image generation problems, which often use generative adversarial networks, our generation task is conditioned on both appearance and geometry and thus is significantly less ambiguous, to the point that adopting a simple perceptual loss formulation is sufficient. We demonstrate that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning