Categorical Mixture Models on VGGNet activations

Sean Billings

arXiv:1803.02446·cs.CV·March 8, 2018

Categorical Mixture Models on VGGNet activations

Sean Billings

PDF

Open Access

TL;DR

This paper explores clustering Yelp restaurant photos using VGGNet activations, applying unsupervised learning techniques like LDA to identify meaningful photo topics aligned with human intuition and labels.

Contribution

It introduces a novel approach combining VGGNet features with LDA to effectively cluster images into interpretable archetypes.

Findings

01

VGGNet activations improve clustering quality

02

Object-based features yield meaningful photo archetypes

03

Clusters align well with Yelp labels

Abstract

In this project, I use unsupervised learning techniques in order to cluster a set of yelp restaurant photos under meaningful topics. In order to do this, I extract layer activations from a pre-trained implementation of the popular VGGNet convolutional neural network. First, I explore using LDA with the activations of convolutional layers as features. Secondly, I explore using the object-recognition powers of VGGNet trained on ImageNet in order to extract meaningful objects from the photos, and then perform LDA to group the photos under topic-archetypes. I find that this second approach finds meaningful archetypes, which match the human intuition for photo topics such as restaurant, food, and drinks. Furthermore, these clusters align well and distinctly with the actual yelp photo labels.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Neural Networks and Applications

MethodsLinear Discriminant Analysis