DHOG: Deep Hierarchical Object Grouping
Luke Nicholas Darlow, Amos Storkey

TL;DR
DHOG introduces a hierarchical approach to unsupervised image representation learning, improving mutual information maximization and clustering performance over existing methods without relying on traditional data augmentation tricks.
Contribution
The paper proposes DHOG, a novel hierarchical method that enhances mutual information optimization and object grouping in unsupervised learning, outperforming prior approaches on key benchmarks.
Findings
Achieved state-of-the-art clustering accuracy on CIFAR-10, CIFAR-100-20, and SVHN.
Improved mutual information alignment with downstream object grouping tasks.
Outperformed previous methods without using prefiltering or Sobel-edge detection.
Abstract
Recently, a number of competitive methods have tackled unsupervised representation learning by maximising the mutual information between the representations produced from augmentations. The resulting representations are then invariant to stochastic augmentation strategies, and can be used for downstream tasks such as clustering or classification. Yet data augmentations preserve many properties of an image and so there is potential for a suboptimal choice of representation that relies on matching easy-to-find features in the data. We demonstrate that greedy or local methods of maximising mutual information (such as stochastic gradient optimisation) discover local optima of the mutual information criterion; the resulting representations are also less-ideally suited to complex downstream tasks. Earlier work has not specifically identified or addressed this issue. We introduce deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
