Structured (De)composable Representations Trained with Neural Networks
Graham Spinks, Marie-Francine Moens

TL;DR
This paper introduces a deep learning method for creating structured, decomposable representations of concept classes that facilitate classification and retrieval across multiple modalities.
Contribution
It presents a novel end-to-end approach for learning structured representations that can be decomposed into class and environment factors, enhancing interpretability and flexibility.
Findings
Effective in classification tasks across visual and language data
Representation structure allows for decomposition into class and environment factors
Demonstrates improved retrieval performance
Abstract
The paper proposes a novel technique for representing templates and instances of concept classes. A template representation refers to the generic representation that captures the characteristics of an entire class. The proposed technique uses end-to-end deep learning to learn structured and composable representations from input images and discrete labels. The obtained representations are based on distance estimates between the distributions given by the class label and those given by contextual information, which are modeled as environments. We prove that the representations have a clear structure allowing to decompose the representation into factors that represent classes and environments. We evaluate our novel technique on classification and retrieval tasks involving different modalities (visual and language data).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Domain Adaptation and Few-Shot Learning
