Learning Abstract Representations through Lossy Compression of Multi-Modal Signals
Charles Wilmot, Gianluca Baldassarre, Jochen Triesch

TL;DR
This paper explores how lossy compression of multi-modal signals can naturally lead to the formation of abstract, modality-invariant representations that are useful for complex, open-ended learning tasks.
Contribution
It introduces a novel approach treating multi-modal representation learning as a lossy compression problem and proposes an architecture to extract shared, abstract features across modalities.
Findings
Lossy compression extracts shared, modality-invariant features.
The proposed architecture effectively discards modality-specific details.
Shared representations facilitate generalization in multi-modal learning.
Abstract
A key competence for open-ended learning is the formation of increasingly abstract representations useful for driving complex behavior. Abstract representations ignore specific details and facilitate generalization. Here we consider the learning of abstract representations in a multi-modal setting with two or more input modalities. We treat the problem as a lossy compression problem and show that generic lossy compression of multimodal sensory input naturally extracts abstract representations that tend to strip away modalitiy specific details and preferentially retain information that is shared across the different modalities. Furthermore, we propose an architecture to learn abstract representations by identifying and retaining only the information that is shared across multiple modalities while discarding any modality specific information.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
