Learning Visual-Semantic Subspace Representations
Gabriel Moreira, Manuel Marques, Jo\~ao Paulo Costeira, Alexander, Hauptmann

TL;DR
This paper introduces a nuclear norm-based loss function for learning image representations that effectively encode semantic relationships and partial orders, supported by theoretical analysis of its geometric properties.
Contribution
It presents a novel loss function grounded in information theory that captures the spectral geometry of visual-semantic data and enforces symbolic structure in learned representations.
Findings
The loss promotes class orthogonality.
It encodes the spectral geometry of data.
Supports logical propositions within subspace structures.
Abstract
Learning image representations that capture rich semantic relationships remains a significant challenge. Existing approaches are either contrastive, lacking robust theoretical guarantees, or struggle to effectively represent the partial orders inherent to structured visual-semantic data. In this paper, we introduce a nuclear norm-based loss function, grounded in the same information theoretic principles that have proved effective in self-supervised learning. We present a theoretical characterization of this loss, demonstrating that, in addition to promoting class orthogonality, it encodes the spectral geometry of the data within a subspace lattice. This geometric representation allows us to associate logical propositions with subspaces, ensuring that our learned representations adhere to a predefined symbolic structure.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Software Engineering Research · Semantic Web and Ontologies
