TL;DR
This paper introduces a new embedding paradigm representing concepts as linear subspaces, enabling hierarchical, logical, and set-theoretic operations, and achieves state-of-the-art results in word and language inference tasks.
Contribution
It proposes a novel subspace embedding framework that models hierarchy and logic naturally, with a differentiable learning method for orientation and dimensionality.
Findings
State-of-the-art in WordNet reconstruction and link prediction
Outperforms bi-encoder baselines on natural language inference
Provides an interpretable, geometrically grounded logical entailment model
Abstract
Traditional neural embeddings represent concepts as points, excelling at similarity but struggling with higher-level reasoning and asymmetric relationships. We introduce a novel paradigm: embedding concepts as linear subspaces. This framework inherently models generality via subspace dimensionality and hierarchy through subspace inclusion. It naturally supports set-theoretic operations like intersection (conjunction), linear sum (disjunction) and orthogonal complements (negations), aligning with classical formal semantics. To enable differentiable learning, we propose a smooth relaxation of orthogonal projection operators, allowing for the learning of both subspace orientation and dimension. Our method achieves state-of-the-art results in reconstruction and link prediction on WordNet. Furthermore, on natural language inference benchmarks, our subspace embeddings surpass bi-encoder…
Peer Reviews
Decision·Submitted to ICLR 2026
Thank you so much for submitting this work! I enjoyed reading this paper and learned a great deal from it. Below are what I believe are this paper’s main strengths: 1. **[Originality, Critical]** The proposed paradigm (i.e., representing concepts as subspaces rather than fixed vectors), the used learning algorithm (i.e., a mechanism for learning different rank subspaces from the data), and the properties that derive from using this paradigm and learning process (i.e., naturally composable and h
In contrast, I believe the following are some of this work’s limitations: 1. **[Significance, Major]** Against standard good scientific practices, the empirical quantitative experiments in this work do not include any error bars. This makes it difficult for one to judge the significance of any observed differences and sets a negative precedent for not following good experimental conventions. 2. **[Quality and Clarity, Minor]** It is unclear how certain hyperparameters like $ \lambda $ are selec
The proposed embedding method increases the interpretability of logical operations and concept structures in neural networks. The traditional semantics of logic and sets is explicitly represented in geometric entities.
This version of the proposed work still suffers from both technical and theoretical clarities. It seems that authors mix entities and propositions, and this will introduce problems. For example, authors embed “man on a boat” as a concept and embedded as a single direction, then, map it to x1 and x2, where x1 might represent a“man on a boat that is fishing” while x2 might represent “man on a boat that is not fishing”. In this way, the concept “man on a boat” is represented by the subspace span(x
+ Novel approach for representation learning, the idea of subspace learning naturally fits into more interpretable logical operations compared to existing approach. Given the generality of the formulation, this type of learning could turn out to be highly significant in several applications. + Comprehensive empirical validation that shows the generality of the approach in various tasks. The proposed method outperforms state of the art models in word net reconstruction, link prediction and NLI. T
Weakness - Not a weakness as such but the paper does not spell out what and if there are limitations of the new representation. Scalability is one possible limitation perhaps (section 6 talks about this briefly) compared to standard embeddings. But in general, it would be nice to know about the trade-offs being made to achieve learning that is more semantically rich.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
