On Improving Neurosymbolic Learning by Exploiting the Representation Space
Aaditya Naik, Efthymia Tsamoura, Shibo Jin, Mayur Naik, Dan Roth

TL;DR
This paper introduces CLIPPER, a novel pruning technique that leverages representation similarity to efficiently reduce label combination search space in neurosymbolic learning, significantly improving performance across multiple benchmarks.
Contribution
The paper presents CLIPPER, a new integer linear programming-based method to prune label spaces in neurosymbolic learning, enhancing existing algorithms without requiring major modifications.
Findings
CLIPPER improves neurosymbolic learning accuracy by up to 53%.
The approach scales well across 16 complex benchmarks.
State-of-the-art results are achieved with existing neurosymbolic engines.
Abstract
We study the problem of learning neural classifiers in a neurosymbolic setting where the hidden gold labels of input instances must satisfy a logical formula. Learning in this setting proceeds by first computing (a subset of) the possible combinations of labels that satisfy the formula and then computing a loss using those combinations and the classifiers' scores. One challenge is that the space of label combinations can grow exponentially, making learning difficult. We propose a technique that prunes this space by exploiting the intuition that instances with similar latent representations are likely to share the same label. While this intuition has been widely used in weakly supervised learning, its application in our setting is challenging due to label dependencies imposed by logical constraints. We formulate the pruning process as an integer linear program that discards inconsistent…
Peer Reviews
Decision·Submitted to ICLR 2026
- The proposed method addresses the exponential explosion of label combinations in NESY with an elegant pruning approach. - The authors provide clear definitions, soundness guarantees, and ILP optimality proof.
- The method leverages standard ideas (representation similarity + graph consistency + ILP) without major theoretical innovation. - ILP optimization over large NESY datasets may be computationally expensive; no runtime or complexity analysis is reported. - The effectiveness heavily depends on the quality of latent representations; the paper does not explore failure cases or encoder ablation. - While numerical gains are large, the work lacks intuitive analysis or visualization of what pre-images
- Overall I find the paper to be quite well-written, barring the critique mentioned below - The problem is well defined, and the solutions is derived from first principles - The use of a running example makes it easy to follow the exposition
- The authors should use \citep and \citet as appropriate throghout the paper. Currently, it appears as though they only make use of \citet, which makes the paper quite a bit harder to read. - I find lines 117-121 to be very restrictive. For instance, [1], [2], and [3] do not require that we create facts of the form digit(d, x1). Rather the characterization given is specific to logic programming, which does not encompass the entirety of NeSy. - "Unlike supervised learning, in NeSy... the gold
1. The paper is well-written, and the background and proposed approach are very well explained 2. The proposed idea is well-motivated and intuitive. The challenges associated with the practical implementation of pre-image space pruning are well-discussed, and the proposed approach effectively addresses these challenges in a simple yet effective manner. 3. The evaluation sections show very promising results! 4. CLIPPER is complementary to existing NESY engines and can operate in a training-free
1. Overhead of solving ILP is not discussed 2. Dependence on the quality of pretrained encoder or robustness to encoder noise is not discussed.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Advanced Memory and Neural Computing · Neural Networks and Reservoir Computing
