Neural Concept Binder
Wolfgang Stammer, Antonia W\"ust, David Steinmann, Kristian Kersting

TL;DR
The Neural Concept Binder (NCB) introduces a novel framework for deriving discrete and continuous concept representations from unlabeled images, enabling interpretable visual reasoning and integration with external knowledge.
Contribution
NCB combines soft and hard binding mechanisms to produce expressive, interpretable concept representations that can be integrated with neural and symbolic reasoning modules.
Findings
Effective in deriving discrete concept representations from unlabeled images
Preserves model performance while enabling external knowledge integration
Validated on the CLEVR-Sudoku dataset
Abstract
The challenge in object-based visual reasoning lies in generating concept representations that are both descriptive and distinct. Achieving this in an unsupervised manner requires human users to understand the model's learned concepts and, if necessary, revise incorrect ones. To address this challenge, we introduce the Neural Concept Binder (NCB), a novel framework for deriving both discrete and continuous concept representations, which we refer to as "concept-slot encodings". NCB employs two types of binding: "soft binding", which leverages the recent SysBinder mechanism to obtain object-factor encodings, and subsequent "hard binding", achieved through hierarchical clustering and retrieval-based inference. This enables obtaining expressive, discrete representations from unlabeled images. Moreover, the structured nature of NCB's concept representations allows for intuitive inspection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning in Bioinformatics
MethodsAttention Is All You Need · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer
