Differentiable Disentanglement Filter: an Application Agnostic Core Concept Discovery Probe
Guntis Barzdins, Eduards Sidorovics

TL;DR
This paper introduces a novel neural network nonlinearity called Differentiable Disentanglement Filter (DDF) that can be integrated into existing networks to automatically disentangle core concepts, aiding interpretability and understanding of neural representations.
Contribution
The paper proposes the DDF, a new nonlinearity that facilitates disentangling core concepts in neural networks, inspired by hyper-dimensional computing theory, applicable across various models.
Findings
DDF can be inserted into neural networks to disentangle core concepts.
DDF successfully disentangles concepts in 3D scene representations.
The approach enhances interpretability of neural network layers.
Abstract
It has long been speculated that deep neural networks function by discovering a hierarchical set of domain-specific core concepts or patterns, which are further combined to recognize even more elaborate concepts for the classification or other machine learning tasks. Meanwhile disentangling the actual core concepts engrained in the word embeddings (like word2vec or BERT) or deep convolutional image recognition neural networks (like PG-GAN) is difficult and some success there has been achieved only recently. In this paper we propose a novel neural network nonlinearity named Differentiable Disentanglement Filter (DDF) which can be transparently inserted into any existing neural network layer to automatically disentangle the core concepts used by that layer. The DDF probe is inspired by the obscure properties of the hyper-dimensional computing theory. The DDF proof-of-concept…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Multimodal Machine Learning Applications · Topic Modeling
