Where We Have Arrived in Proving the Emergence of Sparse Symbolic Concepts in AI Models
Qihan Ren, Jiayang Gao, Wen Shen, Quanshi Zhang

TL;DR
This paper provides a theoretical proof that well-trained deep neural networks inherently develop sparse symbolic concepts, characterized by limited input interactions, under certain common conditions, advancing understanding of neural interpretability.
Contribution
It establishes formal conditions under which DNNs encode sparse symbolic primitive inference patterns, linking network derivatives, robustness, and interpretability.
Findings
DNNs exhibit zero high-order derivatives under certain conditions.
DNN confidence increases with less occlusion in inputs.
Sparse interactions suffice to mimic inference scores across many samples.
Abstract
This study aims to prove the emergence of symbolic concepts (or more precisely, sparse primitive inference patterns) in well-trained deep neural networks (DNNs). Specifically, we prove the following three conditions for the emergence. (i) The high-order derivatives of the network output with respect to the input variables are all zero. (ii) The DNN can be used on occluded samples and when the input sample is less occluded, the DNN will yield higher confidence. (iii) The confidence of the DNN does not significantly degrade on occluded samples. These conditions are quite common, and we prove that under these conditions, the DNN will only encode a relatively small number of sparse interactions between input variables. Moreover, we can consider such interactions as symbolic primitive inference patterns encoded by a DNN, because we show that inference scores of the DNN on an exponentially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Evolutionary Algorithms and Applications
