Where We Have Arrived in Proving the Emergence of Sparse Symbolic   Concepts in AI Models

Qihan Ren; Jiayang Gao; Wen Shen; Quanshi Zhang

arXiv:2305.01939·cs.LG·September 16, 2024·2 cites

Where We Have Arrived in Proving the Emergence of Sparse Symbolic Concepts in AI Models

Qihan Ren, Jiayang Gao, Wen Shen, Quanshi Zhang

PDF

Open Access 1 Repo

TL;DR

This paper provides a theoretical proof that well-trained deep neural networks inherently develop sparse symbolic concepts, characterized by limited input interactions, under certain common conditions, advancing understanding of neural interpretability.

Contribution

It establishes formal conditions under which DNNs encode sparse symbolic primitive inference patterns, linking network derivatives, robustness, and interpretability.

Findings

01

DNNs exhibit zero high-order derivatives under certain conditions.

02

DNN confidence increases with less occlusion in inputs.

03

Sparse interactions suffice to mimic inference scores across many samples.

Abstract

This study aims to prove the emergence of symbolic concepts (or more precisely, sparse primitive inference patterns) in well-trained deep neural networks (DNNs). Specifically, we prove the following three conditions for the emergence. (i) The high-order derivatives of the network output with respect to the input variables are all zero. (ii) The DNN can be used on occluded samples and when the input sample is less occluded, the DNN will yield higher confidence. (iii) The confidence of the DNN does not significantly degrade on occluded samples. These conditions are quite common, and we prove that under these conditions, the DNN will only encode a relatively small number of sparse interactions between input variables. Moreover, we can consider such interactions as symbolic primitive inference patterns encoded by a DNN, because we show that inference scores of the DNN on an exponentially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sjtu-xai-lab/interaction-sparsity
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Evolutionary Algorithms and Applications