Deep Networks as Logical Circuits: Generalization and Interpretation
Christopher Snyder, Sriram Vishwanath

TL;DR
This paper introduces a hierarchical logical circuit decomposition of deep neural networks, enabling better interpretation and improved generalization by analyzing and modifying internal logical components.
Contribution
It presents a novel logical circuit framework for interpreting DNNs, linking interpretability with generalization bounds, and demonstrates its utility on MNIST and controlled settings.
Findings
Logical circuit decomposition aligns with semantic categories
Improved generalization bounds without margin information
Enhanced interpretability and network diagnosis
Abstract
Not only are Deep Neural Networks (DNNs) black box models, but also we frequently conceptualize them as such. We lack good interpretations of the mechanisms linking inputs to outputs. Therefore, we find it difficult to analyze in human-meaningful terms (1) what the network learned and (2) whether the network learned. We present a hierarchical decomposition of the DNN discrete classification map into logical (AND/OR) combinations of intermediate (True/False) classifiers of the input. Those classifiers that can not be further decomposed, called atoms, are (interpretable) linear classifiers. Taken together, we obtain a logical circuit with linear classifier inputs that computes the same label as the DNN. This circuit does not structurally resemble the network architecture, and it may require many fewer parameters, depending on the configuration of weights. In these cases, we obtain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Neural Networks and Applications · Advanced Memory and Neural Computing
