Concept Learning in the Wild: Towards Algorithmic Understanding of Neural Networks
Elad Shoham, Hadar Cohen, Khalil Wattad, Havana Rika, Dan Vilenchik

TL;DR
This paper investigates how neural networks, specifically GNNs trained on SAT problems, learn human-like concepts such as 'support', and demonstrates how these insights can improve algorithms and interpret black-box models.
Contribution
It uncovers that GNNs for SAT learn concepts aligned with human heuristics, identified via principal components, and shows how to enhance algorithms and interpret models through these concepts.
Findings
GNNs learn concepts matching human SAT heuristics.
Principal components encode key concepts in GNN embeddings.
Discovered concepts can improve SAT solving algorithms.
Abstract
Explainable AI (XAI) methods typically focus on identifying essential input features or more abstract concepts for tasks like image or text classification. However, for algorithmic tasks like combinatorial optimization, these concepts may depend not only on the input but also on the current state of the network, like in the graph neural networks (GNN) case. This work studies concept learning for an existing GNN model trained to solve Boolean satisfiability (SAT). \textcolor{black}{Our analysis reveals that the model learns key concepts matching those guiding human-designed SAT heuristics, particularly the notion of 'support.' We demonstrate that these concepts are encoded in the top principal components (PCs) of the embedding's covariance matrix, allowing for unsupervised discovery. Using sparse PCA, we establish the minimality of these concepts and show their teachability through a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus · Principal Components Analysis
