Unit Testing for Concepts in Neural Networks

Charles Lovering; Ellie Pavlick

arXiv:2208.10244·cs.CL·November 29, 2022

Unit Testing for Concepts in Neural Networks

Charles Lovering, Ellie Pavlick

PDF

Open Access

TL;DR

This paper introduces unit tests to evaluate if neural networks' behaviors align with symbolic concept theories, focusing on aspects like groundedness, modularity, and reusability, using a visual concept learning task.

Contribution

It proposes a novel framework of unit tests to assess neural networks' adherence to symbolic concept criteria, bridging neural and symbolic representations.

Findings

01

Models pass tests of groundedness, modularity, and reusability.

02

Important questions about causality in neural models remain open.

03

New analysis methods are needed for internal state understanding.

Abstract

Many complex problems are naturally understood in terms of symbolic concepts. For example, our concept of "cat" is related to our concepts of "ears" and "whiskers" in a non-arbitrary way. Fodor (1998) proposes one theory of concepts, which emphasizes symbolic representations related via constituency structures. Whether neural networks are consistent with such a theory is open for debate. We propose unit tests for evaluating whether a system's behavior is consistent with several key aspects of Fodor's criteria. Using a simple visual concept learning task, we evaluate several modern neural architectures against this specification. We find that models succeed on tests of groundedness, modularlity, and reusability of concepts, but that important questions about causality remain open. Resolving these will require new methods for analyzing models' internal states.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications