Invariance, encodings, and generalization: learning identity effects with neural networks
S. Brugiapaglia, M. Liu, P. Tupper

TL;DR
This paper investigates whether neural networks can learn identity effects, a type of constraint in language, from data alone, revealing how input encoding influences their ability to generalize and the limitations of current algorithms.
Contribution
The paper provides a theoretical framework proving certain algorithms cannot learn identity effects without explicit encoding, and demonstrates this with experiments on neural networks and input encodings.
Findings
Neural networks' ability to learn identity effects depends on input encoding.
Certain algorithms cannot infer identity effects without explicit guidance.
Input encoding critically affects neural networks' generalization to novel inputs.
Abstract
Often in language and other areas of cognition, whether two components of an object are identical or not determines if it is well formed. We call such constraints identity effects. When developing a system to learn well-formedness from examples, it is easy enough to build in an identify effect. But can identity effects be learned from the data without explicit guidance? We provide a framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference. We then show that a broad class of learning algorithms including deep feedforward neural networks trained via gradient-based algorithms (such as stochastic gradient descent or the Adam method) satisfy our criteria, dependent on the encoding of inputs. In some broader circumstances we are able to provide adversarial examples that the network necessarily classifies incorrectly. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Domain Adaptation and Few-Shot Learning · Model Reduction and Neural Networks
MethodsAdam
