TL;DR
This paper presents a theoretical analysis of how flexible task abstractions emerge in linear neural networks with gating mechanisms, revealing dynamics that support task switching and generalization akin to cognitive flexibility in animals.
Contribution
It introduces a model where weights self-organize into task-specific modules and gating layers form representations for switching, providing a mechanistic understanding of cognitive flexibility.
Findings
Weights self-organize into task modules
Gating layer representations enable task switching
Task abstractions support generalization and transfer
Abstract
Animals survive in dynamic environments changing at arbitrary timescales, but such data distribution shifts are a challenge to neural networks. To adapt to change, neural systems may change a large number of parameters, which is a slow process involving forgetting past information. In contrast, animals leverage distribution changes to segment their stream of experience into tasks and associate them with internal task abstracts. Animals can then respond flexibly by selecting the appropriate task abstraction. However, how such flexible task abstractions may arise in neural systems remains unknown. Here, we analyze a linear gated network where the weights and gates are jointly optimized via gradient descent, but with neuron-like constraints on the gates including a faster timescale, nonnegativity, and bounded activity. We observe that the weights self-organize into modules specialized for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
