Flexible task abstractions emerge in linear networks with fast and   bounded units

Kai Sandbrink; Jan P. Bauer; Alexandra M. Proca; Andrew M. Saxe,; Christopher Summerfield; Ali Hummos

arXiv:2411.03840·cs.LG·January 17, 2025

Flexible task abstractions emerge in linear networks with fast and bounded units

Kai Sandbrink, Jan P. Bauer, Alexandra M. Proca, Andrew M. Saxe,, Christopher Summerfield, Ali Hummos

PDF

1 Repo 1 Video

TL;DR

This paper presents a theoretical analysis of how flexible task abstractions emerge in linear neural networks with gating mechanisms, revealing dynamics that support task switching and generalization akin to cognitive flexibility in animals.

Contribution

It introduces a model where weights self-organize into task-specific modules and gating layers form representations for switching, providing a mechanistic understanding of cognitive flexibility.

Findings

01

Weights self-organize into task modules

02

Gating layer representations enable task switching

03

Task abstractions support generalization and transfer

Abstract

Animals survive in dynamic environments changing at arbitrary timescales, but such data distribution shifts are a challenge to neural networks. To adapt to change, neural systems may change a large number of parameters, which is a slow process involving forgetting past information. In contrast, animals leverage distribution changes to segment their stream of experience into tasks and associate them with internal task abstracts. Animals can then respond flexibly by selecting the appropriate task abstraction. However, how such flexible task abstractions may arise in neural systems remains unknown. Here, we analyze a linear gated network where the weights and gates are jointly optimized via gradient descent, but with neuron-like constraints on the gates including a faster timescale, nonnegativity, and bounded activity. We observe that the weights self-organize into modules specialized for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aproca/neural_task_abstraction
jaxOfficial

Videos

Flexible task abstractions emerge in linear networks with fast and bounded units· slideslive