Unexpected Benefits of Self-Modeling in Neural Systems

Vickram N. Premakumar; Michael Vaiana; Florin Pop; Judd Rosenblatt,; Diogo Schwerz de Lucena; Kirsten Ziman; and Michael S. A. Graziano

arXiv:2407.10188·cs.LG·July 25, 2024

Unexpected Benefits of Self-Modeling in Neural Systems

Vickram N. Premakumar, Michael Vaiana, Florin Pop, Judd Rosenblatt,, Diogo Schwerz de Lucena, Kirsten Ziman, and Michael S. A. Graziano

PDF

Open Access

TL;DR

This paper demonstrates that self-modeling in neural networks acts as a form of self-regularization, reducing complexity and improving parameter efficiency, which may explain some benefits observed in machine learning and biological systems.

Contribution

It provides empirical evidence that self-modeling leads to simpler, more regularized neural networks across various architectures and tasks, revealing a new regularization mechanism.

Findings

01

Self-modeling reduces network weight distribution width.

02

Self-modeling decreases the real log canonical threshold (RLCT).

03

Greater emphasis on self-modeling enhances complexity reduction.

Abstract

Self-models have been a topic of great interest for decades in studies of human cognition and more recently in machine learning. Yet what benefits do self-models confer? Here we show that when artificial networks learn to predict their internal states as an auxiliary task, they change in a fundamental way. To better perform the self-model task, the network learns to make itself simpler, more regularized, more parameter-efficient, and therefore more amenable to being predictively modeled. To test the hypothesis of self-regularizing through self-modeling, we used a range of network architectures performing three classification tasks across two modalities. In all cases, adding self-modeling caused a significant reduction in network complexity. The reduction was observed in two ways. First, the distribution of weights was narrower when self-modeling was present. Second, a measure of network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications