Channel-Wise MLPs Improve the Generalization of Recurrent Convolutional Networks

Nathan Breslow

arXiv:2508.08298·cs.LG·August 13, 2025

Channel-Wise MLPs Improve the Generalization of Recurrent Convolutional Networks

Nathan Breslow

PDF

Open Access

TL;DR

This paper demonstrates that incorporating channel-wise MLPs into recurrent convolutional networks enhances their ability to generalize across different data distributions, with DAMP outperforming DARC.

Contribution

It introduces DAMP, a novel architecture extending DARC with gated MLPs for improved channel mixing and generalization in recurrent convolutional networks.

Findings

01

DAMP outperforms DARC in generalization tasks

02

Explicit channel mixing improves robustness

03

Results have implications for neural program synthesis

Abstract

We investigate the impact of channel-wise mixing via multi-layer perceptrons (MLPs) on the generalization capabilities of recurrent convolutional networks. Specifically, we compare two architectures: DARC (Depth Aware Recurrent Convolution), which employs a simple recurrent convolutional structure, and DAMP (Depth Aware Multi-layer Perceptron), which extends DARC with a gated MLP for channel mixing. Using the Re-ARC benchmark, we find that DAMP significantly outperforms DARC in both in-distribution and out-of-distribution generalization under exact-match grading criteria. These results suggest that explicit channel mixing through MLPs enables recurrent convolutional networks to learn more robust and generalizable computational patterns. Our findings have implications for neural program synthesis and highlight the potential of DAMP as a target architecture for hypernetwork approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Neural Networks and Reservoir Computing · Advanced Memory and Neural Computing