Improving Generalization by Permutation Routing Across Model Copies

Shuhei Kashiwamura; Timothee Leleu

arXiv:2605.09256·cs.LG·May 12, 2026

Improving Generalization by Permutation Routing Across Model Copies

Shuhei Kashiwamura, Timothee Leleu

PDF

TL;DR

This paper proposes a novel method using the M-cover transform to enhance neural network generalization by routing messages across multiple model copies without parameter averaging.

Contribution

It introduces a structured message routing framework that improves generalization by leveraging permutations across model copies, applicable to various neural network architectures.

Findings

01

The method improves generalization in perceptrons and multilayer networks.

02

Structured message sharing outperforms traditional replica coupling.

03

The framework applies to both discrete models and differentiable neural networks.

Abstract

We introduce a use of the \(M\)-cover (or \(M\)-layer) transform for machine learning. The method replicates a model \(M\) times, but instead of coupling the copies through parameter averaging or an explicit attractive force, as in replicated SGD or Elastic SGD, it rewires the contexts in which local learning messages are computed. Each local loss is evaluated on a routed model whose parameters are drawn from different copies according to permutations sampled from a structured mixing kernel \(Q\). Training then uses the original local update rule, while the resulting learning messages are redistributed across the copies through these routed computational paths. Thus \(Q\) defines a topology for message transport and controls the long-loop structure of the lifted factor graph. We formulate this construction for perceptrons, committee machines, and multilayer perceptrons, showing that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.