Cumulative-Goodness Free-Riding in Forward-Forward Networks: Real, Repairable, but Not Accuracy-Dominant

Amirhossein Yousefiramandi

arXiv:2605.06240·cs.LG·May 8, 2026

Cumulative-Goodness Free-Riding in Forward-Forward Networks: Real, Repairable, but Not Accuracy-Dominant

Amirhossein Yousefiramandi

PDF

TL;DR

This paper investigates layer free-riding in Forward-Forward networks, demonstrating it as a real, fixable issue that affects layer separation but is not the main factor limiting accuracy.

Contribution

It formalizes layer free-riding in FF networks and proposes local remedies that significantly improve layer separation without heavily impacting accuracy.

Findings

01

Remedies increase layer separation by up to 45 times in deeper layers.

02

Layer free-riding is a real but not dominant factor in accuracy limitations.

03

Architecture and augmentation choices impact final accuracy more than training modifications.

Abstract

Forward-Forward (FF) training allows each layer to learn from a local goodness criterion. In cumulative-goodness variants, however, later layers can inherit a task that earlier layers have already partially separated. We formalize this phenomenon as layer free-riding: under the softplus FF criterion, the class-discrimination gradient reaching block $d$ decays exponentially with the positive margin accumulated by preceding blocks. We then study three local remedies -- per-block, hardness-gated, and depth-scaled -- that recover current-layer separation measures without relying on backpropagated gradients. On CIFAR-10 and CIFAR-100, these remedies dramatically improve layer-separation statistics, with $4 \times$ -- $45 \times$ gains in deeper layers, while changing accuracy by less than one percentage point for non-degenerate training procedures. Tiny ImageNet provides a tougher cross-dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.