Cumulative-Goodness Free-Riding in Forward-Forward Networks: Real, Repairable, but Not Accuracy-Dominant
Amirhossein Yousefiramandi

TL;DR
This paper investigates layer free-riding in Forward-Forward networks, demonstrating it as a real, fixable issue that affects layer separation but is not the main factor limiting accuracy.
Contribution
It formalizes layer free-riding in FF networks and proposes local remedies that significantly improve layer separation without heavily impacting accuracy.
Findings
Remedies increase layer separation by up to 45 times in deeper layers.
Layer free-riding is a real but not dominant factor in accuracy limitations.
Architecture and augmentation choices impact final accuracy more than training modifications.
Abstract
Forward-Forward (FF) training allows each layer to learn from a local goodness criterion. In cumulative-goodness variants, however, later layers can inherit a task that earlier layers have already partially separated. We formalize this phenomenon as layer free-riding: under the softplus FF criterion, the class-discrimination gradient reaching block decays exponentially with the positive margin accumulated by preceding blocks. We then study three local remedies -- per-block, hardness-gated, and depth-scaled -- that recover current-layer separation measures without relying on backpropagated gradients. On CIFAR-10 and CIFAR-100, these remedies dramatically improve layer-separation statistics, with -- gains in deeper layers, while changing accuracy by less than one percentage point for non-degenerate training procedures. Tiny ImageNet provides a tougher cross-dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
