Advancing Supervised Local Learning Beyond Classification with Long-term Feature Bank
Feiyu Zhu, Yuming Zhang, Xiuyuan Guo, Hengyu Shi, Junfeng Luo, Junhao Su, Jialin Gao

TL;DR
This paper introduces FBA, a local learning method with a feature bank that extends beyond classification, conserving memory and matching end-to-end performance in diverse visual tasks.
Contribution
It presents the first successful extension of local learning to multiple visual tasks using a feature bank for better cross-task adaptability.
Findings
FBA reduces GPU memory consumption significantly.
FBA achieves comparable performance to end-to-end methods.
FBA effectively applies beyond classification tasks.
Abstract
Local learning offers an alternative to traditional end-to-end back-propagation in deep neural networks, significantly reducing GPU memory consumption. Although it has shown promise in image classification tasks, its extension to other visual tasks has been limited. This limitation arises primarily from two factors: 1) architectures designed specifically for classification are not readily adaptable to other tasks, which prevents the effective reuse of task-specific knowledge from architectures tailored to different problems; 2) these classification-focused architectures typically lack cross-scale feature communication, leading to degraded performance in tasks like object detection and super-resolution. To address these challenges, we propose the Feature Bank Augmented auxiliary network (FBA), which introduces a simplified design principle and incorporates a feature bank to enhance…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1: The proposed local learning framework can achieve performance comparable to end-to-end backpropagation on complex, non-classification tasks like object detection and super-resolution, while substantially reducing GPU memory consumption. 2: The Simple Local Module (SLM) aims to address the challenge of cross-task adaptability by reusing aligned backbone blocks and the original task head, removing the need for complex, manual, task-specific auxiliary networks. 3: Extensive experiments on vari
1: **The presentation needs significant improvements for publishing.** It's very hard to catch the main ideas from Figures 2, 3, and 4. The overall framework seems to just be reusing the feature of the backbone or any internal blocks. 2: **The paper has limited contributions.** The paper's primary motivation for replacing end-to-end BP is not sufficiently argued; while it mentions BP's limitations like memory usage, the necessity of a local-learning alternative versus the performance trade-o
1. The paper proposes FBA to improve task adaptability and cross-scale feature communication in auxiliary networks. 2. Experiments show that FBA maintains BP-level performance across multiple tasks. 3. The figures are clear with pleasing color design.
1. In Figure 3, it appears to illustrate how the proposed method constructs the auxiliary network when the same model is applied to different tasks. However, the color and structure of the SLM modules are not very clear. The meaning of the color variations in the feature layers and their correspondence to elements within the SLM is ambiguous. For example, in the upper subfigure, the color order of the first SLM is reversed, and in the lower subfigure, the last three SLMs share the same color. No
1. This paper aims to address the limitation of local learning that restricted to image classification tasks, which is a valuable motivation. 2. The proposed method helps to reduce the GPU memory consumption during training process according to the experimental results.
1. The document is not well-organized. This paper is like a draft and the reader needs to take a lot of effort to understand the main pipeline of the proposed framework. There are a lot of inappropriate inline citations separating the sentences illogically. A lot of subscripts of the formula is directly put on the baseline following the main-text instead of below the base line. 2. The proposed method is less innovative. The AugLocal[1] also utilizes the features from cnn layers and fc layers o
The proposed FBA framework offers a unified and efficient solution that extends local learning to diverse visual tasks. By incorporating cross-scale feature access, it enhances representation quality and adaptability while maintaining a simplified architecture. Experiments demonstrate that FBA achieves performance comparable to end-to-end backpropagation with substantially lower GPU memory usage, highlighting its effectiveness and efficiency across various applications.
1) There are some typos in formulas and citations in the main text. For example, Eq (1) and citations in Page 4. Also there are irregular spacing, please check it. 2) I don't understand FPN existing in the main text. It'd be better to introduce relavant infomation. 3) Since the feature bank is one of the most contribution in your work, there should be more contents to explain that how the feature bank works in the three tasks or at least from a general level. After reading the main text, I just
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpen Education and E-Learning · E-Learning and Knowledge Management · Innovative Teaching and Learning Methods
