Online Continual Learning Without the Storage Constraint

Ameya Prabhu; Zhipeng Cai; Puneet Dokania; Philip Torr; Vladlen; Koltun; Ozan Sener

arXiv:2305.09253·cs.CV·November 3, 2023·5 cites

Online Continual Learning Without the Storage Constraint

Ameya Prabhu, Zhipeng Cai, Puneet Dokania, Philip Torr, Vladlen, Koltun, Ozan Sener

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a simple, efficient online continual learning algorithm that uses a fixed feature extractor and a kNN classifier, excelling in scenarios with limited computational resources and outperforming existing methods on large-scale datasets.

Contribution

The paper proposes a novel online continual learning approach combining a fixed pretrained feature extractor with a kNN classifier, reducing computational and storage costs while maintaining high accuracy.

Findings

01

Outperforms existing methods by over 20% in accuracy on large datasets

02

Operates with minimal computational and storage requirements

03

Never forgets previously seen data due to its consistency property

Abstract

Traditional online continual learning (OCL) research has primarily focused on mitigating catastrophic forgetting with fixed and limited storage allocation throughout an agent's lifetime. However, a broad range of real-world applications are primarily constrained by computational costs rather than storage limitations. In this paper, we target such applications, investigating the online continual learning problem under relaxed storage constraints and limited computational budgets. We contribute a simple algorithm, which updates a kNN classifier continually along with a fixed, pretrained feature extractor. We selected this algorithm due to its exceptional suitability for online continual learning. It can adapt to rapidly changing streams, has zero stability gap, operates within tiny computational budgets, has low storage requirements by only storing features, and has a consistency…

Peer Reviews

Decision·ICLR 2024 Conference Withdrawn Submission

Reviewer 01Rating 3· reject, not good enoughConfidence 4

Strengths

- The submission poses an interesting question of whether the storage constraint is realistic or not. - The proposed method is simple and straightforward. - The paper reads well.

Weaknesses

- The novelty is limited. The methodology itself is an approximate kNN, with little modifications. Additionally, the method merely uses pretrained feature extractors as well, which does not add to technical novelty. - The consideration of the storage constraint seems a bit uni-dimensional to me. There are other factors than just storage costs that are not taken into account. For instance, the data itself may be volatile, i.e., some data points may be required by law to be deleted upon a set dura

Reviewer 02Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

- Superior empirical gain over other methods - Simplicity of the method - Good empirical setup using CGLM and CLOC datasets

Weaknesses

- The presented setup with infinite memory is arguably realistic online continual learning setup. The infinite memory would eventually prevent forgetting by perfect reminding (by using properly efficient version of kNN retrieval) and the proposed method is not surprising with that. Thus, it is questionable whether the proposed setup and the method is indeed helping us to solve online continual learning for real world deployment or not. - Method is not well motivated. It is not clear why the kNN

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 5

Strengths

In this paper, online continual learning in the presence of drift is studies, which indeed is an interesting and a practical topic as data streaming applications keep on increasing. The paper is well written and easy to read, and the algorithm is clearly presented and also demonstrated in Figure 1. The paper very well included state of the art. The improvement obtained in the experiments is considerable.

Weaknesses

In my opinion, the assumed setup seems simplified and unrealistic, given that it presumes the use of a fixed pre-trained feature extraction method for all forthcoming data in the data stream, as also mentioned by the authors. In a data stream, the data distribution and their features structure can evolve, and new classes can emerge, using a fixed pre-trained feature extraction method might not be enough in data streaming learning. The authors discussed about mobile devices, but can one really cl

Code & Models

Repositories

drimpossible/acm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications