Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform
Mateusz W\'ojcik, Witold Ko\'sciukiewicz, Mateusz Baran, Tomasz, Kajdanowicz, Adam Gonczarek

TL;DR
This paper introduces a fully differentiable, domain-agnostic neural architecture based on Mixture of Experts for class incremental continual learning in document processing, achieving state-of-the-art results without memory buffers.
Contribution
The authors propose a novel Mixture of Experts-based architecture that enables efficient online learning across multiple domains without memory buffers, outperforming existing methods.
Findings
Achieves state-of-the-art performance in class incremental learning tasks.
Operates effectively without memory buffers in various domains.
Outperforms reference methods in online, production-like environments.
Abstract
Production deployments in complex systems require ML architectures to be highly efficient and usable against multiple tasks. Particularly demanding are classification problems in which data arrives in a streaming fashion and each class is presented separately. Recent methods with stochastic gradient learning have been shown to struggle in such setups or have limitations like memory buffers, and being restricted to specific domains that disable its usage in real-world scenarios. For this reason, we present a fully differentiable architecture based on the Mixture of Experts model, that enables the training of high-performance classifiers when examples from each class are presented separately. We conducted exhaustive experiments that proved its applicability in various domains and ability to learn online in production environments. The proposed technique achieves SOTA results without a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
