FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning
Hongwei Yan, Guanglong Sun, Kanglei Zhou, Qian Li, Liyuan Wang, Yi Zhong

TL;DR
FlyPrompt is a brain-inspired framework for general continual learning that uses random expert routing and temporal ensembles to adaptively learn from non-stationary data streams, outperforming existing methods.
Contribution
It introduces a novel brain-inspired approach with a random analytic router and temporal ensemble heads for improved continual parameter-efficient tuning.
Findings
Achieves up to 12.43% accuracy gains over baselines.
Effectively handles non-stationary data streams.
Demonstrates superior performance on multiple datasets.
Abstract
General continual learning (GCL) challenges intelligent systems to learn from single-pass, non-stationary data streams without clear task boundaries. While recent advances in continual parameter-efficient tuning (PET) of pretrained models show promise, they typically rely on multiple training epochs and explicit task cues, limiting their effectiveness in GCL scenarios. Moreover, existing methods often lack targeted design and fail to address two fundamental challenges in continual PET: how to allocate expert parameters to evolving data distributions, and how to improve their representational capacity under limited supervision. Inspired by the fruit fly's hierarchical memory system characterized by sparse expansion and modular ensembles, we propose FlyPrompt, a brain-inspired framework that decomposes GCL into two subproblems: expert routing and expert competence improvement. FlyPrompt…
Peer Reviews
Decision·ICLR 2026 Poster
1. The paper identifies two core challenges in GCL a) expert routing and b)expert stability. it addresses each with a distinct, principled component: REAR and TEE. 2. REAR does not do backpropagation. Instead it maintains online sufficient statistics and is solving a closed-form ridge regression, which is much faster/simpler. 3. the authors show consistent state-of-the-art results on CIFAR-100, ImageNet-R, and CUB-200 under the Si-Blurry protocol. 4. I like the math analysis linking random-featu
1) Thm 1 relies on a pairwise concentration lemma but omits a full matrix-concentration argument and a margin assumption needed to link regression risk to routing accuracy. This may be fixable in the rebuttal period. 2) Thm-2: the derivation connecting EMA bias to temporal drift is approximate. The claim of “near-optimal adaptation” is not formally proven. Needs to be clarified 3) Despite the task-free framing that the paper emphasizes, in my opinion REAR initialization and label accumulati
**Biologically Inspired Foundation**: The framework is grounded in the neurobiological principles of the fruit fly's brain, offering a novel approach to solving complex GCL challenges. **Addresses Core GCL Problems**: It effectively tackles two fundamental challenges in GCL: "expert routing" (selecting the right parameters) and "expert competence improvement" (adapting to new data) under difficult, realistic constraints (single-pass data, no task boundaries). **Novel and Efficient Components**
**Notational Clarity**: There appears to be a notational inconsistency in Equations 2 and 3, where the symbols $\Phi$ and $\varphi$ seem to be transposed or confused. **Comparative Analysis**: The paper would be significantly strengthened by a direct comparison of FlyPrompt against other prominent methods (such as LoRA, Adapters, and MoE). This comparison should explicitly analyze key metrics: - Parameter efficiency (total and new parameters) - Computational overhead (training and inference t
1. The separation of GCL into routing and competence subproblems offers a structured approach to tackling its challenges. 2. The use of principles from fruit fly olfactory memory introduces a novel interdisciplinary perspective to CL. 3. The paper provides both informal and formal theoretical bounds on routing error and EMA parameter error, enhancing methodological credibility.
1. It resemble an ad hoc combination of existing techniques. The proposed FlyPrompt framework appears to be largely a composition of well-established components rather than a fundamentally novel algorithm. Specifically, the REAR combines fixed random projection with ridge regression (a paradigm already explored in prior CL works and analytic class-incremental learning). Similarly, the TE2 employs EMA with multiple decay rates, a standard technique in online learning and model stabilization (e.g.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
