Continual-MEGA: A Large-scale Benchmark for Generalizable Continual Anomaly Detection
Geonu Lee, Yujeong Oh, Geonhui Jang, Soyoung Lee, Jeonghyo Song, Sungmin Cha, YoungJoon Yoo

TL;DR
This paper introduces Continual-MEGA, a comprehensive benchmark for continual anomaly detection that includes a large dataset and a new zero-shot generalization scenario, aiming to improve robustness and real-world applicability.
Contribution
The paper presents a new large-scale benchmark, Continual-MEGA, with a novel zero-shot generalization scenario and a unified baseline algorithm for continual anomaly detection.
Findings
Existing methods need improvement, especially in pixel-level defect localization.
The proposed method outperforms prior approaches.
The ContinualAD dataset improves anomaly detection performance.
Abstract
In this paper, we introduce a new benchmark for continual learning in anomaly detection, aimed at better reflecting real-world deployment scenarios. Our benchmark, Continual-MEGA, includes a large and diverse dataset that significantly expands existing evaluation settings by combining carefully curated existing datasets with our newly proposed dataset, ContinualAD. In addition to standard continual learning with expanded quantity, we propose a novel scenario that measures zero-shot generalization to unseen classes, those not observed during continual adaptation. This setting poses a new problem setting that continual adaptation also enhances zero-shot performance. We also present a unified baseline algorithm that improves robustness in few-shot detection and maintains strong generalization. Through extensive evaluations, we report three key findings: (1) existing methods show…
Peer Reviews
Decision·Submitted to ICLR 2026
- A new large-scale benchmark that unifies multiple AD datasets and defines reproducible task streams. - Dataset and benchmark release (if completed) could be a useful resource for the community.
- The paper claims that existing methods fail in continual AD, but Table 3 shows that MVFA (CVPR 2024 Spotlight), a non-continual zero-shot VLM-based method, performs competitively with the proposed ADCT. This contradicts the central claim that new continual-learning methods are required. If a zero-shot method performs as well as the proposed continual method, the necessity of the benchmark and ADCT is not established. - Evaluation in Scenario 2/3 artificially disadvantages MVFA, leading to inva
1. Large, diverse benchmark with realistic continual and CZSL settings that better reflect deployment. 2. Broad, carefully reported comparisons across method families with appropriate metrics (image AUROC, pixel AP, forgetting). 3. Clear empirical takeaways on generalization vs. forgetting, highlighting where current methods break. 4. A strong, reproducible CLIP-based baseline that others can extend; code/benchmark availability increases impact.
1. Training-budget mismatch likely benefits the proposed baseline; needs a strictly matched compute comparison. 2. No ablations to disentangle the effects of adapters, mixing strategy, and synthetic feature generation. 3. Limited documentation of the new dataset in the main text (how anomalies are obtained, per-class stats, representative examples). 4. Task split construction and order sensitivity are under-specified, making reproducibility and robustness hard to assess.
The paper proposes a new problem within the domain of Anomaly Detection, which is relevant to practical applications of AD in industry. The proposed method produces strong performance.
The paper is quite challenging to read, and could be better structured to present information in a more logical and more easily comprehensible way. Table 1 seems redundant given that the same information is also present in table 2. The tables and figures use font sizes that are not clearly legible. The caption for Fig.4 fails to fully describe what is shown, and the 7 colored backgrounds in this figure do not seem to correspond to the 4 sub-sets of tasks described in the corresponding main te
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
