MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects
Lei Fan, Dongdong Fan, Zhiguang Hu, Yiwen Ding, Donglin Di, Kai Yi,, Maurice Pagnucco, Yang Song

TL;DR
MANTA is a comprehensive large-scale dataset combining visual and textual data for tiny object anomaly detection, enabling advanced research and benchmarking in multi-view and multimodal anomaly detection tasks.
Contribution
The paper introduces MANTA, a novel large-scale dataset with multi-view images and rich textual annotations for tiny object anomaly detection, along with baseline methods and extensive benchmarking.
Findings
Baseline models show significant challenges on the dataset.
Multi-view data improves anomaly detection accuracy.
Textual information enhances detection performance.
Abstract
We present MANTA, a visual-text anomaly detection dataset for tiny objects. The visual component comprises over 137.3K images across 38 object categories spanning five typical domains, of which 8.6K images are labeled as anomalous with pixel-level annotations. Each image is captured from five distinct viewpoints to ensure comprehensive object coverage. The text component consists of two subsets: Declarative Knowledge, including 875 words that describe common anomalies across various domains and specific categories, with detailed explanations for < what, why, how>, including causes and visual characteristics; and Constructivist Learning, providing 2K multiple-choice questions with varying levels of difficulty, each paired with images and corresponded answer explanations. We also propose a baseline for visual-text tasks and conduct extensive benchmarking experiments to evaluate advanced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Anomaly Detection Techniques and Applications · Digital Media Forensic Detection
