Learning with Mixture of Prototypes for Out-of-Distribution Detection

Haodong Lu; Dong Gong; Shuo Wang; Jason Xue; Lina Yao; Kristen Moore

arXiv:2402.02653·cs.LG·February 6, 2024·1 cites

Learning with Mixture of Prototypes for Out-of-Distribution Detection

Haodong Lu, Dong Gong, Shuo Wang, Jason Xue, Lina Yao, Kristen Moore

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces PALM, a novel method that models each class with multiple prototypes to better capture data diversity and improve out-of-distribution detection performance in machine learning models.

Contribution

PALM automatically identifies and updates multiple prototypes per class, using reciprocal neighbor soft assignment and combined loss functions to enhance OOD detection.

Findings

01

Achieves state-of-the-art AUROC of 93.82 on CIFAR-100

02

Models class diversity with multiple prototypes

03

Outperforms existing OOD detection methods

Abstract

Out-of-distribution (OOD) detection aims to detect testing samples far away from the in-distribution (ID) training data, which is crucial for the safe deployment of machine learning models in the real world. Distance-based OOD detection methods have emerged with enhanced deep representation learning. They identify unseen OOD samples by measuring their distances from ID class centroids or prototypes. However, existing approaches learn the representation relying on oversimplified data assumptions, e.g, modeling ID data of each class with one centroid class prototype or using loss functions not designed for OOD detection, which overlook the natural diversities within the data. Naively enforcing data samples of each class to be compact around only one prototype leads to inadequate modeling of realistic data and limited performance. To tackle these issues, we propose PrototypicAl Learning…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

Overall, the novelty of this work is clearly presented, understandable and consistent with the intuition. Experiment comparison is comprehensive.

Weaknesses

There have been some OOD detection benchmark datasets, such as Openood: Benchmarking generalized out-of-distribution detection. Advances in Neural Information Processing Systems 2022, 35, 32598-32611. Most of the datasets used in experiments are based on standard benchmark datasets. How are these datasets, such as CIFAR are used for OOD in this work? One of the reasons that the proposed model outperforms the compared baselines is it better estimates the class or sample distribution due to the m

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

-The paper provides extensive experimental settings, particularly in the ablation studies. -The proposed method introduces an automatic prototype learning framework that incorporates a mixture of prototypes to represent hyperspherical embeddings, effectively capturing the natural diversities within each class. -The proposed method achieves a significantly improved performance.

Weaknesses

-I have some concerns about scalability. The introduction of multiple prototypes and their dynamic updating could lead to scalability issues, especially when handling very large datasets or a vast number of classes. -The effectiveness of PALM is highly dependent on the quality of the prototypes. If the prototypes do not accurately represent the underlying data distribution, the model may face challenges in OOD detection. -In terms of computational cost, PALM might demand additional computational

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 5

Strengths

This paper is well-written and mathematically sound. The method PALM that is proposed nicely extends an already strong OOD detection method called CIDER. PALM is extensively analyzed via a thorough ablation study, and extensively evaluated against multiple OOD benchmark datasets and methods. PALM boats strong ID-OOD discrimination in almost all of the experiments by outperforming previous supervised and unsupervised methods by a large margin.

Weaknesses

1. PALM, like its predecessor CIDER, heavily relies on the hyperspherical representation of the learned embeddings to formulate and shape the embedding space. In CIDER, this representation was crucial in achieving strong ID-OOD separability and ID classification. However, since PALM is a mixture of Gaussian, that assumption that the embeddings need to be normalized to unit-norm or need to lie in a hyper spherical space may not necessarily be needed. I wonder if this assumption could be lifted so

Code & Models

Repositories

jeff024/palm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Water Systems and Optimization · Machine Learning and Algorithms

MethodsPathways Language Model