Towards Trustworthy Dataset Distillation

Shijie Ma; Fei Zhu; Zhen Cheng; Xu-Yao Zhang

arXiv:2307.09165·cs.LG·August 13, 2024·2 cites

Towards Trustworthy Dataset Distillation

Shijie Ma, Fei Zhu, Zhen Cheng, Xu-Yao Zhang

PDF

Open Access 2 Repos

TL;DR

This paper introduces TrustDD, a new dataset distillation approach that creates compact datasets capable of training models for both in-distribution classification and out-of-distribution detection, enhancing efficiency and trustworthiness.

Contribution

The paper proposes TrustDD, a novel dataset distillation paradigm that jointly distills in-distribution data and outliers, and introduces Pseudo-Outlier Exposure to generate pseudo-outliers without real outlier data.

Findings

01

TrustDD improves both InD classification and OOD detection performance.

02

POE surpasses the state-of-the-art Outlier Exposure method.

03

TrustDD is more trustworthy and suitable for open-world scenarios.

Abstract

Efficiency and trustworthiness are two eternal pursuits when applying deep learning in real-world applications. With regard to efficiency, dataset distillation (DD) endeavors to reduce training costs by distilling the large dataset into a tiny synthetic dataset. However, existing methods merely concentrate on in-distribution (InD) classification in a closed-world setting, disregarding out-of-distribution (OOD) samples. On the other hand, OOD detection aims to enhance models' trustworthiness, which is always inefficiently achieved in full-data settings. For the first time, we simultaneously consider both issues and propose a novel paradigm called Trustworthy Dataset Distillation (TrustDD). By distilling both InD samples and outliers, the condensed datasets are capable of training models competent in both InD classification and OOD detection. To alleviate the requirement of real outlier…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)