GSCLIP : A Framework for Explaining Distribution Shifts in Natural   Language

Zhiying Zhu; Weixin Liang; James Zou

arXiv:2206.15007·cs.CL·July 1, 2022·1 cites

GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language

Zhiying Zhu, Weixin Liang, James Zou

PDF

Open Access 1 Repo

TL;DR

GSCLIP is a training-free framework that automatically explains dataset-level distribution shifts in natural language, aiding AI deployment and understanding dataset differences effectively.

Contribution

It introduces a novel dataset explanation task, a selector for evaluating explanations, and demonstrates the effectiveness of a language model-based generator for dataset shift explanation.

Findings

01

GSCLIP effectively identifies dataset shifts with high accuracy.

02

The framework is scalable and easy-to-use for dataset explanation.

03

Systematic evaluation confirms GSCLIP's superiority over existing methods.

Abstract

Helping end users comprehend the abstract distribution shifts can greatly facilitate AI deployment. Motivated by this, we propose a novel task, dataset explanation. Given two image data sets, dataset explanation aims to automatically point out their dataset-level distribution shifts with natural language. Current techniques for monitoring distribution shifts provide inadequate information to understand datasets with the goal of improving data quality. Therefore, we introduce GSCLIP, a training-free framework to solve the dataset explanation task. In GSCLIP, we propose the selector as the first quantitative evaluation method to identify explanations that are proper to summarize dataset shifts. Furthermore, we leverage this selector to demonstrate the superiority of a generator based on language model generation. Systematic evaluation on natural data shift verifies that GSCLIP, a combined…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

moein-shariatnia/OpenAI-CLIP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management