ConceptX: A Framework for Latent Concept Analysis

Firoj Alam; Fahim Dalvi; Nadir Durrani; Hassan Sajjad and; Abdul Rafae Khan; Jia Xu

arXiv:2211.06642·cs.CL·November 15, 2022

ConceptX: A Framework for Latent Concept Analysis

Firoj Alam, Fahim Dalvi, Nadir Durrani, Hassan Sajjad and, Abdul Rafae Khan, Jia Xu

PDF

Open Access

TL;DR

ConceptX is a human-in-the-loop framework that uncovers, visualizes, and annotates latent concepts in pre-trained language models, aiding interpretability and bias detection in NLP models.

Contribution

It introduces an unsupervised method for discovering concepts and a graphical interface for human annotation, including auto-annotations based on linguistic ontologies.

Findings

01

Discovered diverse linguistic and task-specific concepts in language models.

02

Enabled annotation of bias-related concepts such as gender and religious connotations.

03

Provided a resource for understanding and mitigating biases in NLP models.

Abstract

The opacity of deep neural networks remains a challenge in deploying solutions where explanation is as important as precision. We present ConceptX, a human-in-the-loop framework for interpreting and annotating latent representational space in pre-trained Language Models (pLMs). We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts. To facilitate the process, we provide auto-annotations of the concepts (based on traditional linguistic ontologies). Such annotations enable development of a linguistic resource that directly represents latent concepts learned within deep NLP models. These include not just traditional linguistic concepts, but also task-specific or sensitive concepts (words grouped based on gender or religious connotation) that helps the annotators to mark bias in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification