Scaling up Discovery of Latent Concepts in Deep NLP Models
Majd Hawasly, Fahim Dalvi, Nadir Durrani

TL;DR
This paper improves the scalability of discovering latent concepts in deep NLP models by evaluating clustering algorithms, especially K-Means, enabling analysis of larger datasets and models like LLMs.
Contribution
It introduces scalable clustering methods and metrics for latent concept discovery, significantly enhancing efficiency without sacrificing quality.
Findings
K-Means outperforms hierarchical clustering in efficiency
Metrics effectively assess concept quality
Scalable discovery applied to large language models
Abstract
Despite the revolution caused by deep NLP models, they remain black boxes, necessitating research to understand their decision-making processes. A recent work by Dalvi et al. (2022) carried out representation analysis through the lens of clustering latent spaces within pre-trained models (PLMs), but that approach is limited to small scale due to the high cost of running Agglomerative hierarchical clustering. This paper studies clustering algorithms in order to scale the discovery of encoded concepts in PLM representations to larger datasets and models. We propose metrics for assessing the quality of discovered latent concepts and use them to compare the studied clustering algorithms. We found that K-Means-based concept discovery significantly enhances efficiency while maintaining the quality of the obtained concepts. Furthermore, we demonstrate the practicality of this newfound…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Semantic Web and Ontologies · Natural Language Processing Techniques
