Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
Haekyu Park, Seongmin Lee, Benjamin Hoover, Austin P. Wright, Omar, Shaikh, Rahul Duggal, Nilaksh Das, Kevin Li, Judy Hoffman, Duen Horng Chau

TL;DR
ConceptEvo is a unified framework that interprets and tracks the evolution of learned concepts in deep neural networks during training, providing insights into model development and decision-making.
Contribution
It introduces a novel algorithm for creating a shared semantic space and discovering concept evolutions, advancing post-training interpretation to include training dynamics.
Findings
Successfully identifies concept evolutions across models
Concept evolutions are human-understandable and influence predictions
Applicable to various modern and classic DNN architectures
Abstract
We present ConceptEvo, a unified interpretation framework for deep neural networks (DNNs) that reveals the inception and evolution of learned concepts during training. Our work addresses a critical gap in DNN interpretation research, as existing methods primarily focus on post-training interpretation. ConceptEvo introduces two novel technical contributions: (1) an algorithm that generates a unified semantic space, enabling side-by-side comparison of different models during training, and (2) an algorithm that discovers and quantifies important concept evolutions for class predictions. Through a large-scale human evaluation and quantitative experiments, we demonstrate that ConceptEvo successfully identifies concept evolutions across different models, which are not only comprehensible to humans but also crucial for class predictions. ConceptEvo is applicable to both modern DNN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning · Metabolomics and Mass Spectrometry Studies
