TL;DR
This paper proposes an interactive reconstruction task to measure human interpretability of generative models, effectively distinguishing between different models and providing a human-centered evaluation method.
Contribution
It introduces a novel human-interaction-based task for assessing interpretability in generative models, addressing limitations of existing disentanglement metrics.
Findings
Task reliably differentiates entangled and disentangled models on synthetic data.
Method distinguishes between representation learning techniques on real data.
Qualitative and quantitative results are consistent across small and large-scale studies.
Abstract
For machine learning models to be most useful in numerous sociotechnical systems, many have argued that they must be human-interpretable. However, despite increasing interest in interpretability, there remains no firm consensus on how to measure it. This is especially true in representation learning, where interpretability research has focused on "disentanglement" measures only applicable to synthetic datasets and not grounded in human factors. We introduce a task to quantify the human-interpretability of generative model representations, where users interactively modify representations to reconstruct target instances. On synthetic datasets, we find performance on this task much more reliably differentiates entangled and disentangled models than baseline approaches. On a real dataset, we find it differentiates between representation learning methods widely believed but never shown to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
