Concepts' Information Bottleneck Models
Karim Galliamov, Syed M Ahsan Kazmi, Adil Khan, Ad\'in Ram\'irez Rivera

TL;DR
This paper introduces an information bottleneck regularizer for concept bottleneck models, improving their accuracy, interpretability, and faithfulness by encouraging minimal, sufficient concept representations without changing architecture.
Contribution
It proposes a novel IB-based regularizer for CBMs, applicable across architectures, that enhances performance and interpretability by controlling information flow in the concept layer.
Findings
Consistent performance improvements across six CBM families.
Enhanced interpretability and faithfulness of concept representations.
Robust gains demonstrated on multiple benchmarks.
Abstract
Concept Bottleneck Models (CBMs) aim to deliver interpretable predictions by routing decisions through a human-understandable concept layer, yet they often suffer reduced accuracy and concept leakage that undermines faithfulness. We introduce an explicit Information Bottleneck regularizer on the concept layer that penalizes while preserving task-relevant information in , encouraging minimal-sufficient concept representations. We derive two practical variants (a variational objective and an entropy-based surrogate) and integrate them into standard CBM training without architectural changes or additional supervision. Evaluated across six CBM families and three benchmarks, the IB-regularized models consistently outperform their vanilla counterparts. Information-plane analyses further corroborate the intended behavior. These results indicate that enforcing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Data Stream Mining Techniques · Domain Adaptation and Few-Shot Learning
