Direct Preference Optimization for Adaptive Concept-based Explanations

Jacopo Teneggi; Zhenzhen Wang; Paul H. Yi; Tianmin Shu; Jeremias Sulam

arXiv:2505.15626·cs.LG·October 2, 2025

Direct Preference Optimization for Adaptive Concept-based Explanations

Jacopo Teneggi, Zhenzhen Wang, Paul H. Yi, Tianmin Shu, Jeremias Sulam

PDF

Open Access 1 Repo

TL;DR

This paper introduces a listener-adaptive explanation method for machine learning models that uses preference optimization to generate more effective, context-aware explanations, improving human understanding and classification accuracy.

Contribution

It presents a novel iterative training approach that aligns explanations with listener preferences using pairwise feedback, enhancing interpretability in real-world scenarios.

Findings

01

Aligns explanations with simulated listener preferences

02

Improves human classification accuracy in user studies

03

Effective across multiple image classification datasets

Abstract

Concept-based explanation methods aim at making machine learning models more transparent by finding the most important semantic features of an input (e.g., colors, patterns, shapes) for a given prediction task. However, these methods generally ignore the communicative context of explanations, such as the preferences of a listener. For example, medical doctors understand explanations in terms of clinical markers, but patients may not, needing a different vocabulary to rationalize the same diagnosis. We address this gap with listener-adaptive explanations grounded in principles of pragmatic reasoning and the rational speech act. We introduce an iterative training procedure based on direct preference optimization where a speaker learns to compose explanations that maximize communicative utility for a listener. Our approach only needs access to pairwise preferences, which can be collected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Sulam-Group/pragmatixs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies