A Neuro-Symbolic Benchmark Suite for Concept Quality and Reasoning Shortcuts
Samuele Bortolotti, Emanuele Marconato, Tommaso Carraro, Paolo, Morettin, Emile van Krieken, Antonio Vergari, Stefano Teso, Andrea Passerini

TL;DR
This paper introduces rsbench, a benchmark suite for evaluating how well models learn and reason with concepts, addressing the problem of reasoning shortcuts that undermine interpretability and trustworthiness.
Contribution
The paper presents rsbench, a customizable benchmark with metrics and verification procedures to systematically assess concept quality and reasoning shortcuts in neural and neuro-symbolic models.
Findings
High-quality concept learning remains challenging for current models.
RSs significantly affect model reasoning and interpretability.
rsbench provides a systematic way to evaluate and mitigate RSs.
Abstract
The advent of powerful neural classifiers has increased interest in problems that require both learning and reasoning. These problems are critical for understanding important properties of models, such as trustworthiness, generalization, interpretability, and compliance to safety and structural constraints. However, recent research observed that tasks requiring both learning and reasoning on background knowledge often suffer from reasoning shortcuts (RSs): predictors can solve the downstream reasoning task without associating the correct concepts to the high-dimensional data. To address this issue, we introduce rsbench, a comprehensive benchmark suite designed to systematically evaluate the impact of RSs on models by providing easy access to highly customizable tasks affected by RSs. Furthermore, rsbench implements common metrics for evaluating concept quality and introduces novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software Reliability and Analysis Research
