TL;DR
This paper uncovers class-dependent effects in perturbation-based evaluation of time series feature attributions, showing that evaluation metrics vary across classes and proposing a class-aware framework to improve assessment accuracy.
Contribution
It reveals class-dependent biases in perturbation-based attribution evaluation and introduces a class-aware penalty framework to address these effects.
Findings
Perturbation effectiveness varies significantly across classes.
Class-dependent effects are linked to learned classifier biases.
The proposed framework improves attribution evaluation on class-imbalanced data.
Abstract
As machine learning models become increasingly prevalent in time series applications, Explainable Artificial Intelligence (XAI) methods are essential for understanding their predictions. Within XAI, feature attribution methods aim to identify which input features contribute the most to a model's prediction, with their evaluation typically relying on perturbation-based metrics. Through systematic empirical analysis across multiple datasets, model architectures, and perturbation strategies, we reveal previously overlooked class-dependent effects in these metrics: they show varying effectiveness across classes, achieving strong results for some while remaining less sensitive to others. In particular, we find that the most effective perturbation strategies often demonstrate the most pronounced class differences. Our analysis suggests that these effects arise from the learned biases of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
