StyLEx: Explaining Style Using Human Lexical Annotations
Shirley Anugrah Hayati, Kyumin Park, Dheeraj Rajagopal, Lyle Ungar,, Dongyeop Kang

TL;DR
StyLEx is a model that learns to explain stylistic features in text using human annotations, improving explanation quality while maintaining high style classification performance across diverse datasets.
Contribution
It introduces a joint learning approach that aligns model explanations with human-annotated stylistic features, enhancing interpretability without sacrificing accuracy.
Findings
StyLEx provides explanations that are more plausible and sufficient according to metrics.
It maintains high style classification accuracy on both in-domain and out-of-domain datasets.
Explanations from StyLEx are more understandable to humans compared to saliency-based methods.
Abstract
Large pre-trained language models have achieved impressive results on various style classification tasks, but they often learn spurious domain-specific words to make predictions (Hayati et al., 2021). While human explanation highlights stylistic tokens as important features for this task, we observe that model explanations often do not align with them. To tackle this issue, we introduce StyLEx, a model that learns from human-annotated explanations of stylistic features and jointly learns to perform the task and predict these features as model explanations. Our experiments show that StyLEx can provide human-like stylistic lexical explanations without sacrificing the performance of sentence-level style prediction on both in-domain and out-of-domain datasets. Explanations from StyLEx show significant improvements in explanation metrics (sufficiency, plausibility) and when evaluated with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsALIGN
