TL;DR
This paper introduces a bi-level attention framework for news classification that enhances model explainability without sacrificing competitive performance, addressing the complexity challenge of deep learning models in NLP.
Contribution
It proposes a novel two-tier attention architecture that decouples explanation from decision-making in deep learning models for news classification.
Findings
Achieves competitive accuracy with state-of-the-art models
Improves explainability of attention mechanisms
Effective on large-scale news datasets
Abstract
Many recent deep learning-based solutions have widely adopted the attention-based mechanism in various tasks of the NLP discipline. However, the inherent characteristics of deep learning models and the flexibility of the attention mechanism increase the models' complexity, thus leading to challenges in model explainability. In this paper, to address this challenge, we propose a novel practical framework by utilizing a two-tier attention architecture to decouple the complexity of explanation and the decision-making process. We apply it in the context of a news article classification task. The experiments on two large-scaled news corpora demonstrate that the proposed model can achieve competitive performance with many state-of-the-art alternatives and illustrate its appropriateness from an explainability perspective.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
