Better Reasoning Behind Classification Predictions with BERT for Fake News Detection
Daesoo Lee

TL;DR
This paper enhances fake news detection by analyzing BERT's representation space and introducing a modified CAM method to interpret word-level contributions, achieving robust classification performance.
Contribution
It proposes a new interpretability method using modified CAM with BERT for fake news detection, emphasizing the importance of representation quality.
Findings
Representation space quality affects classification performance
Modified CAM provides word-level interpretability
BERT with a linear layer achieves robust results
Abstract
Fake news detection has become a major task to solve as there has been an increasing number of fake news on the internet in recent years. Although many classification models have been proposed based on statistical learning methods showing good results, reasoning behind the classification performances may not be enough. In the self-supervised learning studies, it has been highlighted that a quality of representation (embedding) space matters and directly affects a downstream task performance. In this study, a quality of the representation space is analyzed visually and analytically in terms of linear separability for different classes on a real and fake news dataset. To further add interpretability to a classification model, a modification of Class Activation Mapping (CAM) is proposed. The modified CAM provides a CAM score for each word token, where the CAM score on a word token denotes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Warmup With Linear Decay · Weight Decay · Layer Normalization · Dropout · Residual Connection · WordPiece · Softmax
