MeLT: Message-Level Transformer with Masked Document Representations as   Pre-Training for Stance Detection

Matthew Matero; Nikita Soni; Niranjan Balasubramanian; and H. Andrew; Schwartz

arXiv:2109.08113·cs.CL·November 3, 2021

MeLT: Message-Level Transformer with Masked Document Representations as Pre-Training for Stance Detection

Matthew Matero, Nikita Soni, Niranjan Balasubramanian, and H. Andrew, Schwartz

PDF

Open Access 1 Repo

TL;DR

MeLT is a hierarchical message-level transformer pre-trained on Twitter data that improves stance detection by modeling sequences of messages and reconstructing message vectors, achieving notable F1 scores.

Contribution

Introduces MeLT, a novel message-level transformer pre-trained with masked message vector reconstruction for stance detection in social media.

Findings

01

Achieves 67% F1 score on stance detection.

02

Effective modeling of message sequences improves attribute prediction.

03

Pre-training with masked message vectors enhances downstream task performance.

Abstract

Much of natural language processing is focused on leveraging large capacity language models, typically trained over single messages with a task of predicting one or more tokens. However, modeling human language at higher-levels of context (i.e., sequences of messages) is under-explored. In stance detection and other social media tasks where the goal is to predict an attribute of a message, we have contextual data that is loosely semantically connected by authorship. Here, we introduce Message-Level Transformer (MeLT) -- a hierarchical message-encoder pre-trained over Twitter and applied to the task of stance prediction. We focus on stance prediction as a task benefiting from knowing the context of the message (i.e., the sequence of previous messages). The model is trained using a variant of masked-language modeling; where instead of predicting tokens, it seeks to generate an entire…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

matthewmatero/melt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Authorship Attribution and Profiling · Misinformation and Its Impacts

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing · Dropout · Softmax · Byte Pair Encoding · Layer Normalization