How does a Pre-Trained Transformer Integrate Contextual Keywords?   Application to Humanitarian Computing

Barriere Valentin; Jacquet Guillaume

arXiv:2111.04052·cs.CL·November 9, 2021

How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing

Barriere Valentin, Jacquet Guillaume

PDF

Open Access

TL;DR

This paper investigates how pre-trained transformers incorporate contextual keywords in multimodal classification tasks, specifically improving humanitarian crisis tweet classification by adding event type metadata.

Contribution

It demonstrates how to effectively integrate textual metadata into pre-trained transformers and analyzes their behavior and potential overfitting in a humanitarian context.

Findings

01

Metadata improves classification accuracy

02

Model leverages semantic information from metadata

03

Partial overfitting observed on Crisis Benchmark

Abstract

In a classification task, dealing with text snippets and metadata usually requires dealing with multimodal approaches. When those metadata are textual, it is tempting to use them intrinsically with a pre-trained transformer, in order to leverage the semantic information encoded inside the model. This paper describes how to improve a humanitarian classification task by adding the crisis event type to each tweet to be classified. Based on additional experiments of the model weights and behavior, it identifies how the proposed neural network approach is partially over-fitting the particularities of the Crisis Benchmark, to better highlight how the model is still undoubtedly learning to use and take advantage of the metadata's textual semantics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Natural Language Processing Techniques