Examining the rhetorical capacities of neural language models

Zining Zhu; Chuer Pan; Mohamed Abdalla; Frank Rudzicz

arXiv:2010.00153·cs.CL·October 6, 2020

Examining the rhetorical capacities of neural language models

Zining Zhu, Chuer Pan, Mohamed Abdalla, Frank Rudzicz

PDF

Open Access

TL;DR

This paper introduces a method to evaluate the ability of neural language models to understand and encode rhetorical structures in discourse, revealing differences among models like BERT, GPT-2, and XLNet.

Contribution

It presents a novel quantitative approach to assess the rhetorical understanding of neural language models based on Rhetorical Structure Theory.

Findings

01

BERT-based models encode richer discourse knowledge.

02

GPT-2 and XLNet encode less rhetorical information.

03

The method provides a new way to measure rhetorical capacities.

Abstract

Recently, neural language models (LMs) have demonstrated impressive abilities in generating high-quality discourse. While many recent papers have analyzed the syntactic aspects encoded in LMs, there has been no analysis to date of the inter-sentential, rhetorical knowledge. In this paper, we propose a method that quantitatively evaluates the rhetorical capacities of neural LMs. We examine the capacities of neural LMs understanding the rhetoric of discourse by evaluating their abilities to encode a set of linguistic features derived from Rhetorical Structure Theory (RST). Our experiments show that BERT-based LMs outperform other Transformer LMs, revealing the richer discourse knowledge in their intermediate layer representations. In addition, GPT-2 and XLNet apparently encode less rhetorical knowledge, and we suggest an explanation drawing from linguistic philosophy. Our method shows an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Cosine Annealing · Discriminative Fine-Tuning · Weight Decay · Linear Warmup With Cosine Annealing · Attention Dropout · GPT-2 · Dense Connections