Using BERT Encoding and Sentence-Level Language Model for Sentence   Ordering

Melika Golestani; Seyedeh Zahra Razavi; Zeinab Borhanifard; Farnaz; Tahmasebian; and Hesham Faili

arXiv:2108.10986·cs.CL·August 26, 2021

Using BERT Encoding and Sentence-Level Language Model for Sentence Ordering

Melika Golestani, Seyedeh Zahra Razavi, Zeinab Borhanifard, Farnaz, Tahmasebian, and Hesham Faili

PDF

TL;DR

This paper introduces a novel sentence ordering method using BERT-based sentence embeddings and a Universal Transformer language model with attention mechanisms, significantly improving the accuracy of ordering short stories.

Contribution

It presents a new approach combining BERT embeddings and Universal Transformers for sentence ordering, outperforming previous models on the ROCStories dataset.

Findings

01

Achieved higher Perfect Match Ratio (PMR) scores than previous state-of-the-art methods.

02

Demonstrated the effectiveness of attention-based models in capturing sentence dependencies.

03

Validated the approach on a large corpus of nearly 100K short stories.

Abstract

Discovering the logical sequence of events is one of the cornerstones in Natural Language Understanding. One approach to learn the sequence of events is to study the order of sentences in a coherent text. Sentence ordering can be applied in various tasks such as retrieval-based Question Answering, document summarization, storytelling, text generation, and dialogue systems. Furthermore, we can learn to model text coherence by learning how to order a set of shuffled sentences. Previous research has relied on RNN, LSTM, and BiLSTM architecture for learning text language models. However, these networks have performed poorly due to the lack of attention mechanisms. We propose an algorithm for sentence ordering in a corpus of short stories. Our proposed method uses a language model based on Universal Transformers (UT) that captures sentences' dependencies by employing an attention mechanism.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Tanh Activation · Byte Pair Encoding · Sigmoid Activation · Attention Dropout