INSET: Sentence Infilling with INter-SEntential Transformer

Yichen Huang; Yizhe Zhang; Oussama Elachqar; Yu Cheng

arXiv:1911.03892·cs.CL·August 4, 2020

INSET: Sentence Infilling with INter-SEntential Transformer

Yichen Huang, Yizhe Zhang, Oussama Elachqar, Yu Cheng

PDF

1 Repo

TL;DR

This paper introduces INSET, a transformer-based framework that improves sentence infilling by decoupling understanding, planning, and generation, leveraging pre-trained models like BERT and GPT-2.

Contribution

It proposes a novel framework that separates the challenges of understanding, planning, and generation in sentence infilling, utilizing existing large-scale pre-trained models.

Findings

01

Effective learning of sentence representations for generation

02

Successful generation of contextually fitting missing sentences

03

Demonstrated improvements over baseline models

Abstract

Missing sentence generation (or sentence infilling) fosters a wide range of applications in natural language generation, such as document auto-completion and meeting note expansion. This task asks the model to generate intermediate missing sentences that can syntactically and semantically bridge the surrounding context. Solving the sentence infilling task requires techniques in natural language processing ranging from understanding to discourse-level planning to generation. In this paper, we propose a framework to decouple the challenge and address these three aspects respectively, leveraging the power of existing large-scale pre-trained models such as BERT and GPT-2. We empirically demonstrate the effectiveness of our model in learning a sentence representation for generation and further generating a missing sentence that fits the context.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dreasysnail/INSET
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Cosine Annealing · Discriminative Fine-Tuning · Linear Warmup With Cosine Annealing · Byte Pair Encoding · GPT-2 · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay