Rethinking Self-Supervision Objectives for Generalizable Coherence   Modeling

Prathyusha Jwalapuram; Shafiq Joty; Xiang Lin

arXiv:2110.07198·cs.CL·March 22, 2022

Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling

Prathyusha Jwalapuram, Shafiq Joty, Xiang Lin

PDF

Open Access

TL;DR

This paper proposes a new approach to neural coherence modeling by increasing negative samples and using a global negative queue, leading to improved coherence evaluation in text generation.

Contribution

It introduces a novel contrastive learning setup with dense negative sampling and a global negative queue, achieving state-of-the-art results with a simple model architecture.

Findings

01

Increasing negative sample density improves model performance.

02

Global negative queue stabilizes training and enhances coherence evaluation.

03

Significant improvements on real-world coherence assessment tasks.

Abstract

Given the claims of improved text generation quality across various pre-trained neural models, we consider the coherence evaluation of machine generated text to be one of the principal applications of coherence models that needs to be investigated. Prior work in neural coherence modeling has primarily focused on devising new architectures for solving the permuted document task. We instead use a basic model architecture and show significant improvements over state of the art within the same training regime. We then design a harder self-supervision objective by increasing the ratio of negative samples within a contrastive learning setup, and enhance the model further through automatic hard negative mining coupled with a large global negative queue encoded by a momentum encoder. We show empirically that increasing the density of negative samples improves the basic model, and using a global…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science

MethodsTest · Contrastive Learning