Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling
Prathyusha Jwalapuram, Shafiq Joty, Xiang Lin

TL;DR
This paper proposes a new approach to neural coherence modeling by increasing negative samples and using a global negative queue, leading to improved coherence evaluation in text generation.
Contribution
It introduces a novel contrastive learning setup with dense negative sampling and a global negative queue, achieving state-of-the-art results with a simple model architecture.
Findings
Increasing negative sample density improves model performance.
Global negative queue stabilizes training and enhances coherence evaluation.
Significant improvements on real-world coherence assessment tasks.
Abstract
Given the claims of improved text generation quality across various pre-trained neural models, we consider the coherence evaluation of machine generated text to be one of the principal applications of coherence models that needs to be investigated. Prior work in neural coherence modeling has primarily focused on devising new architectures for solving the permuted document task. We instead use a basic model architecture and show significant improvements over state of the art within the same training regime. We then design a harder self-supervision objective by increasing the ratio of negative samples within a contrastive learning setup, and enhance the model further through automatic hard negative mining coupled with a large global negative queue encoded by a momentum encoder. We show empirically that increasing the density of negative samples improves the basic model, and using a global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science
MethodsTest · Contrastive Learning
