Transformer over Pre-trained Transformer for Neural Text Segmentation   with Enhanced Topic Coherence

Kelvin Lo; Yuan Jin; Weicong Tan; Ming Liu; Lan Du; Wray Buntine

arXiv:2110.07160·cs.CL·October 15, 2021

Transformer over Pre-trained Transformer for Neural Text Segmentation with Enhanced Topic Coherence

Kelvin Lo, Yuan Jin, Weicong Tan, Ming Liu, Lan Du, Wray Buntine

PDF

Open Access

TL;DR

This paper introduces Transformer$^2$, a hierarchical transformer framework that leverages pre-trained models for improved neural text segmentation and topic coherence, outperforming existing methods.

Contribution

It presents a novel transformer-over-transformer architecture utilizing pre-trained sentence encoders and a multi-task training approach for better segmentation accuracy.

Findings

01

Outperforms state-of-the-art segmentation models on semantic coherence.

02

Pre-trained knowledge enhances segmentation performance.

03

Language-specific pre-trained encoders yield better results than domain-specific ones.

Abstract

This paper proposes a transformer over transformer framework, called Transformer $^{2}$ , to perform neural text segmentation. It consists of two components: bottom-level sentence encoders using pre-trained transformers, and an upper-level transformer-based segmentation model based on the sentence embeddings. The bottom-level component transfers the pre-trained knowledge learned from large external corpora under both single and pair-wise supervised NLP tasks to model the sentence embeddings for the documents. Given the sentence embeddings, the upper-level transformer is trained to recover the segmentation boundaries as well as the topic labels of each sentence. Equipped with a multi-task loss and the pre-trained knowledge, Transformer $^{2}$ can better capture the semantic coherence within the same segments. Our experiments show that (1) Transformer $^{2}$ manages to surpass state-of-the-art text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis