Large-Context Conversational Representation Learning: Self-Supervised Learning for Conversational Documents
Ryo Masumura, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro, Tanaka, Shota Orihashi

TL;DR
This paper introduces LC-CRL, a self-supervised learning approach that leverages large context to improve understanding of conversational documents, especially for utterance-level labeling, without requiring extensive labeled data.
Contribution
The paper proposes a novel large-context self-supervised learning method tailored for conversational documents, enhancing utterance-level labeling without manual annotations.
Findings
Improved scene segmentation accuracy on contact center datasets.
Effective utilization of unlabeled conversational data.
Enhanced utterance-level labeling performance.
Abstract
This paper presents a novel self-supervised learning method for handling conversational documents consisting of transcribed text of human-to-human conversations. One of the key technologies for understanding conversational documents is utterance-level sequential labeling, where labels are estimated from the documents in an utterance-by-utterance manner. The main issue with utterance-level sequential labeling is the difficulty of collecting labeled conversational documents, as manual annotations are very costly. To deal with this issue, we propose large-context conversational representation learning (LC-CRL), a self-supervised learning method specialized for conversational documents. A self-supervised learning task in LC-CRL involves the estimation of an utterance using all the surrounding utterances based on large-context language modeling. In this way, LC-CRL enables us to effectively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
