Prompt-based Pseudo-labeling Strategy for Sample-Efficient   Semi-Supervised Extractive Summarization

Gaurav Sahu; Olga Vechtomova; Issam H. Laradji

arXiv:2311.09559·cs.CL·July 3, 2024·1 cites

Prompt-based Pseudo-labeling Strategy for Sample-Efficient Semi-Supervised Extractive Summarization

Gaurav Sahu, Olga Vechtomova, Issam H. Laradji

PDF

Open Access

TL;DR

This paper introduces a prompt-based pseudo-labeling strategy using large language models for semi-supervised extractive summarization, significantly improving label quality and performance over existing methods.

Contribution

It proposes a novel prompt-based pseudo-labeling approach with relabeling mechanisms, enhancing pseudo-label accuracy for semi-supervised extractive summarization.

Findings

01

Outperforms existing SSL methods on ROUGE scores across datasets.

02

Achieves competitive L-Eval scores in data-scarce settings.

03

Surpasses fully supervised methods in data-abundant scenarios.

Abstract

Semi-supervised learning (SSL) is a widely used technique in scenarios where labeled data is scarce and unlabeled data is abundant. While SSL is popular for image and text classification, it is relatively underexplored for the task of extractive text summarization. Standard SSL methods follow a teacher-student paradigm to first train a classification model and then use the classifier's confidence values to select pseudo-labels for the subsequent training cycle; however, such classifiers are not suitable to measure the accuracy of pseudo-labels as they lack specific tuning for evaluation, which leads to confidence values that fail to capture the semantics and correctness of the generated summary. To address this problem, we propose a prompt-based pseudo-labeling strategy with LLMs that picks unlabeled examples with more accurate pseudo-labels than using just the classifier's probability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsMulti-Head Attention · Attention Is All You Need · Adam · Softmax · Dense Connections · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Residual Connection