GPT Self-Supervision for a Better Data Annotator

Xiaohuan Pei; Yanxi Li; Chang Xu

arXiv:2306.04349·cs.CL·June 9, 2023·5 cites

GPT Self-Supervision for a Better Data Annotator

Xiaohuan Pei, Yanxi Li, Chang Xu

PDF

Open Access

TL;DR

This paper introduces a GPT-based self-supervised annotation method that improves data summarization by leveraging a generating-recovering paradigm, enhancing annotation quality without requiring extensive labeled data.

Contribution

It proposes a novel self-supervised approach using GPT with a generating-recovering paradigm and alignment scores, addressing limitations of existing annotation methods.

Findings

01

Achieves competitive annotation scores across datasets

02

Demonstrates robustness in complex structured data

03

Utilizes alignment scores for self-supervision refinement

Abstract

The task of annotating data into concise summaries poses a significant challenge across various domains, frequently requiring the allocation of significant time and specialized knowledge by human experts. Despite existing efforts to use large language models for annotation tasks, significant problems such as limited applicability to unlabeled data, the absence of self-supervised methods, and the lack of focus on complex structured data still persist. In this work, we propose a GPT self-supervision annotation method, which embodies a generating-recovering paradigm that leverages the one-shot learning capabilities of the Generative Pretrained Transformer (GPT). The proposed approach comprises a one-shot tuning phase followed by a generation phase. In the one-shot tuning phase, we sample a data from the support set as part of the prompt for GPT to generate a textual summary, which is then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Cosine Annealing · Linear Layer · Label Smoothing · Adam · Absolute Position Encodings · Attention Dropout · Position-Wise Feed-Forward Layer