Few-Shot Upsampling for Protest Size Detection

Andrew Halterman; Benjamin J. Radford

arXiv:2105.11260·cs.CL·May 25, 2021

Few-Shot Upsampling for Protest Size Detection

Andrew Halterman, Benjamin J. Radford

PDF

1 Repo

TL;DR

This paper introduces a new task and dataset for upsampling coarse protest size labels to fine-grained spans using question answering, demonstrating that small-scale fine-tuning significantly improves model performance.

Contribution

It presents a novel upsampling task with a benchmark dataset and compares baseline models, including rule-based, zero-shot, and few-shot transformer approaches.

Findings

01

Rule-based model outperforms zero-shot transformer initially.

02

Few-shot fine-tuning on 25 examples improves performance.

03

Fine-tuning on coarse labels with transformers is effective.

Abstract

We propose a new task and dataset for a common problem in social science research: "upsampling" coarse document labels to fine-grained labels or spans. We pose the problem in a question answering format, with the answers providing the fine-grained labels. We provide a benchmark dataset and baselines on a socially impactful task: identifying the exact crowd size at protests and demonstrations in the United States given only order-of-magnitude information about protest attendance, a very small sample of fine-grained examples, and English-language news text. We evaluate several baseline models, including zero-shot results from rule-based and question-answering models, few-shot models fine-tuned on a small set of documents, and weakly supervised models using a larger set of coarsely-labeled documents. We find that our rule-based model initially outperforms a zero-shot pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

benradford/few-shot-upsampling-for-protest-size-detection
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.