DS@GT at CheckThat! 2025: Detecting Subjectivity via Transfer-Learning and Corrective Data Augmentation

Maximilian Heil; Dionne Bang

arXiv:2507.06189·cs.CL·July 9, 2025

DS@GT at CheckThat! 2025: Detecting Subjectivity via Transfer-Learning and Corrective Data Augmentation

Maximilian Heil, Dionne Bang

PDF

Open Access 1 Repo

TL;DR

This paper explores transfer-learning and stylistic data augmentation using GPT-4o to improve subjectivity detection in news text, demonstrating that specialized encoders and curated augmentation enhance classification accuracy.

Contribution

It introduces a controlled augmentation pipeline with GPT-4o and compares transfer-learning of specialized encoders versus fine-tuning general models for subjectivity detection.

Findings

01

Transfer-learning of specified encoders outperforms general-purpose fine-tuning.

02

Carefully curated augmentation improves model robustness.

03

Official ranking was 16th out of 24 participants.

Abstract

This paper presents our submission to Task 1, Subjectivity Detection, of the CheckThat! Lab at CLEF 2025. We investigate the effectiveness of transfer-learning and stylistic data augmentation to improve classification of subjective and objective sentences in English news text. Our approach contrasts fine-tuning of pre-trained encoders and transfer-learning of fine-tuned transformer on related tasks. We also introduce a controlled augmentation pipeline using GPT-4o to generate paraphrases in predefined subjectivity styles. To ensure label and style consistency, we employ the same model to correct and refine the generated samples. Results show that transfer-learning of specified encoders outperforms fine-tuning general-purpose ones, and that carefully curated augmentation significantly enhances model robustness, especially in detecting subjective content. Our official submission placed us…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dsgt-arc/checkthat-2025-subject
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Topic Modeling · Authorship Attribution and Profiling