Adapting Pretrained Language Models for Citation Classification via Self-Supervised Contrastive Learning

Tong Li; Jiachuan Wang; Yongqi Zhang; Shuangyin Li; Lei Chen

arXiv:2505.14471·cs.CL·May 29, 2025

Adapting Pretrained Language Models for Citation Classification via Self-Supervised Contrastive Learning

Tong Li, Jiachuan Wang, Yongqi Zhang, Shuangyin Li, Lei Chen

PDF

1 Repo

TL;DR

This paper introduces Citss, a self-supervised contrastive learning framework that enhances pretrained language models for citation classification, effectively addressing data scarcity and noise issues, and improving performance across multiple benchmarks.

Contribution

The paper proposes a novel contrastive learning approach, Citss, compatible with both encoder and decoder PLMs, to improve citation classification with limited labeled data.

Findings

01

Outperforms previous state-of-the-art methods on benchmark datasets.

02

Effective in reducing reliance on keyphrases and handling contextual noise.

03

Compatible with both encoder-based PLMs and decoder-based LLMs.

Abstract

Citation classification, which identifies the intention behind academic citations, is pivotal for scholarly analysis. Previous works suggest fine-tuning pretrained language models (PLMs) on citation classification datasets, reaping the reward of the linguistic knowledge they gained during pretraining. However, directly fine-tuning for citation classification is challenging due to labeled data scarcity, contextual noise, and spurious keyphrase correlations. In this paper, we present a novel framework, Citss, that adapts the PLMs to overcome these challenges. Citss introduces self-supervised contrastive learning to alleviate data scarcity, and is equipped with two specialized strategies to obtain the contrastive pairs: sentence-level cropping, which enhances focus on target citations within long contexts, and keyphrase perturbation, which mitigates reliance on specific keyphrases.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

litong99/citss
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus · Contrastive Learning