G-Loss: Graph-Guided Fine-Tuning of Language Models

Aditya Sharma; Vinti Agarwal; Rajesh Kumar

arXiv:2604.25853·cs.CL·May 5, 2026

G-Loss: Graph-Guided Fine-Tuning of Language Models

Aditya Sharma, Vinti Agarwal, Rajesh Kumar

PDF

TL;DR

G-Loss introduces a graph-guided loss function for fine-tuning language models, leveraging global semantic relationships to improve embedding quality and classification accuracy.

Contribution

It proposes a novel semi-supervised, graph-based loss that captures global semantic structure, enhancing fine-tuning of pre-trained language models.

Findings

01

G-Loss converges faster than traditional loss functions.

02

It yields higher classification accuracy across multiple datasets.

03

G-Loss produces more semantically coherent embeddings.

Abstract

Traditional loss functions, including cross-entropy, contrastive, triplet, and su pervised contrastive losses, used for fine-tuning pre-trained language models such as BERT, operate only within local neighborhoods and fail to account for the global semantic structure. We present G-Loss, a graph-guided loss function that incorporates semi-supervised label propagation to use structural relationships within the embedding manifold. G-Loss builds a document-similarity graph that captures global semantic relationships, thereby guiding the model to learn more discriminative and robust embeddings. We evaluate G-Loss on five benchmark datasets covering key downstream classification tasks: MR (sentiment analysis), R8 and R52 (topic categorization), Ohsumed (medical document classification), and 20NG (news categorization). In the majority of experimental setups, G-Loss converges faster and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.