Discovering the Hidden Role of Gini Index In Prompt-based Classification

Ruixi Lin

arXiv:2603.15654·cs.LG·March 18, 2026

Discovering the Hidden Role of Gini Index In Prompt-based Classification

Ruixi Lin

PDF

Open Access

TL;DR

This paper explores the Gini Index as a tool for detecting and reducing class accuracy disparities in prompt-based classification, demonstrating its effectiveness across various models and tasks.

Contribution

It introduces the Gini Index as a novel metric for understanding and debiasing class accuracy imbalances, along with a post-hoc bias mitigation method.

Findings

01

Gini scores reveal significant accuracy imbalances in prompt-based classification.

02

The proposed method effectively reduces class accuracy disparities.

03

Experimental results show improved fairness and balanced class performance.

Abstract

In classification tasks, the long-tailed minority classes usually offer the predictions that are most important. Yet these classes consistently exhibit low accuracies, whereas a few high-performing classes dominate the game. We pursue a foundational understanding of the hidden role of Gini Index as a tool for detecting and optimizing (debiasing) disparities in class accuracy, focusing on the case of prompt-based classification. We introduce the intuitions, benchmark Gini scores in real-world LLMs and vision models, and thoroughly discuss the insights of Gini not only as a measure of relative accuracy dominance but also as a direct optimization metric. Through rigorous case analyses, we first show that weak to strong relative accuracy imbalance exists in both prompt-based, text and image classification results and regardless of whether the classification is high-dimensional or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Domain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare and Education