Auditing LLMs for Algorithmic Fairness in Casenote-Augmented Tabular Prediction
Xiao Qi Lee, Ezinne Nwankwo, Angela Zhou

TL;DR
This paper evaluates the fairness of LLM-based tabular classification in social service settings, showing that fine-tuning and casenote summaries can improve accuracy and reduce disparities.
Contribution
It demonstrates that augmenting tabular classification with casenote summaries and fine-tuning can enhance fairness and accuracy in high-stakes social service predictions.
Findings
Fine-tuned models with casenote summaries improve accuracy and fairness.
Zero-shot classification shows mixed results on fairness.
Leveraging casenotes adds valuable information with low implementation burden.
Abstract
LLMs are increasingly being considered for prediction tasks in high-stakes social service settings, but their algorithmic fairness properties in this context are poorly understood. In this short technical report, we audit the algorithmic fairness of LLM-based tabular classification on a real housing placement prediction task, augmented with street outreach casenotes from a nonprofit partner. We audit multi-class classification error disparities. We find that a fine-tuned model augmented with casenote summaries can improve accuracy while reducing algorithmic fairness disparities. We experiment with variable importance improvements to zero-shot tabular classification and find mixed results on resulting algorithmic fairness. Overall, given historical inequities in housing placement, it is crucial to audit LLM use. We find that leveraging LLMs to augment tabular classification with casenote…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
