FairHome: A Fair Housing and Fair Lending Dataset
Anusha Bagalkotkar (1), Aveek Karmakar (1), Gabriel Arnson (1), Ondrej, Linda (1) ((1) Zillow Group)

TL;DR
FairHome is the first publicly available dataset for fair housing and lending compliance, enabling effective detection of potential violations using classifiers and large language models in real estate contexts.
Contribution
Introduces a novel, labeled dataset for fair housing and lending compliance, and demonstrates its effectiveness with a high-performing classifier and benchmarking against state-of-the-art LLMs.
Findings
Classifier achieved an F1-score of 0.91.
Dataset outperforms existing models in compliance detection.
Effective in zero-shot and few-shot scenarios.
Abstract
We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories. To the best of our knowledge, FairHome is the first publicly available dataset labeled with binary labels for compliance risk in the housing domain. We demonstrate the usefulness and effectiveness of such a dataset by training a classifier and using it to detect potential violations when using a large language model (LLM) in the context of real-estate transactions. We benchmark the trained classifier against state-of-the-art LLMs including GPT-3.5, GPT-4, LLaMA-3, and Mistral Large in both zero-shot and few-shot contexts. Our classifier outperformed with an F1-score of 0.91, underscoring the effectiveness of our dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHousing, Finance, and Neoliberalism · Urban and Rural Development Challenges · Income, Poverty, and Inequality
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Absolute Position Encodings · Label Smoothing · Position-Wise Feed-Forward Layer · Residual Connection · Attention Dropout · Linear Layer · Multi-Head Attention
