Protected Attributes Tell Us Who, Behavior Tells Us How: A Comparison of   Demographic and Behavioral Oversampling for Fair Student Success Modeling

Jade Ma\"i Cock; Muhammad Bilal; Richard Davis; Mirko Marras; Tanja; K\"aser

arXiv:2212.10166·cs.CY·December 21, 2022

Protected Attributes Tell Us Who, Behavior Tells Us How: A Comparison of Demographic and Behavioral Oversampling for Fair Student Success Modeling

Jade Ma\"i Cock, Muhammad Bilal, Richard Davis, Mirko Marras, Tanja, K\"aser

PDF

1 Repo

TL;DR

This paper compares demographic and behavioral oversampling methods to mitigate bias in educational models, demonstrating that behavior-based oversampling is effective even without demographic data.

Contribution

It introduces two novel pre-processing bias mitigation techniques: intersectional demographic oversampling and behavior-based oversampling, applicable when demographic data is unavailable.

Findings

01

Both methods reduce model bias effectively.

02

Behavior-based oversampling works without demographic data.

03

The approaches are validated on real educational datasets.

Abstract

Algorithms deployed in education can shape the learning experience and success of a student. It is therefore important to understand whether and how such algorithms might create inequalities or amplify existing biases. In this paper, we analyze the fairness of models which use behavioral data to identify at-risk students and suggest two novel pre-processing approaches for bias mitigation. Based on the concept of intersectionality, the first approach involves intelligent oversampling on combinations of demographic attributes. The second approach does not require any knowledge of demographic attributes and is based on the assumption that such attributes are a (noisy) proxy for student behavior. We hence propose to directly oversample different types of behaviors identified in a cluster analysis. We evaluate our approaches on data from (i) an open-ended learning environment and (ii) a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

epfl-ml4ed/behavioral-oversampling
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.