TL;DR
This paper compares demographic and behavioral oversampling methods to mitigate bias in educational models, demonstrating that behavior-based oversampling is effective even without demographic data.
Contribution
It introduces two novel pre-processing bias mitigation techniques: intersectional demographic oversampling and behavior-based oversampling, applicable when demographic data is unavailable.
Findings
Both methods reduce model bias effectively.
Behavior-based oversampling works without demographic data.
The approaches are validated on real educational datasets.
Abstract
Algorithms deployed in education can shape the learning experience and success of a student. It is therefore important to understand whether and how such algorithms might create inequalities or amplify existing biases. In this paper, we analyze the fairness of models which use behavioral data to identify at-risk students and suggest two novel pre-processing approaches for bias mitigation. Based on the concept of intersectionality, the first approach involves intelligent oversampling on combinations of demographic attributes. The second approach does not require any knowledge of demographic attributes and is based on the assumption that such attributes are a (noisy) proxy for student behavior. We hence propose to directly oversample different types of behaviors identified in a cluster analysis. We evaluate our approaches on data from (i) an open-ended learning environment and (ii) a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
