Discovering an Aid Policy to Minimize Student Evasion Using Offline Reinforcement Learning
Leandro M. de Lima, Renato A. Krohling

TL;DR
This paper presents an offline reinforcement learning approach to identify aid policies that effectively reduce student dropout rates by supporting decision-makers with personalized, timely interventions based on student data.
Contribution
It introduces a novel offline reinforcement learning method combined with state space discretization for aiding decision-making in student dropout prevention.
Findings
The method achieves 1.0 to 1.5 times the cumulative reward of logged policies.
Offline RL can effectively support aid policy discovery for student dropout prevention.
Discretization of student states improves decision-making accuracy.
Abstract
High dropout rates in tertiary education expose a lack of efficiency that causes frustration of expectations and financial waste. Predicting students at risk is not enough to avoid student dropout. Usually, an appropriate aid action must be discovered and applied in the proper time for each student. To tackle this sequential decision-making problem, we propose a decision support method to the selection of aid actions for students using offline reinforcement learning to support decision-makers effectively avoid student dropout. Additionally, a discretization of student's state space applying two different clustering methods is evaluated. Our experiments using logged data of real students shows, through off-policy evaluation, that the method should achieve roughly 1.0 to 1.5 times as much cumulative reward as the logged policy. So, it is feasible to help decision-makers apply appropriate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDropout
