An Offline Time-aware Apprenticeship Learning Framework for Evolving Reward Functions
Xi Yang, Ge Gao, Min Chi

TL;DR
This paper introduces an offline, time-aware apprenticeship learning framework called THEMES, designed to handle evolving reward functions in human-centric decision tasks like healthcare, demonstrated through sepsis treatment experiments.
Contribution
The paper presents a novel offline, time-aware hierarchical apprenticeship learning framework specifically addressing evolving reward functions in complex tasks.
Findings
THEMES significantly outperforms state-of-the-art baselines.
Effective in modeling evolving reward functions in healthcare.
Demonstrated success in sepsis treatment decision-making.
Abstract
Apprenticeship learning (AL) is a process of inducing effective decision-making policies via observing and imitating experts' demonstrations. Most existing AL approaches, however, are not designed to cope with the evolving reward functions commonly found in human-centric tasks such as healthcare, where offline learning is required. In this paper, we propose an offline Time-aware Hierarchical EM Energy-based Sub-trajectory (THEMES) AL framework to tackle the evolving reward functions in such tasks. The effectiveness of THEMES is evaluated via a challenging task -- sepsis treatment. The experimental results demonstrate that THEMES can significantly outperform competitive state-of-the-art baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCognitive Abilities and Testing · Human Resource Development and Performance Evaluation · Intelligent Tutoring Systems and Adaptive Learning
