Using Machine Teaching to Investigate Human Assumptions when Teaching Reinforcement Learners
Yun-Shiuan Chuang, Xuezhou Zhang, Yuzhe Ma, Mark K. Ho, Joseph L., Austerweil, Xiaojin Zhu

TL;DR
This paper explores human assumptions about teaching reinforcement learners, specifically Q-learning agents, through behavioral experiments and machine teaching optimization, revealing insights into effective teaching strategies and human biases.
Contribution
It introduces a normative machine teaching framework for Q-learning and investigates human assumptions, highlighting suboptimal teaching behaviors and the impact of real-time feedback.
Findings
People teach Q-learners efficiently with low discount and high learning rates.
Humans are only partially optimal in their teaching strategies.
Real-time updates of learner states slightly improve teaching effectiveness.
Abstract
Successful teaching requires an assumption of how the learner learns - how the learner uses experiences from the world to update their internal states. We investigate what expectations people have about a learner when they teach them in an online manner using rewards and punishment. We focus on a common reinforcement learning method, Q-learning, and examine what assumptions people have using a behavioral experiment. To do so, we first establish a normative standard, by formulating the problem as a machine teaching optimization problem. To solve the machine teaching optimization problem, we use a deep learning approximation method which simulates learners in the environment and learns to predict how feedback affects the learner's internal states. What do people assume about a learner's learning and discount rates when they teach them an idealized exploration-exploitation task? In a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Auction Theory and Applications
