Synthetically Generating Human-like Data for Sequential Decision Making   Tasks via Reward-Shaped Imitation Learning

Bryan Brandt; Prithviraj Dasgupta

arXiv:2304.07280·cs.LG·April 17, 2023·1 cites

Synthetically Generating Human-like Data for Sequential Decision Making Tasks via Reward-Shaped Imitation Learning

Bryan Brandt, Prithviraj Dasgupta

PDF

Open Access

TL;DR

This paper introduces a reward-shaped imitation learning algorithm that generates human-like decision data from minimal human input, effectively enabling AI systems to mimic human decision-making in sequential tasks.

Contribution

The paper presents a novel method combining reward shaping with imitation learning to synthesize human-like decision data from limited initial data, improving AI performance in sequential decision tasks.

Findings

01

Synthetic data closely mimics human decisions

02

Generated data enables AI to perform tasks indistinguishably from humans

03

Method effective across multiple sequential decision-making tasks

Abstract

We consider the problem of synthetically generating data that can closely resemble human decisions made in the context of an interactive human-AI system like a computer game. We propose a novel algorithm that can generate synthetic, human-like, decision making data while starting from a very small set of decision making data collected from humans. Our proposed algorithm integrates the concept of reward shaping with an imitation learning algorithm to generate the synthetic data. We have validated our synthetic data generation technique by using the synthetically generated data as a surrogate for human interaction data to solve three sequential decision making tasks of increasing complexity within a small computer game-like setup. Different empirical and statistical analyses of our results show that the synthetically generated data can substitute the human data and perform the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Data Stream Mining Techniques