Learning Behavioral Soft Constraints from Demonstrations
Arie Glazier, Andrea Loreggia, Nicholas Mattei, Taher Rahgooy,, Francesca Rossi, Brent Venable

TL;DR
This paper introduces MESC-IRL, a novel inverse reinforcement learning approach that learns implicit soft and hard constraints from demonstrations, enabling AI agents to mirror human decision-making in complex environments.
Contribution
The paper presents a new IRL method that learns implicit constraints from demonstrations, generalizing prior deterministic approaches and improving performance in stochastic environments.
Findings
Achieves state-of-the-art performance in learning constraints.
Successfully transfers learned constraints across different environments.
Handles both deterministic and non-deterministic MDPs.
Abstract
Many real-life scenarios require humans to make difficult trade-offs: do we always follow all the traffic rules or do we violate the speed limit in an emergency? These scenarios force us to evaluate the trade-off between collective rules and norms with our own personal objectives and desires. To create effective AI-human teams, we must equip AI agents with a model of how humans make these trade-offs in complex environments when there are implicit and explicit rules and constraints. Agent equipped with these models will be able to mirror human behavior and/or to draw human attention to situations where decision making could be improved. To this end, we propose a novel inverse reinforcement learning (IRL) method: Max Entropy Inverse Soft Constraint IRL (MESC-IRL), for learning implicit hard and soft constraints over states, actions, and state features from demonstrations in deterministic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman-Automation Interaction and Safety
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
