DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints for Safety-Critical Systems
Se-Wook Yoo, Seung-Woo Seo

TL;DR
This paper introduces a method for learning shared safety constraints across multiple tasks in reinforcement learning, enabling safer and more efficient adaptation without task-specific constraint definitions.
Contribution
The authors propose a novel approach to identify and adapt shared constraint distributions across tasks using imitation learning, improving safety and sample efficiency.
Findings
Outperforms baseline methods in safety and success rates
Effective in multi-task and meta-task scenarios
Does not require task-specific constraint definitions
Abstract
Safe reinforcement learning has traditionally relied on predefined constraint functions to ensure safety in complex real-world tasks, such as autonomous driving. However, defining these functions accurately for varied tasks is a persistent challenge. Recent research highlights the potential of leveraging pre-acquired task-agnostic knowledge to enhance both safety and sample efficiency in related tasks. Building on this insight, we propose a novel method to learn shared constraint distributions across multiple tasks. Our approach identifies the shared constraints through imitation learning and then adapts to new tasks by adjusting risk levels within these learned distributions. This adaptability addresses variations in risk sensitivity stemming from expert-specific biases, ensuring consistent adherence to general safety principles even with imperfect demonstrations. Our method can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Reliability and Analysis Research · Machine Learning and Algorithms · Intelligent Tutoring Systems and Adaptive Learning
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
