DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints   for Safety-Critical Systems

Se-Wook Yoo; Seung-Woo Seo

arXiv:2501.18086·cs.LG·January 31, 2025

DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints for Safety-Critical Systems

Se-Wook Yoo, Seung-Woo Seo

PDF

Open Access

TL;DR

This paper introduces a method for learning shared safety constraints across multiple tasks in reinforcement learning, enabling safer and more efficient adaptation without task-specific constraint definitions.

Contribution

The authors propose a novel approach to identify and adapt shared constraint distributions across tasks using imitation learning, improving safety and sample efficiency.

Findings

01

Outperforms baseline methods in safety and success rates

02

Effective in multi-task and meta-task scenarios

03

Does not require task-specific constraint definitions

Abstract

Safe reinforcement learning has traditionally relied on predefined constraint functions to ensure safety in complex real-world tasks, such as autonomous driving. However, defining these functions accurately for varied tasks is a persistent challenge. Recent research highlights the potential of leveraging pre-acquired task-agnostic knowledge to enhance both safety and sample efficiency in related tasks. Building on this insight, we propose a novel method to learn shared constraint distributions across multiple tasks. Our approach identifies the shared constraints through imitation learning and then adapts to new tasks by adjusting risk levels within these learned distributions. This adaptability addresses variations in risk sensitivity stemming from expert-specific biases, ensuring consistent adherence to general safety principles even with imperfect demonstrations. Our method can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Reliability and Analysis Research · Machine Learning and Algorithms · Intelligent Tutoring Systems and Adaptive Learning

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings