AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable   Diffusion Model

Zibin Dong; Yifu Yuan; Jianye Hao; Fei Ni; Yao Mu; Yan Zheng; Yujing; Hu; Tangjie Lv; Changjie Fan; Zhipeng Hu

arXiv:2310.02054·cs.AI·February 6, 2024·1 cites

AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model

Zibin Dong, Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing, Hu, Tangjie Lv, Changjie Fan, Zhipeng Hu

PDF

Open Access

TL;DR

AlignDiff introduces a diffusion-based framework that effectively aligns agent behaviors with diverse human preferences, enabling accurate customization and flexible switching in reinforcement learning tasks.

Contribution

This work presents a novel approach combining RLHF and diffusion models to quantify, match, and switch between human preferences in agent behavior, addressing abstractness and mutability.

Findings

01

Superior preference matching performance

02

Effective behavior switching capabilities

03

Successful adaptation to unseen tasks

Abstract

Aligning agent behaviors with diverse human preferences remains a challenging problem in reinforcement learning (RL), owing to the inherent abstractness and mutability of human preferences. To address these issues, we propose AlignDiff, a novel framework that leverages RL from Human Feedback (RLHF) to quantify human preferences, covering abstractness, and utilizes them to guide diffusion planning for zero-shot behavior customizing, covering mutability. AlignDiff can accurately match user-customized behaviors and efficiently switch from one to another. To build the framework, we first establish the multi-perspective human feedback datasets, which contain comparisons for the attributes of diverse behaviors, and then train an attribute strength model to predict quantified relative strengths. After relabeling behavioral datasets with relative strengths, we proceed to train an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMental Health Research Topics

MethodsDiffusion