Dual-Agent Co-Training for Health Coaching via Implicit Adversarial Preference Optimization

Da Long; Lingyi Fu; Diya Michelle Rao; Jasmine Ruales Carrera; Yang Bai; Shandian Zhe

arXiv:2605.07011·cs.LG·May 11, 2026

Dual-Agent Co-Training for Health Coaching via Implicit Adversarial Preference Optimization

Da Long, Lingyi Fu, Diya Michelle Rao, Jasmine Ruales Carrera, Yang Bai, Shandian Zhe

PDF

TL;DR

This paper introduces a dual-agent co-training framework for AI health coaches, enhancing interaction exploration and performance by jointly training a coach and a client simulator through implicit adversarial preference optimization.

Contribution

It proposes a novel dual-agent co-training method that jointly optimizes both the health coach and client simulator using implicit adversarial training and Pareto-dominant response pairs.

Findings

01

Improved coaching quality across multiple dimensions.

02

Effective exploration of interaction space through dual-agent co-training.

03

The method admits a natural stochastic-game interpretation.

Abstract

Motivational-interviewing-based health coaching is an effective approach for improving mental health and promoting healthy behavior change. However, the scarcity of trained human coaches and the high cost of coaching services make such support inaccessible to many people who could benefit from it. This motivates the development of AI health coaches that can provide scalable and affordable support. Existing methods typically optimize only one side of the interaction: they either train a dialogue agent against a fixed client environment or train a client simulator against a fixed assistant. This one-sided setup can limit exploration of the interaction space and may be inefficient at developing the capabilities required by the target agent and pushing its performance boundaries. In this paper, we propose a dual-agent framework that interactively co-trains both the health coach agent and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.