Exploring LLM-based Data Annotation Strategies for Medical Dialogue   Preference Alignment

Chengfeng Dou; Ying Zhang; Zhi Jin; Wenpin Jiao; Haiyan Zhao,; Yongqiang Zhao; Zhengwei Tao

arXiv:2410.04112·cs.CL·October 8, 2024

Exploring LLM-based Data Annotation Strategies for Medical Dialogue Preference Alignment

Chengfeng Dou, Ying Zhang, Zhi Jin, Wenpin Jiao, Haiyan Zhao,, Yongqiang Zhao, Zhengwei Tao

PDF

Open Access

TL;DR

This paper proposes a novel agent-based annotation method using LLMs and Constitutional AI to improve medical dialogue data labeling, reducing expert reliance and enhancing model performance in healthcare applications.

Contribution

It introduces an innovative agent-based annotation approach leveraging Constitutional AI and flowcharts, addressing evaluation challenges and outperforming existing methods.

Findings

01

Agent-based approach outperforms existing RLAIF methods

02

Framework effectively assesses LLMs in medical dialogue tasks

03

Flowcharts are particularly effective for expressing physician preferences

Abstract

This research examines the use of Reinforcement Learning from AI Feedback (RLAIF) techniques to improve healthcare dialogue models, with the aim of tackling the challenges of preference-aligned data annotation while reducing the reliance on medical experts. We argue that the primary challenges in current RLAIF research for healthcare are the limitations of automated evaluation methods and the difficulties in accurately representing physician preferences. To address these challenges, we present a new evaluation framework based on standardized patient examinations. This framework is designed to objectively assess the effectiveness of large language models (LLMs) in guiding users and following instructions, enabling a comprehensive comparison across different models. Furthermore, our investigation of effective ways to express physician preferences using Constitutional AI algorithms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Semantic Web and Ontologies

MethodsReinforcement Learning from AI Feedback