Conversation Forests: The Key to Fine Tuning Large Language Models for Multi-Turn Medical Conversations is Branching
Thomas Savage

TL;DR
This paper introduces Savage Conversation Forests (SCF), a reinforcement learning framework with a branched architecture that improves fine-tuning of large language models for multi-turn medical dialogues, enhancing diagnostic accuracy.
Contribution
The paper proposes a novel branched conversation architecture, SCF, for fine-tuning LLMs in multi-turn dialogues, addressing limitations of previous methods in complex conversational tasks.
Findings
SCF outperforms linear architectures in diagnostic accuracy.
Branched architecture provides richer training signals.
Improves understanding of conversation dynamics in medical dialogues.
Abstract
Fine-tuning methods such as Direct Preference Optimization (DPO) and Group Relative Policy Optimization (GRPO) have demonstrated success in training large language models (LLMs) for single-turn tasks. However, these methods fall short in multi-turn applications, such as diagnostic patient interviewing, where understanding how early conversational turns influence downstream completions and outcomes is essential. In medicine, a multi-turn perspective is critical for learning diagnostic schemas and better understanding conversation dynamics. To address this gap, I introduce Savage Conversation Forests (SCF), a reinforcement learning framework that leverages a branched conversation architecture to fine-tune LLMs for multi-turn dialogue. SCF generates multiple possible conversation continuations at each turn, enabling the model to learn how different early responses affect downstream…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling
