TL;DR
This paper investigates generating synthetic multi-party conversations using instruction-tuned LLMs, comparing whole-dialogue versus turn-by-turn approaches, and evaluates their quality and adherence to constraints.
Contribution
It introduces two strategies for LLM-based WMPC generation, along with an analytical framework for evaluating constraint compliance and content quality.
Findings
Turn-by-turn generation yields better constraint conformance.
Some LLMs can generate high-quality WMPCs.
Both strategies can produce high-quality conversations.
Abstract
Written Multi-Party Conversations (WMPCs) are widely studied across disciplines, with social media as a primary data source due to their accessibility. However, these datasets raise privacy concerns and often reflect platform-specific properties. For example, interactions between speakers may be limited due to rigid platform structures (e.g., threads, tree-like discussions), which yield overly simplistic interaction patterns (e.g., one-to-one "reply-to" links). This work explores the feasibility of generating synthetic WMPCs with instruction-tuned Large Language Models (LLMs) by providing deterministic constraints such as dialogue structure and participants' stance. We investigate two complementary strategies of leveraging LLMs in this context: (i.) LLMs as WMPC generators, where we task the LLM to generate a whole WMPC at once and (ii.) LLMs as WMPC parties, where the LLM generates one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
