CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis
Saranya Venkatraman, Nafis Irtiza Tripto, Dongwon Lee

TL;DR
This paper introduces CollabStory, a dataset of over 32,000 stories co-authored by multiple LLMs, and explores the challenges and potential of multi-LLM collaboration in story generation and authorship analysis.
Contribution
It presents the first dataset of multi-LLM generated stories and extends authorship analysis tasks to multi-LLM scenarios, providing baselines and highlighting challenges.
Findings
Current baselines struggle with multi-LLM collaboration tasks.
CollabStory dataset enables research on multi-LLM authorship detection.
Multi-LLM collaboration raises issues in plagiarism and copyright detection.
Abstract
The rise of unifying frameworks that enable seamless interoperability of Large Language Models (LLMs) has made LLM-LLM collaboration for open-ended tasks a possibility. Despite this, there have not been efforts to explore such collaborative writing. We take the next step beyond human-LLM collaboration to explore this multi-LLM scenario by generating the first exclusively LLM-generated collaborative stories dataset called CollabStory. We focus on single-author to multi-author (up to 5 LLMs) scenarios, where multiple LLMs co-author stories. We generate over 32k stories using open-source instruction-tuned LLMs. Further, we take inspiration from the PAN tasks that have set the standard for human-human multi-author writing tasks and analysis. We extend their authorship-related tasks for multi-LLM settings and present baselines for LLM-LLM collaboration. We find that current baselines are not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAuthorship Attribution and Profiling · Topic Modeling
MethodsSparse Evolutionary Training · Focus
