CollabStory: Multi-LLM Collaborative Story Generation and Authorship   Analysis

Saranya Venkatraman; Nafis Irtiza Tripto; Dongwon Lee

arXiv:2406.12665·cs.CL·February 12, 2025·2 cites

CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis

Saranya Venkatraman, Nafis Irtiza Tripto, Dongwon Lee

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces CollabStory, a dataset of over 32,000 stories co-authored by multiple LLMs, and explores the challenges and potential of multi-LLM collaboration in story generation and authorship analysis.

Contribution

It presents the first dataset of multi-LLM generated stories and extends authorship analysis tasks to multi-LLM scenarios, providing baselines and highlighting challenges.

Findings

01

Current baselines struggle with multi-LLM collaboration tasks.

02

CollabStory dataset enables research on multi-LLM authorship detection.

03

Multi-LLM collaboration raises issues in plagiarism and copyright detection.

Abstract

The rise of unifying frameworks that enable seamless interoperability of Large Language Models (LLMs) has made LLM-LLM collaboration for open-ended tasks a possibility. Despite this, there have not been efforts to explore such collaborative writing. We take the next step beyond human-LLM collaboration to explore this multi-LLM scenario by generating the first exclusively LLM-generated collaborative stories dataset called CollabStory. We focus on single-author to multi-author (up to 5 LLMs) scenarios, where multiple LLMs co-author stories. We generate over 32k stories using open-source instruction-tuned LLMs. Further, we take inspiration from the PAN tasks that have set the standard for human-human multi-author writing tasks and analysis. We extend their authorship-related tasks for multi-LLM settings and present baselines for LLM-LLM collaboration. We find that current baselines are not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

saranya-venkatraman/multi_llm_story_writing
pytorchOfficial

Datasets

saranya132/CollabStory
dataset· 262 dl
262 dl

Videos

CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis· underline

Taxonomy

TopicsAuthorship Attribution and Profiling · Topic Modeling

MethodsSparse Evolutionary Training · Focus