ShareChat: A Dataset of Chatbot Conversations in the Wild

Yueru Yan; Tuc Nguyen; Bo Su; Melissa Lieffers; Thai Le

arXiv:2512.17843·cs.CL·May 19, 2026

ShareChat: A Dataset of Chatbot Conversations in the Wild

Yueru Yan, Tuc Nguyen, Bo Su, Melissa Lieffers, Thai Le

PDF

1 Datasets

TL;DR

ShareChat introduces a large-scale, multi-platform dataset of chatbot conversations that captures native platform features, enabling more realistic evaluation of LLMs in diverse real-world settings.

Contribution

It provides the first extensive corpus of multi-platform chatbot interactions preserving native features, facilitating new research on user behavior and system performance.

Findings

01

Cross-platform differences in conversation completeness and intent satisfaction.

02

Distinct citation strategies in search-augmented systems.

03

Divergent response latency patterns across platforms.

Abstract

By evaluating Large Language Models (LLMs) through uniform, text-only interfaces, current academic benchmarks obscure how the unique designs and affordances of distinct commercial platforms shape real-world user behavior and system performance. To bridge this gap, we present ShareChat, the first large-scale corpus of 142,808 conversations (660,293 turns) collected from publicly shared URLs on ChatGPT, Perplexity, Grok, Gemini, and Claude. ShareChat preserves native platform affordances, including citations, thinking traces, and code artifacts, across 95 languages and the period from April 2023 to October 2025, complementing existing corpora that homogenize these interactions. To demonstrate the dataset's evaluative utility, we present three case studies: a conversation completeness analysis assessing cross-platform differences in intent satisfaction, a source grounding analysis…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

tucnguyen/ShareChat
dataset· 702 dl
702 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Artificial Intelligence in Healthcare and Education · Topic Modeling