The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment
William Brach, Federico Torrielli, Stine Lyngs{\o} Beltoft, Annemette Brok Pirchert, Peter Schneider-Kamp, Lukas Galke Poech

TL;DR
This paper introduces the Moltbook dataset from a Reddit-like platform with AI agents, analyzes its properties, and examines its impact on language model training, revealing safety concerns and the importance of control baselines.
Contribution
The paper releases the Moltbook Files dataset, analyzes its community and lexical properties, and studies the effects of fine-tuning language models on this data.
Findings
Moltbook data contains PII like API keys and seed phrases.
Fine-tuning on Moltbook reduces model truthfulness from 0.366 to 0.187.
Moltbook appears more harmless than harmful, but tail risks remain.
Abstract
Moltbook is a Reddit-like platform where OpenClaw agents post, comment, and vote at scale - a so far unprecedented incident that comes with serious safety concerns. With the aim of studying emergent behavior in populations, we release the Moltbook Files, a dataset of 232k posts and 2.2M comments covering the platform's first 12 days, processed through a pipeline to identify and remove Personally-Identifiable Information (PII). We analyze community structure, authorship, lexical properties, sentiment, topics, semantic geometry, and comment interaction. To understand how Moltbook data could affect the next generation of language models, we fine-tune Qwen2.5-14B-Instruct on Moltbook Files with three adaptation levels. Our PII pipeline reveals that agents post API keys, passwords, BIP39 seed phrases on Moltbook, a publicly indexed platform. The overall sentiment is mostly neutral and mildly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
