Clio: Privacy-Preserving Insights into Real-World AI Use
Alex Tamkin, Miles McCain, Kunal Handa, Esin Durmus, Liane Lovitt,, Ankur Rathi, Saffron Huang, Alfred Mountfield, Jerry Hong, Stuart Ritchie,, Michael Stern, Brian Clarke, Landon Goldberg, Theodore R. Sumers, Jared, Mueller, William McEachen, Wes Mitchell, Shan Carter

TL;DR
Clio is a privacy-preserving platform that analyzes real-world AI assistant usage patterns across millions of conversations without exposing raw data, enabling insights into user behavior and system safety improvements.
Contribution
This paper introduces Clio, a novel privacy-preserving system that leverages AI assistants to analyze aggregated usage data for insights and safety monitoring, addressing privacy and practical challenges.
Findings
Identified common use cases like coding, writing, and research.
Discovered language-specific usage patterns, e.g., elder care discussions in Japanese.
Demonstrated Clio's effectiveness in detecting abuse and monitoring system safety.
Abstract
How are AI assistants being used in the real world? While model providers in theory have a window into this impact via their users' data, both privacy concerns and practical challenges have made analyzing this data difficult. To address these issues, we present Clio (Claude insights and observations), a privacy-preserving platform that uses AI assistants themselves to analyze and surface aggregated usage patterns across millions of conversations, without the need for human reviewers to read raw conversations. We validate this can be done with a high degree of accuracy and privacy by conducting extensive evaluations. We demonstrate Clio's usefulness in two broad ways. First, we share insights about how models are being used in the real world from one million Claude.ai Free and Pro conversations, ranging from providing advice on hairstyles to providing guidance on Git operations and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
