CrowdLLM: Building LLM-Based Digital Populations Augmented with Generative Models
Ryan Feng Lin, Keyu Tian, Hanming Zheng, Congjing Zhang, Li Zeng, Shuai Huang

TL;DR
CrowdLLM combines pretrained large language models and generative models to create diverse, accurate, and cost-effective digital populations that mimic real human data for various applications.
Contribution
The paper introduces CrowdLLM, a novel framework integrating LLMs and generative models to improve the diversity and fidelity of digital populations.
Findings
CrowdLLM achieves high accuracy in replicating human data.
It demonstrates scalability and cost-effectiveness.
The approach outperforms existing methods in diversity and realism.
Abstract
The emergence of large language models (LLMs) has sparked much interest in creating LLM-based digital populations that can be applied to many applications such as social simulation, crowdsourcing, marketing, and recommendation systems. A digital population can reduce the cost of recruiting human participants and alleviate many concerns related to human subject study. However, research has found that most of the existing works rely solely on LLMs and could not sufficiently capture the accuracy and diversity of a real human population. To address this limitation, we propose CrowdLLM that integrates pretrained LLMs and generative models to enhance the diversity and fidelity of the digital population. We conduct theoretical analysis of CrowdLLM regarding its great potential in creating cost-effective, sufficiently representative, scalable digital populations that can match the quality of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Recommender Systems and Techniques · Topic Modeling
