CrowdLLM: Building LLM-Based Digital Populations Augmented with Generative Models

Ryan Feng Lin; Keyu Tian; Hanming Zheng; Congjing Zhang; Li Zeng; Shuai Huang

arXiv:2512.07890·cs.MA·January 15, 2026

CrowdLLM: Building LLM-Based Digital Populations Augmented with Generative Models

Ryan Feng Lin, Keyu Tian, Hanming Zheng, Congjing Zhang, Li Zeng, Shuai Huang

PDF

Open Access

TL;DR

CrowdLLM combines pretrained large language models and generative models to create diverse, accurate, and cost-effective digital populations that mimic real human data for various applications.

Contribution

The paper introduces CrowdLLM, a novel framework integrating LLMs and generative models to improve the diversity and fidelity of digital populations.

Findings

01

CrowdLLM achieves high accuracy in replicating human data.

02

It demonstrates scalability and cost-effectiveness.

03

The approach outperforms existing methods in diversity and realism.

Abstract

The emergence of large language models (LLMs) has sparked much interest in creating LLM-based digital populations that can be applied to many applications such as social simulation, crowdsourcing, marketing, and recommendation systems. A digital population can reduce the cost of recruiting human participants and alleviate many concerns related to human subject study. However, research has found that most of the existing works rely solely on LLMs and could not sufficiently capture the accuracy and diversity of a real human population. To address this limitation, we propose CrowdLLM that integrates pretrained LLMs and generative models to enhance the diversity and fidelity of the digital population. We conduct theoretical analysis of CrowdLLM regarding its great potential in creating cost-effective, sufficiently representative, scalable digital populations that can match the quality of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Recommender Systems and Techniques · Topic Modeling