An Empirical Study of Perceptions of General LLMs and Multimodal LLMs on Hugging Face
Yujian Liu, Xiao Yu, Jacky Keung, Xing Hu, Xin Xia, Xiaoxue Ma

TL;DR
This study analyzes user discussions on Hugging Face to understand perceptions of general and multimodal large language models, revealing key concerns and areas for improvement.
Contribution
It provides the first empirical, taxonomy-based analysis of real-world user perceptions of GLLMs and MLLMs from diverse community discussions.
Findings
Access barriers and generation quality are top user concerns.
Deployment complexity and documentation issues are prominent.
Insights inform recommendations for LLM ecosystem improvements.
Abstract
Large language models (LLMs) have rapidly evolved from general-purpose systems to multimodal models capable of processing text, images, and audio. As both general-purpose LLMs (GLLMs) and multimodal LLMs (MLLMs) gain widespread adoption, understanding user perceptions in real-world settings becomes increasingly important. However, existing studies often rely on surveys or platform-specific data (e.g., Reddit or GitHub issues), which either constrain user feedback through predefined questions or overemphasize failure-driven, debugging-oriented discussions, thus failing to capture diverse, experience-driven, and cross-model user perspectives in practice. To address this issue, we conduct an empirical study of user discussions on Hugging Face, a major model hub with diverse models and active communities. We collect and manually annotate 662 discussion threads from 38 representative models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
