From Performance to Purpose: A Sociotechnical Taxonomy for Evaluating Large Language Model Utility
Gavin Levinson, Keith Feldman

TL;DR
This paper introduces LUX, a comprehensive sociotechnical framework for evaluating large language model utility across multiple domains, addressing the limitations of performance-only metrics in real-world, high-stakes applications.
Contribution
The paper presents the Language Model Utility Taxonomy (LUX), a hierarchical, multi-domain framework that organizes diverse metrics for assessing LLM utility beyond traditional performance measures.
Findings
LUX provides a structured taxonomy for LLM utility evaluation.
A web tool connects taxonomy components to relevant metrics.
Framework supports consistent comparison across use cases.
Abstract
As large language models (LLMs) continue to improve at completing discrete tasks, they are being integrated into increasingly complex and diverse real-world systems. However, task-level success alone does not establish a model's fit for use in practice. In applied, high-stakes settings, LLM effectiveness is driven by a wider array of sociotechnical determinants that extend beyond conventional performance measures. Although a growing set of metrics capture many of these considerations, they are rarely organized in a way that supports consistent evaluation, leaving no unified taxonomy for assessing and comparing LLM utility across use cases. To address this gap, we introduce the Language Model Utility Taxonomy (LUX), a comprehensive framework that structures utility evaluation across four domains: performance, interaction, operations, and governance. Within each domain, LUX is organized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education
