HumBEL: A Human-in-the-Loop Approach for Evaluating Demographic Factors of Language Models in Human-Machine Conversations
Anthony Sicilia, Jennifer C. Gates, and Malihe Alikhani

TL;DR
This paper introduces HumBEL, a human-in-the-loop framework for evaluating how well large language models adapt to demographic factors like age and gender in human-machine conversations, using clinical techniques and automated methods.
Contribution
It presents a novel evaluation approach combining clinical speech pathology techniques with automated tools to assess demographic compatibility of language models.
Findings
GPT-3.5 mimics 6-15 year olds in inference tasks
Outperforms a 21-year-old in memorization tasks
Less than 50% of pragmatic skills are exhibited by GPT-3.5
Abstract
While demographic factors like age and gender change the way people talk, and in particular, the way people talk to machines, there is little investigation into how large pre-trained language models (LMs) can adapt to these changes. To remedy this gap, we consider how demographic factors in LM language skills can be measured to determine compatibility with a target demographic. We suggest clinical techniques from Speech Language Pathology, which has norms for acquisition of language skills in humans. We conduct evaluation with a domain expert (i.e., a clinically licensed speech language pathologist), and also propose automated techniques to complement clinical evaluation at scale. Empirically, we focus on age, finding LM capability varies widely depending on task: GPT-3.5 mimics the ability of humans ranging from age 6-15 at tasks requiring inference, and simultaneously, outperforms a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer
