BehaviorBox: Automated Discovery of Fine-Grained Performance Differences Between Language Models
Lindia Tjuatja, Graham Neubig

TL;DR
BehaviorBox is an automated methodology that identifies fine-grained, context-specific performance differences between language models by extracting meaningful features where one model outperforms another, aiding deeper understanding beyond traditional metrics.
Contribution
This work introduces BehaviorBox, a novel automated approach for discovering detailed performance differences between language models using performance-aware contextual embeddings.
Findings
Identifies specific contextual features where models differ in performance
Reveals insights not captured by corpus-level perplexity measures
Applies to various models, sizes, and training methods
Abstract
Language model evaluation is a daunting task: prompts are brittle, corpus-level perplexities are vague, and the choice of benchmarks are endless. Finding examples that show meaningful, generalizable differences between two LMs is crucial to understanding where one model succeeds and another fails. Can this process be done automatically? In this work, we propose methodology for automated comparison of language models that uses performance-aware contextual embeddings to find fine-grained features of text where one LM outperforms another. Our method, which we name BehaviorBox, extracts coherent features that demonstrate differences with respect to the ease of generation between two LMs. Specifically, BehaviorBox finds features that describe groups of words in fine-grained contexts, such as "conditional 'were' in the phrase 'if you were'" and "exclamation marks after emotional statements",…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Mental Health via Writing · Machine Learning in Healthcare
