VERBA: Verbalizing Model Differences Using Large Language Models
Shravan Doda, Shashidhar Reddy Javaji, Zining Zhu

TL;DR
VERBA uses large language models to generate verbal descriptions of differences between machine learning models, aiding in model comparison and understanding.
Contribution
This paper introduces VERBA, a novel LLM-based method for verbalizing pairwise model differences, including a protocol for evaluating verbalization informativeness and a diverse benchmark suite.
Findings
VERBA achieves up to 80% accuracy in verbalizing model differences.
Including structural information increases verbalization accuracy to 90%.
VERBA facilitates fine-grained, post-hoc model comparison and transparency.
Abstract
In the current machine learning landscape, we face a "model lake" phenomenon: Given a task, there is a proliferation of trained models with similar performances despite different behavior. For model users attempting to navigate and select from the models, documentation comparing model pairs is helpful. However, for every models there could be pairwise comparisons, a number prohibitive for the model developers to manually perform pairwise comparisons and prepare documentations. To facilitate fine-grained pairwise comparisons among models, we introduced . Our approach leverages a large language model (LLM) to generate verbalizations of model differences by sampling from the two models. We established a protocol that evaluates the informativeness of the verbalizations via simulation. We also assembled a suite with a diverse set of commonly used machine learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
