VERBA: Verbalizing Model Differences Using Large Language Models

Shravan Doda; Shashidhar Reddy Javaji; Zining Zhu

arXiv:2507.02241·cs.LG·July 4, 2025

VERBA: Verbalizing Model Differences Using Large Language Models

Shravan Doda, Shashidhar Reddy Javaji, Zining Zhu

PDF

TL;DR

VERBA uses large language models to generate verbal descriptions of differences between machine learning models, aiding in model comparison and understanding.

Contribution

This paper introduces VERBA, a novel LLM-based method for verbalizing pairwise model differences, including a protocol for evaluating verbalization informativeness and a diverse benchmark suite.

Findings

01

VERBA achieves up to 80% accuracy in verbalizing model differences.

02

Including structural information increases verbalization accuracy to 90%.

03

VERBA facilitates fine-grained, post-hoc model comparison and transparency.

Abstract

In the current machine learning landscape, we face a "model lake" phenomenon: Given a task, there is a proliferation of trained models with similar performances despite different behavior. For model users attempting to navigate and select from the models, documentation comparing model pairs is helpful. However, for every $N$ models there could be $O (N^{2})$ pairwise comparisons, a number prohibitive for the model developers to manually perform pairwise comparisons and prepare documentations. To facilitate fine-grained pairwise comparisons among models, we introduced $VERBA$ . Our approach leverages a large language model (LLM) to generate verbalizations of model differences by sampling from the two models. We established a protocol that evaluates the informativeness of the verbalizations via simulation. We also assembled a suite with a diverse set of commonly used machine learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.