Lexara: A User-Centered Toolkit for Evaluating Large Language Models for Conversational Visual Analytics
Srishti Palani, Vidya Setlur

TL;DR
Lexara is a user-centered toolkit designed to evaluate large language models for conversational visual analytics, addressing evaluation challenges by providing interpretable metrics and an interactive interface based on real-world use cases.
Contribution
The paper introduces Lexara, a novel evaluation toolkit that operationalizes user insights into test cases, interpretable metrics, and an accessible interface for CVA model assessment.
Findings
Lexara effectively guides model and prompt selection for CVA tasks.
The toolkit covers diverse real-world scenarios and multi-format outputs.
User feedback confirms its usefulness and ease of use.
Abstract
Large Language Models (LLMs) are transforming Conversational Visual Analytics (CVA) by enabling data analysis through natural language. However, evaluating LLMs for CVA remains a challenge: requiring programming expertise, overlooking real-world complexity, and lacking interpretable metrics for multi-format (visualizations and text) outputs. Through interviews with 22 CVA developers and 16 end-users, we identified use cases, evaluation criteria and workflows. We present Lexara, a user-centered evaluation toolkit for CVA that operationalizes these insights into: (i) test cases spanning real-world scenarios; (ii) interpretable metrics covering visualization quality (data fidelity, semantic alignment, functional correctness, design clarity) and language quality (factual grounding, analytical reasoning, conversational coherence) using rule-based and LLM-as-a-Judge methods; and (iii) an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Computational and Text Analysis Methods · Multimodal Machine Learning Applications
