Less Is More: Measuring How LLM Involvement affects Chatbot Accuracy in Static Analysis

Krishna Narasimhan

arXiv:2604.21746·cs.SE·April 24, 2026

Less Is More: Measuring How LLM Involvement affects Chatbot Accuracy in Static Analysis

Krishna Narasimhan

PDF

TL;DR

This study compares three LLM-based architectures for translating natural language into code analysis queries, finding that structured intermediate representations significantly improve accuracy, especially for large models.

Contribution

It introduces and evaluates a spectrum of LLM involvement architectures, highlighting the effectiveness of structured intermediate representations over direct or agentic approaches.

Findings

01

Structured intermediate representation outperforms direct generation by 15-25 percentage points.

02

Large models benefit most from constrained, well-typed intermediates.

03

Schema compliance limits small models' performance, despite the structured approach.

Abstract

Large language models are increasingly used to make static analysis tools accessible through natural language, yet existing systems differ in how much they delegate to the LLM without treating the degree of delegation as an independent variable. We compare three architectures along a spectrum of LLM involvement for translating natural language to Joern's query language \cpgql{}: direct query generation (\approach{1}), generation of a schema-constrained JSON intermediate representation (\approach{2}), and tool-augmented agentic generation (\approach{3}). These are evaluated on a benchmark of 20 code analysis tasks across three complexity tiers, using four open-weight models in a 2\(\times\)2 design (two model families \(\times\) two scales), each with three repetitions. The structured intermediate representation (\approach{2}) achieves the highest result match rates, outperforming direct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.