Query2Diagram: Answering Developer Queries with UML Diagrams
Oleg Baryshnikov (1), Anton M. Alekseev (2, 3), Sergey I. Nikolenko (2, 3) ((1) HSE University, (2) St. Petersburg Department of Steklov Mathematical Institute, RAS, (3) St. Petersburg State University)

TL;DR
Query2Diagram leverages fine-tuned LLMs to generate UML diagrams that directly answer developer questions, improving relevance and correctness over traditional reverse engineering tools.
Contribution
It introduces a query-driven UML diagram generation method using fine-tuned LLMs, focusing on semantic relevance and structural correctness based on developer queries.
Findings
Fine-tuning on curated data significantly improves diagram quality.
The approach achieves higher F1 scores than existing LLMs.
Generated diagrams are both structurally sound and semantically faithful.
Abstract
Software documentation frequently becomes outdated or fails to exist entirely, yet developers need focused views of their codebase to understand complex systems. While automated reverse engineering tools can generate UML diagrams from code, they produce overwhelming detail without considering developer intent. We introduce query-driven UML diagram generation, where LLMs create diagrams that directly answer natural language questions about code. Unlike existing methods, our approach produces semantically focused diagrams containing only relevant elements with contextual descriptions. We fine-tune Qwen2.5-Coder-14B on a curated dataset of code files, developer queries, and corresponding diagram representations in a structured JSON format, evaluating with both automatic detection of structural defects and human assessment of semantic relevance. Results demonstrate that fine-tuning on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
