Towards Automatically Extracting UML Class Diagrams from Natural   Language Specifications

Song Yang; Houari Sahraoui

arXiv:2210.14441·cs.SE·October 28, 2022

Towards Automatically Extracting UML Class Diagrams from Natural Language Specifications

Song Yang, Houari Sahraoui

PDF

1 Repo

TL;DR

This paper presents an automated pipeline for extracting UML class diagrams from natural language specifications, aiming to facilitate model-driven engineering despite current low precision and recall.

Contribution

It introduces a novel dataset and a structured pipeline approach for UML diagram extraction from natural language, establishing a benchmark for future research.

Findings

01

The approach provides a baseline for UML diagram extraction.

02

It achieves low precision and recall, highlighting room for improvement.

03

A new dataset and evaluation framework are introduced.

Abstract

In model-driven engineering (MDE), UML class diagrams serve as a way to plan and communicate between developers. However, it is complex and resource-consuming. We propose an automated approach for the extraction of UML class diagrams from natural language software specifications. To develop our approach, we create a dataset of UML class diagrams and their English specifications with the help of volunteers. Our approach is a pipeline of steps consisting of the segmentation of the input into sentences, the classification of the sentences, the generation of UML class diagram fragments from sentences, and the composition of these fragments into one UML class diagram. We develop a quantitative testing framework specific to UML class diagram extraction. Our approach yields low precision and recall but serves as a benchmark for future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

XsongyangX/uml-translation-3step
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.