AutoFL: A Tool for Automatic Multi-granular Labelling of Software   Repositories

Cezar Sas; Andrea Capiluppi

arXiv:2408.02557·cs.SE·April 28, 2025

AutoFL: A Tool for Automatic Multi-granular Labelling of Software Repositories

Cezar Sas, Andrea Capiluppi

PDF

Open Access 1 Repo

TL;DR

AutoFL is a tool that automatically labels software repositories at multiple levels using source code, helping developers understand large codebases more efficiently by providing meaningful domain annotations.

Contribution

The paper introduces AutoFL, a novel tool for automatic multi-granular labeling of software repositories directly from source code, improving upon prior project-level classification methods.

Findings

01

AutoFL successfully labels files, packages, and projects with relevant application domains.

02

The tool demonstrates potential to reduce manual effort in software comprehension.

03

Limitations include accuracy challenges and scope for future improvements.

Abstract

Software comprehension, especially of new code bases, is time consuming for developers, especially in large projects with multiple functionalities spanning various domains. One strategy to reduce this effort involves annotating files with meaningful labels that describe the functionalities contained. However, prior research has so far focused on classifying the whole project using README files as a proxy, resulting in little information gained for the developers. Our objective is to streamline the labelling of files with the correct application domains using source code as input. To achieve this, in prior work, we evaluated the ability to annotate files automatically using a weak labelling approach. This paper presents AutoFL, a tool for automatically labelling software repositories from source code. AutoFL allows multi-granular annotations including: \textit{file},…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SasCezar/AutoFL
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research