AutoFL: A Tool for Automatic Multi-granular Labelling of Software Repositories
Cezar Sas, Andrea Capiluppi

TL;DR
AutoFL is a tool that automatically labels software repositories at multiple levels using source code, helping developers understand large codebases more efficiently by providing meaningful domain annotations.
Contribution
The paper introduces AutoFL, a novel tool for automatic multi-granular labeling of software repositories directly from source code, improving upon prior project-level classification methods.
Findings
AutoFL successfully labels files, packages, and projects with relevant application domains.
The tool demonstrates potential to reduce manual effort in software comprehension.
Limitations include accuracy challenges and scope for future improvements.
Abstract
Software comprehension, especially of new code bases, is time consuming for developers, especially in large projects with multiple functionalities spanning various domains. One strategy to reduce this effort involves annotating files with meaningful labels that describe the functionalities contained. However, prior research has so far focused on classifying the whole project using README files as a proxy, resulting in little information gained for the developers. Our objective is to streamline the labelling of files with the correct application domains using source code as input. To achieve this, in prior work, we evaluated the ability to annotate files automatically using a weak labelling approach. This paper presents AutoFL, a tool for automatically labelling software repositories from source code. AutoFL allows multi-granular annotations including: \textit{file},…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research
