Leveraging multi-task learning to improve the detection of SATD and vulnerability

Barbara Russo; Jorge Melegati; Moritz Mock

arXiv:2501.15934·cs.SE·July 3, 2025

Leveraging multi-task learning to improve the detection of SATD and vulnerability

Barbara Russo, Jorge Melegati, Moritz Mock

PDF

Open Access 1 Repo

TL;DR

This study explores whether multi-task learning can enhance the automatic detection of self-admitted technical debt and vulnerabilities in code, but results show no significant improvement, highlighting the need for further research into their relationship.

Contribution

The paper implements VulSATD, a deep learning model based on CodeBERT, to jointly detect SATD and vulnerabilities, and evaluates its effectiveness using a fused dataset.

Findings

01

No significant performance difference between single and multi-task approaches

02

Multi-task learning did not improve detection accuracy in this study

03

Further investigation needed into the relationship between technical debt and vulnerabilities

Abstract

Multi-task learning is a paradigm that leverages information from related tasks to improve the performance of machine learning. Self-Admitted Technical Debt (SATD) are comments in the code that indicate not-quite-right code introduced for short-term needs, i.e., technical debt (TD). Previous research has provided evidence of a possible relationship between SATD and the existence of vulnerabilities in the code. In this work, we investigate if multi-task learning could leverage the information shared between SATD and vulnerabilities to improve the automatic detection of these issues. To this aim, we implemented VulSATD, a deep learner that detects vulnerable and SATD code based on CodeBERT, a pre-trained transformers model. We evaluated VulSATD on MADE-WIC, a fused dataset of functions annotated for TD (through SATD) and vulnerability. We compared the results using single and multi-task…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

moritzmock/multitask-vulberability-detection
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications