A Preliminary Study of Large Language Models for Multilingual Vulnerability Detection
Junji Yu, Honglin Shu, Michael Fu, Dong Wang, Chakkrit Tantithamthavorn, Yasutaka Kamei, and Junjie Chen

TL;DR
This study evaluates the effectiveness of large language models in detecting software vulnerabilities across seven programming languages, highlighting CodeT5P's superior performance and discussing future potential in multilingual vulnerability detection.
Contribution
It provides an initial assessment of PLMs and LLMs for multilingual vulnerability detection, identifying CodeT5P as the most effective model among those tested.
Findings
CodeT5P achieves the best performance in multilingual vulnerability detection.
LLMs show promise in identifying critical vulnerabilities across languages.
This work offers insights for future research and practical deployment in multilingual security.
Abstract
Deep learning-based approaches, particularly those leveraging pre-trained language models (PLMs), have shown promise in automated software vulnerability detection. However, existing methods are predominantly limited to specific programming languages, restricting their applicability in multilingual settings. Recent advancements in large language models (LLMs) offer language-agnostic capabilities and enhanced semantic understanding, presenting a potential solution to this limitation. While existing studies have explored LLMs for vulnerability detection, their detection performance remains unknown for multilingual vulnerabilities. To address this gap, we conducted a preliminary study to evaluate the effectiveness of PLMs and state-of-the-art LLMs across seven popular programming languages. Our findings reveal that the PLM CodeT5P achieves the best performance in multilingual vulnerability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Information and Cyber Security · Web Application Security Vulnerabilities
