TL;DR
CroCS introduces a meta-learning based transfer approach for domain-specific code search, enabling effective adaptation of pre-trained models to languages with limited data, outperforming traditional fine-tuning methods.
Contribution
The paper presents CroCS, a novel meta-learning framework that improves domain-specific code search by adapting pre-trained models to languages with scarce data using MAML.
Findings
CroCS significantly outperforms traditional fine-tuning in domain-specific languages.
It is especially effective when data is scarce.
Experimental results on SQL and Solidity demonstrate its superiority.
Abstract
Recently, pre-trained programming language models such as CodeBERT have demonstrated substantial gains in code search. Despite showing great performance, they rely on the availability of large amounts of parallel data to fine-tune the semantic mappings between queries and code. This restricts their practicality in domain-specific languages with relatively scarce and expensive data. In this paper, we propose CroCS, a novel approach for domain-specific code search. CroCS employs a transfer learning framework where an initial program representation model is pre-trained on a large corpus of common programming languages (such as Java and Python) and is further adapted to domain-specific languages such as SQL and Solidity. Unlike cross-language CodeBERT, which is directly fine-tuned in the target language, CroCS adapts a few-shot meta-learning algorithm called MAML to learn the good…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
