Pointing to Subwords for Generating Function Names in Source Code
Shogo Fujita, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

TL;DR
This paper introduces two strategies for copying low-frequency subwords to improve automatic function name generation from source code, significantly enhancing performance on Java datasets.
Contribution
It presents novel copying strategies for low-frequency subwords, addressing out-of-vocabulary issues in function name generation models.
Findings
Improved F1 and accuracy scores on Java-small and Java-large datasets.
Effective handling of low-frequency and out-of-vocabulary subwords.
Enhanced model performance over conventional methods.
Abstract
We tackle the task of automatically generating a function name from source code. Existing generators face difficulties in generating low-frequency or out-of-vocabulary subwords. In this paper, we propose two strategies for copying low-frequency or out-of-vocabulary subwords in inputs. Our best performing model showed an improvement over the conventional method in terms of our modified F1 and accuracy on the Java-small and Java-large datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Natural Language Processing Techniques · Topic Modeling
