Improved Retrieval of Programming Solutions With Code Examples Using a Multi-featured Score
Rodrigo F. Silva, M. Masudur Rahman, Carlos Eduardo Dantas, Chanchal, Roy, Foutse Khomh, Marcelo A. Maia

TL;DR
This paper introduces CRAR, a multi-featured approach combining semantic, social, and information retrieval techniques to improve the retrieval of relevant programming solutions with code examples and explanations from Stack Overflow.
Contribution
CRAR integrates multiple features including semantic embeddings and social metrics to enhance retrieval accuracy over existing systems like CROKAGE.
Findings
CRAR outperforms CROKAGE in Mean Reciprocal Rank.
Combining diverse features improves retrieval effectiveness.
Semantic and social features significantly enhance answer relevance.
Abstract
Developers often depend on code search engines to obtain solutions for their programming tasks. However, finding an expected solution containing code examples along with their explanations is challenging due to several issues. There is a vocabulary mismatch between the search keywords (the query) and the appropriate solutions. Semantic gap may increase for similar bag of words due to antonyms and negation. Moreover, documents retrieved by search engines might not contain solutions containing both code examples and their explanations. So, we propose CRAR (Crowd Answer Recommender) to circumvent those issues aiming at improving retrieval of relevant answers from Stack Overflow containing not only the expected code examples for the given task but also their explanations. Given a programming task, we investigate the effectiveness of combining information retrieval techniques along with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
