Semantic Code Browsing
Isabel Garcia-Contreras, Jose F. Morales, Manuel V. Hermenegildo

TL;DR
This paper introduces a semantic code browsing method that uses static analysis and an assertion-based query language to find code based on semantic properties, improving accuracy and flexibility over traditional syntactic search.
Contribution
It presents a fully automatic, semantics-based code search approach using static analysis and a novel query language, without relying on documentation or annotations.
Findings
Prototype implementation within the Ciao system demonstrates effectiveness.
Semantic search is more resilient to syntactic differences.
Approach surpasses signature matching in power and flexibility.
Abstract
Programmers currently enjoy access to a very high number of code repositories and libraries of ever increasing size. The ensuing potential for reuse is however hampered by the fact that searching within all this code becomes an increasingly difficult task. Most code search engines are based on syntactic techniques such as signature matching or keyword extraction. However, these techniques are inaccurate (because they basically rely on documentation) and at the same time do not offer very expressive code query languages. We propose a novel approach that focuses on querying for semantic characteristics of code obtained automatically from the code itself. Program units are pre-processed using static analysis techniques, based on abstract interpretation, obtaining safe semantic approximations. A novel, assertion-based code query language is used to express desired semantic characteristics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Web Data Mining and Analysis · Software System Performance and Reliability
