Doc2OracLL: Investigating the Impact of Documentation on LLM-based Test Oracle Generation
Soneya Binta Hossain, Raygan Taylor, Matthew Dwyer

TL;DR
This paper explores how Java Javadoc comments influence the effectiveness of automated test oracle generation, analyzing their role in improving accuracy and bug detection in software testing.
Contribution
It provides a comprehensive analysis of the impact of Javadoc comments on test oracle generation, identifying key components that enhance oracle correctness and strength.
Findings
Javadoc comments significantly improve test oracle accuracy.
Certain Javadoc components are more influential in oracle generation.
Using Javadoc comments helps in detecting real bugs more effectively.
Abstract
Code documentation is a critical aspect of software development, serving as a bridge between human understanding and machine-readable code. Beyond assisting developers in understanding and maintaining code, documentation also plays a critical role in automating various software engineering tasks, such as test oracle generation (TOG). In Java, Javadoc comments provide structured, natural language documentation embedded directly in the source code, typically detailing functionality, usage, parameters, return values, and exceptions. While prior research has utilized Javadoc comments in test oracle generation (TOG), there has not been a thorough investigation into their impact when combined with other contextual information, nor into identifying the most relevant components for generating correct and strong test oracles, or understanding their role in detecting real bugs. In this study, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Software System Performance and Reliability
