Are We Ready to Embrace Generative AI for Software Q&A?
Bowen Xu, Thanh-Dat Nguyen, Thanh Le-Cong, Thong Hoang, Jiakun Liu,, Kisub Kim, Chen Gong, Changan Niu, Chenyu Wang, Bach Le, David Lo

TL;DR
This paper evaluates the quality of ChatGPT-generated answers compared to human answers on Stack Overflow, revealing that while answers are semantically similar, human answers are consistently better by about 10%, impacting the platform's trust.
Contribution
It provides a comparative analysis of ChatGPT and human answers in a real-world software Q&A context, highlighting current limitations of generative AI in this domain.
Findings
ChatGPT answers are semantically similar to human answers.
Human answers outperform ChatGPT by 10% overall.
Generative AI currently has limitations in quality for software Q&A.
Abstract
Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques. ChatGPT is banned by Stack Overflow after only 6 days from its release. The main reason provided by the official Stack Overflow is that the answers generated by ChatGPT are of low quality. To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers. Our methodology employs both automatic comparison and a manual study. Our results suggest that human-written and ChatGPT-generated answers are semantically similar, however, human-written answers outperform ChatGPT-generated ones consistently across multiple aspects, specifically by 10% on the overall score. We release the data, analysis scripts, and detailed results at https://anonymous.4open.science/r/GAI4SQA-FD5C.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Software Engineering Research · Artificial Intelligence in Healthcare and Education
