SOGPTSpotter: Detecting ChatGPT-Generated Answers on Stack Overflow
Suyu Ma, Chunyang Chen, Hourieh Khalajzadeh, John Grundy

TL;DR
SOGPTSpotter is a novel Siamese Neural Network-based method that effectively detects ChatGPT-generated answers on Stack Overflow, outperforming existing baselines and aiding moderation efforts.
Contribution
Introduces SOGPTSpotter, a Siamese Neural Network approach utilizing BigBird and Triplet loss for accurate detection of AI-generated answers on Stack Overflow.
Findings
Outperforms baseline detection methods like GPTZero and DetectGPT.
Effective in real-world moderation by identifying ChatGPT-suspected answers.
Robust against adversarial attacks and generalizes across domains.
Abstract
Stack Overflow is a popular Q&A platform where users ask technical questions and receive answers from a community of experts. Recently, there has been a significant increase in the number of answers generated by ChatGPT, which can lead to incorrect and unreliable information being posted on the site. While Stack Overflow has banned such AI-generated content, detecting whether a post is ChatGPT-generated remains a challenging task. We introduce a novel approach, SOGPTSpotter, that employs Siamese Neural Networks, leveraging the BigBird model and the Triplet loss, to detect ChatGPT-generated answers on Stack Overflow. We use triplets of human answers, reference answers, and ChatGPT answers. Our empirical evaluation reveals that our approach outperforms well-established baselines like GPTZero, DetectGPT, GLTR, BERT, RoBERTa, and GPT-2 in identifying ChatGPT-synthesized Stack Overflow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExpert finding and Q&A systems · Topic Modeling · Academic integrity and plagiarism
