Differentiate ChatGPT-generated and Human-written Medical Texts
Wenxiong Liao, Zhengliang Liu, Haixing Dai, Shaochen Xu, Zihao Wu,, Yiyang Zhang, Xiaoke Huang, Dajiang Zhu, Hongmin Cai, Tianming Liu, Xiang Li

TL;DR
This study analyzes linguistic differences between human-written and ChatGPT-generated medical texts, developing machine learning methods to accurately identify AI-generated medical content to ensure safety and reliability.
Contribution
It introduces a comprehensive dataset and a BERT-based detection model that achieves over 95% F1 score in distinguishing AI-generated from human medical texts.
Findings
Human texts are more concrete and diverse.
ChatGPT texts focus on fluency and general terms.
BERT model achieves high detection accuracy.
Abstract
Background: Large language models such as ChatGPT are capable of generating grammatically perfect and human-like text content, and a large number of ChatGPT-generated texts have appeared on the Internet. However, medical texts such as clinical notes and diagnoses require rigorous validation, and erroneous medical content generated by ChatGPT could potentially lead to disinformation that poses significant harm to healthcare and the general public. Objective: This research is among the first studies on responsible and ethical AIGC (Artificial Intelligence Generated Content) in medicine. We focus on analyzing the differences between medical texts written by human experts and generated by ChatGPT, and designing machine learning workflows to effectively detect and differentiate medical texts generated by ChatGPT. Methods: We first construct a suite of datasets containing medical texts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling
