A Survey on the Real Power of ChatGPT
Ming Liu, Ran Liu, Ye Zhu, Hua Wang, Youyang Qu, Rongsheng Li, Yongpan, Sheng, Wray Buntine

TL;DR
This survey reviews recent research on ChatGPT's true performance across NLP tasks, discusses social and safety issues, and highlights challenges in evaluating this closed-source AI model.
Contribution
It provides a comprehensive overview of ChatGPT's actual capabilities, social implications, and key challenges in its evaluation process.
Findings
Uncovered ChatGPT's performance levels in seven NLP categories
Discussed social and safety implications of ChatGPT
Highlighted challenges in evaluating closed-source models
Abstract
ChatGPT has changed the AI community and an active research line is the performance evaluation of ChatGPT. A key challenge for the evaluation is that ChatGPT is still closed-source and traditional benchmark datasets may have been used by ChatGPT as the training data. In this paper, (i) we survey recent studies which uncover the real performance levels of ChatGPT in seven categories of NLP tasks, (ii) review the social implications and safety issues of ChatGPT, and (iii) emphasize key challenges and opportunities for its evaluation. We hope our survey can shed some light on its blackbox manner, so that researchers are not misleaded by its surface generation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · COVID-19 diagnosis using AI · Artificial Intelligence in Healthcare
