Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT
Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Dacheng Tao

TL;DR
This study evaluates ChatGPT's understanding capabilities using the GLUE benchmark, comparing it with fine-tuned BERT models, and explores how prompting strategies can enhance its performance.
Contribution
It provides a quantitative analysis of ChatGPT's understanding ability across multiple NLP tasks and compares it with BERT models, highlighting strengths and weaknesses.
Findings
ChatGPT struggles with paraphrase and similarity tasks.
ChatGPT outperforms BERT on inference tasks.
Prompting strategies can improve ChatGPT's understanding.
Abstract
Recently, ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries. Several prior studies have shown that ChatGPT attains remarkable generation ability compared with existing models. However, the quantitative analysis of ChatGPT's understanding ability has been given little attention. In this report, we explore the understanding ability of ChatGPT by evaluating it on the most popular GLUE benchmark, and comparing it with 4 representative fine-tuned BERT-style models. We find that: 1) ChatGPT falls short in handling paraphrase and similarity tasks; 2) ChatGPT outperforms all BERT models on inference tasks by a large margin; 3) ChatGPT achieves comparable performance compared with BERT on sentiment analysis and question-answering tasks. Additionally, by combining some advanced prompting strategies, we show that the understanding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · Dropout · Linear Warmup With Linear Decay · Attention Dropout · Attention Is All You Need · Residual Connection · Layer Normalization · WordPiece · Softmax
