Using Large Language Model to Solve and Explain Physics Word Problems Approaching Human Level
Jingzhe Ding, Yan Cen, Xinyuan Wei

TL;DR
This paper demonstrates that large language models like GPT-3.5 can solve, explain, and generate physics word problems with performance nearing human levels, using a new dataset and prompting techniques.
Contribution
It introduces PhysQA, the first annotated physics word problem dataset, and shows that GPT-3.5 can solve and explain physics problems with high accuracy using zero-shot and few-shot learning.
Findings
GPT3.5 solves 49.3% of problems zero-shot
GPT3.5 solves 73.2% of problems few-shot
LLMs can generate explanations and new physics problems
Abstract
Our work demonstrates that large language model (LLM) pre-trained on texts can not only solve pure math word problems, but also physics word problems, whose solution requires calculation and inference based on prior physical knowledge. We collect and annotate the first physics word problem dataset-PhysQA, which contains over 1000 junior high school physics word problems (covering Kinematics, Mass&Density, Mechanics, Heat, Electricity). Then we use OpenAI' s GPT3.5 to generate the answer of these problems and found that GPT3.5 could automatically solve 49.3% of the problems through zero-shot learning and 73.2% through few-shot learning. This result demonstrates that by using similar problems and their answers as prompt, LLM could solve elementary physics word problems approaching human level performance. In addition to solving problems, GPT3.5 can also summarize the knowledge or topics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Educational Assessment and Pedagogy
MethodsFocus
