Benchmarking Large Language Models on Homework Assessment in Circuit Analysis

Liangliang Chen; Zhihao Qin; Yiming Guo; Jacqueline Rohde; Ying Zhang

arXiv:2506.06390·cs.CY·June 10, 2025

Benchmarking Large Language Models on Homework Assessment in Circuit Analysis

Liangliang Chen, Zhihao Qin, Yiming Guo, Jacqueline Rohde, Ying Zhang

PDF

Open Access

TL;DR

This paper benchmarks various large language models on their ability to assess undergraduate circuit analysis homework, highlighting their strengths, limitations, and potential for educational applications.

Contribution

It introduces a novel dataset and evaluation framework for assessing LLMs in engineering education, providing benchmarks and insights for future development.

Findings

01

GPT-4o and Llama 3 70B outperform GPT-3.5 Turbo across all metrics

02

Different models show distinct strengths in solution evaluation aspects

03

Current LLMs have limitations in reliably assessing circuit analysis homework

Abstract

Large language models (LLMs) have the potential to revolutionize various fields, including code development, robotics, finance, and education, due to their extensive prior knowledge and rapid advancements. This paper investigates how LLMs can be leveraged in engineering education. Specifically, we benchmark the capabilities of different LLMs, including GPT-3.5 Turbo, GPT-4o, and Llama 3 70B, in assessing homework for an undergraduate-level circuit analysis course. We have developed a novel dataset consisting of official reference solutions and real student solutions to problems from various topics in circuit analysis. To overcome the limitations of image recognition in current state-of-the-art LLMs, the solutions in the dataset are converted to LaTeX format. Using this dataset, a prompt template is designed to test five metrics of student solutions: completeness, method, final answer,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Innovative Teaching and Learning Methods · Career Development and Diversity

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Cosine Annealing · Layer Normalization · Linear Warmup With Cosine Annealing · Attention Dropout · Byte Pair Encoding · Softmax · Dropout · Dense Connections