EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark
Ming Li, Jike Zhong, Tianle Chen, Yuxiang Lai, Konstantinos Psounis

TL;DR
EEE-Bench is a comprehensive multimodal benchmark designed to evaluate large language and multimodal models' abilities to solve complex, real-world electrical and electronics engineering problems involving intricate visual and textual information.
Contribution
The paper introduces EEE-Bench, a new multimodal benchmark with 2860 engineering problems, and provides extensive evaluation revealing current models' limitations in practical engineering tasks.
Findings
Current models perform poorly on EEE-Bench, with accuracy from 19.48% to 46.78%.
Models tend to rely on text and overlook visual information, revealing a 'laziness' shortcoming.
EEE-Bench exposes critical gaps in models' ability to handle complex, multimodal engineering problems.
Abstract
Recent studies on large language models (LLMs) and large multimodal models (LMMs) have demonstrated promising skills in various domains including science and mathematics. However, their capability in more challenging and real-world related scenarios like engineering has not been systematically studied. To bridge this gap, we propose EEE-Bench, a multimodal benchmark aimed at assessing LMMs' capabilities in solving practical engineering tasks, using electrical and electronics engineering (EEE) as the testbed. Our benchmark consists of 2860 carefully curated problems spanning 10 essential subdomains such as analog circuits, control systems, etc. Compared to benchmarks in other domains, engineering problems are intrinsically 1) more visually complex and versatile and 2) less deterministic in solutions. Successful solutions to these problems often demand more-than-usual rigorous integration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPower Systems and Technologies
