Evaluating the Performance of Large Language Models for Spanish Language in Undergraduate Admissions Exams
Sabino Miranda, Obdulia Pichardo-Lagunas, Bella Mart\'inez-Seis,, Pierre Baldi

TL;DR
This study assesses GPT-3.5 and BARD's abilities on Mexican undergraduate admissions exams, showing both models perform well with GPT-3.5 generally outperforming BARD in most subjects.
Contribution
First comprehensive evaluation of large language models' performance on real-world university entrance exams in Spanish.
Findings
GPT-3.5 scored higher overall than BARD.
Both models exceeded minimum acceptance scores.
GPT-3.5 excelled in Mathematics and Physics.
Abstract
This study evaluates the performance of large language models, specifically GPT-3.5 and BARD (supported by Gemini Pro model), in undergraduate admissions exams proposed by the National Polytechnic Institute in Mexico. The exams cover Engineering/Mathematical and Physical Sciences, Biological and Medical Sciences, and Social and Administrative Sciences. Both models demonstrated proficiency, exceeding the minimum acceptance scores for respective academic programs to up to 75% for some academic programs. GPT-3.5 outperformed BARD in Mathematics and Physics, while BARD performed better in History and questions related to factual information. Overall, GPT-3.5 marginally surpassed BARD with scores of 60.94% and 60.42%, respectively.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Scientific Research and Technology · Oil and Gas Production Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Cosine Annealing · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Dropout · Linear Layer · Adam · Dense Connections · Linear Warmup With Cosine Annealing · Weight Decay
