Evaluating the Symbol Binding Ability of Large Language Models for Multiple-Choice Questions in Vietnamese General Education
Duc-Vu Nguyen, Quoc-Nam Nguyen

TL;DR
This study assesses large language models' ability to correctly bind symbols in Vietnamese multiple-choice questions across various settings, introducing a new LaTeX-typed dataset for evaluation.
Contribution
It introduces a novel LaTeX-formatted dataset for evaluating symbol binding in Vietnamese LLMs and benchmarks six models' performance on this task.
Findings
GPT-4 outperforms other models in symbol binding accuracy.
The new dataset provides a high-quality resource for Vietnamese NLP evaluation.
LLMs show promising results but still have room for improvement.
Abstract
In this paper, we evaluate the ability of large language models (LLMs) to perform multiple choice symbol binding (MCSB) for multiple choice question answering (MCQA) tasks in zero-shot, one-shot, and few-shot settings. We focus on Vietnamese, with fewer challenging MCQA datasets than in English. The two existing datasets, ViMMRC 1.0 and ViMMRC 2.0, focus on literature. Recent research in Vietnamese natural language processing (NLP) has focused on the Vietnamese National High School Graduation Examination (VNHSGE) from 2019 to 2023 to evaluate ChatGPT. However, these studies have mainly focused on how ChatGPT solves the VNHSGE step by step. We aim to create a novel and high-quality dataset by providing structured guidelines for typing LaTeX formulas for mathematics, physics, chemistry, and biology. This dataset can be used to evaluate the MCSB ability of LLMs and smaller language models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Educational Assessment and Pedagogy · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Dense Connections · Linear Layer · Residual Connection
