Trusting Language Models in Education

Jogi Suda Neto; Li Deng; Thejaswi Raya; Reza Shahbazi; Nick Liu,; Adhitya Venkatesh; Miral Shah; Neeru Khosla; Rodrigo Capobianco Guido

arXiv:2308.03866·cs.CL·August 9, 2023

Trusting Language Models in Education

Jogi Suda Neto, Li Deng, Thejaswi Raya, Reza Shahbazi, Nick Liu,, Adhitya Venkatesh, Miral Shah, Neeru Khosla, Rodrigo Capobianco Guido

PDF

Open Access

TL;DR

This paper proposes a method to calibrate confidence scores of language models in education by using an XGBoost model on attention-based features to improve the reliability of model predictions.

Contribution

It introduces a novel approach combining XGBoost with attention features from BERT to better calibrate language model confidence in educational settings.

Findings

01

Improved calibration of language model confidence scores.

02

Attention-based features correlate with response quality.

03

Enhanced reliability of language models in educational applications.

Abstract

Language Models are being widely used in Education. Even though modern deep learning models achieve very good performance on question-answering tasks, sometimes they make errors. To avoid misleading students by showing wrong answers, it is important to calibrate the confidence - that is, the prediction probability - of these models. In our work, we propose to use an XGBoost on top of BERT to output the corrected probabilities, using features based on the attention mechanism. Our hypothesis is that the level of uncertainty contained in the flow of attention is related to the quality of the model's response itself.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Adam · Dense Connections · Residual Connection · Dropout · WordPiece · Weight Decay