Trusting Language Models in Education
Jogi Suda Neto, Li Deng, Thejaswi Raya, Reza Shahbazi, Nick Liu,, Adhitya Venkatesh, Miral Shah, Neeru Khosla, Rodrigo Capobianco Guido

TL;DR
This paper proposes a method to calibrate confidence scores of language models in education by using an XGBoost model on attention-based features to improve the reliability of model predictions.
Contribution
It introduces a novel approach combining XGBoost with attention features from BERT to better calibrate language model confidence in educational settings.
Findings
Improved calibration of language model confidence scores.
Attention-based features correlate with response quality.
Enhanced reliability of language models in educational applications.
Abstract
Language Models are being widely used in Education. Even though modern deep learning models achieve very good performance on question-answering tasks, sometimes they make errors. To avoid misleading students by showing wrong answers, it is important to calibrate the confidence - that is, the prediction probability - of these models. In our work, we propose to use an XGBoost on top of BERT to output the corrected probabilities, using features based on the attention mechanism. Our hypothesis is that the level of uncertainty contained in the flow of attention is related to the quality of the model's response itself.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Adam · Dense Connections · Residual Connection · Dropout · WordPiece · Weight Decay
