A Formalism and Approach for Improving Robustness of Large Language   Models Using Risk-Adjusted Confidence Scores

Ke Shen; Mayank Kejriwal

arXiv:2310.03283·cs.CL·October 6, 2023

A Formalism and Approach for Improving Robustness of Large Language Models Using Risk-Adjusted Confidence Scores

Ke Shen, Mayank Kejriwal

PDF

Open Access

TL;DR

This paper introduces a formal framework and novel metrics for assessing and reducing risks in large language models, demonstrating improved decision-making and risk management in natural language inference tasks.

Contribution

It formalizes decision and composite risks in LLMs, proposes a risk-centric evaluation framework, and introduces DwD, a calibration method to minimize risks in LLM-based NLP applications.

Findings

01

DwD reduces decision risk by 20.1% in low-risk tasks.

02

DwD skips 19.8% of high-risk tasks to prevent errors.

03

Evaluation framework effectively measures risks in LLMs.

Abstract

Large Language Models (LLMs), such as ChatGPT, have achieved impressive milestones in natural language processing (NLP). Despite their impressive performance, the models are known to pose important risks. As these models are deployed in real-world applications, a systematic understanding of different risks posed by these models on tasks such as natural language inference (NLI), is much needed. In this paper, we define and formalize two distinct types of risk: decision risk and composite risk. We also propose a risk-centric evaluation framework, and four novel metrics, for assessing LLMs on these risks in both in-domain and out-of-domain settings. Finally, we propose a risk-adjusted calibration method called DwD for helping LLMs minimize these risks in an overall NLI architecture. Detailed experiments, using four NLI benchmarks, three baselines and two LLMs, including ChatGPT, show both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)