Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection   Method

Yukun Zhao; Lingyong Yan; Weiwei Sun; Guoliang Xing; Chong Meng,; Shuaiqiang Wang; Zhicong Cheng; Zhaochun Ren; Dawei Yin

arXiv:2310.17918·cs.CL·March 22, 2024·2 cites

Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method

Yukun Zhao, Lingyong Yan, Weiwei Sun, Guoliang Xing, Chong Meng,, Shuaiqiang Wang, Zhicong Cheng, Zhaochun Ren, Dawei Yin

PDF

Open Access 1 Video

TL;DR

This paper introduces a simple self-detection method for LLMs to identify questions likely to produce nonfactual answers by analyzing divergences in diversified prompts, enhancing reliability without external resources.

Contribution

The paper presents a novel self-detection approach that uses prompt diversification and divergence analysis to identify potential falsehoods in LLM outputs without external tools.

Findings

01

Effective detection of nonfactual responses in LLMs

02

Applicable to models like Vicuna, ChatGPT, GPT-4

03

Improves LLM reliability in factual tasks

Abstract

Large Language Models (LLMs) have shown great potential in Natural Language Processing (NLP) tasks. However, recent literature reveals that LLMs generate nonfactual responses intermittently, which impedes the LLMs' reliability for further utilization. In this paper, we propose a novel self-detection method to detect which questions that a LLM does not know that are prone to generate nonfactual results. Specifically, we first diversify the textual expressions for a given question and collect the corresponding answers. Then we examine the divergencies between the generated answers to identify the questions that the model may generate falsehoods. All of the above steps can be accomplished by prompting the LLMs themselves without referring to any other external resources. We conduct comprehensive experiments and demonstrate the effectiveness of our method on recently released LLMs, e.g.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research

MethodsMulti-Head Attention · Attention Is All You Need · Label Smoothing · Linear Layer · Residual Connection · Byte Pair Encoding · Softmax · Dense Connections · Dropout · Adam