Towards Building a Robust Knowledge Intensive Question Answering Model   with Large Language Models

Xingyun Hong; Yan Shao; Zhilin Wang; Manni Duan; Jin Xiongnan

arXiv:2409.05385·cs.CL·September 19, 2024

Towards Building a Robust Knowledge Intensive Question Answering Model with Large Language Models

Xingyun Hong, Yan Shao, Zhilin Wang, Manni Duan, Jin Xiongnan

PDF

Open Access

TL;DR

This paper constructs a dataset to evaluate LLM robustness against noisy external information and proposes data augmentation and contrastive learning methods to improve model resilience and discrimination capabilities.

Contribution

It introduces a novel dataset simulating interference scenarios and proposes a fine-tuning approach with contrastive learning to enhance LLM robustness against noise.

Findings

01

Improved model robustness against noisy information

02

Enhanced discrimination capability of LLMs

03

Validated effectiveness using GPT-4 evaluations

Abstract

The development of LLMs has greatly enhanced the intelligence and fluency of question answering, while the emergence of retrieval enhancement has enabled models to better utilize external information. However, the presence of noise and errors in retrieved information poses challenges to the robustness of LLMs. In this work, to evaluate the model's performance under multiple interferences, we first construct a dataset based on machine reading comprehension datasets simulating various scenarios, including critical information absence, noise, and conflicts. To address the issue of model accuracy decline caused by noisy external information, we propose a data augmentation-based fine-tuning method to enhance LLM's robustness against noise. Additionally, contrastive learning approach is utilized to preserve the model's discrimination capability of external information. We have conducted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems

MethodsByte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Layer Normalization · Dropout · Attention Is All You Need · Position-Wise Feed-Forward Layer · Residual Connection · Linear Layer