ATM: Adversarial Tuning Multi-agent System Makes a Robust Retrieval-Augmented Generator
Junda Zhu, Lingyong Yan, Haibo Shi, Dawei Yin, Lei Sha

TL;DR
This paper introduces ATM, an adversarial multi-agent system that enhances retrieval-augmented generators' robustness against noisy and fabricating content, improving accuracy in knowledge-intensive question answering.
Contribution
It proposes a novel adversarial multi-agent tuning framework that significantly improves the robustness and accuracy of retrieval-augmented language models.
Findings
ATM outperforms state-of-the-art baselines in experiments.
The Generator learns to better discriminate useful documents.
Adversarial tuning enhances model robustness against noisy data.
Abstract
Large language models (LLMs) are proven to benefit a lot from retrieval-augmented generation (RAG) in alleviating hallucinations confronted with knowledge-intensive questions. RAG adopts information retrieval techniques to inject external knowledge from semantic-relevant documents as input contexts. However, since today's Internet is flooded with numerous noisy and fabricating content, it is inevitable that RAG systems are vulnerable to these noises and prone to respond incorrectly. To this end, we propose to optimize the retrieval-augmented Generator with an Adversarial Tuning Multi-agent system (ATM). The ATM steers the Generator to have a robust perspective of useful documents for question answering with the help of an auxiliary Attacker agent through adversarially tuning the agents for several iterations. After rounds of multi-agent iterative tuning, the Generator can eventually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Algorithms
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Linear Warmup With Linear Decay · Weight Decay · Attention Dropout · Linear Layer · Byte Pair Encoding · Adam · Residual Connection
