VAULT: Vigilant Adversarial Updates via LLM-Driven Retrieval-Augmented Generation for NLI

Roie Kazoom; Ofir Cohen; Rami Puzis; Asaf Shabtai; Ofer Hadar

arXiv:2508.00965·cs.LG·August 5, 2025

VAULT: Vigilant Adversarial Updates via LLM-Driven Retrieval-Augmented Generation for NLI

Roie Kazoom, Ofir Cohen, Rami Puzis, Asaf Shabtai, Ofer Hadar

PDF

Open Access 3 Reviews

TL;DR

VAULT is an automated pipeline that enhances NLI model robustness by systematically generating and incorporating adversarial examples using retrieval-augmented generation with LLMs, leading to significant accuracy improvements.

Contribution

The paper introduces VAULT, a fully automated adversarial data generation method leveraging LLMs and retrieval techniques to improve NLI model robustness without human intervention.

Findings

01

Significant accuracy improvements on SNLI, ANLI, and MultiNLI benchmarks.

02

Outperforms prior in-context adversarial methods by up to 2.0%.

03

Automates high-quality adversarial data curation at scale.

Abstract

We introduce VAULT, a fully automated adversarial RAG pipeline that systematically uncovers and remedies weaknesses in NLI models through three stages: retrieval, adversarial generation, and iterative retraining. First, we perform balanced few-shot retrieval by embedding premises with both semantic (BGE) and lexical (BM25) similarity. Next, we assemble these contexts into LLM prompts to generate adversarial hypotheses, which are then validated by an LLM ensemble for label fidelity. Finally, the validated adversarial examples are injected back into the training set at increasing mixing ratios, progressively fortifying a zero-shot RoBERTa-base model.On standard benchmarks, VAULT elevates RoBERTa-base accuracy from 88.48% to 92.60% on SNLI +4.12%, from 75.04% to 80.95% on ANLI +5.91%, and from 54.67% to 71.99% on MultiNLI +17.32%. It also consistently outperforms prior in-context…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

1. The main contribution of this paper is the design of an automated “retrieve-generate-validate-retrain” closed-loop system, which provides a valuable engineering framework for improving the robustness of NLI models in a scalable manner without human annotation. 2. The paper conducts extensive ablation studies, demonstrating substantial experimental effort. The main experiments show that with a small number of adversarial samples, the method can achieve significant performance improvements acro

Weaknesses

1. The primary concern about this paper lies in its novelty. Each component of VAULT (including RAG, adversarial sample selection and refinement through iterative training, and the use of an LLM as a verifier) has been explored in prior work. VAULT appears to be more of an integration and adaptation of these existing techniques rather than a fundamentally new approach. 2. The paper lacks ablation studies across different models. a. Using different backbone models. It would be important to s

Reviewer 02Rating 2Confidence 4

Strengths

1) The authors employ a data augmentation method for the NLI task which generate the adversarial data for training a model. 2) During the retrieval phase, the authors combine BGE and BM25 methods for the sample similarity assessment, effectively capturing the semantic information and the token-level feature. Experimental results further validated the effectiveness of this methodology.

Weaknesses

1） This method only has been evaluated on NLI tasks, which limits its practical value. 2） The experimental comparison is insufficient. Since this paper focuses on model training with labelled data, the baseline methods should include few-shot learning methods and LLMs based contextual learning methods. However, currently the paper only compared with LLMs, lacking comparisons with the aforementioned types of methods. Besides, the authors only use Roberta-base as the baseline model for NLI tasks,

Reviewer 03Rating 4Confidence 3

Strengths

- This paper proposes an end-to-end automated adversarial RAG pipeline, which fully automates retrieval, adversarial generation, multi-LLM validation, and iterative retraining. - This paper provides a detailed procedure for the RAG pipeline, and experimental results on three NLI datasets show the proposed method achieves better performance by fune-tuning RoBERTa-base.

Weaknesses

- The proposed method is not innovative, since adversarial generation has been proposed by prior works. - As many LLMs can use RAG for implementing natural language inference, I want to see a direct comparison with these LLMs in the experiment. - The structure of the paper could be improved, the figures and the hyperparameter settings can be put in appropriate positions.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning