Tug-of-War within A Decade: Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented Generations

Ziyin Zhou; Jianyi Zhang; Xu ji; Yilong Li; Jiameng Han; Zhangchi Zhao

arXiv:2604.14172·cs.CL·April 17, 2026

Tug-of-War within A Decade: Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented Generations

Ziyin Zhou, Jianyi Zhang, Xu ji, Yilong Li, Jiameng Han, Zhangchi Zhao

PDF

TL;DR

This paper introduces CRVA-TGRAG, a two-stage framework combining improved retrieval techniques and teacher-guided fine-tuning to enhance LLMs' ability to accurately analyze and update cybersecurity vulnerabilities.

Contribution

The paper presents a novel framework that improves vulnerability knowledge retrieval and consistency in LLMs through document segmentation, ensemble retrieval, and preference-based fine-tuning.

Findings

01

Achieves higher accuracy in retrieving latest CVEs.

02

Reduces knowledge conflicts and hallucinations in LLM outputs.

03

Enhances content quality and question-answering precision.

Abstract

Large Language Models (LLMs) are essential for analyzing and addressing vulnerabilities in cybersecurity. However, among over 200,000 vulnerabilities were discovered in the past decade, more than 30,000 have been changed or updated. This necessitates frequent updates to the training datasets and internal knowledge bases of LLMs to maintain knowledge consistency. In this paper, we focus on the problem of knowledge discrepancy and conflict within CVE (Common Vulnerabilities and Exposures) detection and analysis. This problem hinders LLMs' ability to retrieve the latest knowledge from original training datasets, leading to knowledge conflicts, fabrications of factually incorrect results, and generation hallucinations. To address this problem, we propose an innovative two-stage framework called CRVA-TGRAG (Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.