Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support

Fangjian Lei; Mariam El Mezouar; Shayan Noei; Ying Zou

arXiv:2507.16754·cs.SE·July 23, 2025

Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support

Fangjian Lei, Mariam El Mezouar, Shayan Noei, Ying Zou

PDF

Open Access

TL;DR

This paper develops and evaluates an adaptive retrieval-augmented generation pipeline using a large Stack Overflow corpus to improve the accuracy and reliability of LLMs in developer support tasks, outperforming zero-shot approaches.

Contribution

It introduces a novel RAG pipeline combining HyDE with full-answer context and demonstrates its effectiveness across multiple LLMs and question types.

Findings

01

Best RAG pipeline uses HyDE with full-answer context.

02

RAG pipeline outperforms zero-shot baselines in helpfulness and correctness.

03

Adaptive retrieval improves coverage for unseen questions.

Abstract

Large Language Models (LLMs) have shown promise in assisting developers with code-related questions; however, LLMs carry the risk of generating unreliable answers. To address this, Retrieval-Augmented Generation (RAG) has been proposed to reduce the unreliability (i.e., hallucinations) of LLMs. However, designing effective pipelines remains challenging due to numerous design choices. In this paper, we construct a retrieval corpus of over 3 million Java and Python related Stack Overflow posts with accepted answers, and explore various RAG pipeline designs to answer developer questions, evaluating their effectiveness in generating accurate and reliable responses. More specifically, we (1) design and evaluate 7 different RAG pipelines and 63 pipeline variants to answer questions that have historically similar matches, and (2) address new questions without any close prior matches by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Topic Modeling · Expert finding and Q&A systems