Retrieval-Augmented Generation for Service Discovery: Chunking Strategies and Benchmarking

Robin D. Pesl; Jerin G. Mathew; Massimo Mecella; Marco Aiello

arXiv:2505.19310·cs.SE·May 27, 2025

Retrieval-Augmented Generation for Service Discovery: Chunking Strategies and Benchmarking

Robin D. Pesl, Jerin G. Mathew, Massimo Mecella, Marco Aiello

PDF

Open Access

TL;DR

This paper explores how Retrieval-Augmented Generation can improve service endpoint discovery by optimizing API description chunking and introducing a Discovery Agent, evaluated on new benchmarks to enhance accuracy and efficiency.

Contribution

It introduces novel chunking strategies and a Discovery Agent for API description preprocessing, along with new benchmarks for evaluating retrieval accuracy in service discovery.

Findings

01

Chunking strategies impact retrieval accuracy.

02

Discovery Agent improves precision but may reduce recall.

03

RAG effectively reduces token length for endpoint discovery.

Abstract

Integrating multiple (sub-)systems is essential to create advanced Information Systems. Difficulties mainly arise when integrating dynamic environments, e.g., the integration at design time of not yet existing services. This has been traditionally addressed using a registry that provides the API documentation of the endpoints. Large Language Models have shown to be capable of automatically creating system integrations (e.g., as service composition) based on this documentation but require concise input due to input oken limitations, especially regarding comprehensive API descriptions. Currently, it is unknown how best to preprocess these API descriptions. In the present work, we (i) analyze the usage of Retrieval Augmented Generation for endpoint discovery and the chunking, i.e., preprocessing, of state-of-practice OpenAPIs to reduce the input oken length while preserving the most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Web Data Mining and Analysis · Semantic Web and Ontologies

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · travel james · Attention Is All You Need · Linear Layer · Attention Dropout · Softmax · WordPiece · Weight Decay · Multi-Head Attention · Layer Normalization