Retrieval-Augmented Generation for Service Discovery: Chunking Strategies and Benchmarking
Robin D. Pesl, Jerin G. Mathew, Massimo Mecella, Marco Aiello

TL;DR
This paper explores how Retrieval-Augmented Generation can improve service endpoint discovery by optimizing API description chunking and introducing a Discovery Agent, evaluated on new benchmarks to enhance accuracy and efficiency.
Contribution
It introduces novel chunking strategies and a Discovery Agent for API description preprocessing, along with new benchmarks for evaluating retrieval accuracy in service discovery.
Findings
Chunking strategies impact retrieval accuracy.
Discovery Agent improves precision but may reduce recall.
RAG effectively reduces token length for endpoint discovery.
Abstract
Integrating multiple (sub-)systems is essential to create advanced Information Systems. Difficulties mainly arise when integrating dynamic environments, e.g., the integration at design time of not yet existing services. This has been traditionally addressed using a registry that provides the API documentation of the endpoints. Large Language Models have shown to be capable of automatically creating system integrations (e.g., as service composition) based on this documentation but require concise input due to input oken limitations, especially regarding comprehensive API descriptions. Currently, it is unknown how best to preprocess these API descriptions. In the present work, we (i) analyze the usage of Retrieval Augmented Generation for endpoint discovery and the chunking, i.e., preprocessing, of state-of-practice OpenAPIs to reduce the input oken length while preserving the most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Web Data Mining and Analysis · Semantic Web and Ontologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · travel james · Attention Is All You Need · Linear Layer · Attention Dropout · Softmax · WordPiece · Weight Decay · Multi-Head Attention · Layer Normalization
