Retrieval-Augmented Multi-LLM Ensemble for Industrial Part Specification Extraction

Muzakkiruddin Ahmed Mohammed; John R. Talburt; Leon Claasssens; and Adriaan Marais

arXiv:2601.05266·cs.IR·January 12, 2026

Retrieval-Augmented Multi-LLM Ensemble for Industrial Part Specification Extraction

Muzakkiruddin Ahmed Mohammed, John R. Talburt, Leon Claasssens, and Adriaan Marais

PDF

Open Access

TL;DR

This paper presents RAGsemble, a retrieval-augmented multi-LLM ensemble system that significantly improves industrial part specification extraction from unstructured text by combining multiple models and grounding outputs with factual data.

Contribution

Introduction of a novel multi-LLM ensemble framework with retrieval augmentation, enhancing accuracy and reliability in industrial text extraction tasks.

Findings

01

Significant accuracy improvements over single-LLM baselines.

02

Effective grounding of outputs using FAISS-based retrieval.

03

Robust system architecture suitable for industrial deployment.

Abstract

Industrial part specification extraction from unstructured text remains a persistent challenge in manufacturing, procurement, and maintenance, where manual processing is both time-consuming and error-prone. This paper introduces a retrieval-augmented multi-LLM ensemble framework that orchestrates nine state-of-the-art Large Language Models (LLMs) within a structured three-phase pipeline. RAGsemble addresses key limitations of single-model systems by combining the complementary strengths of model families including Gemini (2.0, 2.5, 1.5), OpenAI (GPT-4o, o4-mini), Mistral Large, and Gemma (1B, 4B, 3n-e4b), while grounding outputs in factual data using FAISS-based semantic retrieval. The system architecture consists of three stages: (1) parallel extraction by diverse LLMs, (2) targeted research augmentation leveraging high-performing models, and (3) intelligent synthesis with conflict…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Handwritten Text Recognition Techniques · Natural Language Processing Techniques