A Proposed Large Language Model-Based Smart Search for Archive System
Ha Dung Nguyen, Thi-Hoang Anh Nguyen, Thanh Binh Nguyen

TL;DR
This paper introduces a novel LLM-based smart search framework for digital archives, utilizing RAG techniques to improve retrieval accuracy, handle multilingual queries, and enhance archival search efficiency.
Contribution
It proposes an integrated architecture combining advanced metadata, hybrid retrieval, and response synthesis, demonstrating significant performance improvements over traditional methods.
Findings
Enhanced search precision and relevance.
Effective multilingual query handling.
Improved efficiency in archival information retrieval.
Abstract
This study presents a novel framework for smart search in digital archival systems, leveraging the capabilities of Large Language Models (LLMs) to enhance information retrieval. By employing a Retrieval-Augmented Generation (RAG) approach, the framework enables the processing of natural language queries and transforming non-textual data into meaningful textual representations. The system integrates advanced metadata generation techniques, a hybrid retrieval mechanism, a router query engine, and robust response synthesis, the results proved search precision and relevance. We present the architecture and implementation of the system and evaluate its performance in four experiments concerning LLM efficiency, hybrid retrieval optimizations, multilingual query handling, and the impacts of individual components. Obtained results show significant improvements over conventional approaches and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
