A Cloud-Native Architecture for Human-in-Control LLM-Assisted OpenSearch in Investigative Settings
Benjamin Puhani, Kai Brehmer, Malte Prie{\ss}

TL;DR
This paper proposes a cloud-native microservice architecture integrating Large Language Models to translate natural language queries into OpenSearch, facilitating investigative searches in secure, scalable environments.
Contribution
It introduces a novel hybrid retrieval system combining lexical and semantic search within a secure, cloud-native framework for investigative applications.
Findings
Functional prototype demonstrates feasibility.
Hybrid retrieval improves search relevance.
Architectural baseline established for future evaluation.
Abstract
Complex criminal investigations are often hindered by large volumes of unstructured evidence and by the semantic gap between natural language investigative intent and technical search logic. To address this challenge, we present a design and feasibility study of a cloud-native microservice architecture tailored to private-cloud deployments, contributing to research in secure cloud computing and leveraging modern cloud paradigms under high security and scalability requirements. The proposed system integrates Large Language Models into a "Human-in-Control" workflow that translates natural-language queries into syntactically valid OpenSearch Domain-Specific Language expressions. We describe the implementation of a hybrid retrieval strategy within OpenSearch that combines BM25-based lexical search with nested semantic vector embeddings. The paper focuses on system design and preliminary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
