OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit

Arun S. Maiya

arXiv:2505.07672·cs.CL·September 30, 2025

OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit

Arun S. Maiya

PDF

Open Access 1 Repo

TL;DR

OnPrem.LLM is a versatile, privacy-focused toolkit enabling secure, local deployment of large language models for document processing tasks, with flexible backend support and user-friendly interfaces.

Contribution

It introduces a comprehensive, privacy-preserving LLM toolkit with multi-backend support, hybrid deployment options, and an accessible no-code web interface.

Findings

01

Supports multiple LLM backends including llama.cpp, Ollama, vLLM, and Hugging Face.

02

Enables privacy-preserving document processing in restricted environments.

03

Provides seamless backend switching and hybrid cloud integration.

Abstract

We present OnPrem $.$ LLM, a Python-based toolkit for applying large language models (LLMs) to sensitive, non-public data in offline or restricted environments. The system is designed for privacy-preserving use cases and provides prebuilt pipelines for document processing and storage, retrieval-augmented generation (RAG), information extraction, summarization, classification, and prompt/output processing with minimal configuration. OnPrem $.$ LLM supports multiple LLM backends -- including llama $.$ cpp, Ollama, vLLM, and Hugging Face Transformers -- with quantized model support, GPU acceleration, and seamless backend switching. Although designed for fully local execution, OnPrem $.$ LLM also supports integration with a wide range of cloud LLM providers when permitted, enabling hybrid deployments that balance performance with data control. A no-code web interface extends accessibility to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amaiya/onprem
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Big Data and Digital Economy · Computational Physics and Python Applications