Optimizing RAG Techniques for Automotive Industry PDF Chatbots: A Case Study with Locally Deployed Ollama Models
Fei Liu, Zejun Kang, Xing Han

TL;DR
This paper presents a multi-dimensional optimization approach for local Retrieval-Augmented Generation (RAG) techniques tailored to automotive PDF chatbot applications, improving document processing accuracy and relevance in low-performance environments.
Contribution
It introduces tailored PDF processing, retrieval, and context compression methods, along with custom classes and a self-RAG agent, specifically designed for automotive industry documents using Ollama models.
Findings
Significant improvements in context precision and recall.
Enhanced answer relevancy and faithfulness.
Notable performance gains on automotive datasets.
Abstract
With the growing demand for offline PDF chatbots in automotive industrial production environments, optimizing the deployment of large language models (LLMs) in local, low-performance settings has become increasingly important. This study focuses on enhancing Retrieval-Augmented Generation (RAG) techniques for processing complex automotive industry documents using locally deployed Ollama models. Based on the Langchain framework, we propose a multi-dimensional optimization approach for Ollama's local RAG implementation. Our method addresses key challenges in automotive document processing, including multi-column layouts and technical specifications. We introduce improvements in PDF processing, retrieval mechanisms, and context compression, tailored to the unique characteristics of automotive industry documents. Additionally, we design custom classes supporting embedding pipelines and an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Byte Pair Encoding · Softmax · Dense Connections · Dropout · Linear Layer · Attention Dropout · Residual Connection · Linear Warmup With Linear Decay · BART
