Optimizing RAG Techniques for Automotive Industry PDF Chatbots: A Case   Study with Locally Deployed Ollama Models

Fei Liu; Zejun Kang; Xing Han

arXiv:2408.05933·cs.IR·August 13, 2024·5 cites

Optimizing RAG Techniques for Automotive Industry PDF Chatbots: A Case Study with Locally Deployed Ollama Models

Fei Liu, Zejun Kang, Xing Han

PDF

Open Access

TL;DR

This paper presents a multi-dimensional optimization approach for local Retrieval-Augmented Generation (RAG) techniques tailored to automotive PDF chatbot applications, improving document processing accuracy and relevance in low-performance environments.

Contribution

It introduces tailored PDF processing, retrieval, and context compression methods, along with custom classes and a self-RAG agent, specifically designed for automotive industry documents using Ollama models.

Findings

01

Significant improvements in context precision and recall.

02

Enhanced answer relevancy and faithfulness.

03

Notable performance gains on automotive datasets.

Abstract

With the growing demand for offline PDF chatbots in automotive industrial production environments, optimizing the deployment of large language models (LLMs) in local, low-performance settings has become increasingly important. This study focuses on enhancing Retrieval-Augmented Generation (RAG) techniques for processing complex automotive industry documents using locally deployed Ollama models. Based on the Langchain framework, we propose a multi-dimensional optimization approach for Ollama's local RAG implementation. Our method addresses key challenges in automotive document processing, including multi-column layouts and technical specifications. We introduce improvements in PDF processing, retrieval mechanisms, and context compression, tailored to the unique characteristics of automotive industry documents. Additionally, we design custom classes supporting embedding pipelines and an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Byte Pair Encoding · Softmax · Dense Connections · Dropout · Linear Layer · Attention Dropout · Residual Connection · Linear Warmup With Linear Decay · BART