EnterpriseEM: Fine-tuned Embeddings for Enterprise Semantic Search
Kamalkumar Rathinasamy, Jayarama Nettar, Amit Kumar, Vishal Manchanda, Arun Vijayakumar, Ayush Kataria, Venkateshprasanna Manjunath, Chidambaram GS, Jaskirat Singh Sodhi, Shoeb Shaikh, Wasim Akhtar Khan, Prashant Singh, Tanishq Dattatray Ige, Vipin Tiwari, Rajab Ali Mondal

TL;DR
This paper introduces a methodology for adapting pre-trained embedding models to enterprise data, improving the accuracy and relevance of semantic search results in organizational information retrieval tasks.
Contribution
It presents a comprehensive process for fine-tuning embeddings specifically for enterprise environments, enhancing retrieval performance over generic models.
Findings
Fine-tuned embeddings improve search precision in enterprise data.
The methodology enhances relevance of retrieved information.
Fine-tuning adapts models to enterprise-specific data characteristics.
Abstract
Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components. While pre-trained embeddings may exhibit proximity or disparity based on their original training objectives, they might not fully align with the unique characteristics of enterprise-specific data, leading to suboptimal alignment with the retrieval goals of enterprise environments. In this paper, we propose a comprehensive methodology for contextualizing pre-trained embedding models to enterprise environments, covering the entire process from data preparation to model fine-tuning and evaluation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Scientific Computing and Data Management · Open Source Software Innovations
MethodsALIGN
