ODYS: A Massively-Parallel Search Engine Using a DB-IR Tightly-Integrated Parallel DBMS
Kyu-Young Whang, Tae-Seob Yun, Yeon-Mi Yeo, Il-Yeol Song, Hyuk-Yoon, Kwon, In-Joong Kim

TL;DR
ODYS is a massively-parallel search engine built on a tightly-integrated parallel DBMS, demonstrating high scalability and performance comparable to commercial engines, capable of handling billions of queries daily with low response times.
Contribution
This paper introduces a novel approach of constructing a parallel search engine using a DB-IR tightly-integrated parallel DBMS, achieving high scalability and performance.
Findings
Handles 1 billion queries per day with 43,472 nodes
Achieves an average response time of 211 ms for 30 billion pages
Doubling nodes reduces response time to 162 ms
Abstract
Recently, parallel search engines have been implemented based on scalable distributed file systems such as Google File System. However, we claim that building a massively-parallel search engine using a parallel DBMS can be an attractive alternative since it supports a higher-level (i.e., SQL-level) interface than that of a distributed file system for easy and less error-prone application development while providing scalability. In this paper, we propose a new approach of building a massively-parallel search engine using a DB-IR tightly-integrated parallel DBMS and demonstrate its commercial-level scalability and performance. In addition, we present a hybrid (i.e., analytic and experimental) performance model for the parallel search engine. We have built a five-node parallel search engine according to the proposed architecture using a DB-IR tightly-integrated DBMS. Through extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Advanced Database Systems and Queries · Data Management and Algorithms
