Odysseus/DFS: Integration of DBMS and Distributed File System for Transaction Processing of Big Data
Jun-Sung Kim, Kyu-Young Whang, Hyuk-Yoon Kwon, and Il-Yeol Song

TL;DR
This paper introduces Odysseus/DFS, a novel architecture integrating a relational DBMS with distributed file systems to support scalable, reliable, and high-level transaction processing suitable for big data analytics.
Contribution
It proposes a new architecture combining RDBMS and DFS, including the concept of meta DFS files and efficient transaction management, enabling high-level DBMS functionalities on distributed storage.
Findings
Odysseus/DFS outperforms HBase in transaction processing.
Performance is comparable to local storage RDBMS with minimal overhead.
Supports large-scale, reliable, and scalable data management for big data analytics.
Abstract
The relational DBMS (RDBMS) has been widely used since it supports various high-level functionalities such as SQL, schemas, indexes, and transactions that do not exist in the O/S file system. But, a recent advent of big data technology facilitates development of new systems that sacrifice the DBMS functionality in order to efficiently manage large-scale data. Those so-called NoSQL systems use a distributed file system, which support scalability and reliability. They support scalability of the system by storing data into a large number of low-cost commodity hardware and support reliability by storing the data in replica. However, they have a drawback that they do not adequately support high-level DBMS functionality. In this paper, we propose an architecture of a DBMS that uses the DFS as storage. With this novel architecture, the DBMS is capable of supporting scalability and reliability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Advanced Data Storage Technologies · Advanced Database Systems and Queries
