A Survey of Semantics-Aware Performance Optimization for Data-Intensive Computing
Bingbing Rao, Liqiang Wang

TL;DR
This survey reviews semantics-aware techniques that enhance performance in data-intensive computing, highlighting recent advances, challenges, and future research directions in optimizing large-scale data processing systems.
Contribution
It provides a comprehensive overview of semantics-aware performance optimization approaches, categorizes existing techniques, and discusses future research challenges in data-intensive computing.
Findings
Identified four types of performance defects in data-intensive systems.
Surveyed state-of-the-art semantics-aware optimization techniques.
Discussed key research challenges and future opportunities.
Abstract
We are living in the era of Big Data and witnessing the explosion of data. Given that the limitation of CPU and I/O in a single computer, the mainstream approach to scalability is to distribute computations among a large number of processing nodes in a cluster or cloud. This paradigm gives rise to the term of data-intensive computing, which denotes a data parallel approach to process massive volume of data. Through the efforts of different disciplines, several promising programming models and a few platforms have been proposed for data-intensive computing, such as MapReduce, Hadoop, Apache Spark and Dyrad. Even though a large body of research work has being proposed to improve overall performance of these platforms, there is still a gap between the actual performance demand and the capability of current commodity systems. This paper is aimed to provide a comprehensive understanding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Graph Theory and Algorithms · IoT and Edge/Fog Computing
