Measuring and Managing Answer Quality for Online Data-Intensive Services
Jaimie Kelley, Christopher Stewart, Nathaniel Morris, Devesh Tiwari,, Yuxiong He, and Sameh Elnikety

TL;DR
Ubora is a novel system that measures the impact of slow components on answer quality in online data services, enabling better online admission control and improving query throughput.
Contribution
It introduces Ubora, a system that efficiently measures answer quality by replaying network messages, applicable across various platforms, to improve online query management.
Findings
Ubora accurately measures answer quality faster than existing methods.
Using answer quality for admission control increases query throughput by 37%.
Ubora is compatible with platforms like Hadoop, Lucene, and question answering systems.
Abstract
Online data-intensive services parallelize query execution across distributed software components. Interactive response time is a priority, so online query executions return answers without waiting for slow running components to finish. However, data from these slow components could lead to better answers. We propose Ubora, an approach to measure the effect of slow running components on the quality of answers. Ubora randomly samples online queries and executes them twice. The first execution elides data from slow components and provides fast online answers; the second execution waits for all components to complete. Ubora uses memoization to speed up mature executions by replaying network messages exchanged between components. Our systems-level implementation works for a wide range of platforms, including Hadoop/Yarn, Apache Lucene, the EasyRec Recommendation Engine, and the OpenEphyra…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Scientific Computing and Data Management · Software System Performance and Reliability
