Pliant: Leveraging Approximation to Improve Datacenter Resource Efficiency
Neeraj Kulkarni, Feng Qi, Christina Delimitrou

TL;DR
Pliant is a lightweight cloud runtime that uses approximate computing to increase datacenter resource utilization while maintaining QoS for latency-critical services, with minimal quality loss.
Contribution
It introduces a novel interference-aware approximation approach that improves resource efficiency in multi-tenant datacenter environments.
Findings
Preserves QoS for all workloads during high contention
Achieves an average of 2.1% output quality loss
Enhances server utilization through approximation techniques
Abstract
Cloud multi-tenancy is typically constrained to a single interactive service colocated with one or more batch, low-priority services, whose performance can be sacrificed when deemed necessary. Approximate computing applications offer the opportunity to enable tighter colocation among multiple applications whose performance is important. We present Pliant, a lightweight cloud runtime that leverages the ability of approximate computing applications to tolerate some loss in their output quality to boost the utilization of shared servers. During periods of high resource contention, Pliant employs incremental and interference-aware approximation to reduce contention in shared resources, and prevent QoS violations for co-scheduled interactive, latency-critical services. We evaluate Pliant across different interactive and approximate computing applications, and show that it preserves QoS for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
