Container Profiler: Profiling Resource Utilization of Containerized Big Data Pipelines
Varik Hoang, Ling-Hong Hung, David Perez, Huazeng Deng, Raymond, Schooley, Niharika Arumilli, Ka Yee Yeung, Wes Lloyd

TL;DR
The paper introduces the Container Profiler, a tool for detailed resource usage profiling of containerized tasks, enabling continuous monitoring and bottleneck identification in complex data pipelines.
Contribution
It presents a novel, comprehensive profiling tool that measures multiple resource metrics at various system levels with minimal overhead.
Findings
Effective profiling of bioinformatics pipeline stages
Negligible overhead of the profiling tool
Insights into resource utilization patterns
Abstract
This paper presents the Container Profiler, a software tool that measures and records the resource usage of any containerized task. Our tool profiles the CPU, memory, disk, and network utilization of containerized tasks collecting over fifty Linux operating system metrics at the virtual machine, container, and process levels. The Container Profiler supports performing time series profiling at a configurable sampling interval to enable continuous monitoring of the resources consumed by containerized tasks and pipelines. To investigate the utility of the Container Profiler, we profile the resource utilization requirements of a multi-stage bioinformatics analytical pipeline (RNA sequencing using unique molecular identifiers). We examine profiling metrics to assess patterns of CPU, disk, and network resource utilization across the different stages of the pipeline. We also quantify the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
