Managing Large-Scale Transient Data in IoT Systems
Nanjangud C. Narendra, Sambit Nayak, Anshu Shukla

TL;DR
This paper introduces new dataflow checkpoint and migration techniques for IoT streaming platforms that enable fast, lossless migration and quick stabilization during scale-in and scale-out operations on cloud resources.
Contribution
It presents novel migration strategies for Apache Storm that significantly reduce migration time and improve stability without message loss.
Findings
Migration of large dataflows within 50 seconds
Migration time reduced by over 50% compared to default Storm
Applications stabilize faster with no message re-processing
Abstract
The pervasive availability of streaming data is driving interest in distributed Fast Data platforms for streaming applications. Such latency-sensitive applications need to respond to dynamism in the input rates and task behavior using scale-in and -out on elastic Cloud resources. Platforms like Apache Storm do not provide robust capabilities for responding to such dynamism and for rapid task migration across VMs. We propose several dataflow checkpoint and migration approaches that allow a running streaming dataflow to migrate, without any loss of in-flight messages or their internal tasks states, while reducing the time to recover and stabilize. We implement and evaluate these migration strategies on Apache Storm using micro and application dataflows for scaling in and out on up to 2-21 Azure VMs. Our results show that we can migrate dataflows of large sizes within 50 sec, in comparison…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
