Metabolomics in the Cloud: Scaling Computational Tools to Big Data
Jianliang Gao, Noureddin Sadawi, Ibrahim Karaman, Jake T M Pearce,, Pablo Moreno, Anders Larsson, Marco Capuccini, Paul Elliott, Jeremy K, Nicholson, Timothy M D Ebbels, Robert Glen

TL;DR
This paper demonstrates that cloud-based metabolomics analysis platforms like PhenoMeNal can significantly reduce processing times for large datasets, offering scalable and cost-effective solutions for metabolomics research.
Contribution
It evaluates the computational scalability and efficiency of the PhenoMeNal platform across various cloud configurations, highlighting its advantages over traditional desktop processing.
Findings
Processing time reduced from 4 days to 10 minutes on large clusters.
Efficiency drops below 80% above one-third of maximum vCPUs.
Cloud platforms are cost-effective compared to desktops.
Abstract
Background: Metabolomics datasets are becoming increasingly large and complex, with multiple types of algorithms and workflows needed to process and analyse the data. A cloud infrastructure with portable software tools can provide much needed resources enabling faster processing of much larger datasets than would be possible at any individual lab. The PhenoMeNal project has developed such an infrastructure, allowing users to run analyses on local or commercial cloud platforms. We have examined the computational scaling behaviour of the PhenoMeNal platform using four different implementations across 1-1000 virtual CPUs using two common metabolomics tools. Results: Our results show that data which takes up to 4 days to process on a standard desktop computer can be processed in just 10 min on the largest cluster. Improved runtimes come at the cost of decreased efficiency, with all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetabolomics and Mass Spectrometry Studies · Bioinformatics and Genomic Networks
