# FlowCron - Increasing access to HPC by wrapping Globus into a function-as-a-service

**Authors:** Dimitrios Bellos, James Allsopp, Elaine M. L. Ho, Tibor Auer, Gavin Yearwood, Andrew J. Morris, Mark Basham, Christopher Woods, Rick Wagner, Weijian Zheng

PMC · DOI: 10.12688/wellcomeopenres.23491.1 · Wellcome Open Research · 2025-01-13

## TL;DR

FlowCron is a system that makes it easier for researchers to use high-performance computing clusters with minimal training, improving productivity and data analysis.

## Contribution

FlowCron introduces a function-as-a-service approach by integrating Globus and cron to simplify HPC access for researchers.

## Key findings

- FlowCron automates data transfer, analysis, and return while ensuring reproducibility and reducing administrative burden.
- The system uses common HPC software dependencies, making it easy to install and maintain.
- By lowering barriers to HPC usage, FlowCron can increase scientific throughput and return on investment in computational resources.

## Abstract

Despite significant investment in High-Performance Computing (HPC) clusters by funding councils, there are still many researchers whose workflows could not benefit from the computation speed that is provided by these clusters. Reducing barriers to entry for these researchers would accelerate their scientific throughput, since they will be able to respond to results in a timely fashion, improving either their protocols or correcting any problems that might have arisen. This improves the quality of science, and therefore the return on investment, in computationally-intensive areas such as Cryogenic Electron Microscopy (cryo-EM). This paper outlines a technique, FlowCron, for users to analyse their data on a HPC facility with minimal training, increasing accessibility. FlowCron transfers the responsibilities of installation and upkeep of data processing pipelines from users to HPC systems administrators, simplifies the set up of HPC pipelines, and makes pipelines as reliable as possible once set up. The work described here has software dependencies that are common to the majority of HPC clusters.

We achieve this by linking Globus and cron to produce an open-source system that requires little administrative support but provides a very easy way of running an analysis on a HPC system. The user starts the analysis through the Globus website and, when started, the data will be encrypted, uploaded to the HPC, analysed, and returned to the originating machine, along with a record of the analysis.

Despite significant investment by public funding councils in multi-million pound high-speed computing clusters, many researchers still do not utilise them in their work. By reducing any barriers to entry, the hope is that more researchers will increase their usage and productivity. This will accelerate their scientific throughput, allowing them to adapt to experimental results in a timely fashion, improving their protocols or correcting any problems that arise before significant costs are incurred. As a result, there will be a better return on the investment in those computer clusters and other equipment. This paper outlines a technique called FlowCron which allows users to analyse their data with minimal training. It allows the automatic transfer of data to these clusters, performs any computation to analyse and process the data, returns the processed data to the user, and then optionally deletes any data or files left in the cluster. Furthermore, all actions performed are documented and logged, encouraging reproducibility of the results and reducing effort associated with troubleshooting, maintenance, and support. Finally, FlowCron has ubiquitous software dependencies, making it simple to install and use.

## Full-text entities

- **Chemicals:** FlowCron (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11885907/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11885907/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC11885907/full.md

---
Source: https://tomesphere.com/paper/PMC11885907