# Prospects for Wideband VLBI Correlation in the Cloud

**Authors:** Ajay Gill, Lindy Blackburn, Arash Roshanineshat, Chi-Kwan Chan,, Sheperd S. Doeleman, Michael D. Johnson, Alexander W. Raymond, Jonathan, Weintroub

arXiv: 1908.03991 · 2021-03-16

## TL;DR

This paper explores a cloud-based architecture for wideband VLBI data correlation, demonstrating its potential to improve scalability, flexibility, and cost-effectiveness over traditional dedicated clusters through benchmarks and economic analysis.

## Contribution

It introduces a cloud correlation framework for VLBI, evaluates its performance with benchmarks, and provides cost estimates, highlighting advantages over fixed hardware systems.

## Key findings

- Cloud correlation can scale with array size and resources.
- Performance benchmarks show optimal VM configurations for VLBI correlation.
- Economic analysis suggests cloud is a viable alternative for high data rate VLBI processing.

## Abstract

This paper proposes a cloud architecture for the correlation of wide bandwidth VLBI data. Cloud correlation facilitates processing of entire experiments in parallel using flexibly allocated and practically unlimited compute resources. This approach offers a potential improvement over dedicated correlation clusters, which are constrained by a fixed number of installed processor nodes and playback units. Additionally, cloud storage offers an alternative to maintaining a fleet of hard-disk drives that might be utilized intermittently. We describe benchmarks of VLBI correlation using the DiFX-2.5.2 software on the Google Cloud Platform to assess cloud-based correlation performance. The number of virtual CPUs per Virtual Machine was varied to determine the optimum configuration of cloud resources. The number of stations was varied to determine the scaling of correlation time with VLBI arrays of different sizes. Data transfer rates from Google Cloud Storage to the Virtual Machines performing the correlation were also measured. We also present an example cloud correlation configuration. Current cloud service and equipment pricing data is used to compile cost estimates allowing an approximate economic comparison between cloud and cluster processing. The economic comparisons are based on cost figures which are a moving target, and are highly dependent on factors such as the utilization of cluster and media, which are a challenge to estimate. Our model suggests that shifting to the cloud is an alternative path for high data rate, low duty cycle wideband VLBI correlation that should continue to be explored. In the production phase of VLBI correlation, the cloud has the potential to significantly reduce data processing times and allow the processing of more science experiments in a given year for the petabyte-scale data sets increasingly common in both astronomy and geodesy VLBI applications.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.03991/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1908.03991/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1908.03991/full.md

---
Source: https://tomesphere.com/paper/1908.03991