# OpenCluster: A Flexible Distributed Computing Framework for Astronomical   Data Processing

**Authors:** Shoulin Wei, Feng Wang, Hui Deng, Cuiyin Liu, Wei Dai, Bo Liang, Ying, Mei, Congming Shi, Yingbo Liu, Jingping Wu

arXiv: 1701.04907 · 2017-01-25

## TL;DR

OpenCluster is an open-source distributed computing framework designed to efficiently process large astronomical data, offering high fault tolerance, simple APIs, and scalable architecture to facilitate rapid development of processing pipelines.

## Contribution

It introduces a flexible, scalable, and easy-to-use distributed framework tailored for astronomical data processing, addressing limitations of existing architectures.

## Key findings

- Demonstrated effective processing of complex astronomical data with OpenCluster.
- Showed high fault tolerance and scalability in performance evaluations.
- Reduced software development time for astronomical data pipelines.

## Abstract

The volume of data generated by modern astronomical telescopes is extremely large and rapidly growing. However, current high-performance data processing architectures/frameworks are not well suited for astronomers because of their limitations and programming difficulties. In this paper, we therefore present OpenCluster, an open-source distributed computing framework to support rapidly developing high-performance processing pipelines of astronomical big data. We first detail the OpenCluster design principles and implementations and present the APIs facilitated by the framework. We then demonstrate a case in which OpenCluster is used to resolve complex data processing problems for developing a pipeline for the Mingantu Ultrawide Spectral Radioheliograph. Finally, we present our OpenCluster performance evaluation. Overall, OpenCluster provides not only high fault tolerance and simple programming interfaces, but also a flexible means of scaling up the number of interacting entities. OpenCluster thereby provides an easily integrated distributed computing framework for quickly developing a high-performance data processing system of astronomical telescopes and for significantly reducing software development expenses.

---
Source: https://tomesphere.com/paper/1701.04907