# Architecture of processing and analysis system for big astronomical data

**Authors:** Ivan Kolosov, Sergey Gerasimov, Alexander Meshcheryakov

arXiv: 1703.10979 · 2017-04-03

## TL;DR

This paper investigates the application of Hadoop and Spark in cloud-based big data processing for astronomy, comparing their performance and flexibility in image co-adding tasks.

## Contribution

It demonstrates the feasibility and performance equivalence of Hadoop and Spark for astronomical data processing, highlighting Spark's flexibility for pipeline construction.

## Key findings

- Hadoop and Spark have similar performance in image co-adding.
- Spark offers greater flexibility for building processing pipelines.
- Both frameworks are suitable for cloud-based astronomical data analysis.

## Abstract

This work explores the use of big data technologies deployed in the cloud for processing of astronomical data. We have applied Hadoop and Spark to the task of co-adding astronomical images. We compared the overhead and execution time of these frameworks. We conclude that performance of both frameworks is generally on par. The Spark API is more flexible, which allows one to easily construct astronomical data processing pipelines.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.10979/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1703.10979/full.md

## References

6 references — full list in the complete paper: https://tomesphere.com/paper/1703.10979/full.md

---
Source: https://tomesphere.com/paper/1703.10979