# A processing and analytics system for microscopy data workflows: the   Pycroscopy ecosystem of packages

**Authors:** Rama Vasudevan, Mani Valleti, Maxim Ziatdinov, Gerd Duscher, Suhas, Somnath

arXiv: 2302.14629 · 2023-03-01

## TL;DR

The paper introduces the pycroscopy ecosystem, an open-source Python framework with a common data model designed to streamline microscopy data processing, analysis, and reproducibility across diverse microscopic techniques.

## Contribution

It presents a unified, open-source ecosystem with a common data model that integrates various microscopy data workflows and accelerates analysis and visualization.

## Key findings

- Demonstrates workflows for data ingestion and analysis
- Shows compatibility with multiple microscopy techniques
- Highlights potential for autonomous instrument integration

## Abstract

Major advancements in fields as diverse as biology and quantum computing have relied on a multitude of microscopic techniques. All optical, electron and scanning probe microscopy advanced with new detector technologies and integration of spectroscopy, imaging, and diffraction. Despite the considerable proliferation of these instruments, significant bottlenecks remain in terms of processing, analysis, storage, and retrieval of acquired datasets. Aside from the lack of file standards, individual domain-specific analysis packages are often disjoint from the underlying datasets. Thus, keeping track of analysis and processing steps remains tedious for the end-user, hampering reproducibility. Here, we introduce the pycroscopy ecosystem of packages, an open-source python-based ecosystem underpinned by a common data model. Our data model, termed the N-dimensional spectral imaging data format, is realized in pycroscopy's sidpy package. This package is built on top of dask arrays, thus leveraging dask array attributes but expanding them to accelerate microscopy-relevant analysis and visualization. Several examples of the use of the pycroscopy ecosystem to create workflows for data ingestion and analysis are shown. Adoption of such standardized routines will be critical to usher in the next generation of autonomous instruments where processing, computation, and meta-data storage will be critical to overall experimental operations.

---
Source: https://tomesphere.com/paper/2302.14629