# Turning data into insights in Jub, an extensible generic big data platform for life science and healthcare applications

**Authors:** Ignacio Castillo-Barrios, Melesio Crespo-Sanchez, Hugo G. Reyes-Anastacio, Jose L. Gonzalez-Compean, Ivan Lopez-Arevalo, J. Armando Barron-Lugo, J. Carlos Morin-Garcia, Yelda A. Leal, Jaqueline Calderon-Hernandez, Heriberto Aguirre-Meneses, Marco Antonio Núñez-Gaona

PMC · DOI: 10.1038/s41598-025-32196-3 · Scientific Reports · 2025-12-19

## TL;DR

Jub is a big data platform that helps life science and healthcare organizations turn large datasets into useful information for decision-making.

## Contribution

Jub introduces a new platform using AI and cloud storage to create customizable data observatories for healthcare and life science applications.

## Key findings

- Jub created 16 data observatories from 85 million information products using mortality and pollutant data.
- Breast cancer mortality rates were found to possibly correlate with air pollutants.
- Jub was successfully implemented in a cancer registry network and for bone cancer diagnosis in Mexico.

## Abstract

This paper presents Jub, a Life Science and Healthcare Data Platform (LSHDP) based on generic sandboxes that integrate AI tools and cloud storage into big data science services. Jub automatically and transparently creates data science services to transform datasets into massive information products by using a profiling methodology. These products are presented by generic-secure cloud-based FAIR observatories adding Programmable, Configurable/Customizing, Adaptable, and Resiliency properties (PCA-FAIR-R). This enables organizations to conduct and customize complex analytics processes to support decision-making. We conducted a study case to convert mortality, climate, and pollutants datasets (2000-2023) reported by the Mexican Government into a solid core hub of information products: 16 strategic data observatories based on 85,171,404 information products created from 114,155,622 spatio-temporal profiles of the International Classification of Diseases (ICD-10) mortality classes/strata and cancerogenic substances. An exploratory study revealed highlights about the significance of breast cancer mortality rate growth showing possible associations with air pollutants. This paper also describes the lessons learned from the practice and experience of implementing Jub sandboxes-based observatories for the Population-based Cancer Registry Network deployed on the Mexican territory in 12 Mexican states by public healthcare institutions, as well as to implement bone cancer deep-learning-based diagnosis at a national Hospital.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989), bone cancer (MONDO:0002129)

## Full-text entities

- **Diseases:** breast cancer (MESH:D001943), bone cancer (MESH:D001859), Cancer (MESH:D009369)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12819404/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12819404/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/PMC12819404/full.md

---
Source: https://tomesphere.com/paper/PMC12819404