# Exploring the effects and potential of unlocked I/O-powered single board computer clusters

**Authors:** Yeongmo Lee, Dongchul Park

PMC · DOI: 10.1038/s41598-025-34623-x · Scientific Reports · 2026-01-07

## TL;DR

This paper explores using Raspberry Pi 5B clusters with high-speed storage for efficient, large-scale data processing as a cost-effective and energy-efficient alternative to traditional data centers.

## Contribution

The study demonstrates the feasibility of using Raspberry Pi 5B with PCIe SSDs for terabyte-scale big data processing in single board computer clusters.

## Key findings

- RPi 5B clusters with PCIe SSDs can handle up to 2 TB of data using Hadoop benchmarks.
- Unlocked I/O performance and hardware optimizations significantly improve SBC cluster efficiency.
- The study identifies key factors like I/O throughput and CPU overclocking that impact performance.

## Abstract

Across all fields, experts strive to collect and analyze numerous data to extract meaningful insight. In response to this trend, Hadoop and Spark have emerged, and many organizations have adopted these platforms for big data storage and processing. In addition, data centers with powerful servers are constantly expanding to accommodate the increasing number of data, causing significant costs and environmental problems due to the tremendous energy consumption. Single board computer (SBC) clusters have emerged as a promising alternative for efficient computing. Most SBCs have adopted a microSD slot for data storage; thus effectively processing massive data has some limitations. However, the latest generation Raspberry Pi (RPi), model 5B provides a peripheral component interconnect express (PCIe) interface, enabling high-performance storage media, such as solid state drives (SSD). This paper extensively investigates the practicability and potential of SBCs for terabyte-scale big data processing. We build the SBC Hadoop cluster, adopting the most powerful, latest RPi 5B (8 GB of RAM) with a fast PCIe-based SSD via the PCIe interface, and perform six widely known benchmarks with a large (up to 2 TB) data size. Furthermore, this paper discusses challenges and suggestions, including the effects of input/output (I/O) throughput, central processing unit (CPU) overclocking, power supply, and trim command, which significantly affect SBC Hadoop performance. This comprehensive study concludes that integrating the enhanced computing of RPi 5B with unlocked I/O performance finally paves the way for a practical solution to real-world big data processing on SBC clusters.

## Full-text entities

- **Diseases:** RPi 5B (MESH:C536872), microSD (MESH:C536681), SSD (MESH:D018250)
- **Chemicals:** silicon (MESH:D012825), CPU (-)
- **Cell lines:** RPi — Homo sapiens (Human), Friedreich ataxia, Induced pluripotent stem cell (CVCL_ZC10)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12864792/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12864792/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/PMC12864792/full.md

---
Source: https://tomesphere.com/paper/PMC12864792