# DOD-ETL: Distributed On-Demand ETL for Near Real-Time Business   Intelligence

**Authors:** Gustavo V. Machado, \'Italo Cunha, Adriano C. M. Pereira, Leonardo B., Oliveira

arXiv: 1907.06723 · 2019-07-17

## TL;DR

DOD-ETL is an innovative distributed on-demand ETL tool that significantly accelerates data processing, enabling near real-time business intelligence and replacing traditional solutions in industrial settings.

## Contribution

The paper introduces DOD-ETL, a novel distributed, parallel, and in-memory architecture for near real-time ETL, outperforming existing stream processing frameworks.

## Key findings

- DOD-ETL executes workloads up to 10 times faster than comparable frameworks.
- Deployed in a large steelworks, DOD-ETL enabled near real-time reporting previously unavailable.
- The approach effectively addresses the main bottleneck in Business Intelligence solutions.

## Abstract

The competitive dynamics of the globalized market demand information on the internal and external reality of corporations. Information is a precious asset and is responsible for establishing key advantages to enable companies to maintain their leadership. However, reliable, rich information is no longer the only goal. The time frame to extract information from data determines its usefulness. This work proposes DOD-ETL, a tool that addresses, in an innovative manner, the main bottleneck in Business Intelligence solutions, the Extract Transform Load process (ETL), providing it in near real-time. DODETL achieves this by combining an on-demand data stream pipeline with a distributed, parallel and technology-independent architecture with in-memory caching and efficient data partitioning. We compared DOD-ETL with other Stream Processing frameworks used to perform near real-time ETL and found DOD-ETL executes workloads up to 10 times faster. We have deployed it in a large steelworks as a replacement for its previous ETL solution, enabling near real-time reports previously unavailable.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.06723/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1907.06723/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1907.06723/full.md

---
Source: https://tomesphere.com/paper/1907.06723