Deep Q-Learning-Based Intelligent Scheduling for ETL Optimization in Heterogeneous Data Environments

Kangning Gao; Yi Hu; Cong Nie; Wei Li

arXiv:2512.13060·cs.LG·December 16, 2025

Deep Q-Learning-Based Intelligent Scheduling for ETL Optimization in Heterogeneous Data Environments

Kangning Gao, Yi Hu, Cong Nie, Wei Li

PDF

Open Access

TL;DR

This paper introduces a deep Q-learning framework for optimizing ETL scheduling in heterogeneous data environments, significantly improving efficiency, resource utilization, and adaptability in complex data processing systems.

Contribution

It presents a novel reinforcement learning-based scheduling model that dynamically adapts to complex, high-dimensional data environments for ETL processes.

Findings

01

Reduces scheduling delay significantly

02

Improves system throughput and stability

03

Demonstrates robustness under various conditions

Abstract

This paper addresses the challenges of low scheduling efficiency, unbalanced resource allocation, and poor adaptability in ETL (Extract-Transform-Load) processes under heterogeneous data environments by proposing an intelligent scheduling optimization framework based on deep Q-learning. The framework formalizes the ETL scheduling process as a Markov Decision Process and enables adaptive decision-making by a reinforcement learning agent in high-dimensional state spaces to dynamically optimize task allocation and resource scheduling. The model consists of a state representation module, a feature embedding network, a Q-value estimator, and a reward evaluation mechanism, which collectively consider task dependencies, node load states, and data flow characteristics to derive the optimal scheduling strategy in complex environments. A multi-objective reward function is designed to balance key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Distributed and Parallel Computing Systems · Big Data and Digital Economy