Improving Q-Learning for Real-World Control: A Case Study in Series Hybrid Agricultural Tractors

Hend Abououf; Sidra Ghayour Bhatti; Qadeer Ahmed

arXiv:2508.03647·eess.SY·August 6, 2025

Improving Q-Learning for Real-World Control: A Case Study in Series Hybrid Agricultural Tractors

Hend Abououf, Sidra Ghayour Bhatti, Qadeer Ahmed

PDF

TL;DR

This paper enhances Q-learning algorithms for hybrid agricultural tractor control by introducing reward shaping and expert demonstration strategies, significantly improving learning speed and policy efficiency in real-world energy management tasks.

Contribution

It evaluates and compares advanced Q-learning algorithms, proposes a reward-shaping method, and analyzes expert demonstration effects to optimize hybrid tractor energy management.

Findings

01

DDQN converges 70% faster than DQN.

02

Reward shaping biases policies toward fuel efficiency.

03

Expert demonstrations improve convergence speed by 33%.

Abstract

The variable and unpredictable load demands in hybrid agricultural tractors make it difficult to design optimal rule-based energy management strategies, motivating the use of adaptive, learning-based control. However, existing approaches often rely on basic fuel-based rewards and do not leverage expert demonstrations to accelerate training. In this paper, first, the performance of Q-value-based reinforcement learning algorithms is evaluated for powertrain control in a hybrid agricultural tractor. Three algorithms, Double Q-Learning (DQL), Deep Q-Networks (DQN), and Double DQN (DDQN), are compared in terms of convergence speed and policy optimality. Second, a piecewise domain-specific reward-shaping strategy is introduced to improve learning efficiency and steer agent behavior toward engine fuel-efficient operating regions. Third, the design of the experience replay buffer is examined,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.