Improving Q-Learning for Real-World Control: A Case Study in Series Hybrid Agricultural Tractors
Hend Abououf, Sidra Ghayour Bhatti, Qadeer Ahmed

TL;DR
This paper enhances Q-learning algorithms for hybrid agricultural tractor control by introducing reward shaping and expert demonstration strategies, significantly improving learning speed and policy efficiency in real-world energy management tasks.
Contribution
It evaluates and compares advanced Q-learning algorithms, proposes a reward-shaping method, and analyzes expert demonstration effects to optimize hybrid tractor energy management.
Findings
DDQN converges 70% faster than DQN.
Reward shaping biases policies toward fuel efficiency.
Expert demonstrations improve convergence speed by 33%.
Abstract
The variable and unpredictable load demands in hybrid agricultural tractors make it difficult to design optimal rule-based energy management strategies, motivating the use of adaptive, learning-based control. However, existing approaches often rely on basic fuel-based rewards and do not leverage expert demonstrations to accelerate training. In this paper, first, the performance of Q-value-based reinforcement learning algorithms is evaluated for powertrain control in a hybrid agricultural tractor. Three algorithms, Double Q-Learning (DQL), Deep Q-Networks (DQN), and Double DQN (DDQN), are compared in terms of convergence speed and policy optimality. Second, a piecewise domain-specific reward-shaping strategy is introduced to improve learning efficiency and steer agent behavior toward engine fuel-efficient operating regions. Third, the design of the experience replay buffer is examined,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
