Policy Gradient Reinforcement Learning for Policy Represented by Fuzzy Rules: Application to Simulations of Speed Control of an Automobile
Seiji Ishihara, Harukazu Igarashi

TL;DR
This paper introduces a fuzzy policy gradient reinforcement learning method with a smoothness constraint for automobile speed control, reducing undesirable fluctuations and improving policy quality.
Contribution
It proposes a fusion approach combining fuzzy inference with a smoothness constraint in policy gradient reinforcement learning for better control policies.
Findings
The method suppresses undesirable fluctuations in output speed.
Reward function choice affects learning stability.
The approach improves policy smoothness in speed control.
Abstract
A method of a fusion of fuzzy inference and policy gradient reinforcement learning has been proposed that directly learns, as maximizes the expected value of the reward per episode, parameters in a policy function represented by fuzzy rules with weights. A study has applied this method to a task of speed control of an automobile and has obtained correct policies, some of which control speed of the automobile appropriately but many others generate inappropriate vibration of speed. In general, the policy is not desirable that causes sudden time change or vibration in the output value, and there would be many cases where the policy giving smooth time change in the output value is desirable. In this paper, we propose a fusion method using the objective function, that introduces defuzzification with the center of gravity model weighted stochastically and a constraint term for smoothness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
