Gaussian Process Self-triggered Policy Search in Weakly Observable Environments
Hikaru Sasaki, Terushi Hirabayashi, Kaoru Kawabata, Takamitsu, Matsubara

TL;DR
This paper introduces a novel Gaussian process self-triggered policy search method designed for weakly observable environments, enabling effective control policy learning for waste cranes with limited sensor information.
Contribution
The paper proposes a new non-parametric, self-triggered policy search algorithm using Gaussian processes, specifically tailored for environments with sparse observations.
Findings
Successfully learned control policies for waste crane tasks
Demonstrated effectiveness in simulation and real robotic systems
Achieved robust action and duration decisions based on limited data
Abstract
The environments of such large industrial machines as waste cranes in waste incineration plants are often weakly observable, where little information about the environmental state is contained in the observations due to technical difficulty or maintenance cost (e.g., no sensors for observing the state of the garbage to be handled). Based on the findings that skilled operators in such environments choose predetermined control strategies (e.g., grasping and scattering) and their durations based on sensor values, %thereby improving the robustness of their actions, we propose a novel non-parametric policy search algorithm: Gaussian process self-triggered policy search (GPSTPS). GPSTPS has two types of control policies: action and duration. A gating mechanism either maintains the action selected by the action policy for the duration specified by the duration policy or updates the action and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification
