Enhancement of Energy-Based Swing-Up Controller via Entropy Search

Chang Sik Lee; Dong Eui Chang

arXiv:1904.01214·cs.LG·April 4, 2019

Enhancement of Energy-Based Swing-Up Controller via Entropy Search

Chang Sik Lee, Dong Eui Chang

PDF

Open Access

TL;DR

This paper enhances an energy-based swing-up controller for a rotary inverted pendulum by applying Bayesian optimization with Entropy Search, resulting in improved performance across different initial conditions.

Contribution

It introduces a novel application of Entropy Search Bayesian optimization to tune parameters of an energy-based swing-up controller for the Furuta pendulum.

Findings

01

Optimal controller outperforms nominal controller in simulations.

02

Performance improvements observed across various initial conditions.

03

Bayesian optimization effectively finds suitable controller parameters.

Abstract

An energy based approach for stabilizing a mechanical system has offered a simple yet powerful control scheme. However, since it does not impose such strong constraints on parameter space of the controller, finding appropriate parameter values for an optimal controller is known to be hard. This paper intends to generate an optimal energy-based controller for swinging up a rotary inverted pendulum, also known as the Furuta pendulum, by applying the Bayesian optimization called Entropy Search. Simulations and experiments show that the optimal controller has an improved performance compared to a nominal controller for various initial conditions.

Equations71

K^{*} = arg min_{K \in D} J (K),

K^{*} = arg min_{K \in D} J (K),

H_{n} = {J (K_{1}), J (K_{2}), \dots, J (K_{n})},

H_{n} = {J (K_{1}), J (K_{2}), \dots, J (K_{n})},

h_{n} = {K_{1}, K_{2}, \dots, K_{n}},

h_{n} = {K_{1}, K_{2}, \dots, K_{n}},

μ_{n} (K_{new}) = m (K_{new}) + k_{n} (K_{new}) K_{n}^{- 1} y_{n},

μ_{n} (K_{new}) = m (K_{new}) + k_{n} (K_{new}) K_{n}^{- 1} y_{n},

σ_{n}^{2} (K_{new}) = k (K_{new}, K_{new}) - k_{n} (K_{new}) K_{n}^{- 1} k_{n}^{T} (K_{new}),

[K_{n}]_{ij} = k (K_{i}, K_{j}) i, j \in {1, 2, 3, \dots, n},

[K_{n}]_{ij} = k (K_{i}, K_{j}) i, j \in {1, 2, 3, \dots, n},

k_{n} (K_{new}) = [k (K_{new}, K_{1}), k (K_{new}, K_{2}), \dots, k (K_{new}, K_{n})],

y_{n} = [(J (K_{1}) - m (K_{1}), J (K_{2}) - m (K_{2}), \dots, J (K_{n}) - m (K_{n})]^{T} .

P_{min} (K)

P_{min} (K)

H (K)

L (q, \overset{q}{˙}) = \frac{1}{2} \overset{q}{˙}^{T} M (q) \overset{q}{˙} - P E (q)

L (q, \overset{q}{˙}) = \frac{1}{2} \overset{q}{˙}^{T} M (q) \overset{q}{˙} - P E (q)

\displaystyle M(q)=\left(\begin{array}[]{cc}I_{10}+I_{11}\sin^{2}{q_{2}}&-I_{12}\cos{q_{2}}\\ -I_{12}\cos{q_{2}}&I_{2}\end{array}\right),

\displaystyle M(q)=\left(\begin{array}[]{cc}I_{10}+I_{11}\sin^{2}{q_{2}}&-I_{12}\cos{q_{2}}\\ -I_{12}\cos{q_{2}}&I_{2}\end{array}\right),

P E (q) = V_{0} cos q_{2}

\displaystyle M(q)\left(\begin{array}[]{c}\ddot{q}_{1}\\ \ddot{q}_{2}\end{array}\right)+C(q,\dot{q})\left(\begin{array}[]{c}\dot{q}_{1}\\ \dot{q}_{2}\end{array}\right)+G(q,\dot{q})=\left(\begin{array}[]{c}u\\ 0\end{array}\right)

\displaystyle M(q)\left(\begin{array}[]{c}\ddot{q}_{1}\\ \ddot{q}_{2}\end{array}\right)+C(q,\dot{q})\left(\begin{array}[]{c}\dot{q}_{1}\\ \dot{q}_{2}\end{array}\right)+G(q,\dot{q})=\left(\begin{array}[]{c}u\\ 0\end{array}\right)

C (q, \overset{q}{˙})

C (q, \overset{q}{˙})

G (q, \overset{q}{˙})

I_{10}

I_{10}

I_{2}

m_{1}

m_{1}

l_{1}

l_{2}

E (q, \overset{q}{˙})

E (q, \overset{q}{˙})

= \frac{1}{2} ((I_{10} + I_{11} sin^{2} q_{2}) \overset{q}{˙}_{1}^{2} + I_{2} \overset{q}{˙}_{2}^{2}) - I_{12} \overset{q}{˙}_{1} \overset{q}{˙}_{2} cos q_{2}

+ V_{0} cos q_{2} .

V (q, \overset{q}{˙}) = k_{E} \frac{1}{2} (E - E_{0})^{2} + k_{v} \frac{1}{2} \overset{q}{˙}_{1}^{2} + k_{x} (1 - cos q_{1}),

V (q, \overset{q}{˙}) = k_{E} \frac{1}{2} (E - E_{0})^{2} + k_{v} \frac{1}{2} \overset{q}{˙}_{1}^{2} + k_{x} (1 - cos q_{1}),

u = \frac{- k _{p} q ˙ _{1} - H ( q , q ˙ )}{k _{E} ( E - E _{0} ) + k _{v} R ( q )}

u = \frac{- k _{p} q ˙ _{1} - H ( q , q ˙ )}{k _{E} ( E - E _{0} ) + k _{v} R ( q )}

H (q, \overset{q}{˙})

H (q, \overset{q}{˙})

\displaystyle\quad-k_{v}\left(\begin{array}[]{cc}1&0\end{array}\right)M(q)^{-1}G(q,\dot{q})+k_{x}(1-\cos{q_{1}}),

R (q)

K = (k_{p}, k_{E}, k_{v}, k_{x}) \in R^{4} .

K = (k_{p}, k_{E}, k_{v}, k_{x}) \in R^{4} .

k_{v} > 6.8366 \times 1 0^{- 6} k_{E} .

k_{v} > 6.8366 \times 1 0^{- 6} k_{E} .

400 \leq k_{p} \leq 900, 1 0^{6} \leq k_{E} \leq 1 0^{7},

400 \leq k_{p} \leq 900, 1 0^{6} \leq k_{E} \leq 1 0^{7},

5 \leq k_{v} \leq 100, 100 \leq k_{x} \leq 1000,

J (K) = \int_{t_{0}}^{t_{f}} [\frac{20 ( 1 - cos x _{1} ( t ) )}{5 - cos x _{1} ( t _{0} )} + \frac{100 ( 1 - cos x _{2} ( t ) )}{30 - cos x _{2} ( t _{0} )}

J (K) = \int_{t_{0}}^{t_{f}} [\frac{20 ( 1 - cos x _{1} ( t ) )}{5 - cos x _{1} ( t _{0} )} + \frac{100 ( 1 - cos x _{2} ( t ) )}{30 - cos x _{2} ( t _{0} )}

+ \frac{1}{2} (\frac{x ˙ _{1} ( t )}{80 + ∣ x ˙ _{1} ( t _{0} ) ∣})^{2} + \frac{1}{2} (\frac{x ˙ _{2} ( t )}{100 + ∣ x ˙ _{2} ( t _{0} ) ∣})^{2}] d t,

x = (x_{1}, x_{2}, x_{3}, x_{4}) = (q_{1}, q_{2}, \overset{q}{˙}_{1}, \overset{q}{˙}_{2}) .

x = (x_{1}, x_{2}, x_{3}, x_{4}) = (q_{1}, q_{2}, \overset{q}{˙}_{1}, \overset{q}{˙}_{2}) .

x_{0} = (0, \frac{7 π}{9}, 0, 0) .

x_{0} = (0, \frac{7 π}{9}, 0, 0) .

K_{nom} = (770.152, 6255313.438, 35.190, 465.098),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsControl and Stability of Dynamical Systems · Advanced Control Systems Optimization · Adaptive Control of Nonlinear Systems

Full text

Enhancement of Energy-Based Swing-Up Controller via Entropy Search

Chang Sik Lee1 and Dong Eui Chang2,3 1School of Electrical Engineering, KAIST, Daejeon, Korea. [email protected]2Corresponding author, School of Electrical Engineering, KAIST, Daejeon, Korea. [email protected]3This research has been in part supported by KAIST under grant N11180231 and N11190038, and by the ICT R $\And$ D program of MSIP/IITP [2016-0-00563, Research on Adaptive Machine Learning Technology Development for Intelligent Autonomous Digital Companion].

Abstract

An energy based approach for stabilizing a mechanical system has offered a simple yet powerful control scheme. However, since it does not impose such strong constraints on parameter space of the controller, finding appropriate parameter values for an optimal controller is known to be hard. This paper intends to generate an optimal energy-based controller for swinging up a rotary inverted pendulum, also known as the Furuta pendulum, by applying the Bayesian optimization called Entropy Search. Simulations and experiments show that the optimal controller has an improved performance compared to a nominal controller for various initial conditions.

I INTRODUCTION

The task of stabilizing an underactuated mechanical system has been investigated over decades. Accordingly, several ideas have been proposed to resolve the problem in improved methods [1, 2, 3, 4, 5, 6]. The idea of using a particular storage function established on the Euler-Lagrange equations of a mechanical system has presented a framework for an effective energy-based swing-up controller [7]. A drawback of the result is that, when it comes to applying it to a real system, the controller requires vague adjustment over a multidimensional parameter space.

Meanwhile, the construction of optimally adjusted controllers has been studied from a wide and diversified point of view [8, 9]. In recent years, as the notion of machine learning has been widening its coverage over a variety of fields, it has also begun to put its influence on the optimal control of mechanical systems[10, 11, 12, 13, 14, 15, 16]. Da et al.[12] deploys supervised learning methods to obtain more robust controllers for a 3D bipedal robot. In [13] and [16], reinforcement learning algorithms are used to compensate for unmodeled dynamics of systems. Furthermore, as a sample-efficient methodology to solve non-convex optimization problems, Bayesian optimization are widely adopted to optimize controllers[11, 14, 15].

However, all the approaches in [10, 11, 12, 13, 14, 15, 16] have a common problem that they look for local minima. On the other hand, Marco et al.[10] tackles the task of finding proper parameter values for a controller that optimally stabilizes a linear model by using Entropy Search[17], a machine learning process which finds a global minimum of a given cost function.

This paper aims to take advantage of the machine learning optimization technique to resolve the drawback of the energy-based control[7] for stabilizing a nonlinear model. To be specific, we use an energy-based controller for a rotary inverted pendulum system, and we intend to fit a Gaussian process estimation model through repeated evaluations of a cost function whose distribution is unknown, following procedures of Entropy Search [17]. Consequently, we can globally estimate the optimal parameter value for the best performance of the controller.

II PROBLEM STATEMENT

Kolesnichenko and Shiriaev [7] has proposed an energy-based swing-up controller for an underactuated mechanical system, and provided sufficient conditions on the controller’s gain parameters $K\in\mathbb{R}^{\ell}$ for successful swing-up. However, not all the parameter values under the conditions result in assured swing-up of the real system. Moreover, even though most parameter values can build controllers that drive the system to eventually reach the desired swing-up equilibrium point, their performances may not be all satisfactory. Therefore, there still remains the laborious task to find a set of parameter values which achieves the desired performance to swiftly reach the desired equilibrium point with less oscillation.

The task to find such values of control parameters is formulated as an optimization problem with a cost function $J(K)$ that properly reflects the desired performance,

[TABLE]

where $\mathcal{D}$ is a parameter domain. To solve this optimization problem, we employ the Bayesian optimization technique called Entropy Search; refer to [17] for more details on Entropy Search. Entropy Search has the merit that, where not all the values of $J(K)$ are not known, it globally estimates the given cost function $J(K)$ and finds a reliable global minimum while most of other algorithms seek local minima.

III Preliminaries

Before description of the main result, we offer backgrounds on Entropy Search.

III-A Entropy Search

The problem (1) can be stated as finding $K^{*}\in\mathcal{D}$ that optimizes a function $J(K)$ while the functional relationship between $K$ and $J(K)$ is not known a priori. Namely, the values of cost function $J(K)$ may not be available or observable for all $K\in\mathcal{D}$ . In such a situation, Bayesian optimization methods are quite useful since they repeatedly estimate an arbitrary black box function “ $J(K)$ ” based on a probabilistic model and selects an appropriate measure point $K_{\rm next}$ for more accurate modeling. Among several available Bayesian techniques, we choose to use Entropy Search which efficiently finds global minimum [17].

Two tools are required for Bayesian optimization. One is a probabilistic model for estimating the black box function $J(K)$ based on measurements

[TABLE]

and the other is a decision rule for specifying a new point $K_{n+1}$ where $J(K_{n+1})$ will be evaluated so that the estimation model approaches closer to the actual values of $J(K)$ .

First, as its estimation model, Entropy Search utilizes a Gaussian process. A Gaussian process is a non-parametric model generally used to estimate an unknown function $J(K)$ . Suppose $m(K)$ as a prior mean and $k(K_{j},K_{l})$ as a covariance function (kernel) between $J(K_{j})$ and $J(K_{l})$ , where $K_{j},K_{l}\in\mathcal{D}$ . The former implies the prior belief on $J(K)$ , which is usually a constant, and the latter suggests the relationship between those two random variables $J(K_{j})$ and $J(K_{l})$ . Given a set of evaluation (2) at a set of points given by

[TABLE]

the function value $J(K_{\rm new})$ at a new point $K_{\rm new}$ is a random variable with a Gaussian distribution with the posterior mean and variance given respectively by

[TABLE]

where

[TABLE]

Utilization of above equations allows us to estimate the functional relationship between $J(K)$ and $K$ . For more details, refer to [17, 18]

Secondly, in order to determine the next measurement point, Entropy Search computes the expected change $E[\bigtriangleup\mathbf{H}]$ in entropy $\mathbf{H}$ of $P_{\rm min}$ , where $P_{\rm min}$ and $\mathbf{H}$ are defined as

[TABLE]

with $\hat{J}(K)$ being the Gaussian process estimation of $J(K)$ . i.e. $\hat{J}(K)\sim\mathcal{N}(\mu_{n}(K),\sigma^{2}_{n}(K))\enspace\forall K\in\mathcal{D}$ , and $U(K)$ is the uniform distribution over $\mathcal{D}$ . The next measurement point $K_{n+1}$ is then selected by finding a point with the largest expected change in entropy ( $E[\bigtriangleup\mathbf{H}]$ ). This decision rule is established on the assumption that the next measurement point $K_{n+1}$ obtained as above is the most informative point.

The measurement of $J(K_{n+1})$ is made at the new point $K_{n+1}$ , and then $J(K_{n+1})$ and $K_{n+1}$ are added respectively to the sets $H_{n}$ and $h_{n}$ after which the two sets are renamed as $H_{n+1}$ and $h_{n+1}$ . Entropy Search then returns a best guess point $K_{\rm bg}$ at which the cost function $J(K)$ is likely to be minimum, that is, where $P_{\rm min}$ is the largest by definition of $P_{\rm min}$ . This makes the end of a single process.

The process is repeated until the model has sufficiently converged to the objective function $J(K)$ and $P_{\rm min}$ is peaked around the optimum [18]. Namely, the termination of the process is determined when a posterior mean at a best guess $\mu_{n}(K_{\rm bg})$ does not change over a threshold $\epsilon$ for $\gamma$ consecutive iterations. For more details including derivation of $E[\bigtriangleup\mathbf{H}]$ , refer to [17].

To sum up, given an initial condition, a termination threshold $\epsilon$ , a duration $\gamma$ , and a set of evaluations (2) at arbitrary points (3), Entropy Search can be described as in the following algorithm:

IV Swing Up of the Furuta Pendulum

IV-A Swing-Up Controller

As an underactuated mechanical system, we choose Quanser QUBE Servo 2[19] which is a kind of Furuta pendulum. Assume an ideal model of the Furuta pendulum system with no noise and no frictions. The configuration space $Q$ of the system is $Q=\mathbb{R}\times\mathbb{R}$ , $q=(q_{1},q_{2})$ where $q_{1}$ is an angle of the rotary arm, $q_{2}$ is an angle of the inverted pendulum, as shown in Figure 1. The Lagrangian $\mathcal{L}$ of the system is given by

[TABLE]

where

[TABLE]

with $PE(q)$ being the potential energy. The Euler-Lagrange equations of the system are computed as

[TABLE]

where

[TABLE]

and

[TABLE]

where $m_{1}$ and $m_{2}$ are masses, $J_{1}$ and $J_{2}$ are moments of inertia, $l_{1}$ and $l_{2}$ are lengths of rotary arm and pendulum respectively. The symbol $g$ denotes the gravitational acceleration and $V_{0}$ is the potential energy at the equilibrium point $(q_{1},q_{2},\dot{q}_{1},\dot{q}_{2})=(0,0,0,0)$ . The values of the parameters are

[TABLE]

which are from the table on p.8 of [19]. The total energy $E$ is given by

[TABLE]

Kolesnichenko and Shiriaev [7] introduces the following storage function $V(q,\dot{q})$ :

[TABLE]

where the original term $q_{1}^{2}/2$ in Kolesnichenko and Shiriaev [7] has been replaced by $(1-\cos q_{1})$ in order to take the periodicity of angle into account. From the storage function, one can easily derive the following energy-based controller

[TABLE]

where

[TABLE]

See [7] for a detailed derivation.

The swing-up control law (5) contains 4 parameters: $k_{p}$ , $k_{E}$ , $k_{v}$ and $k_{x}$ , which are put in vector form as follows:

[TABLE]

According to Theorem 2 of [7], a sufficient condition on $K$ for successful swing-up is given by

[TABLE]

In the range of $|q_{2}|\leq$$$, the swing-up controller ([5](#S4.E5)) is switched to the LQR for the linearization of the system at the equilibrium point with the weight matrices$ Q=\operatorname{diag}([1,10,1,10]) $and$ R=10000$.

To sum up, we swing up the rotary inverted pendulum relying on the energy-based controller (5). When the pendulum is in the region where the linearized model is effective, the LQR is turned on to hold the pendulum at the desired equilibrium point.

IV-B Optimization of Swing-Up Controller via Entropy Search

This section explains practical details about the optimization task to obtain an optimal swing-up controller. We first provide a common setup for simulations and experiments such as the range of parameters, a cost function, and the initial condition. The range of parameter vector $K$ is set as

[TABLE]

which defines the bounded domain $\mathcal{D}$ . The above range for $K$ is determined on the basis of the following observations: In the controller formula (5), the energy term $(E-E_{0})$ is relatively small due to the small values of the system’s physical parameters, so the gain $k_{E}$ to the energy term is chosen from the range, $10^{6}\leq k_{E}\leq 10^{7}$ , of large numbers relative to other gains. Moreover, the controller has a tendency to work well when $k_{v}$ is close to its lower bound $6.8366\times 10^{-6}k_{E}$ given in (6), from which the range, $5\leq k_{v}\leq 100$ , is derived. Ranges of the other parameters $k_{p}$ and $k_{x}$ are chosen in a way that the controller works well, provided that $k_{E}$ and $k_{v}$ are readily set in the above ranges, in several simulations.

We set a cost function as follows:

[TABLE]

where $t_{0}$ is the initial time, $t_{f}$ is the terminal time, and we use the following state vector

[TABLE]

By introducing initial conditions in denominators, the cost value defined in (IV-B) is less influenced by modification of initial conditions, which makes cost values comparable over various initial conditions. For these reasons, (IV-B) is used to measure performance of the controller in this paper.

The default initial condition for simulations and experiments is set as

[TABLE]

With the setting given above, we find a nominal controller $u(K_{\rm nom})$ by running 10,000 simulations in Matlab Simulink, where the time span of each simulation is 30 seconds. Each simulation starts with choosing a gain parameter vector $K=(k_{p},k_{E},k_{v},k_{x})$ uniformly randomly from the range (7), and ends with computing a cost value $J(K)$ . After all the simulations are finished, the set of parameter vectors which result in the lowest costs in the simulations are tested in experiments to obtain their experimental costs. Through this procedure, a set of parameter values which yields the lowest experimental cost has been found as follows:

[TABLE]

which is used as the nominal parameter vector.

We now find an optimal controller $u(K_{\rm bg})$ using Entropy Search. For Gaussian process, we choose constant prior mean $m(K)=20$ and the rational quadratic kernel function

[TABLE]

with $s^{2}=9.894$ , $\alpha=0.131$ and

[TABLE]

The hyperparameters, $m(K)$ , $s$ , $\alpha$ and $S$ , for the Gaussian process have been determined based on the result of running several times of simulations and hyperparameter fittings[18].

Before initializing Algorithm 1 to perform Entropy Search, we run 5 simulations with the default initial condition (10) to form a set of initial observations $H_{5}$ (2) at a set of points $h_{5}$ (3). Once the sets $H_{5}$ and $h_{5}$ are made, Entropy Search starts by running Algorithm 1. We use simulations, in line 11 of Algorithm 1, to compute trajectories of the system driven by controller $u(K_{n+i})$ where a single simulation is run for 30 seconds with the default initial condition. The process is terminated when the posterior mean $\mu_{n}(K_{\rm bg})$ at the best guess $K_{\rm bg}$ has not changed more than $\epsilon=0.01$ for $\gamma=3$ iterations or when an iteration is repeated for $N=60$ times. Verification of the resultant controller $u(K_{\rm bg})$ is executed in a simulation and an experiment for 30 seconds after Algorithm 1 is completed.

After 60 iterations, Entropy Search obtains the optimal parameter vector

[TABLE]

Figure 2 shows how Entropy Search has converged to $K_{\rm ES}$ by iteratively evaluating a cost value $J(K_{n+i})$ and estimating a posterior mean $\mu_{n}(K_{\rm bg})$ at a best guesses $K_{\rm bg}$ . To be specific, in the upper side of Figure 2, a cost value $J(K_{n+i})$ obtained at a suggested point $K_{n+i}$ , following the line 9 – 12 of Algorithm 1, is plotted for each iteration. In the lower side, a posterior mean $\mu_{n}(K_{\rm bg})$ at a best guess $K_{\rm bg}$ given in line 14 of Algorithm 1 is plotted for each iteration. As the iterative process goes on, the posterior mean $\mu_{n}(K_{\rm bg})$ at the best guess point $K_{\rm bg}$ approaches to a certain value, which indicates that the estimation model has been fit to the real distribution of $J(K)$ over iterations.

IV-C Performance Comparison

We have run two simulations for the default initial condition (10): one with the nominal controller $u(K_{\rm nom})$ and the other with the optimal controller $u(K_{\rm ES})$ , and have obtained the following cost values:

[TABLE]

from which it is deduced that the optimal controller yields a cost value $27.12\%$ than the nominal controller. Although the optimal gain $K_{\rm ES}$ has been obtained for the default initial condition, our exhaustive simulations show that it performs well for various initial conditions in the range of $-\pi\leq x_{1}\leq\pi$ , $-\pi\leq x_{2}\leq\pi$ with zero initial velocity. Figure 3 shows cost values of the optimal controller $u(K_{\rm ES})$ and the nominal controller $u(K_{\rm nom})$ sampled from the set of entire costs computed in simulations, where they respectively form plots over initial conditions.

For the purpose of verification, we test the two controllers $u(K_{\rm nom})$ and $u(K_{\rm ES})$ on the system of Quanser QUBE Servo 2 for the following initial conditions:

[TABLE]

with the other states at zero.

For each initial condition, the cost value $J$ is computed by averaging the cost values of 5 repeated experiments. The results are plotted in Figure 4. It can be seen that the optimal controller produces a lower cost value for each initial condition than the nominal controller. The time responses of the two controllers for the initial conditions $x_{2}(0)\in\left\{\frac{\pi}{3},\enspace\frac{\pi}{2},\enspace\frac{5\pi}{6}\right\}$ are measured in experiments and plotted in Figures 5, 6, and 7, repectively. It can be seen that the response with the optimal controller $u(K_{\rm ES})$ has a shorter settling time than the nominal controller for each initial condition. It follows that Entropy Search has succeeded in isolating an energy-based controller with the best performance, which leads to quick and firm stabilization of the rotary inverted pendulum. The video of the experiments is available at https://youtu.be/JcmpLU5rJCg.

V CONCLUSIONS

The energy based controller proposed in [7] is not only derived easily by considering the energy of system but also effective in stabilizing an underactuated non-linear system. However, it still requires a considerable amount of efforts, such as searching through multidimensional hyper-parameter space, to isolate optimal parameter values. This paper proposes application of Entropy Search to the problem of finding the optimal gain parameter values of an energy-based swing-up controller for the Furuta pendulum system. Based on the results in Section IV-C, it is concluded that Entropy Search successfully optimizes the given controller so that the optimal controller attains a better performance than the nominal controller. In the future, we will combine Entropy Search with a deep neural network [20] to enhance the performance of the controller.

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. M. Bloch, D. E. Chang, N. E. Leonard, and J. E. Marsden, “Controlled Lagrangians and the stabilization of mechanical systems. II. potential shaping,” IEEE Transactions on Automatic Control , vol. 46, no. 10, pp. 1556–1571, Oct 2001.
2[2] D. E. Chang, “Stabilizability of controlled Lagrangian systems of two degrees of freedom and one degree of under-actuation by the energy-shaping method,” IEEE Transactions on Automatic Control , vol. 55, no. 8, pp. 1888–1893, Aug 2010.
3[3] D. E. Chang, “The method of controlled Lagrangians: Energy plus force shaping,” SIAM Journal on Control and Optimization , vol. 48, no. 8, pp. 4821–4845, 2010.
4[4] W. Ng, D. E. Chang, and G. Labahn, “Energy shaping for systems with two degrees of underactuation and more than three degrees of freedom,” SIAM Journal on Control and Optimization , vol. 51, no. 2, pp. 881–905, 2013.
5[5] K. Åström and K. Furuta, “Swinging up a pendulum by energy control,” Automatica , vol. 36, no. 2, pp. 287 – 295, 2000.
6[6] A. L. Fradkov, “Swinging control of nonlinear oscillations,” International Journal of Control , vol. 64, no. 6, pp. 1189–1202, 1996.
7[7] O. Kolesnichenko and A. S. Shiriaev, “Partial stabilization of underactuated Euler–Lagrange systems via a class of feedback transformations,” Systems and Control Letters , vol. 45, no. 2, pp. 121 – 132, 2002.
8[8] I. Kamwa, G. Trudel, and L. Gerin-Lajoie, “Robust design and coordination of multiple damping controllers using nonlinear constrained optimization,” in Proceedings of the 21st International Conference on Power Industry Computer Applications. Connecting Utilities. PICA 99. To the Millennium and Beyond (Cat. No.99CH 36351) , May 1999, pp. 87–94.