Gait modeling and optimization for the perturbed Stokes regime

Matthew D. Kvalheim; Brian Bittner; Shai Revzen

arXiv:1906.04384·cs.RO·April 28, 2020

Gait modeling and optimization for the perturbed Stokes regime

Matthew D. Kvalheim, Brian Bittner, Shai Revzen

PDF

TL;DR

This paper develops a nonlinear reduced-order model for viscous-dominated locomotion in the perturbed Stokes regime, extending geometric mechanics to include inertial effects, and introduces an algorithm for data-driven gait analysis.

Contribution

It introduces a nonlinear shape-to-velocity model for the perturbed Stokes regime and an algorithm to estimate dynamics from observational data, improving over previous linear models.

Findings

01

Model performs well in Stokesian regime

02

Significantly improves prediction accuracy in perturbed Stokes regime

03

Enables better data-driven gait analysis and optimization

Abstract

Many forms of locomotion, both natural and artificial, are dominated by viscous friction in the sense that without power expenditure they quickly come to a standstill. From geometric mechanics, it is known that for swimming at the "Stokesian" (viscous; zero Reynolds number) limit, the motion is governed by a reduced order "connection" model that describes how body shape change produces motion for the body frame with respect to the world. In the "perturbed Stokes regime" where inertial forces are still dominated by viscosity, but are not negligible (low Reynolds number), we show that motion is still governed by a functional relationship between shape velocity and body velocity, but this function is no longer linear in shape change rate. We derive this model using results from singular perturbation theory, and the theory of noncompact normally hyperbolic invariant manifolds (NHIMs).…

Equations113

\accentset \circ g = - A_{visc} (r) \cdot \overset{r}{˙},

\accentset \circ g = - A_{visc} (r) \cdot \overset{r}{˙},

\accentset \circ g = - A_{visc} (r) \cdot \overset{r}{˙} + ϵ B (r) \cdot \overset{r}{¨} + ϵ G (r) \cdot (\overset{r}{˙}, \overset{r}{˙}) + O (ϵ^{2}) .

\accentset \circ g = - A_{visc} (r) \cdot \overset{r}{˙} + ϵ B (r) \cdot \overset{r}{¨} + ϵ G (r) \cdot (\overset{r}{˙}, \overset{r}{˙}) + O (ϵ^{2}) .

\accentset \circ g_{n}^{k} \approx - C_{0, m}^{k} A_{m, i}^{k} \overset{γ}{˙}_{m}^{i} - C_{1, m}^{k} A_{m, i}^{k} \dot{δ}_{n}^{i} - C_{2, m}^{k} \frac{\partial A _{m, i}^{k}}{\partial r ^{j}} \overset{γ}{˙}_{m}^{i} δ_{n}^{j} - C_{3, m}^{k} \frac{\partial A _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \dot{δ}_{n}^{i} .

\accentset \circ g_{n}^{k} \approx - C_{0, m}^{k} A_{m, i}^{k} \overset{γ}{˙}_{m}^{i} - C_{1, m}^{k} A_{m, i}^{k} \dot{δ}_{n}^{i} - C_{2, m}^{k} \frac{\partial A _{m, i}^{k}}{\partial r ^{j}} \overset{γ}{˙}_{m}^{i} δ_{n}^{j} - C_{3, m}^{k} \frac{\partial A _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \dot{δ}_{n}^{i} .

\accentset \circ g_{1} ⋮ \accentset \circ g_{N} = 1, ⋮ 1, δ_{1}, ⋮ δ_{N}, \dot{δ}_{1}, ⋮ \dot{δ}_{N}, δ_{1} \otimes \dot{δ}_{1} ⋮ δ_{N} \otimes \dot{δ}_{N} \cdot C_{0} C_{1} C_{2} C_{3}

\accentset \circ g_{1} ⋮ \accentset \circ g_{N} = 1, ⋮ 1, δ_{1}, ⋮ δ_{N}, \dot{δ}_{1}, ⋮ \dot{δ}_{N}, δ_{1} \otimes \dot{δ}_{1} ⋮ δ_{N} \otimes \dot{δ}_{N} \cdot C_{0} C_{1} C_{2} C_{3}

\accentset \circ g_{n}^{k} \dots \dots \approx - A_{m, i}^{k} \overset{γ}{˙}_{m}^{i} - A_{m, i}^{k} \dot{δ}_{n}^{i} - \frac{\partial A _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \overset{γ}{˙}_{m}^{i} - \frac{\partial A _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \dot{δ}_{n}^{i} + ϵ (B_{m, i}^{k} \overset{γ}{¨}_{m}^{i} + B_{m, i}^{k} \ddot{δ}_{n}^{i} + \frac{\partial B _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \overset{γ}{¨}_{m}^{i} + \frac{\partial B _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \ddot{δ}_{n}^{i} + G_{m, i, j}^{k} \overset{γ}{˙}_{m}^{i} \overset{γ}{˙}_{m}^{j} + G_{m, i, j}^{k} \overset{γ}{˙}_{m}^{i} \dot{δ}_{n}^{j} + G_{m, i, j}^{k} \dot{δ}_{n}^{i} \overset{γ}{˙}_{m}^{j} + G_{m, i, j}^{k} \dot{δ}_{n}^{i} \dot{δ}_{n}^{j} + \frac{\partial G _{m, i, j}^{k}}{\partial r ^{ℓ}} δ_{n}^{ℓ} \overset{γ}{˙}_{m}^{i} \overset{γ}{˙}_{m}^{j} + \frac{\partial G _{m, i, j}^{k}}{\partial r ^{ℓ}} δ_{n}^{ℓ} \overset{γ}{˙}_{m}^{i} \dot{δ}_{n}^{j} + \frac{\partial G _{m, i, j}^{k}}{\partial r ^{ℓ}} δ_{n}^{ℓ} \dot{δ}_{n}^{i} \overset{γ}{˙}_{m}^{j} + \frac{\partial G _{m, i, j}^{k}}{\partial r ^{ℓ}} δ_{n}^{ℓ} \dot{δ}_{n}^{i} \dot{δ}_{n}^{j}) .

\accentset \circ g_{n}^{k} \dots \dots \approx - A_{m, i}^{k} \overset{γ}{˙}_{m}^{i} - A_{m, i}^{k} \dot{δ}_{n}^{i} - \frac{\partial A _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \overset{γ}{˙}_{m}^{i} - \frac{\partial A _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \dot{δ}_{n}^{i} + ϵ (B_{m, i}^{k} \overset{γ}{¨}_{m}^{i} + B_{m, i}^{k} \ddot{δ}_{n}^{i} + \frac{\partial B _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \overset{γ}{¨}_{m}^{i} + \frac{\partial B _{m, i}^{k}}{\partial r ^{j}} δ_{n}^{j} \ddot{δ}_{n}^{i} + G_{m, i, j}^{k} \overset{γ}{˙}_{m}^{i} \overset{γ}{˙}_{m}^{j} + G_{m, i, j}^{k} \overset{γ}{˙}_{m}^{i} \dot{δ}_{n}^{j} + G_{m, i, j}^{k} \dot{δ}_{n}^{i} \overset{γ}{˙}_{m}^{j} + G_{m, i, j}^{k} \dot{δ}_{n}^{i} \dot{δ}_{n}^{j} + \frac{\partial G _{m, i, j}^{k}}{\partial r ^{ℓ}} δ_{n}^{ℓ} \overset{γ}{˙}_{m}^{i} \overset{γ}{˙}_{m}^{j} + \frac{\partial G _{m, i, j}^{k}}{\partial r ^{ℓ}} δ_{n}^{ℓ} \overset{γ}{˙}_{m}^{i} \dot{δ}_{n}^{j} + \frac{\partial G _{m, i, j}^{k}}{\partial r ^{ℓ}} δ_{n}^{ℓ} \dot{δ}_{n}^{i} \overset{γ}{˙}_{m}^{j} + \frac{\partial G _{m, i, j}^{k}}{\partial r ^{ℓ}} δ_{n}^{ℓ} \dot{δ}_{n}^{i} \dot{δ}_{n}^{j}) .

\accentset \circ g_{n}^{k} \dots \dots \approx (- A_{m, i}^{k} \overset{γ}{˙}_{m}^{i} + ϵ B_{m, i}^{k} \overset{γ}{¨}_{m}^{i} + ϵ G_{m, i, j}^{k} \overset{γ}{˙}_{m}^{i} \overset{γ}{˙}_{m}^{j}) + (- \frac{\partial A _{m, j}^{k}}{\partial r ^{i}} \overset{γ}{˙}_{m}^{j} + ϵ \frac{\partial B _{m, j}^{k}}{\partial r ^{i}} \overset{γ}{¨}_{m}^{j} + ϵ \frac{\partial G _{m, j, ℓ}^{k}}{\partial r ^{i}} \overset{γ}{˙}_{m}^{j} \overset{γ}{˙}_{m}^{ℓ}) δ_{n}^{i} + (- A_{m, i}^{k} + ϵ G_{m, j, i}^{k} \overset{γ}{˙}_{m}^{j} + ϵ G_{m, i, j}^{k} \overset{γ}{˙}_{m}^{j}) \dot{δ}_{n}^{i} + (- \frac{\partial A _{m, j}^{k}}{\partial r ^{i}} + ϵ \frac{\partial G _{m, ℓ, j}^{k}}{\partial r ^{i}} \overset{γ}{˙}_{m}^{ℓ} + ϵ \frac{\partial G _{m, j, ℓ}^{k}}{\partial r ^{i}} \overset{γ}{˙}_{m}^{ℓ}) δ_{n}^{i} \dot{δ}_{n}^{j} + ϵ (B_{m, i}^{k} \ddot{δ}_{n}^{i} + \frac{\partial B _{m, j}^{k}}{\partial r ^{i}} δ_{n}^{i} \ddot{δ}_{n}^{j} + G_{m, i, j}^{k} \dot{δ}_{n}^{i} \dot{δ}_{n}^{j} + \frac{\partial G _{m, j, ℓ}^{k}}{\partial r ^{i}} δ_{n}^{i} \dot{δ}_{n}^{j} \dot{δ}_{n}^{ℓ}),

\accentset \circ g_{n}^{k} \dots \dots \approx (- A_{m, i}^{k} \overset{γ}{˙}_{m}^{i} + ϵ B_{m, i}^{k} \overset{γ}{¨}_{m}^{i} + ϵ G_{m, i, j}^{k} \overset{γ}{˙}_{m}^{i} \overset{γ}{˙}_{m}^{j}) + (- \frac{\partial A _{m, j}^{k}}{\partial r ^{i}} \overset{γ}{˙}_{m}^{j} + ϵ \frac{\partial B _{m, j}^{k}}{\partial r ^{i}} \overset{γ}{¨}_{m}^{j} + ϵ \frac{\partial G _{m, j, ℓ}^{k}}{\partial r ^{i}} \overset{γ}{˙}_{m}^{j} \overset{γ}{˙}_{m}^{ℓ}) δ_{n}^{i} + (- A_{m, i}^{k} + ϵ G_{m, j, i}^{k} \overset{γ}{˙}_{m}^{j} + ϵ G_{m, i, j}^{k} \overset{γ}{˙}_{m}^{j}) \dot{δ}_{n}^{i} + (- \frac{\partial A _{m, j}^{k}}{\partial r ^{i}} + ϵ \frac{\partial G _{m, ℓ, j}^{k}}{\partial r ^{i}} \overset{γ}{˙}_{m}^{ℓ} + ϵ \frac{\partial G _{m, j, ℓ}^{k}}{\partial r ^{i}} \overset{γ}{˙}_{m}^{ℓ}) δ_{n}^{i} \dot{δ}_{n}^{j} + ϵ (B_{m, i}^{k} \ddot{δ}_{n}^{i} + \frac{\partial B _{m, j}^{k}}{\partial r ^{i}} δ_{n}^{i} \ddot{δ}_{n}^{j} + G_{m, i, j}^{k} \dot{δ}_{n}^{i} \dot{δ}_{n}^{j} + \frac{\partial G _{m, j, ℓ}^{k}}{\partial r ^{i}} δ_{n}^{i} \dot{δ}_{n}^{j} \dot{δ}_{n}^{ℓ}),

\accentset \circ g_{1} ⋮ \accentset \circ g_{N} = 1, ⋮ 1, δ_{1}, ⋮ δ_{N}, \dot{δ}_{1}, ⋮ \dot{δ}_{N}, \ddot{δ}_{1} ⋮ \ddot{δ}_{N} δ_{1} \otimes \dot{δ}_{1} ⋮ δ_{N} \otimes \dot{δ}_{N} δ_{1} \otimes \ddot{δ}_{1} ⋮ δ_{N} \otimes \ddot{δ}_{N} \dot{δ}_{1} \otimes \dot{δ}_{1} ⋮ \dot{δ}_{N} \otimes \dot{δ}_{N} δ_{1} \otimes \dot{δ}_{1} \otimes \dot{δ}_{1} ⋮ δ_{N} \otimes \dot{δ}_{N} \otimes \dot{δ}_{N} \cdot C_{0} C_{1} C_{2} C_{3} C_{4} C_{5} C_{6} C_{7}

\accentset \circ g_{1} ⋮ \accentset \circ g_{N} = 1, ⋮ 1, δ_{1}, ⋮ δ_{N}, \dot{δ}_{1}, ⋮ \dot{δ}_{N}, \ddot{δ}_{1} ⋮ \ddot{δ}_{N} δ_{1} \otimes \dot{δ}_{1} ⋮ δ_{N} \otimes \dot{δ}_{N} δ_{1} \otimes \ddot{δ}_{1} ⋮ δ_{N} \otimes \ddot{δ}_{N} \dot{δ}_{1} \otimes \dot{δ}_{1} ⋮ \dot{δ}_{N} \otimes \dot{δ}_{N} δ_{1} \otimes \dot{δ}_{1} \otimes \dot{δ}_{1} ⋮ δ_{N} \otimes \dot{δ}_{N} \otimes \dot{δ}_{N} \cdot C_{0} C_{1} C_{2} C_{3} C_{4} C_{5} C_{6} C_{7}

\accentset \circ g = cos (θ) - sin (θ) 0 sin (θ) cos (θ) 0 001 \overset{g}{˙} .

\accentset \circ g = cos (θ) - sin (θ) 0 sin (θ) cos (θ) 0 001 \overset{g}{˙} .

C_{\frac{d}{n}} = c C_{x} \frac{d}{n} 00 0 C_{y} \frac{d}{n} 0 00 \frac{1}{12} (\frac{d}{n})^{3} C_{y}, C_{_{L}} = c C_{x} L 00 0 C_{y} L 0 00 \frac{1}{12} L^{3} C_{y},

C_{\frac{d}{n}} = c C_{x} \frac{d}{n} 00 0 C_{y} \frac{d}{n} 0 00 \frac{1}{12} (\frac{d}{n})^{3} C_{y}, C_{_{L}} = c C_{x} L 00 0 C_{y} L 0 00 \frac{1}{12} L^{3} C_{y},

F_{body} = c \overset{ˉ}{F}_{body} = - C_{_{L}} \accentset \circ g .

F_{body} = c \overset{ˉ}{F}_{body} = - C_{_{L}} \accentset \circ g .

F_{i} = c \overset{ˉ}{F}_{i} = - W_{i} C_{\frac{d}{n}} V_{i} [\accentset \circ g \overset{α}{˙}],

F_{i} = c \overset{ˉ}{F}_{i} = - W_{i} C_{\frac{d}{n}} V_{i} [\accentset \circ g \overset{α}{˙}],

V_{i} \cdot [\accentset \circ g \overset{α}{˙}] W_{i} \cdot [f τ] = [R_{α_{*} + \dots + α_{i}}^{- 1} \accentset \circ g_{x, y} + (\frac{d}{2 n} (\dot{θ} + \sum_{k = *}^{i} \overset{α}{˙}_{k}) + \frac{d}{n} \sum_{k = *}^{i - 1} (\dot{θ} + \sum_{j = *}^{k} \overset{α}{˙}_{j}) R_{α_{k + 1} + \dots + α_{i}}^{- 1}) e_{2} \dot{θ} + \sum_{k = *}^{i} \overset{α}{˙}_{k}] = [R_{α_{*} + \dots + α_{i}} f τ + e_{2}^{T} (\frac{d}{2 n} I_{2 \times 2} + \frac{d}{n} \sum_{k = *+ 1}^{i} R_{α_{k} + α_{k + 1} + \dots + α_{i}}) \cdot f],

V_{i} \cdot [\accentset \circ g \overset{α}{˙}] W_{i} \cdot [f τ] = [R_{α_{*} + \dots + α_{i}}^{- 1} \accentset \circ g_{x, y} + (\frac{d}{2 n} (\dot{θ} + \sum_{k = *}^{i} \overset{α}{˙}_{k}) + \frac{d}{n} \sum_{k = *}^{i - 1} (\dot{θ} + \sum_{j = *}^{k} \overset{α}{˙}_{j}) R_{α_{k + 1} + \dots + α_{i}}^{- 1}) e_{2} \dot{θ} + \sum_{k = *}^{i} \overset{α}{˙}_{k}] = [R_{α_{*} + \dots + α_{i}} f τ + e_{2}^{T} (\frac{d}{2 n} I_{2 \times 2} + \frac{d}{n} \sum_{k = *+ 1}^{i} R_{α_{k} + α_{k + 1} + \dots + α_{i}}) \cdot f],

\ddot{g}=\begin{bmatrix}\ddot{x}\\ \ddot{y}\\ \ddot{\theta}\end{bmatrix}=\frac{1}{\epsilon}\begin{bmatrix}1&0&0\\ 0&1&0\\ 0&0&\frac{1}{\bar{I}}\end{bmatrix}\begin{bmatrix}\cos(\theta)&-\sin(\theta)&0\\ \sin(\theta)&\cos(\theta)&0\\ 0&0&1\end{bmatrix}\Bigg{(}\bar{F}_{\textnormal{body}}+\sum_{i=1}^{n}\bar{F}_{i}\Bigg{)},

\ddot{g}=\begin{bmatrix}\ddot{x}\\ \ddot{y}\\ \ddot{\theta}\end{bmatrix}=\frac{1}{\epsilon}\begin{bmatrix}1&0&0\\ 0&1&0\\ 0&0&\frac{1}{\bar{I}}\end{bmatrix}\begin{bmatrix}\cos(\theta)&-\sin(\theta)&0\\ \sin(\theta)&\cos(\theta)&0\\ 0&0&1\end{bmatrix}\Bigg{(}\bar{F}_{\textnormal{body}}+\sum_{i=1}^{n}\bar{F}_{i}\Bigg{)},

⟨ J (v_{q}), ξ ⟩ = ⟨ F L (v_{q}), ξ_{Q} (q)⟩ = m k_{q} (v_{q}, ξ_{Q} (q)),

⟨ J (v_{q}), ξ ⟩ = ⟨ F L (v_{q}), ξ_{Q} (q)⟩ = m k_{q} (v_{q}, ξ_{Q} (q)),

⟨ I (q) ξ, η ⟩ : = ⟨ F L (ξ_{Q} (q)), η_{Q} (q)⟩ = m k_{q} (ξ_{Q} (q), η_{Q} (q)),

⟨ I (q) ξ, η ⟩ : = ⟨ F L (ξ_{Q} (q)), η_{Q} (q)⟩ = m k_{q} (ξ_{Q} (q), η_{Q} (q)),

⟨ K (v_{q}), ξ ⟩ = ⟨ F_{R} (v_{q}), ξ_{Q} (q)⟩ = - c ν_{q} (v_{q}, ξ_{Q} (q)),

⟨ K (v_{q}), ξ ⟩ = ⟨ F_{R} (v_{q}), ξ_{Q} (q)⟩ = - c ν_{q} (v_{q}, ξ_{Q} (q)),

⟨ V (q) ξ, η ⟩ : = ⟨ F_{R} (ξ_{Q} (q)), η_{Q} (q)⟩ = - c ν_{q} (ξ_{Q} (q), η_{Q} (q)),

⟨ V (q) ξ, η ⟩ : = ⟨ F_{R} (ξ_{Q} (q)), η_{Q} (q)⟩ = - c ν_{q} (ξ_{Q} (q), η_{Q} (q)),

\forall g \in G : Γ_{mech} \circ D θ_{g} = Ad_{g} \circ Γ_{mech}, Γ_{visc} \circ D θ_{g} = Ad_{g} \circ Γ_{visc}

\forall g \in G : Γ_{mech} \circ D θ_{g} = Ad_{g} \circ Γ_{mech}, Γ_{visc} \circ D θ_{g} = Ad_{g} \circ Γ_{visc}

\dot{J} = K,

\dot{J} = K,

⟨ \dot{J} (q, \overset{q}{˙}), ξ ⟩ = \frac{d}{d t} (\frac{\partial L ( q ( t ) , q ˙ ( t ))}{\partial q ˙} ξ_{Q} (q (t))) = (\frac{d}{d t} \frac{\partial L}{\partial q ˙}) ξ_{Q} (q) + \frac{\partial L}{\partial q ˙} D ξ_{Q} (q) \overset{q}{˙} = (\frac{\partial L}{\partial q} + F_{R} + F_{E}) ξ_{Q} (q) + \frac{\partial L}{\partial q ˙} D ξ_{Q} (q) \overset{q}{˙},

⟨ \dot{J} (q, \overset{q}{˙}), ξ ⟩ = \frac{d}{d t} (\frac{\partial L ( q ( t ) , q ˙ ( t ))}{\partial q ˙} ξ_{Q} (q (t))) = (\frac{d}{d t} \frac{\partial L}{\partial q ˙}) ξ_{Q} (q) + \frac{\partial L}{\partial q ˙} D ξ_{Q} (q) \overset{q}{˙} = (\frac{\partial L}{\partial q} + F_{R} + F_{E}) ξ_{Q} (q) + \frac{\partial L}{\partial q ˙} D ξ_{Q} (q) \overset{q}{˙},

⟨ \dot{J} (q, \overset{q}{˙}), ξ ⟩ = \frac{\partial}{\partial s} L (Φ_{ξ}^{s} (q (t)), D Φ_{ξ}^{s} (q (t)) \overset{q}{˙} (t)) + ⟨ F_{R} (q, \overset{q}{˙}), ξ_{Q} (q)⟩ = \frac{\partial}{\partial s} L (Φ_{ξ}^{s} (q (t)), D Φ_{ξ}^{s} (q (t)) \overset{q}{˙} (t)) + ⟨ K (q, \overset{q}{˙}), ξ ⟩ .

⟨ \dot{J} (q, \overset{q}{˙}), ξ ⟩ = \frac{\partial}{\partial s} L (Φ_{ξ}^{s} (q (t)), D Φ_{ξ}^{s} (q (t)) \overset{q}{˙} (t)) + ⟨ F_{R} (q, \overset{q}{˙}), ξ_{Q} (q)⟩ = \frac{\partial}{\partial s} L (Φ_{ξ}^{s} (q (t)), D Φ_{ξ}^{s} (q (t)) \overset{q}{˙} (t)) + ⟨ K (q, \overset{q}{˙}), ξ ⟩ .

\dot{J} = 0.

\dot{J} = 0.

Γ_{mech} (r, g) \cdot (\overset{r}{˙}, \overset{g}{˙}) Γ_{visc} (r, g) \cdot (\overset{r}{˙}, \overset{g}{˙}) = Ad_{g} (\accentset \circ g + A_{mech} (r) \cdot \overset{r}{˙}) = Ad_{g} (\accentset \circ g + A_{visc} (r) \cdot \overset{r}{˙}),

Γ_{mech} (r, g) \cdot (\overset{r}{˙}, \overset{g}{˙}) Γ_{visc} (r, g) \cdot (\overset{r}{˙}, \overset{g}{˙}) = Ad_{g} (\accentset \circ g + A_{mech} (r) \cdot \overset{r}{˙}) = Ad_{g} (\accentset \circ g + A_{visc} (r) \cdot \overset{r}{˙}),

p : = Ad_{g}^{*} J \in g^{*} .

p : = Ad_{g}^{*} J \in g^{*} .

I_{loc} V_{loc} : = Ad_{g}^{*} I Ad_{g} : g \to g^{*} : = Ad_{g}^{*} V Ad_{g} : g \to g^{*}

I_{loc} V_{loc} : = Ad_{g}^{*} I Ad_{g} : g \to g^{*} : = Ad_{g}^{*} V Ad_{g} : g \to g^{*}

\accentset \circ g \overset{p}{˙} = - A_{mech} \cdot \overset{r}{˙} + I_{loc}^{- 1} p = V_{loc} (A_{visc} - A_{mech}) \cdot \overset{r}{˙} + V_{loc} I_{loc}^{- 1} p + ad_{I_{loc}^{- 1} p}^{*} p - ad_{A_{mech} \cdot \overset{r}{˙}}^{*} p,

\accentset \circ g \overset{p}{˙} = - A_{mech} \cdot \overset{r}{˙} + I_{loc}^{- 1} p = V_{loc} (A_{visc} - A_{mech}) \cdot \overset{r}{˙} + V_{loc} I_{loc}^{- 1} p + ad_{I_{loc}^{- 1} p}^{*} p - ad_{A_{mech} \cdot \overset{r}{˙}}^{*} p,

\overset{r}{¨} = f (t, r, \overset{r}{˙}, I_{loc}^{- 1} p)

\overset{r}{¨} = f (t, r, \overset{r}{˙}, I_{loc}^{- 1} p)

I_{loc} (r) = : m \overset{ˉ}{I}_{loc} (r) V_{loc} (r) = : c \overset{ˉ}{V}_{loc} (r) .

I_{loc} (r) = : m \overset{ˉ}{I}_{loc} (r) V_{loc} (r) = : c \overset{ˉ}{V}_{loc} (r) .

\accentset \circ g ϵ \overset{ˉ}{I}_{loc} \overset{ˉ}{V}_{loc}^{- 1} \overset{p}{˙} = - A_{mech} \cdot \overset{r}{˙} + \frac{1}{m} \overset{ˉ}{I}_{loc}^{- 1} p = m \overset{ˉ}{I}_{loc} (A_{visc} - A_{mech}) \cdot \overset{r}{˙} + p + ϵ \overset{ˉ}{I}_{loc} \overset{ˉ}{V}_{loc}^{- 1} ad_{I_{loc}^{- 1} p}^{*} p - ϵ \overset{ˉ}{I}_{loc} \overset{ˉ}{V}_{loc}^{- 1} ad_{A_{mech} \cdot \overset{r}{˙}}^{*} p .

\accentset \circ g ϵ \overset{ˉ}{I}_{loc} \overset{ˉ}{V}_{loc}^{- 1} \overset{p}{˙} = - A_{mech} \cdot \overset{r}{˙} + \frac{1}{m} \overset{ˉ}{I}_{loc}^{- 1} p = m \overset{ˉ}{I}_{loc} (A_{visc} - A_{mech}) \cdot \overset{r}{˙} + p + ϵ \overset{ˉ}{I}_{loc} \overset{ˉ}{V}_{loc}^{- 1} ad_{I_{loc}^{- 1} p}^{*} p - ϵ \overset{ˉ}{I}_{loc} \overset{ˉ}{V}_{loc}^{- 1} ad_{A_{mech} \cdot \overset{r}{˙}}^{*} p .

\accentset \circ g = - A_{visc} \cdot \overset{r}{˙} .

\accentset \circ g = - A_{visc} \cdot \overset{r}{˙} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Gait modeling and optimization for the perturbed Stokes regime

Matthew D. Kvalheim111Corresponding author; EECS Department, University of Michigan, Ann Arbor, MI, USA ([email protected]) Brian Bittner222Robotics Institute, University of Michigan, Ann Arbor, MI, USA ([email protected]) Shai Revzen333Department of EECS, Department of EEB, Robotics Institute, University of Michigan, Ann Arbor, MI, USA ([email protected])

Abstract

Many forms of locomotion, both natural and artificial, are dominated by viscous friction in the sense that without power expenditure they quickly come to a standstill. From geometric mechanics, it is known that for swimming at the “Stokesian” (viscous; zero Reynolds number) limit, the motion is governed by a reduced order “connection” model that describes how body shape change produces motion for the body frame with respect to the world. In the “perturbed Stokes regime” where inertial forces are still dominated by viscosity, but are not negligible (low Reynolds number), we show that motion is still governed by a functional relationship between shape velocity and body velocity, but this function is no longer linear in shape change rate. We derive this model using results from singular perturbation theory, and the theory of noncompact normally hyperbolic invariant manifolds (NHIMs).

Using the theoretical properties of this reduced-order model, we develop an algorithm that estimates an approximation to the dynamics near a cyclic body shape change (a “gait”) directly from observational data of shape and body motion. This extends our previous work which assumed kinematic “connection” models. To compare the old and new algorithms, we analyze simulated swimmers over a range of inertia to damping ratios. Our new class of models performs well on the Stokesian regime, and over several orders of magnitude outside it into the perturbed Stokes regime, where it gives significantly improved prediction accuracy compared to previous work.

In addition to algorithmic improvements, we thereby present a new class of models that is of independent interest. Their application to data-driven modeling improves our ability to study the optimality of animal gaits, and our ability to use hardware-in-the-loop optimization to produce gaits for robots.

1 Introduction
1.1 Acknowledgements
2 Background
3 Estimating Data-Driven Models in the Perturbed Stokes Regime
3.1 Determination of regressors for estimation of the dynamics
3.2 Local models enable optimality testing and optimization
4 Performance Comparison of the Two Data-Driven Models
4.1 Modeling a swimmer
4.2 Comparison of the estimated models
4.2.1 Algorithm comparison using manually selected gaits
4.2.2 Algorithm comparison using extremal gaits
4.2.3 Performance gains grow with shape space dimension
4.3 Discussion
5 Conclusion
A Appendix A — Derivation of the Equations of Motion
A.1 The mechanical and viscous connections
A.2 Local form of the equations of motion
A.3 Reduction in the Stokesian limit
B Appendix B — Reduction in the Perturbed Stokes Regime

1 Introduction

In this paper, we study how animals and robots move through space by deforming the “shape” of their body — typically in a cyclic fashion — to propel that body. We call such motion-producing cyclic shape deformations gaits. We study a class of locomotion which includes swimming and crawling in viscous media, in which the viscous damping forces are large compared to the inertia of the body. A classic exposition of such locomotors “living life at low Reynolds number” is given in Purcell (1977). An important aspect of our work is that we consider the perturbed Stokes regime (Eldering and Jacobs, 2016) in which the inertia-damping ratio (or Reynolds number) is small but nonzero, as opposed to previous geometric mechanics literature addressing only the viscous or Stokesian limit which formally assumes the inertia-damping ratio is zero (Kelly and Murray, 1996, 1995; Hatton and Choset, 2011, 2013; Bittner et al., 2018). We note that our methods are related to the realization of nonholonomic constraints as a limit of friction forces (Brendelev, 1981; Karapetian, 1981; Eldering, 2016).

For both scientific and engineering purposes, it is often of interest to ask whether a particular gait is optimal with respect to a goal function. For animal locomotion, explicit equations of motion are nigh impossible to come by, and therefore directly testing animal gait optimality via analytical tools like the calculus of variations is not an option. However, if a model can be obtained from experimental data for the local dynamics on a tubular neighborhood of the gait cycle — i.e. a model valid for small variations in the gait cycle — then local optimality tests can be formulated and evaluated on these models. Such an approach was taken in Bittner et al. (2018), which introduced an algorithm informed by both geometric mechanics and data-driven techniques for studying oscillators (Revzen and Guckenheimer, 2008; Revzen, 2009; Revzen and Kvalheim, 2015).

One limitation of Bittner et al. (2018) was the assumption that motion was entirely kinematic, effectively assuming that the inertia-damping ratio is zero by assuming a viscous connection-based model as introduced by Kelly and Murray (1995) and to be discussed more below. The real-world systems we are interested in have small — but always nonzero — inertia-damping ratio, and therefore we are interested in the extent to which the algorithm of Bittner et al. (2018) can be improved.

By applying normally hyperbolic invariant manifold (NHIM) theory (Fenichel, 1971, 1974, 1977; Hirsch et al., 1977; Fenichel, 1979; Eldering, 2013) in a singular perturbation context, we show that an exponentially stable invariant slow manifold exists for small inertia-damping ratio (this was also shown in Eldering and Jacobs (2016)). Furthermore, this slow manifold is close to the viscous connection (viewed geometrically as a subbundle — hence as a submanifold — of state space), and therefore the dynamics restricted to the slow manifold are close to those assumed in the purely viscous case (Kelly and Murray, 1996, 1995; Hatton and Choset, 2011; Bittner et al., 2018), and reduce to those in the zero inertia-damping ratio limit. Aside from its theoretical appeal, this result also has practical implications: it is possible to explicitly compute “correction terms” which, when added to the purely-viscous connection model, yield the dynamics restricted to the slow manifold. The slow-manifold dynamics are provably more accurate than those of the idealized viscous connection model. Additionally, they still enjoy the same useful properties of reduced dimension and symmetry under the group. The computation of such correction terms is a fundamental technique in geometric singular perturbation theory (Fenichel, 1979; Jones, 1995), and has been used, e.g., to compute reduced-order models of robots with flexible joints (Spong et al., 1987).

Given an algorithm that produces a data-driven local model of dynamics near a gait, we could conduct variational tests for local optimality of that gait with respect to any cost functional that the model allows us to evaluate. Thus we have in mind two classes of application for the approach we present below: a biological application — verification of whether a postulated goal function is optimized for an observed animal gait, and an engineering application — optimization of robot gaits with “hardware-in-the-loop” by iteratively modeling and improving the gait with respect to a goal functional without the need for precise models of the robot or its interactions with the environment.

It is clear why our approach would be a boon to biology. In most cases we cannot cajole animals to vary their gaits and observe whether that improves them. Additionally, we rarely have detailed enough models of animal-environment interaction to allow gait optimality to be assessed from a model.

The value to gait optimization of robots comes from the fact that a gait, being a periodic continuous function of shape, is an infinite-dimensional object. Thus, gait parameterizations are unavoidably of high dimension. Any gradient calculation for optimization of a gait thus requires many tests to identify the influence of these many parameters. Combined with the high practical cost of hardware experiments in terms of time and robot wear-and-tear, this renders hardware-in-the-loop optimization nigh infeasible. We propose that by producing a tractably computable local model, we can resolve this problem. The high-dimensional gradients can be computed by simulating the (local) model instead of directly using the hardware, decoupling the dimension of the gait parameterization from the number of experiments conducted on hardware.

It is our hope that, through a combination of geometric mechanics and NHIM theory, we can develop an algorithm which can serve the purposes of both biologists and engineers.

1.1 Acknowledgements

The authors were supported by NSF CMMI 1825918 and ARO grants W911NF-14-1-0573 and W911NF-17-1-0306 to Revzen. Kvalheim would like to thank Jaap Eldering for introducing him to the relevance of NHIM theory to locomotion, for helpful comments and suggestions regarding the global asymptotic stability of the slow manifold of Thm. 2, and for other useful suggestions.

2 Background

In studying locomotion, we will consider dissipative Lagrangian mechanical systems on a product configuration space $Q=S\times G$ with coordinates $(r,g)$ , and with a Lagrangian of the form kinetic minus potential energy. Here $S$ is the shape space of the locomoting body, and $G$ is a Lie group (typically a subgroup of the Euclidean group $\mathsf{SE}(3)$ of rigid motions) representing the body’s position and orientation in the world.444In a formal sense, one may start with generalized coordinates $Q$ and the action of $G$ , and define $S$ as a quotient manifold $Q/G$ . The details of this construction are not germane to our argument. Instead, for simplicity we postulate the separation of configuration into “shape” and “body-frame” here, with the more general case treated in the appendices. We assume throughout this paper that $S$ is compact. We will also assume that this system is subjected to external viscous drag forces which are linear in velocity.555We make this assumption for simplicity. In principle, it should be possible to relax this assumption to derive modified but similar results for a force depending nonlinearly on velocities, as long as the linear approximation (with respect to velocities) of this force satisfies the same assumptions that we impose on our assumed linear force.

If the physics of locomotion are independent of the body’s position and orientation, then the Lagrangian $L(r,g,\dot{r},\dot{g})$ is independent of $g,\dot{g}$ and the viscous drag force $F_{R}(r,g,\dot{r},\dot{g})$ is equivariant in $g$ (on the $g,\dot{g}$ components). Under this symmetry assumption, Kelly and Murray (1996) derived general equations of motion satisfied by $g$ and by the *body momentum666 Here $\mathfrak{g}^{*}$ is the vector space dual of the Lie algebra $\mathfrak{g}$ of $G$ .

$p\in\mathfrak{g}^{*}$ ; these equations are essentially special cases of those derived in Bloch et al. (1996). For a detailed statement and derivations of these equations, see §A.

Let us suppose that the kinetic energy metric of the body is scaled by a dimensionless inertial parameter $m>0$ , that the viscous drag force $F_{R}$ is scaled by a dimensionless damping parameter $c>0$ , and define $\epsilon\coloneqq\frac{m}{c}$ the dimensionless ratio of the two which is (up to scale) the Reynolds number in the case of fluid dynamics. Kelly and Murray (1996) showed that in the limit $\epsilon\to 0$ , the equation of motion for $g$ becomes independent of $p$ . Defining the body velocity777 The body velocity is often written $g^{-1}\dot{g}$ by an abuse of notation which is only defined on matrix Lie groups where the product of a tangent vector and a group element is naturally defined. For a general definition note that $\dot{g}\in\mathsf{T}_{g}G$ , and the derivative of the left action $\mathsf{D}\mathrm{L}_{g^{-1}}$ restricts to a map $\mathsf{T}_{g}G\to\mathsf{T}_{e}G\cong\mathfrak{g}$ . Hence the definition above. $\accentset{\scriptstyle\circ}{g}\coloneqq\mathsf{D}\mathrm{L}_{g^{-1}}\dot{g}$ , they obtained

[TABLE]

where $A_{\textnormal{visc}}$ is called the local viscous connection.

Away from the Stokes limit, Eldering and Jacobs (2016) studied the perturbed Stokes regime in which $\epsilon$ is assumed to be small but nonzero. For $\epsilon$ sufficiently small they showed there is an exponentially stable invariant slow manifold $M_{\epsilon}$ , to which the dynamics converge. We derive similar results tailored for our applications in §B. Using an asymptotic series expansion for the slow manifold, in §B we also prove that the equations of motion for trajectories within $M_{\epsilon}$ take the form given by Thm. 1 below. Hence trajectories of the full dynamics converge to solutions of Eqn. (2) below, after a transient duration that goes to zero with $\epsilon$ .

Theorem 1.

Assume that the shape space $S$ is compact. For sufficiently small $\epsilon>0$ , there exist smooth fields of linear maps $B(r)$ and bilinear maps $G(r)$ such that the dynamics restricted to the slow manifold $M_{\epsilon}$ satisfy

[TABLE]

*Remark 1**.*

The bilinear maps or $(1,2)$ tensors $G(r)$ are not, in general, symmetric: e.g., they are unlike Hessians.

Bittner et al. (2018) developed a data-driven algorithm for approximating the equations of motion of a locomotion system assuming the model of Eqn. (1). Here we define and study an extension of their approach to models of the form of Eqn. (2). We examine the efficacy of this extension in modeling motion in the perturbed Stokes regime, in which $\epsilon$ is allowed to be small but nonzero.

3 Estimating Data-Driven Models in the Perturbed Stokes Regime

In this section, we develop a data-driven algorithm for estimating the dynamics Eqn. (2) in a neighborhood of an exponentially stable periodic orbit. We assume that the image of this periodic orbit is contained in the slow manifold $M_{\epsilon}$ of Thm. 1, and for simplicity we assume that — on the slow manifold — $\ddot{r}=f(r,\dot{r})$ can be written autonomously as a function of $r$ and $\dot{r}$ . Letting $\gamma(t)$ denote the shape (or $r$ ) component of this periodic orbit, we refer to $\gamma$ as a gait.

3.1 Determination of regressors for estimation of the dynamics

In this section we closely follow the approach of Bittner et al. (2018) to produce a data driven model of the dynamics from an ensemble of noisy trajectories near $\Gamma\coloneqq\text{Im }\gamma$ . We extensively use the Einstein summation convention in the regression equations below.

Let $T$ be the period of $\gamma$ . Since we assume that that the exponentially stable periodic orbit is contained in the slow manifold on which $\ddot{r}$ is of the form $\ddot{r}=f(r,\dot{r})$ , it follows that there is an asymptotic phase map $\phi\colon\mathsf{T}S\to[0,T)$ whose derivative along trajectories is equal to one (Guckenheimer, 1975). Given trajectory data $(r(t),\dot{r}(t)),~{}t\in[t_{0},t_{1}]$ , we assign asymptotic phase values $\phi_{t}\coloneqq\phi(r(t),\dot{r}(t))$ to each data point using an algorithm such as that of Revzen and Guckenheimer (2008).888In principle, any circle-valued “phase” function of state whose derivative along trajectories is positive could be used instead of asymptotic phase. We chose to use asymptotic phase because it is dynamically meaningful and there exist algorithms to compute it. After grouping data points according to their phase values, we construct Fourier series models of $\gamma,\dot{\gamma},\ddot{\gamma}$ as functions of phase.999In practice the Fourier series models of $\gamma,\dot{\gamma},\ddot{\gamma}$ might be computed from their own noisy data sets, and in this case the resulting Fourier models need not be derivatives of one another. We find that the use of matched filters is helpful in mitigating this issue; see Bittner et al. (2018); Revzen (2009) for more details.

Next, we select $M$ evenly spaced values of phase, $\phi_{1},\ldots,\phi_{M}$ , to obtain values $\gamma_{m}:=\gamma(\phi_{m}),\dot{\gamma}_{m}:=\dot{\gamma}(\phi_{m}),\ddot{\gamma}_{m}:=\ddot{\gamma}(\phi_{m})$ — the shapes, shape velocities, and shape accelerations of a system that is following the gait cycle precisely. For each $m$ we collect from our trajectory data all triples $(r_{n},\dot{r}_{n},\ddot{r}_{n})\coloneqq(r(t_{n}),\dot{r}(t_{n}),\ddot{r}(t_{n}))$ that are sufficiently close to $(\gamma_{m},\dot{\gamma}_{m},\ddot{\gamma}_{m})$ , i.e., such that $\|r_{n}-\gamma_{m}\|,\|\dot{r}_{n}-\dot{\gamma}_{m}\|,\|\ddot{r}_{n}-\ddot{\gamma}_{m}\|<\kappa$ for all101010The astute experimentalist realizes that since the derivative terms contain $dt$ and $dt^{2}$ in their units, a certain degree of numerical conditioning can be obtained by judicious choice of units for time. $n$ , and we also collect the corresponding $\accentset{\scriptstyle\circ}{g}_{n}$ values. We define the offsets $\delta_{n}:=r_{n}-\gamma_{m}$ , $\dot{\delta}_{n}\coloneqq\dot{r}_{n}-\dot{\gamma}_{m}$ , $\ddot{\delta}_{n}\coloneqq\ddot{r}_{n}-\ddot{\gamma}_{m}$ . Note that the range of $n$ depends on $m$ , but for notational simplicity we do not display this.

Introducing coordinates and Taylor expanding, Bittner et al. (2018) obtained from Eqn. (1) the following expression (no sum over $m$ or $n$ ):

[TABLE]

Omitted here are higher-order terms, the subscript of $A_{\textnormal{visc}}$ , and the nonlinear $\gamma$ dependence of the local expression $A^{k}_{i}$ . They then operationalized Eqn. (3) as a least-squares problem, written in matrix form as follows (for each $k$ and $m$ ; indices $k$ and $m$ elided below for clarity):

[TABLE]

where $\widehat{~{}}$ indicates “estimated” and $\otimes$ is the outer product. For a $d$ -dimensional shape space, the row of unknowns on the right consists of $1+d+d+d^{2}$ elements. Once they have computed a least squares model for every $m$ , they construct Fourier series so that the $\widehat{C}_{i}$ may be smoothly interpolated at any phase value. The result is a local model of Eqn. (1).

In the perturbed Stokes regime which we seek to model, we follow a similar approach by expanding Eqn. (2) instead of Eqn. (1). We obtain (no sum over $m$ or $n$ ):

[TABLE]

Partitioning these terms according to their dependence on the observations $\delta$ , $\dot{\delta}$ , and $\ddot{\delta}$ , we obtained

[TABLE]

giving a similar least squares problem written in matrix form as follows (for each $k$ and $m$ ; indices $k$ and $m$ elided below for clarity):

[TABLE]

For a $d$ -dimensional shape space, the row of unknowns on the right consists of $1+d+d+d+d^{2}+d^{2}+d^{2}+d^{3}$ elements. Once we have computed a least squares model for every $m$ , we similarly construct Fourier series so that the $\widehat{C}_{i}$ may be smoothly interpolated at any phase value. The result is a local model of Eqn. (2).

Because it is the only term of order $\kappa^{3}$ , we find that in practice the 3-index regressor $\delta\otimes\dot{\delta}\otimes\dot{\delta}$ can often be omitted if $\kappa>0$ is sufficiently small. In the remainder of this paper, we refer to the regressors of Eqn. (7) (with the 3-index term excluded) as the “perturbed Stokes regressors”, and refer to those used in the Bittner et al. (2018) algorithm as the “Stokes regressors.”

*Remark 2**.*

All tensors appearing in Eqn. (3) and Eqn. (5) are not necessarily symmetric, and therefore the order of terms matters.

*Remark 3**.*

Examining Eqn. (3), we see that there are some constraints that the regression does not enforce. Namely, $C_{0}=\left[C_{1}\right]_{i}\dot{\gamma}^{i}$ and $C_{2}=\left[C_{3}\right]_{i}\dot{\gamma}^{i}$ . When we performed regressions ignoring these implicit constraints, we found that the constraints are not respected in the results. However, an important consequence of Eqn. (5) is that, for systems operating in the perturbed Stokes regime, such a mismatch is actually to be expected — this is because some independent new terms appear in $C_{1},\ldots,C_{3}$ which break the constraints.

3.2 Local models enable optimality testing and optimization

The data-driven models computed by the process described above have predictive power locally, in a neighborhood of a gait cycle. For any shape trajectory inside this neighborhood, we can used the local model to predict the trajectory of the body in the world. We assume that we are interested in some $\mathbb{R}$ -valued goal functional $\tilde{\phi}(\gamma,g_{\gamma})$ defined on an appropriate space of trajectories. Here the group trajectory $g_{\gamma}(t)$ is determined by the gait $\gamma(t)$ via Eqn. (2), and therefore we may consider the goal functional $\phi(\gamma):=\tilde{\phi}(\gamma,g_{\gamma})$ to be a function of $\gamma$ alone.

Testing for Optimality

— We can test the gait of an organism for optimality by checking that $0=\frac{\partial}{\partial s}\phi(\gamma_{s})|_{s=0}$ for all smooth variations $\gamma_{s}$ of a gait $\gamma$ (where $\gamma_{0}=\gamma$ ). This condition is necessary for local optimality, but depending on the choice of $\phi$ it is often possible to argue on physical grounds that its satisfaction is also sufficient for optimality. While this variational condition can be used to derive a PDE via the Euler-Lagrange approach, a more computationally straightforward approach is to consider a finite- (but often high-) dimensional family $\gamma_{p}$ with $p\in\mathbb{R}^{N}$ , and numerically computing the gradient $\nabla_{p}\phi(\gamma_{p})$ . When this gradient is sufficiently small at some parameter $p_{*}$ , then it might be possible to argue that the gait is nearly extremal (or possibly optimal) with respect to $\phi$ .111111In some cases this procedure is provably correct. Furthermore, suitable finite-dimensional families that provide these guarantees always exist (Milnor, 1969, Sec. 16). We do not discuss these technicalities any further here. Since we can compute $\phi$ using a data-driven model around $\gamma_{p}$ , we can compute $\nabla_{p}\phi(\gamma_{p})$ . We can do so directly from observation and without need for any general model of body-environment interactions, so long as use of Thm. 1 can be justified.

Optimizing Gaits

— We can use the gradient $\nabla_{p}\phi(\gamma_{p})$ to iteratively improve the gait of a robot whose dynamics satisfy Thm. 1 without requiring any further details of the physics. Taking parameter set $p$ we compute the next iterate $p^{\prime}:=p+\alpha\nabla_{p}\phi(\gamma_{p})$ , with the step-size scaling $\alpha>0$ chosen to ensure that $p^{\prime}$ is within the domain for which our local model of $\phi$ is valid, using the approach of Bittner et al. (2018, Sec. 7.2). For each gait $\gamma_{p}$ , we only require enough experimental data for building a good local model of $\phi$ near $\gamma_{p}$ — a dataset whose size does not depend on the dimension of the representation $p$ . We plan to use this decoupling to perform hardware-in-the-loop optimization to produce rapid adaptation of robot motions in the face of foreign environments, mechanical failures, and more.

4 Performance Comparison of the Two Data-Driven Models

One of the primary contributions of this paper is the introduction of new regressors based on Thm. 1, which we use to augment the regressors used in the algorithm of Bittner et al. (2018) for estimating the dynamics near a gait. These allow us to extend the domain of validity of their algorithm from the Stokesian limit to include the perturbed Stokes regime. To demonstrate this, we constructed a swimming model which we simulated at various Reynolds numbers, and tested the ability of the two types of local models to predict the results of the fully nonlinear simulation.121212All of these simulations did not account for fluid-fluid interactions; as such we make no claim that they are physically meaningful at the higher Reynolds number in the ranges shown.

4.1 Modeling a swimmer

We tested the prediction quality of both models on a swimming model. The system shown in Fig. 1 had uniformly distributed mass along a central body, with two paddles comprising chains of massless links extending from the center of the body. Each paddle could be broken up into an arbitrary number $\frac{n}{2}$ ( $n$ even) of equally spaced links, which sum to a constant total length independent of $n$ . This allowed us to vary the behavior of the system from one reminiscent of a boat with oars (for $n=2$ ) to one more like a bacterial cell with flagella (for $n$ large).

The system moves in a homogeneous and isotropic plane. Its configuration space is $S\times G=\mathbb{T}^{n}\times\mathsf{SE}(2)$ : the $n$ -torus and the special Euclidean group of planar rigid motions $\mathsf{SE}(2)$ . We assume the dynamics are equivariant under $\mathsf{SE}(2)$ . The group element $g\in\mathsf{SE}(2)$ provides the position and orientation of the central body in world coordinates with respect to a fixed inertial reference frame. Hereon we represent $g$ as a column vector $g=[x,y,\theta]^{T}$ , and similarly represent $\dot{g}$ as a column vector. We define the body velocity

[TABLE]

We treat the link at the main body (length $L$ ) and the links comprising the paddles (length $d$ ) as slender members, and model their drag forces according to Cox theory (Cox, 1970) using the drag matrices

[TABLE]

where the factor $c>0$ is explicitly written for later scaling purposes. The drag coefficient ratio $C_{y}/C_{x}$ has a maximum value of $2$ corresponding to the limit of infinitesimally thin segments, and we will assume this limiting ratio here (c.f. Hatton and Choset (2013, Sec. 2.B)). Given these drag matrices, the wrench on the central link can be written as

[TABLE]

The wrench that the segments (denoted $i$ ) apply on the body can be written as

[TABLE]

where the linear map $W_{i}(g,\alpha)\colon\mathfrak{se}(2)^{*}\to\mathfrak{se}(2)^{*}$ maps a wrench on link $i$ to a wrench on the body and the linear map $V_{i}(g,\alpha)\colon\mathfrak{se}(2)\to\mathfrak{se}(2)$ maps a velocity in the body frame to a velocity in the link frame. Let $R_{\beta}$ denote the counterclockwise rotation of the plane by angle $\beta$ , define $e_{2}\coloneqq[0,1]^{T}$ , and write $\accentset{\scriptstyle\circ}{g}=[\accentset{\scriptstyle\circ}{g}_{x,y}^{T},\dot{\theta}]^{T}$ . Then, for the $n$ -segment model (recall that $n$ must be even), for $i\in\{1,\ldots,n\}$ the linear maps $V_{i}$ and $W_{i}$ are given by

[TABLE]

where $*\coloneqq 1+\llfloor i/\frac{n}{2}\rrfloor\cdot\frac{n}{2}\in\{1,\frac{n}{2}+1\}$ , $f=[f_{1},f_{2}]^{T}$ , and where a summation is understood to be zero if the lower bound of its index set exceeds its upper bound.

These wrenches act on the body (which has uniformly distributed mass $m$ and moment of inertia $I=m\bar{I}$ about its midpoint) yielding the following equations of motion in world coordinates:

[TABLE]

where $\epsilon\coloneqq\frac{m}{c}$ is the dimensionless inertia-damping ratio. In keeping with our earlier conventions that $m$ , $c$ , and $\epsilon$ are all dimensionless we think of the “ $1$ ” terms on the diagonal in Eqn. (13) as having units of inverse time.

Upon inspection of Eqn. (13), we see that by modifying $\epsilon$ we can directly adjust the ratio of inertial to viscous forces in the swimming model. The Stokesian limit corresponds to $\epsilon\to 0$ ; on the other hand, the $\epsilon\to\infty$ limit corresponds to a fully “momentum-dominated” regime, wherein viscous effects are negligible and motion is governed by conservation of momentum via Noether’s theorem (see Corollary 1 §A.1). In the following §4.2 we simulate the swimming model at a variety of $\epsilon$ values, and compare the performance of the two algorithms for estimating the dynamics near a gait cycle.

4.2 Comparison of the estimated models

In all simulations in this section, we used the parameter values $L=1$ , $d=0.5$ , $C_{x}=1$ , $C_{y}=2$ , and $\bar{I}=1$ . The only remaining free variable is $\epsilon$ , which governs both the ratio of inertial to viscous forces and the rate of attraction to the slow manifold. The procedure we used for generating simulations for experiments in this section is identical to that described in Bittner et al. (2018). Briefly, an experiment consists of 30 cycles of a numerically integrated stochastic differential equation (SDE) representing shape space dynamics consisting of a deterministic oscillator perturbed by system noise (see Bittner et al. (2018, Sec. 6.2) for precise details on the SDE, parameter values used, etc.).

We used these noisy shape dynamics to drive the body momentum and group dynamics via the full equations of motion Eqn. (26) derived in §A.3. For each simulation we recorded a “ground truth” body velocity trajectory $\accentset{\scriptstyle\circ}{g}_{G}$ . We used this record to evaluate the accuracy of the data-driven approximations. We denoted the body velocity computed with the perturbed Stokes regressors by $\accentset{\scriptstyle\circ}{g}_{p}$ , and those computed with the Stokes regressors by $\accentset{\scriptstyle\circ}{g}_{s}$ .

As a “zeroth-order” phase model of the dynamics, we constructed a Fourier series model of $\accentset{\scriptstyle\circ}{g}_{G}$ with respect to the estimated phase (see §3.1), which we denote by $\accentset{\scriptstyle\circ}{g}_{a}$ . For any data point, the zeroth-order model prediction is $\accentset{\scriptstyle\circ}{g}_{a}(\varphi)$ for the phase $\varphi$ of that data point.

We computed the RMS errors $e^{k}_{*}$ for each component $k$ of the body velocity and each model $*=p,s,a$ by $e^{k}_{*}:=\langle|\accentset{\scriptstyle\circ}{g}^{k}_{*}-\accentset{\scriptstyle\circ}{g}^{k}_{G}|^{2}\rangle^{1/2}$ . Since the numerical value of these errors means little, we defined the metric $\Gamma^{k}_{*}:=1-e^{k}_{*}/e^{k}_{a}$ for $*=p,s$ to indicate how much better the regression models were performing compared to the zeroth-order phase model $\accentset{\scriptstyle\circ}{g}_{a}$ . A $\Gamma^{k}_{*}$ of [math] indicates doing no better than the zeroth order model whereas a $1$ indicates a perfect model. To further highlight the difference in prediction quality, we also plot $\Delta^{k}:=\Gamma^{k}_{p}-\Gamma^{k}_{s}$ .

4.2.1 Algorithm comparison using manually selected gaits

We chose to first test the modeling approaches on a collection of simple manually selected behaviors. These include behaviors we term “twist in place” and “symmetric flapping” gaits, both of which initialize with paddles aligned at a quarter turn away from the body (as depicted in the two-segment model in Figure 1), and respectively involve anti-symmetric and symmetric sinusoidal movement of the paddles with amplitude $1$ . The “symmetric flapping gait” primarily moves in the direction of the $x$ body axis, while the “twist in place gait” primarily changes the $\theta$ body coordinate. Finally, we considered a “circle” gait which also initializes the paddles at a quarter turn away from the body and moves them sinusoidally with amplitude $1$ , but has a quarter cycle phase offset between them. This gait tends to move the system in a way that changes all three body coordinates throughout its execution.

We selected these three gaits because they are simple to describe and span a range of resultant body motions. For single link paddles, the body shape space is 2D, and these gaits are represented by loci that are diagonal lines with slopes $1$ , $-1$ , and a circle (see Fig. 2). We simulated the gaits and plotted mean and variance of $\Gamma_{s}$ , $\Gamma_{p}$ and $\Delta$ for each value of $\epsilon$ (Fig. 2). The plot shows that for all three gaits tested and for all three body coordinates, over a range spanning an order of magnitude or more around $\epsilon=1$ , the perturbed Stokes models are better by $\Delta>0.05$ or more.

4.2.2 Algorithm comparison using extremal gaits

Arbitrarily selected gaits such as those examined in the previous section are not expected to exhibit any special properties with respect to our modeling approach. In particular, with respect to a goal function $\phi(\cdot)$ , they are expected to be regular points of $\phi(\cdot)$ . However, $\phi$ -optimal gaits have $\nabla_{p}\phi=0$ and thus have additional structure that might interact with the modeling approach.

We chose goal functionals $\int\accentset{\scriptstyle\circ}{g}^{x}(t)\,\mathrm{d}t$ and $\int\accentset{\scriptstyle\circ}{g}^{y}(t)\,\mathrm{d}t$ (where superscripts denote components) corresponding to displacement in the $x$ and $y$ coordinates as measured in the body frame of the paddleboat. This is not the same as actual $x$ or $y$ displacement in the world, since boat orientation changes over time. Using the methods of Hatton and Choset (2013), we determined the extremal gaits for these goal functionals in the Stokes regime with high accuracy. Plotted in the shape-space (and superimposed on the “connection vector fields” (Hatton and Choset, 2011, 2013) of the appropriate goal functional) they are diamond shaped (Fig. 3). We also plotted $\Gamma$ and $\Delta$ , revealing that again, perturbed Stokes regressors improve performance ( $\Delta>0.15$ ) over a range of two orders of magnitude in $\epsilon$ . Unlike the arbitrary gaits of the previous section, the extremal gaits have $\Gamma>0.1$ for all $\epsilon>1$ for both model types. This suggests that even outside the perturbed Stokes regime the addition of regressors improves upon the zeroth order phase model. It is also notable that in the extremal $x$ gait, $\Delta^{x}$ is significantly better than $\Delta^{y}$ , whereas in the extremal $y$ gait the converse is true.

4.2.3 Performance gains grow with shape space dimension

Thus far we have only presented results for systems having 2D shape spaces. Because data-driven methods are often handicapped by their inability to scale with model dimensionality, we chose also to test our approach on systems of higher dimension by extending each paddle into a multi-segmented model. We selected a gait similar to that of the symmetric flapping gait, but with the additional feature that the bending angle of a paddle was uniformly distributed through the joints it contains. In particular, the relative angles between adjacent segments were equal and of amplitude $\pi/N$ , where $N$ is the number of joints.

We plotted $\Gamma^{x}_{p}$ , $\Gamma^{x}_{s}$ and $\Delta^{x}$ for paddles with $1$ , $2$ and $3$ segments (Fig. 4). The $\Delta^{x}$ shows a marked improvement in the $4$ D and $6$ D models, suggesting that as shape-space complexity increased, the advantage of perturbed Stokes regressors became comparatively more significant.

4.3 Discussion

The results of §4.2 show that for all versions of the swimming model and all gaits that we tested there exists a sizable window of $\epsilon$ values wherein the perturbed Stokes regressors provide models of superior quality when compared to the Stokes regressors. In particular, the improvement is consistently present in the region $\log_{10}\epsilon\in[0,1]$ , suggesting that this range of $\epsilon$ might be the range for which the predicted slow manifold is both present and sufficiently simple to be captured by the new regressors.

As noted in §4.2.2, the perturbed Stokes regressors seem to improve prediction performance more in the direction in which the gait was extremal. We hypothesize that this is because extremal gaits have already exhausted any first-order improvements available, i.e. gradients are zero. With the first-order terms close to zero, the presence of more high-order terms among the perturbed Stokes regressors may have a greater effect on the relative prediction error.

It is interesting to note the large magnitude of improvement in $\Delta$ as the shape space dimension increased in Fig. 4. Whether this is an artifact of the particular model and/or gait, or a more general feature, remains to be determined.

At the lower end $\epsilon$ magnitudes studied here, the systems are near the Stokesian limit, and therefore we expect relatively little improvement from adding regressors designed for the perturbed Stokes regime. This is consistent with our experimental results in all figures which show for $\epsilon$ small both small values of $\Delta$ and large values of $\Gamma$ for both sets of regressors.

For very large values of $\epsilon$ , the predictive quality of both algorithms is hindered by at least three factors, although only the first two can be observed here.

The $\mathcal{O}(\epsilon^{2})$ term in Thm. 1 becomes more significant as $\epsilon$ increases. This issue is insurmountable if we restrict ourselves to Stokes regressors. If we do not, it is possible to compute correction terms which are higher order in $\epsilon$ and which can inform the selection of additional regressors for addition to our algorithm. It is one possible direction for future work. 2. 2.

For $\epsilon$ sufficiently large, we expect a bifurcation in which the slow manifold (whose existence is guaranteed by Thm. 2 in §B) ceases to exist. For such values of $\epsilon$ , the hypotheses of Thm. 1 are not satisfied, and a reduced-order model may not exist. This is a mathematical expression of the physical reality of inertial effects playing a dominant role as $\epsilon$ increases, and eventually requiring momentum states to be added to the models. 3. 3.

For sufficiently large values of $\epsilon$ the full complications of fluid-fluid interactions to come into play, and the linear viscous friction model we used becomes less and less accurate. We conjecture that for many systems this effect will not have significant influence until after $\epsilon$ is already sufficiently large for the slow manifold to have disappeared. It would be interesting to explore this issue further.

5 Conclusion

We have shown that the accuracy of data-driven models motivated from geometric mechanics can be improved by using a collection of regressors derived from an asymptotic series approximation of an attracting invariant manifold in the small parameter $\epsilon$ representing the ratio of inertial to viscous forces (a Reynolds-number-like parameter). The existence of such an invariant manifold was previously known in similar situations,131313 But see the discussion preceding Thm. 2 in §B, which details how our result differs from that of Eldering and Jacobs (2016). as were the approximation techniques we employed, but the combination of these together for producing data-driven models of locomotion is a novel contribution. In simulations where we tested geometrically similar motions over $6$ orders of magnitude of $\epsilon$ , we obtained improvements of $5$ – $65\%$ (depending on the specific system and gait) compared to previous work, suggesting that these better-informed models can indeed capture the perturbed Stokes regime more accurately. Furthermore, the results of one of our experiments showed further improvements as the shape-space dimension of the locomoting system increased; this suggests that higher-dimensional systems might be modeled effectively using our approach.

Future work will include application of our algorithm to questions of locomotion optimality in animals, and to hardware-in-the-loop optimization of robot motions. An additional direction for future work is the selection of regressors and regression techniques for hybrid dynamical systems, and for non-viscous dissipation models.

Appendix A Appendix A — Derivation of the Equations of Motion

In this and the following section we consider systems more general than those considered earlier, and in so doing assume that the reader is familiar with some basic concepts in geometric mechanics and differential geometry: Lie groups, group actions, and principal bundles. We refer the reader to Kobayashi and Nomizu (1963); Marsden and Ratiu (1994); Lee (2013); Bloch (2015) for the relevant standard definitions related to Lie groups and group actions, and we refer the reader to Kobayashi and Nomizu (1963); Marsden et al. (1991); Marsden (2009); Bloch (2015) for material on bundles.

We consider a mechanical system on a configuration space $Q$ whose Lagrangian is of the form kinetic minus potential energy. We will also consider this system to be subjected to external viscous forcing arising from a Rayleigh dissipation function, and also subjected to an external force exerted by the locomoting body. We are interested in the situation that we have a smooth action $\theta\colon G\times Q\to Q$ of a Lie group $G$ on $Q$ , such that the Lagrangian, viscous forces, and external force are all symmetric under the action. In this case, we say that $G$ is a symmetry group.

In §A.1, we will define some geometric quantities on $Q$ which encode information about the symmetry and the dynamics. Working in coordinates induced by a local trivialization, in §A.2 we derive the equations of motion in terms of these quantities. In §A.3, we recall how the equations become governed by the so-called viscous connection in the Stokesian limit (Kelly and Murray, 1996; Eldering and Jacobs, 2016), which will set the stage for our derivation in §B of a corrected reduced-order model for the perturbed Stokes regime.

A.1 The mechanical and viscous connections

In this section, we define the mechanical and viscous (or Stokes) connections, roughly following Kelly and Murray (1996). We consider a Lagrangian $L\colon\mathsf{T}Q\to\mathbb{R}$ which is invariant under the lifted action $\mathsf{D}\theta_{g}$ of $G$ on $\mathsf{T}Q$ (here $\mathsf{D}$ denotes the derivative or pushforward). We assume the Lagrangian to be of the form kinetic minus potential energy, where kinetic energy is given by $\frac{m}{2}k$ , where $m>0$ is a dimensionless mass parameter, $k$ is a smooth symmetric bilinear form, and $mk$ is the kinetic energy metric. In what follows, we assume that $k$ is positive definite when restricted to tangent spaces to $G$ orbits, but not necessarily that $k$ is positive definite on all tangent vectors.141414This does not affect any of the following derivations and results. However, this generality is merely a convenience ensuring that our results apply to certain idealized examples, e.g., linkages with some links having zero mass (c.f. §4). Of course such examples are not physical and, e.g., must be supplemented with assumptions to ensure that the massless links have well-defined dynamics. Denoting by $\mathfrak{g}$ the Lie algebra of $G$ and $\mathfrak{g}^{*}$ its dual, we define the (Lagrangian) momentum map $J\colon\mathsf{T}Q\to\mathfrak{g}^{*}$ via

[TABLE]

where $v\in\mathsf{T}_{q}Q$ and $\xi\in\mathfrak{g}$ . Here $\mathbb{F}L\colon\mathsf{T}Q\to\mathsf{T}^{*}Q$ is the fiber derivative of $L$ given by $\mathbb{F}L(v_{q})(w_{q})\coloneqq\frac{\partial}{\partial s}|_{s=0}L(v_{q}+$ s $w_{q})$ , and the smooth vector field $\xi_{Q}$ on $Q$ is the infinitesimal generator defined by $\xi_{Q}(q)\coloneqq\frac{\partial}{\partial s}|_{s=0}\theta_{\exp(s\xi)}(q)$ . We define the mechanical connection $\Gamma_{\textnormal{mech}}\colon\mathsf{T}Q\to\mathfrak{g}$ via $\Gamma_{\textnormal{mech}}(v_{q})\coloneqq\mathbb{I}^{-1}(q)J(v_{q})$ , where $\mathbb{I}(q)\colon\mathfrak{g}\to\mathfrak{g}^{*}$ is the locked inertia tensor defined via

[TABLE]

where $\xi,\eta\in\mathfrak{g}$ .

We now follow an analogous procedure to define the viscous connection $\Gamma_{\textnormal{visc}}\colon\mathsf{T}Q\to\mathbb{R}$ . We consider a Rayleigh dissipation function $R\colon\mathsf{T}Q\to\mathbb{R}$ defined in terms of a $G$ -invariant smooth symmetric bilinear form $\nu$ on $Q$ : $R(v_{q})\coloneqq\frac{c}{2}\nu_{q}(v_{q},v_{q})$ , where $c>0$ is a dimensionless parameter representing the amount of damping or dissipation in the system due to viscous forces. As with $k$ , we assume that $\nu$ is positive definite when restricted to tangent spaces to $G$ orbits, but not necessarily that $\nu$ is positive definite on all tangent vectors.151515This generality simply allows for, e.g., the situation of a linkage in which not all links are subject to viscous forces. The corresponding force field $F_{R}\colon\mathsf{T}Q\to\mathsf{T}^{*}Q$ is given by minus the fiber derivative of $R$ , $F_{R}\coloneqq\mathbb{F}(-R)$ . We define a map $K\colon\mathsf{T}Q\to\mathfrak{g}^{*}$ , analogous to the momentum map $J$ , via

[TABLE]

where $v\in\mathsf{T}_{q}Q$ and $\xi\in\mathfrak{g}$ . We define the viscous connection or Stokes connection $\Gamma_{\textnormal{visc}}\colon\mathsf{T}Q\to\mathfrak{g}$ via $\Gamma_{\textnormal{visc}}(v_{q})\coloneqq\mathbb{V}^{-1}(q)K(v_{q})$ , where $\mathbb{V}(q)\colon\mathfrak{g}\to\mathfrak{g}^{*}$ is defined via

[TABLE]

where $\xi,\eta\in\mathfrak{g}$ .

Using the $G$ -invariance of $L$ and $\nu$ , a calculation shows that $\Gamma_{\textnormal{mech}}$ and $\Gamma_{\textnormal{visc}}$ are equivariant with respect to the adjoint action of $G$ on $\mathfrak{g}$ :

[TABLE]

Hence if the natural projection $\pi_{Q}\colon Q\to Q/G$ from $Q$ to the space of orbits $Q/G$ of points in $Q$ is a principal $G$ -bundle, then the mechanical and viscous connections $\Gamma_{\textnormal{mech}}$ and $\Gamma_{\textnormal{visc}}$ are indeed principal connections; this justifies their titles.

Now in order for our system to move itself through space, we also allow there to be a $G$ -equivariant external force $F_{E}\colon\mathbb{R}\times\mathsf{T}Q\to\mathsf{T}^{*}Q$ exerted by the locomoting body, subject to the requirement that $F_{E}$ takes values in the annihilator of $\ker\mathsf{D}\pi_{Q}$ , the distribution tangent to group orbits. This requirement reflects the physically reasonable assumption that the locomoting body can exert only “internal forces” which directly affect only its shape $r\in Q/G$ (c.f. Eldering and Jacobs (2016, Sec. 3.3) and Bloch et al. (1996, Sec. 4.2)). For future use, we now prove the following

Proposition 1.

The derivative of $J$ along trajectories of the $G$ -symmetric mechanical system is given by

[TABLE]

making the canonical identifications $\mathsf{T}_{J}\mathfrak{g}\cong\mathfrak{g}$ .

Proof.

We compute in a local trivialization on $\mathsf{T}Q$ induced by a chart for $Q$ , so that we may write a trajectory as $(q,\dot{q})$ . Note that in such local coordinates, $\mathbb{F}L(q,\dot{q})(v_{q})=\frac{\partial L(q,\dot{q})}{\partial\dot{q}}v_{q}$ . Hence

[TABLE]

where we obtained the last line using $\frac{d}{dt}\frac{\partial L}{\partial\dot{q}}-\frac{\partial L}{\partial q}=F_{R}+F_{E},$ which follows from the Lagrange-d’Alembert principle (Bloch, 2015, p. 8). Since $F_{E}$ annihilates tangent vectors to group orbits, $\langle F_{E},\xi_{Q}(q)\rangle=0$ . Hence rearranging and letting $\Phi_{\xi}^{s}$ denote the flow of $\xi_{Q}$ , we find

[TABLE]

The derivative term is zero due to the invariance of $L$ under the action of $G$ , so from the arbitrariness of $\xi\in\mathfrak{g}$ we obtain the desired result. ∎

As a corollary, we obtain a slight generalization of the classical Noether’s theorem.

Corollary 1 (Noether’s theorem).

Consider a mechanical system given by a $G$ -invariant Lagrangian of the form kinetic minus potential energy. Assume that the only external forces take values in the annihilator of the distribution tangent to the $G$ orbits. Then the derivative of the momentum map $J$ along trajectories satisfies

[TABLE]

Proof.

Set $K=0$ in Proposition 1. ∎

A.2 Local form of the equations of motion

Assuming that the action of $G$ on $Q$ is free and proper (Lee, 2013, Ch. 21) so that $\pi_{Q}\colon Q\to Q/G$ is a principal $G$ -bundle, we now derive the equations in a local trivialization, following (Kelly and Murray, 1996). In a local trivialization $U\times G$ , $\pi_{Q}$ simply becomes projection onto the first factor and the $G$ action is given by left multiplication on the second factor. We define $S\coloneqq Q/G$ to be the shape space representing all possible shapes of a locomoting body, and we write a point in the local trivialization as $(r,g)\in U\times G$ where $U\subset S$ . We assume that $U$ is the domain of a chart for $S$ , so that we have induced coordinates $(r,\dot{r})$ for $\mathsf{T}U$ .

Defining the body velocity161616 As mentioned in the main text, the body velocity is often written $g^{-1}\dot{g}$ by an abuse of notation which is only defined on matrix Lie groups where the product of a tangent vector and a group element is naturally defined. We use the alternative notation $\accentset{\scriptstyle\circ}{g}$ as a matter of personal preference. $\accentset{\scriptstyle\circ}{g}\coloneqq\mathsf{D}\mathrm{L}_{g^{-1}}\dot{g}$ , the equivariance property (18) of the connection forms $\Gamma_{\textnormal{mech}},\Gamma_{\textnormal{visc}}$ imply that they may be written in the trivialization as

[TABLE]

where $A_{\textnormal{mech}}\colon\mathsf{T}U\to\mathfrak{g}$ and $A_{\textnormal{visc}}\colon\mathsf{T}U\to\mathfrak{g}$ are respectively the local mechanical connection and local viscous connection. We define a diffeomorphism $(r,\dot{r},g,\dot{g})\mapsto(r,\dot{r},g,p)$ , with $p$ the body momentum defined by

[TABLE]

Here $\textnormal{Ad}_{g}^{*}$ is the dual of the adjoint action $\textnormal{Ad}_{g}$ of $G$ on $\mathfrak{g}$ . We additionally define

[TABLE]

to be the local forms of $\mathbb{I}$ and $\mathbb{V}$ . We note that the invariance of the Lagrangian $L$ and Rayleigh dissipation function $R$ under $G$ , together with the general identity $\mathsf{D}\theta_{g}\xi_{Q}(q)=(\textnormal{Ad}_{g}\xi)_{Q}(\theta_{g}(q))$ , imply that $\mathbb{I}_{\textnormal{loc}}(r),\mathbb{V}_{\textnormal{loc}}(r)$ depend on the shape variable $r$ only.

Rearranging (21), using the expressions (22), (23), and using Proposition 1, we obtain the equations of motion

[TABLE]

where we have suppressed the $r$ -dependence of $A_{\textnormal{mech}},A_{\textnormal{visc}},\mathbb{I}_{\textnormal{loc}},\mathbb{V}_{\textnormal{loc}}$ for readability. Notice that the $\dot{p}$ equation is completely decoupled from $g$ .

In this paper, we are interested in the effect of shape changes on body motion, and not on the generation of shape changes themselves. Hence we have suppressed the equations for $\dot{r},\ddot{r}$ from (24), simply viewing $r,\dot{r}$ as inputs in those equations, but see Bloch et al. (1996) for more details on the specific form of the equations. We merely note that, if the kinetic energy metric is positive-definite, then the Lagrangian is hyperregular and our assumption of $G$ -equivariance of the exerted force $F_{E}$ implies that

[TABLE]

for some function $f$ which depends on the local trivialization. If the kinetic energy metric is not positive-definite (for use in toy examples like those in §4; see the precise assumptions in §A.1, and the footnote there), then we assume that $\ddot{r}$ is given by (25).

A.3 Reduction in the Stokesian limit

From the definitions (15), (17) of $\mathbb{I}_{\textnormal{loc}},\mathbb{V}_{\textnormal{loc}}$ , we see that we may define $\bar{\mathbb{I}}_{\textnormal{loc}},\bar{\mathbb{V}}_{\textnormal{loc}}$ by

[TABLE]

Defining the dimensionless parameter $\epsilon\coloneqq\frac{m}{c}$ and multiplying both sides of (24) by $\mathbb{I}_{\textnormal{loc}}\mathbb{V}_{\textnormal{loc}}^{-1}$ , we obtain the rewritten equations of motion

[TABLE]

In considering the limit in which viscous forces dominate the inertia of the locomoting body, Kelly and Murray (1996) formally set $\epsilon=0$ in (26) to obtain $p=m\bar{\mathbb{I}}_{\textnormal{loc}}(A_{\textnormal{mech}}-A_{\textnormal{visc}})\cdot\dot{r}$ from the second equation. Substituting this into the first equation of (26), they derive the following form of the equations of motion:

[TABLE]

In the language of differential geometry, (27) states that in the Stokesian limit trajectories are horizontal with respect to the viscous connection. We will see in the next section that this reduction can be extended away from the $\epsilon\to 0$ limit.

Appendix B Appendix B — Reduction in the Perturbed Stokes Regime

In Eldering and Jacobs (2016), the argument of Kelly and Murray (1996) was explained in more detail using the theory of normally hyperbolic invariant manifolds (NHIMs) in the context of geometric singular perturbation theory (Fenichel, 1979; Jones, 1995; Kaper, 1999). The idea is to show that for $\epsilon>0$ sufficiently small, the dynamics (26) possess an exponentially attractive invariant slow manifold $M_{\epsilon}$ , such that the dynamics restricted to $M_{\epsilon}$ approach (27) as $\epsilon\to 0$ . We give an alternative argument which yields a result differing from that of Eldering and Jacobs (2016) in two ways.

Eldering and Jacobs (2016) give an argument for general mechanical systems without symmetry under the assumption that the configuration space $Q$ is compact, although they do indicate that compactness can be replaced with uniformity conditions using noncompact NHIM theory (Eldering, 2013). Our argument assumes symmetry but allows $G$ to be noncompact, though we do require that $S\coloneqq Q/G$ be compact. This enables application of our result to locomotion systems with noncompact symmetry groups, such as the Euclidean group of planar rigid motions $\mathsf{SE}(2)$ as in the systems of §4. 2. 2.

Eldering and Jacobs (2016) consider the limit $m\to 0$ while holding $c$ and the force exerted by the locomoting body fixed. This makes sense, because if the exerted force were held fixed while taking $c\to\infty$ , then trivial dynamics would result in the singular limit: the system would not move at all. Rather than holding the exerted force fixed, we will consider the differential equation prescribing the dynamics of the shape variable to be fixed.171717This implicitly assumes that the locomoting body is capable of exerting $\mathcal{O}(c)$ forces. Under this assumption, we show that the dynamics depend only on the ratio $\epsilon=\frac{m}{c}$ , and in particular the dynamics obtained in the two singular limits $m\to 0$ and $c\to\infty$ are the same.

Before stating Theorem 2, we need the following definition.

Definition 1 ( $C^{k}_{b}$ time-dependent vector fields).

Let $M$ be a compact manifold with boundary, and let $f\colon\mathbb{R}\times M\to\mathsf{T}M$ a $C^{k\geq 0}$ time-dependent vector field. Let $(U_{i})_{i=1}^{n}$ be a finite open cover of $M$ and $(V_{i},\psi_{i})_{i=1}^{n}$ be a finite atlas for $M$ such that $\bar{U}_{i}\subset V_{i}$ for all $i$ , and for each $i$ define $f_{i}\coloneqq(\mathsf{D}\psi_{i}\circ f\circ(\textnormal{id}_{\mathbb{R}}\times\psi_{i}^{-1}))$ . We define an associated $C^{k}$ norm $\lVert f\rVert_{k}$ of $f$ via

[TABLE]

where $\lVert\mathsf{D}^{j}f_{i}(x)\rVert$ denotes the norm of a $j$ -linear map; here $\mathsf{D}^{j}f$ includes partial derivatives with respect to time as well as the spatial variables. If $\lVert f\rVert_{k}<\infty$ , we say that $f$ is $C^{k}$ -bounded and write $f\in C^{k}_{b}$ . The norm $\lVert\cdot\rVert_{k}$ makes the $C^{k}_{b}$ time-dependent vector fields into a Banach space. The norms induced by any two such finite covers of $M$ are equivalent, and thereby induce a canonical $C^{k}_{b}$ * topology* on the space of $C^{k}_{b}$ time-dependent vector fields.

*Remark 4**.*

Definition 1 defines the $C^{k}_{b}$ topology on the space of $C^{k}_{b}$ time-dependent vector fields on a compact manifold. As discussed in Eldering (2013, Sec. 1.7), this $C^{k}_{b}$ topology is finer than the $C^{k}$ weak Whitney topology and coarser than the $C^{k}$ strong Whitney topology (Hirsch, 1994, Ch. 2), but all of these topologies induce the same topology on the subspace of time-independent vector fields due to compactness. Definition 1 is a special case of the definition in Eldering (2013, Ch. 2) for the $C^{k}_{b}$ topology on $C^{k}_{b}$ vector fields on Riemannian manifolds of bounded geometry, and on $C^{k}_{b}$ maps between such manifolds.

The following theorem concerns a $G$ -symmetric dynamical system on $\mathsf{T}Q$ whose equations of motion are consistent with our assumptions so far: i.e., they are given in local trivializations by (26) and an equation of the form (25).

Theorem 2.

Assume that $S=Q/G$ is compact. Let $2\leq k<\infty$ , and let $X^{\epsilon}$ be a $C^{k}$ family of $G$ -symmetric time-dependent vector fields on $\mathsf{T}Q$ with the following properties:

For every compact neighborhood with $C^{k}$ boundary $K_{0}\subset\mathsf{T}Q$ and $\epsilon>0$ , $X^{\epsilon}|_{\mathbb{R}\times K_{0}}\in C^{k}_{b}$ (Definition 1). 2. 2.

There exists a compact connected neighborhood $K\subset\mathsf{T}S$ of the zero section of $\mathsf{T}S$ with $C^{k}$ boundary, such that $N\coloneqq\mathsf{D}\pi_{Q}^{-1}(K)\subset\mathsf{T}Q$ is positively invariant for $X^{\epsilon}$ , for all sufficiently small $\epsilon>0$ . 3. 3.

$X^{\epsilon}$ * is given in each local trivialization $\mathsf{T}(U\times G)$ , where $U$ is a chart for $S$ , by (25) and (26):*

[TABLE]

for some function $f$ which depends on the local trivialization but is independent of $\epsilon$ .

Then for all sufficiently small $\epsilon>0$ , there exists a $C^{k}$ noncompact normally hyperbolic invariant manifold with boundary $M_{\epsilon}\subset\mathbb{R}\times N\subset\mathbb{R}\times\mathsf{T}Q$ for the extended dynamics given by the extended vector field $(1,X_{\epsilon})$ on $\mathbb{R}\times\mathsf{T}Q$ . Additionally, $M_{\epsilon}$ is uniformly (in time and space) globally asymptotically stable and uniformly locally exponentially stable (with respect to the distance induced by any complete $G$ -invariant Riemannian metric on $\mathsf{T}Q$ ) for the extended dynamics restricted to $\mathbb{R}\times N$ . Finally, there exists $\epsilon_{0}>0$ such that, for each local trivialization $U\times G$ , there exists a $C^{k}$ map $h_{\epsilon}\colon\mathbb{R}\times(\mathsf{T}U\cap K)\times(0,\epsilon_{0})\to\mathfrak{g}^{*}$ such that $M_{\epsilon}\cap\mathsf{D}\pi_{Q}^{-1}(\mathsf{T}U\cap K)$ corresponds to

[TABLE]

(with $p$ defined by (22)), and $h_{\epsilon}$ together with its partial derivatives of order $k$ or less are bounded uniformly in time. If $f(t,r,\dot{r},\mathbb{I}_{\textnormal{loc}}^{-1}p)$ is independent of $t$ , then $h_{\epsilon}$ and $M_{\epsilon}$ are independent of $t$ , and $M_{\epsilon}$ can be interpreted as a compact NHIM for the (non-extended) dynamics restricted to $N$ .

*Remark 5**.*

Note that even if we assume $f\in C^{\infty}$ , we can generally only obtain $C^{k}$ NHIMs $M_{\epsilon}$ for $k$ finite. This is because we obtain $M_{\epsilon}$ as a perturbation of a NHIM $M_{0}$ , and perturbations of $C^{\infty}$ NHIMs are generally only finitely smooth because the maximum perturbation size $\epsilon$ required to obtain degree of smoothness $k$ for $M_{\epsilon}$ generally depends on $k$ in such a way that $\epsilon\to 0$ as $k\to\infty$ . See Eldering (2013, Rem. 1.12) and van Strien (1979) for more discussion.

*Remark 6**.*

By replacing compactness of $Q/G$ with uniformity conditions, it should be possible to generalize Theorem 2 to the situation of $Q$ noncompact where either $Q/G$ is noncompact, or where there is no symmetry at all. This was pointed out in Eldering and Jacobs (2016, App. 1). This observation seems important for the consideration of dissipative mechanical systems which are only approximately symmetric under a group $G$ , which seems to be a more realistic assumption.

*Remark 7**.*

By taking $\epsilon\to 0$ in Theorem 2, we find that $p=\mathbb{I}_{\textnormal{loc}}(A_{\textnormal{mech}}-A_{\textnormal{visc}})\cdot\dot{r}$ in the limit. Substituting this into the first equation of (32), we obtain Equation (24) as in Kelly and Murray (1996).

Proof.

Preparation of the equations of motion. Throughout the proof, we consider the dynamics in local trivializations of the form $U\times G$ for $Q$ , where $U$ is the domain of a chart for $S$ , so that we have induced coordinates $(r,\dot{r})$ for $\mathsf{T}U$ . In such a local trivialization we would like to use (29) to analyze the dynamics, but there are two (related) problems with this. First, the definition of $p$ depends on $m$ , and this will cause difficulties in verifying Definition 1 to check that certain vector fields are close in the $C^{k}_{b}$ topology. Second, we would like to analyze (29) in a singular perturbation framework, but this is difficult to do directly because $m$ explicitly appears, and the size of $m$ may or may not be commensurate with the size of $\epsilon$ . To remedy this situation, we change variables via the diffeomorphism $(r,\dot{r},p,g)\mapsto(r,\dot{r},\Omega,g)$ of $\mathsf{T}U\times\mathfrak{g}^{*}\times G\to\mathsf{T}U\times\mathfrak{g}\times G$ where $\Omega\in\mathfrak{g}$ is defined by

[TABLE]

Sometimes $\Omega$ is referred to as the (body) locked angular velocity (Bloch et al., 1996, p. 61). Differentiating $\mathbb{I}_{\textnormal{loc}}\Omega=p$ , using (29), and rearranging yields

[TABLE]

where we have introduced the variable $v\coloneqq\dot{r}$ . We have written $\textnormal{ad}^{*}_{\accentset{\scriptstyle\circ}{g}}$ for space reasons, but note that the $\dot{\Omega}$ equation is independent of $g$ since

[TABLE]

and this implies that $\textnormal{ad}^{*}_{\accentset{\scriptstyle\circ}{g}}=\textnormal{ad}^{*}_{\Omega}-\textnormal{ad}^{*}_{A_{\textnormal{mech}}\cdot\dot{r}}$ . We see that (32) is split into slow $(t,r,v)$ and fast $(\Omega)$ variables, which is the appropriate setup for a singular perturbation analysis. The remainder of the proof consists of two parts: (i) proving that the NHIM $M_{\epsilon}$ exists, and (ii) establishing the stability properties of $M_{\epsilon}$ .

Proof that $M_{\epsilon}$ exists. Introducing the “fast time” $\tau\coloneqq\frac{1}{\epsilon}t$ and denoting a derivative with respect to $\tau$ by a prime, after the time-rescaling we obtain the regularized equations

[TABLE]

This rescaling of time is equivalent to replacing the vector field $(1,X_{\epsilon})$ on $\mathbb{R}\times\mathsf{T}Q$ by $(\epsilon,\epsilon X_{\epsilon})$ . We see from (33) and (34) that there is a well-defined $C^{k}$ time-dependent vector field $\tilde{X}_{0}$ given by the pointwise limit $\tilde{X}_{0}\coloneqq\lim_{\epsilon\to 0}\epsilon X_{\epsilon}$ . Given any $G$ -symmetric time-dependent vector field $Y$ on $\mathsf{T}Q$ , we let $Y/G$ denote the corresponding reduced vector field on $(\mathsf{T}Q)/G$ . Hence (34) shows that the extended vector field $(1,\tilde{X}_{0}/G)$ has a smooth embedded submanifold $(M_{0}/G)$ of critical points whose intersection with a locally trivializable neighborhood is given by

[TABLE]

and it is readily seen that $M_{0}/G$ is described globally as the quotient of the Ehresmann connection $M_{0}\coloneqq\ker\Gamma_{\textnormal{visc}}$ by the lifted action of $G$ on $\mathsf{T}Q$ .

Furthermore, $M_{0}/G$ is a globally exponentially stable NHIM for the $\epsilon=0$ system. To see this, first note that in any local trivialization $t,r,v$ are constants when $\epsilon=0$ , and hence $\Omega^{\prime}$ is of the form $\Omega^{\prime}=\bar{\mathbb{I}}_{\textnormal{loc}}^{-1}\bar{\mathbb{V}}_{\textnormal{loc}}\Omega+b$ for a constant $b$ , and therefore has a globally exponentially stable equilibrium provided that all eigenvalues of $\bar{\mathbb{I}}_{\textnormal{loc}}^{-1}\bar{\mathbb{V}}_{\textnormal{loc}}$ have negative real part. To see that this is the case, fix a basis of $\mathfrak{g}$ and corresponding dual basis for $\mathfrak{g}^{*}$ , and first consider the product $\mathbb{I}^{-1}\mathbb{V}$ . With respect to our chosen basis, $\mathbb{I},\mathbb{V}$ and their inverses $\mathbb{I}^{-1},\mathbb{V}^{-1}$ are respectively represented by $r$ -dependent matrices $I_{ij},V_{ij}$ and their inverses $I^{ij},V^{ij}$ . It is immediate from the definitions (15) and (17) that $I_{ij}$ and $V_{ij}$ are respectively positive definite and negative definite symmetric matrices (this is why we required the bilinear forms $k,\nu$ to be positive definite when restricted to vectors tangent to $G$ orbits). Since $I_{ij}$ is symmetric positive definite, we may let $(\sqrt{I})_{ij}$ be a matrix square root of $I_{ij}$ and let $(\sqrt{I})^{ij}$ be its inverse. But then the product $I^{ik}V_{kj}$ is similar to the symmetric negative definite matrix $(\sqrt{I})^{ik}V_{k\ell}(\sqrt{I})^{\ell j}$ (Einstein summation implied). Hence $\mathbb{I}^{-1}\mathbb{V}$ has only eigenvalues with negative real part, and the same is true of $\mathbb{I}_{\textnormal{loc}}^{-1}\mathbb{V}_{\textnormal{loc}}$ because of the similarity $\mathbb{I}_{\textnormal{loc}}^{-1}\mathbb{V}_{\textnormal{loc}}=\textnormal{Ad}_{g}^{-1}\mathbb{I}^{-1}\mathbb{V}\textnormal{Ad}_{g}$ .

Let $\tilde{\pi}\colon(\mathsf{T}Q)/G\to\mathsf{T}S$ denote the projection induced by $\mathsf{D}\pi_{Q}$ . Equation (35) implies that $M_{0}/G$ is the image of a section $\sigma_{0}\colon\mathsf{T}S\to(\mathsf{T}Q)/G$ of $\tilde{\pi}$ . Hence $(M_{0}/G)\cap\tilde{\pi}^{-1}(K)=\sigma_{0}(K)$ is compact, and $M_{0}/G$ intersects $\tilde{\pi}^{-1}(\partial K)$ transversely. Furthermore, the assumption that $X^{\epsilon}|_{\mathbb{R}\times K_{0}}\in C^{k}_{b}$ for any compact neighborhood with $C^{k}$ boundary $K_{0}\subset\mathsf{T}Q$ implies that all partial derivatives of $f$ are bounded on compact sets uniformly in time. This makes it clear that for any compact $K_{1}\subset(\mathsf{T}Q)/G$ , $(\epsilon X_{\epsilon}/G)|_{\mathbb{R}\times K_{1}}$ can be made arbitrarily close to $(\tilde{X}_{0}/G)|_{\mathbb{R}\times K_{1}}$ in the $C^{k}_{b}$ topology (Definition 1) by taking $\epsilon>0$ sufficiently small. Hence by the noncompact NHIM results of Eldering (2013, Sec. 4.1-4.2), it follows that $(M_{0}/G)\cap\tilde{\pi}^{-1}(K)$ persists in extended state space $\mathbb{R}\times N$ to a nearby attracting NHIM $M_{\epsilon}/G$ with boundary for $(\epsilon,\epsilon X_{\epsilon}/G)$ .181818 $M_{\epsilon}/G$ is unique up to the choice of a cutoff function used to modify the dynamics near the boundary of a slightly enlarged neighborhood of $\tilde{\pi}^{-1}(K)$ , used in order to render a slightly enlarged version of $(M_{0}/G)\cap\tilde{\pi}^{-1}(K)$ overflowing invariant (Eldering, 2013, Sec. 4.3). See Eldering et al. (2018, Sec. 5) and Josić (2000, Sec. 2) for more details on such boundary modifications. Furthermore, $M_{\epsilon}/G$ is the image of a section $\sigma_{\epsilon}\colon\mathbb{R}\times K\to(\mathsf{T}Q)/G$ of $\tilde{\pi}$ , and is given in each local trivialization of $(\mathsf{T}Q)/G$ by the graph of a function $\Omega=\tilde{h}_{\epsilon}(t,r,\dot{r},\epsilon)$ which is $C^{k}$ bounded uniformly in time. By symmetry, the preimage $M_{\epsilon}=\pi_{\mathsf{T}Q}^{-1}(M_{\epsilon}/G)$ of $M_{\epsilon}/G$ via the quotient $\pi_{\mathsf{T}Q}\colon\mathsf{T}Q\to(\mathsf{T}Q)/G$ yields a NHIM $M_{\epsilon}$ for $(\epsilon,\epsilon X_{\epsilon})$ (and hence also for $(1,X_{\epsilon})$ ) on the subset $\mathbb{R}\times N$ of $\mathbb{R}\times\mathsf{T}Q$ , and $M_{\epsilon}$ is given in each local trivialization by the graph of the same function $\Omega=\tilde{h}_{\epsilon}$ as $M_{\epsilon}/G$ but augmented with trivial dependence on $g$ . The function $h_{\epsilon}$ from the theorem statement is given by $h_{\epsilon}=\mathbb{I}_{\textnormal{loc}}\tilde{h}_{\epsilon}$ .

Proof of the stability properties of $M_{\epsilon}$ . Fix any complete $G$ -invariant Riemannian metric on191919For example, take the Sasaki metric on $\mathsf{T}Q$ induced by any complete $G$ -invariant metric on $Q$ . $\mathsf{T}Q$ , so that it descends to a metric on $(\mathsf{T}Q)/G$ making $\pi_{\mathsf{T}Q}\colon\mathsf{T}Q\to(\mathsf{T}Q)/G$ into a Riemannian submersion (do Carmo, 1992, p. 185). We have distance functions $\tilde{d}$ and $d$ on $\mathsf{T}Q$ and $(\mathsf{T}Q)/G$ induced by these metrics. For $t\in\mathbb{R}$ , we let $M_{\epsilon}(t)\coloneqq M_{\epsilon}\cap(\{t\}\times N)$ and $M_{\epsilon}(t)/G\coloneqq\pi_{\mathsf{T}Q}(M_{\epsilon}(t))$ . Given $w\in\mathsf{T}Q$ and its orbit $\pi_{\mathsf{T}Q}(w)\in(\mathsf{T}Q)/G$ , it follows that for all $t\in\mathbb{R}$ , $\tilde{d}(w,M_{\epsilon}(t))=d(\pi_{\mathsf{T}Q}(w),M_{\epsilon}(t)/G)$ .202020To prove this, first note that $d(\pi_{\mathsf{T}Q}(w),M_{\epsilon}(t)/G)\leq\tilde{d}(w,M_{\epsilon}(t))$ because the length $\ell(\tilde{\gamma})$ of any curve $\tilde{\gamma}\colon[0,1]\to\mathsf{T}Q$ satisfies $\ell(\pi_{\mathsf{T}Q}\circ\tilde{\gamma})\leq\ell(\tilde{\gamma})$ . But if $\gamma:[0,1]\to(\mathsf{T}Q)/G$ is any curve joining $\pi_{\mathsf{T}Q}(w)$ to $M_{\epsilon}/G$ , then its horizontal lift $\tilde{\gamma}$ is a curve joining $w$ to $M_{\epsilon}$ such that $\ell(\tilde{\gamma})=\ell(\gamma)$ . Taking the infimum over all such $\gamma$ shows that $\tilde{d}(w,M_{\epsilon}(t))=d(\pi_{\mathsf{T}Q}(w),M_{\epsilon}(t)/G)$ . Hence it suffices to prove that $M_{\epsilon}/G$ is uniformly globally asymptotically stable and locally exponentially stable for the vector field $(1,X_{\epsilon}/G)$ on $\mathbb{R}\times\tilde{\pi}^{-1}(K)=\mathbb{R}\times\pi_{\mathsf{T}Q}(N)$ , and to do this it suffices to prove the same for $(\epsilon,\epsilon X_{\epsilon}/G)$ .

Fixing an inner product $\langle\,\cdot\,,\,\cdot\,\rangle$ and associated norm $\lVert\,\cdot\,\rVert$ on $\mathfrak{g}$ , we accomplish this in two steps. First, we show that there exists a compact neighborhood $K_{0}\subset\pi_{\mathsf{T}Q}(N)$ of $M_{\epsilon}/G$ such that $K_{0}$ is positively invariant for the time-dependent flow of $X_{\epsilon}$ , and such that any other compact neighborhood $K_{1}\subset\pi_{\mathsf{T}Q}(N)$ of $M_{\epsilon}/G$ flows into $K_{0}$ after some finite time depending on $K_{1}$ but independent of the initial time. Second, we show that all trajectories in $K_{0}$ converge to $M_{\epsilon}/G$ at a uniform exponential rate. To achieve this second step, we show that in the intersection of each local trivialization with $K_{0}$ , $\lVert\Omega-\tilde{h}_{\epsilon}(t,r,v)\rVert$ decreases at an exponential rate. Since $(\mathsf{T}Q)/G$ is covered by finitely many local trivialization (by compactness of $S$ ), and since all Riemannian metrics are uniformly equivalent on compact sets212121Let $\lVert\,\cdot\,\rVert,\lVert\,\cdot\,\rVert^{\prime}$ denote the Finslers (norms) induced by two Riemannian metrics, and $K_{0}$ our compact set. Since all norms are equivalent on finite-dimensional vector spaces, we have that the restrictions of these norms to the tangent space of a single point $x$ satisfy $\frac{1}{c(x)}\lVert\,\cdot\,\rVert\leq\lVert\,\cdot\,\rVert^{\prime}\leq c(x)\lVert\,\cdot\,\rVert$ . Defining $\bar{c}\coloneqq\sup_{x\in K_{0}}c(x)$ , we obtain the uniform equivalence $\frac{1}{\bar{c}}\lVert\,\cdot\,\rVert\leq\lVert\,\cdot\,\rVert^{\prime}\leq\bar{c}\lVert\,\cdot\,\rVert$ on all of $K_{0}$ . If $K_{0}$ is a connected submanifold and we give it the restricted metrics, then by considering the lengths of curves in $K_{0}$ this implies the uniform bound $\frac{1}{\bar{c}}d\leq d^{\prime}\leq\bar{c}d$ on the Riemannian distances between points in $K_{0}$ with respect to the restricted metrics., this will establish uniform exponential convergence of points in $K_{0}$ with respect to the distance induced by any Riemannian metric, and in particular the distance $d$ .

Consider a local trivialization $U\times G$ of $Q$ and the associated form (34) of the dynamics restricted to $\tilde{\pi}^{-1}(K\cap\mathsf{T}U)$ . Differentiating $\lVert\Omega\rVert^{2}$ using the last equation of (34), it is easy to check that $\frac{d}{d\tau}\lVert\Omega\rVert^{2}\to-\infty$ as $\lVert\Omega\rVert^{2}\to\infty$ , uniformly in $(t,r,v,\epsilon)$ for $\epsilon$ sufficiently small. (This follows from the negative definiteness of $\mathbb{I}_{\textnormal{loc}}^{-1}\mathbb{V}_{\textnormal{loc}}$ and the compactness of $K$ .) Hence we see that there exists $k_{0}>0$ such that for all $\epsilon$ sufficiently small, $\frac{d}{d\tau}\lVert\Omega\rVert^{2}\leq-1$ when $\lVert\Omega\rVert^{2}\geq k_{0}^{2}$ . Now $k_{0}$ might depend on the local trivialization, but we can replace $k_{0}$ with the largest such constant selected from finitely many fixed local trivializations covering $Q$ . Hence there exists a compact subset $K_{0}\subset\pi_{\mathsf{T}Q}(N)$ given by $\{\lVert\Omega\rVert\leq k_{0}\}$ in each of these fixed local trivializations, such that $K_{0}$ is positively invariant for the time-dependent flow of $X_{\epsilon}$ and such that any other compact neighborhood $K_{1}\subset\pi_{\mathsf{T}Q}(N)$ of $M_{\epsilon}/G$ flows into $K_{0}$ after some finite time independent of the initial time.

It remains only to establish the uniform exponential rate of convergence of trajectories in $K_{0}$ to $M_{\epsilon}$ . For each local trivialization $U\times G$ of $Q$ , we define the translated variable $\tilde{\Omega}\coloneqq\Omega-\tilde{h}_{\epsilon}(t,r,v,\epsilon)$ . Since $M_{\epsilon}/G$ is invariant, we must have $\tilde{\Omega}^{\prime}=0$ whenever $\tilde{\Omega}=0$ . Differentiating $\tilde{\Omega}$ using (34), we therefore find that

[TABLE]

since all of the terms which do not vanish when $\tilde{\Omega}=0$ must cancel. Here $\zeta$ is defined via Hadamard’s lemma (Nestruev, 2003, Lemma 2.8):

[TABLE]

so that $\zeta(t,r,v,\tilde{\Omega})\tilde{\Omega}=\tilde{h}_{\epsilon}(t,r,v)f(t,r,v,\tilde{h}_{\epsilon}+\tilde{\Omega})$ . As previously mentioned, the $C^{k}$ boundedness of $X_{\epsilon}$ on compact subsets of $\mathsf{T}Q$ implies that $\tilde{h}_{\epsilon}$ , $f$ , and their first $k$ partial derivatives are uniformly bounded on sets of the form $\mathbb{R}\times K_{2}$ with $K_{2}$ compact. Hence whenever $\lVert\Omega\rVert\leq k_{0}$ and $(r,v)\in U\cap K$ , $\lVert A(t,r,v,\tilde{\Omega})\rVert\leq L$ for some constant $L$ depending on the local trivialization; we replace $L$ with the largest such constant chosen from finitely many local trivializations covering $Q$ . Integrating both sides of (36), taking norms using the triangle inequality, and applying Grönwall’s Lemma therefore yields

[TABLE]

where $-\lambda<0$ is defined via $-\lambda\coloneqq\sup_{r\in S}\max\,\textnormal{spec}(\bar{\mathbb{I}}_{\textnormal{loc}}^{-1}\bar{\mathbb{V}}_{\textnormal{loc}}(r))$ , and is strictly negative since $S$ is compact. By the previous discussion, requiring $\epsilon>0$ to be sufficiently small so that $-\lambda+\epsilon L<0$ completes the proof. ∎

Theorem 2 and Remark 7 show that, to zeroth order in $\epsilon$ , the dynamics restricted to the slow manifold $M_{\epsilon}$ are given by the viscous connection model (27). The following theorem shows that the dynamics restricted to $M_{\epsilon}$ can be explicitly computed to higher order in $\epsilon$ . We compute the restricted dynamics to first order in $\epsilon$ . Higher order terms in $\epsilon$ can also be computed recursively, but we choose not to pursue this here.

Theorem 3.

Assume the same hypotheses as in Theorem 2. Then the dynamics restricted to the slow manifold $M_{\epsilon}$ are given in a local trivialization by

[TABLE]

where

[TABLE]

where we are using the definition $\bar{\mathbb{I}}_{\textnormal{loc}}\coloneqq\frac{1}{m}\mathbb{I}_{\textnormal{loc}}$ . Alternatively, we may write

[TABLE]

for a different $\mathcal{O}(\epsilon^{2})$ term.

*Remark 8**.*

Notice the presence, in the second term of (39), of $\bar{h}_{0}$ rather than $h_{0}$ of (30). This is important because the expression for $h_{0}$ contains an $\mathbb{I}_{\textnormal{loc}}=m\bar{\mathbb{I}}_{\textnormal{loc}}$ factor. Because of the possibility that the size of $m$ is commensurate with $\epsilon$ , this means that $h_{0}$ could be $\mathcal{O}(\epsilon)$ . However, $\bar{h}_{0}$ is $\mathcal{O}(1)$ , ensuring that the second term is $\mathcal{O}(\epsilon)$ but not $\mathcal{O}(\epsilon^{2})$ .

*Remark 9**.*

Equations (39) and (40) can be viewed as adding $\mathcal{O}(\epsilon)$ correction terms to the viscous connection model (27), valid in the limit $\epsilon\to 0$ , to account for the more realistic situation that the inertia-damping ratio $\frac{m}{c}=\epsilon$ is small but nonzero.

Proof of Theorem 3.

Consider the function

[TABLE]

from the proof of Theorem 2, and define $\bar{h}_{\epsilon}\coloneqq\bar{\mathbb{I}}_{\textnormal{loc}}\tilde{h}_{\epsilon}=\frac{1}{m}h_{\epsilon}$ . Since $\bar{h}_{\epsilon},\tilde{h}_{\epsilon}\in C^{k}$ , we may expand them as asymptotic series

[TABLE]

where for all $i$ , $\bar{h}_{i}=\bar{\mathbb{I}}_{\textnormal{loc}}\tilde{h}_{i}$ . We also already know from Theorem 2 that $\tilde{h}_{0}=(A_{\textnormal{mech}}-A_{\textnormal{visc}})\cdot\dot{r}$ , and therefore $\tilde{h}_{0}(t,r,\dot{r})\equiv\tilde{h}_{0}(r,\dot{r})$ has no explicit $t$ -dependence. We now compute $\tilde{h}_{1}$ via a standard technique (Jones, 1995). Differentiating both sides of the equation $\Omega=\tilde{h}_{\epsilon}(t,r,\dot{r},\epsilon)$ with respect to time (using (32) to differentiate the left hand side), substituting the second equation of (41) for $\Omega$ in the resulting expression, and retaining terms only up to $\mathcal{O}(\epsilon)$ we obtain

[TABLE]

Equating the coefficients of $\epsilon$ yields

[TABLE]

Since $h_{1}=\mathbb{I}_{\textnormal{loc}}\tilde{h}_{1}$ and $\bar{h}_{0}=\bar{\mathbb{I}}_{\textnormal{loc}}\tilde{h}_{0}$ , we find

[TABLE]

and therefore (substituting $\ddot{r}=f(t,r,\dot{r},\mathbb{I}_{\textnormal{loc}}^{-1}p)=f(t,r,\dot{r},\tilde{h}_{0})+\mathcal{O}(\epsilon)$ and differentiating $\bar{h}_{0}(r,\dot{r})$ via the chain rule),

[TABLE]

Notice that, since $\tilde{h}_{0}$ is a function of $r,\dot{r}$ only, the $\mathcal{O}(\epsilon)$ portion of the right hand side of (43) is a function of $t,r,\dot{r}$ alone and not $p$ . This is required since $h_{\epsilon}$ is required to be a function of $t,r,\dot{r},\epsilon$ alone, and is the reason that we needed to replace $\ddot{r}$ by $f(t,r,\dot{r},\tilde{h}_{0})$ in the $\mathcal{O}(\epsilon)$ term. Substituting (43) into the first equation of (26) yields Equation (40). Finally, making the substitution $f(t,r,\dot{r},\tilde{h}_{0})=\ddot{r}+\mathcal{O}(\epsilon)$ in Equation (40) yields Equation (39). ∎

The following theorem makes clearer the functional form of the dynamics (39), and it removes the $\accentset{\scriptstyle\circ}{g}$ dependence of the right hand side of (39).

Theorem 1′.**

Assume the hypotheses of Theorem 2. For sufficiently small $\epsilon>0$ , then for each local trivialization there exist smooth fields of linear maps $B(r)$ and $(1,2)$ tensors $G(r)$ such that the dynamics restricted to the slow manifold $M_{\epsilon}$ in the local trivialization satisfy

[TABLE]

*Remark 10**.*

The (1,2) tensors $G(r)$ are not generally symmetric, which is clear from Equation (46) below.

Proof.

Using the properties of $\textnormal{ad}^{*}$ , we may write $\textnormal{ad}^{*}_{\accentset{\scriptstyle\circ}{g}}(\bar{h}_{0})=(C\cdot\bar{h}_{0})\cdot(\accentset{\scriptstyle\circ}{g})$ for an appropriate ( $r$ -independent) linear map $C\colon\mathfrak{g}^{*}\to\text{End}(\mathfrak{g})$ , and hence we may rewrite (39) as

[TABLE]

For sufficiently small $\epsilon$ , we may use the identity

[TABLE]

to obtain

[TABLE]

Since $\bar{h}_{0}(r,\dot{r})=\bar{\mathbb{I}}_{\textnormal{loc}}(r)(A_{\textnormal{mech}}(r)-A_{\textnormal{visc}}(r))\cdot\dot{r}$ is linear in $\dot{r}$ , it follows that the second and third terms are bilinear in $\dot{r}$ , and the fourth term is linear in $\ddot{r}$ . Hence we may take $B(r)\coloneqq\bar{\mathbb{V}}_{\textnormal{loc}}^{-1}\left(\frac{\partial}{\partial\dot{r}}\bar{h}_{0}\right)$ and

[TABLE]

∎

Bibliography38

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Bittner et al. [2018] B Bittner, R L Hatton, and S Revzen. Geometrically optimal gaits: a data-driven approach. Nonlinear Dynamics , pages 1–16, 2018.
2Bloch [2015] A M Bloch. Nonholonomic mechanics and control , volume 24. Springer-Verlag, 2 edition, 2015. ISBN 978-1-4939-3016-6. doi: 10.1007/978-1-4939-3017-3 .
3Bloch et al. [1996] A M Bloch, P S Krishnaprasad, J E Marsden, and R M Murray. Nonholonomic mechanical systems with symmetry. Archive for Rational Mechanics and Analysis , 136(1):21–99, 1996.
4Brendelev [1981] V N Brendelev. On the realization of constraints in nonholonomic mechanics. Journal of Applied Mathematics and Mechanics , 45(3):351–355, 1981.
5Cox [1970] R G Cox. The motion of long slender bodies in a viscous fluid part 1. general theory. Journal of Fluid mechanics , 44(4):791–810, 1970.
6do Carmo [1992] M P do Carmo. Riemannian geometry . Birkhäuser, 2 edition, 1992. ISBN 978-0-8176-3490-2.
7Eldering [2013] J Eldering. Normally hyperbolic invariant manifolds: the noncompact case . Atlantis Press, 2013. ISBN 978-94-6239-002-7. doi: 10.2991/978-94-6239-003-4 .
8Eldering [2016] J Eldering. Realizing nonholonomic dynamics as limit of friction forces. Regular and Chaotic Dynamics , 21(4):390–409, 2016.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Gait modeling and optimization for the perturbed Stokes regime

Abstract

Contents

1 Introduction

1.1 Acknowledgements

2 Background

Theorem 1**.**

Remark 1*.*

3 Estimating Data-Driven Models in the Perturbed Stokes Regime

3.1 Determination of regressors for estimation of the dynamics

Remark 2*.*

Remark 3*.*

3.2 Local models enable optimality testing and optimization

Testing for Optimality

Optimizing Gaits

4 Performance Comparison of the Two Data-Driven Models

4.1 Modeling a swimmer

4.2 Comparison of the estimated models

4.2.1 Algorithm comparison using manually selected gaits

4.2.2 Algorithm comparison using extremal gaits

4.2.3 Performance gains grow with shape space dimension

4.3 Discussion

5 Conclusion

Appendix A Appendix A — Derivation of the Equations of Motion

A.1 The mechanical and viscous connections

Proposition 1**.**

Proof.

Corollary 1** (Noether’s theorem).**

Proof.

A.2 Local form of the equations of motion

A.3 Reduction in the Stokesian limit

Appendix B Appendix B — Reduction in the Perturbed Stokes Regime

Definition 1** (CbkC^{k}_{b}Cbk​ time-dependent vector fields).**

Remark 4*.*

Theorem 2**.**

Remark 5*.*

Remark 6*.*

Remark 7*.*

Proof.

Theorem 3**.**

Remark 8*.*

Remark 9*.*

Proof of Theorem 3.

Theorem 1′.**

Remark 10*.*

Proof.

Theorem 1.

*Remark 1**.*

*Remark 2**.*

*Remark 3**.*

Proposition 1.

Corollary 1 (Noether’s theorem).

Definition 1 ( $C^{k}_{b}$ time-dependent vector fields).

*Remark 4**.*

Theorem 2.

*Remark 5**.*

*Remark 6**.*

*Remark 7**.*

Theorem 3.

*Remark 8**.*

*Remark 9**.*

*Remark 10**.*