Learned multi-stability in mechanical networks

Menachem Stern; Matthew B. Pinson; Arvind Murugan

arXiv:1902.08317·cond-mat.soft·September 2, 2020

Learned multi-stability in mechanical networks

Menachem Stern, Matthew B. Pinson, Arvind Murugan

PDF

TL;DR

This paper explores how elastic networks can be designed or physically learned to have multiple stable states, revealing that non-linear elasticity is essential for sequential learning and stability of desired configurations.

Contribution

It introduces a framework contrasting material design and physical learning, showing non-linear elasticity enables sequential learning of multiple stable states in mechanical networks.

Findings

01

Linear networks stabilize designed states.

02

Non-linear elasticity stabilizes states with mixed strains.

03

Material properties enable continuous learning of new functions.

Abstract

We contrast the distinct frameworks of materials design and physical learning in creating elastic networks with desired stable states. In design, the desired states are specified in advance and material parameters can be optimized on a computer with this knowledge. In learning, the material physically experiences the desired stable states in sequence, changing the material so as to stabilize each additional state. We show that while designed states are stable in networks of linear Hookean springs, sequential learning requires specific non-linear elasticity. We find that such non-linearity stabilizes states in which strain is zero in some springs and large in others, thus playing the role of Bayesian priors used in sparse statistical regression. Our model shows how specific material properties allow continuous learning of new functions through deployment of the material itself.

Equations8

\frac{d k _{ij}^{\mbox e f f}}{d t} = k_{0} f (r_{ij}) .

\frac{d k _{ij}^{\mbox e f f}}{d t} = k_{0} f (r_{ij}) .

E = ∣∣ A s - b ∣ ∣^{2} + ∣∣ s ∣ ∣^{ξ}

E = ∣∣ A s - b ∣ ∣^{2} + ∣∣ s ∣ ∣^{ξ}

E (s) \sim k_{0} \frac{s ^{2}}{( σ ^{2} + s ^{2} ) ^{1 - 0.5 ξ}},

E (s) \sim k_{0} \frac{s ^{2}}{( σ ^{2} + s ^{2} ) ^{1 - 0.5 ξ}},

E = - F^{e x t} \cdot x + k \mbox r e d \sum s_{a}^{ξ} = - F^{e x t} \cdot x + k ∣∣ s ∣ ∣^{ξ},

E = - F^{e x t} \cdot x + k \mbox r e d \sum s_{a}^{ξ} = - F^{e x t} \cdot x + k ∣∣ s ∣ ∣^{ξ},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Learned multi-stability in mechanical networks

Menachem Stern, Matthew B. Pinson, Arvind Murugan

Physics Department and the James Franck Institute, University of Chicago, Chicago, IL 60637

Abstract

We contrast the distinct frameworks of materials design and physical learning in creating elastic networks with desired stable states. In design, the desired states are specified in advance and material parameters can be optimized on a computer with this knowledge. In learning, the material physically experiences the desired stable states in sequence, changing the material so as to stabilize each additional state. We show that while designed states are stable in networks of linear Hookean springs, sequential learning requires specific non-linear elasticity. We find that such non-linearity stabilizes states in which strain is zero in some springs and large in others, thus playing the role of Bayesian priors used in sparse statistical regression. Our model shows how specific material properties allow continuous learning of new functions through deployment of the material itself.

Materials design is generally predicated on knowing the desired material behavior at the time of design. If an adaptable material with multiple behaviors is desired, all potential desired behaviors are usually specified in advance. As a result, we can optimize design parameters compatible with all of the specified desired behaviors. Among mechanical metamaterials, such design has been fruitfully used to create materials that switch from being soft to stiff, transparent to opaque or energy absorbing to elastic, by simply switching between different stable geometric states of the material silverberg2015origami ; Waitukaitis2015-rw ; overvelde2016three ; shan2015multistable ; bertoldi2017flexible ; steinbach2016bistable ; Wu2018-ad ; Yang2018-kv ; Che2017-bu ; Yang2018-ep ; Fu2018-ya .

Here, we explore an alternative approach, where a material learns desired behaviors on the fly by physically experiencing such behaviors in sequence, e.g., by being held in each desired state for a period of time. Such a learning framework for materials offers many complementary strengths to the conventional design framework. For example, the precise behaviors needed can be inferred from the actual conditions of use in real time, instead of an a priori specification. New functionalities can be gained during, and due to, use. Such benefits have made learning a powerful framework in neuroscience and artificial neural networks, but this framework is relatively unexplored in the context of materials rocks2017designing ; rocks2018limits ; hexner2018role .

However, learning in the context of materials presents challenges in addition to such opportunities. In the learning framework, the desired behaviors are not all known ahead of time but presented sequentially. Thus material parameters to encode each desired behavior must be chosen independently without knowledge of future desired behaviors. Most critically, each stored behavior or state needs to survive the parameter changes due to the subsequent learned behaviors and not be overwritten by them. It is not clear what kinds of material properties and interactions would allow such sequential learning of multiple behaviors.

In this work we contrast the requirements for design and learning of multiple stable states in a simple elastic network. In the design model, we search over all spring constants on a computer to stabilize a set of states that are specified beforehand. In the learning model, the desired states are learned in sequence by example, placing the material in each of these states for a period of time. During this time, stabilizing elastic rods or springs with a rest length grow between particles within some distance in space, mimicking the seeded growth of microtubules Hess2017-gi or self-assembling DNA nanotubes mohammed2013directing . Thus, in contrast to design, the learning model is constrained by locality in space and time – material parameters are modified only by the local geometry of the current configuration being experienced rocks2018limits ; hexner2018role .

As a direct consequence, we find that successful learning requires non-linear elasticity of a specific type. Parameterizing the elastic energy of springs in the network as $E\sim x^{\xi}$ for large extensions $x$ , we find that our design procedure is optimal for $\xi\approx 2$ (Hooke’s law) but learning requires $0<\xi\leq 1$ . Such nonlinear springs have been demonstrated using metamaterial designs IsobeNonlinearSpring ; ChenNonlinearSpring . We relate this distinction to the way springs are unequally strained in a learned state – springs learned for that state are nearly unstrained while all other springs are highly strained. Such ‘sparse’ strain profiles are stabilized by $\xi\leq 1$ springs but not for $\xi>1$ .

We establish these results by relating spring non-linearity to Bayesian priors used in statistical regression; such priors can pick out sparse solutions to equations in which some variables are exactly zero. Much in the way Bayesian priors dictate sparsity in statistical regression, the non-linearity of springs dictates that information about each learned state is localized in the material. We hope our analysis of a simple mechanical model will stimulate further work on the conditions under which materials can learn new functionalities on the fly.

Results

We seek to create an elastic network of springs connecting $N$ particles in $2$ dimensions, such that the network has $M$ desired stable states (Fig. 1a). Each desired stable state $m=1,\ldots M$ is specified by the positions $\textbf{x}^{(m)}$ of the $N$ particles (up to rigid body translations and rotations).

In our design model, we connect the $N$ particles by Hookean (linear) springs, and solve an optimization problem for spring constants $k_{ij}$ and rest lengths $l_{ij}$ that minimizes residual forces at each of the desired configurations $\textbf{x}^{(m)}$ (Fig. 1b); see Supplementary Note 1 for details.

In the learning model, desired stable states are acquired by sequentially placing the material in the desired configurations (Fig. 1c). When left in a configuration $\textbf{x}^{(1)}$ for a length of time, unstretched elastic rods grow between every pair of particles $i,j$ at a rate $f(r_{ij})$ set by their separation $r_{ij}$ ; we assume that $f$ vanishes rapidly outside of a characteristic length scale $R$ , so only nodes within a distance less than $R$ are stabilized by such rods. Such elastic elements that grow between specific sites are found both in living systems (e.g., microtubules growing between centrosomes and centromeres Dogterom2013-vk ; Hess2017-gi ) and in synthetic systems (e.g., self-assembling DNA nanotubes mohammed2013directing growing between seeds).

Since the number of rods grows with time, the effective spring constant for the set of rods connecting two particles $i,j$ grows with time and is given by,

[TABLE]

Here $k_{0}$ is the spring constant of each rod, whose rest length $l_{ij}$ is assumed equal to the particle separation $r_{ij}$ , i.e., rods are unstretched in the desired state. In simulations, we pick $f$ to be a step function of range $R$ , $f(r<R)=1,f(r>R)=0$ .

Equation 1 describes the learning rule for this material; the effective spring constant and rest length between two particles $i,j$ is determined by the geometric configurations experienced by the material and the amount of time spent in each configuration. When the material is deformed and held in a second distinct configuration $\textbf{x}^{(2)}$ , additional rods start growing between the particles according to their positions in the new configuration. In some cases, two particles can be joined by multiple springs with different rest lengths.

Linear and non-linear elasticity

We ran the design and learning algorithms using rods with linear Hookean elasticity, i.e., with elastic energy $E_{ij}\sim k_{0}s_{ij}^{2}$ when strained by $s_{ij}$ . The design algorithm, when run on a pair of randomly generated desired states $\textbf{x}^{(1)}$ and $\textbf{x}^{(2)}$ of 10 particles, resulted in an elastic network with two stable minima that resemble $\textbf{x}^{(1)}$ and $\textbf{x}^{(2)}$ , as seen Fig. 2a. These states can be retrieved by any initial condition within wide attractor regions around $\textbf{x}^{(1)},\textbf{x}^{(2)}$ .

In contrast, learning the same two states $\textbf{x}^{(1)},\textbf{x}^{(2)}$ with linear springs fails (Fig. 2b); the two desired states are not stable minima of the learned network. The rods grown to encode state $\textbf{x}^{(1)}$ destabilize, or overwrite, state $\textbf{x}^{(2)}$ and vice-versa. Initial conditions near either $\textbf{x}^{(1)}$ or $\textbf{x}^{(2)}$ relax to new minima very different from $\textbf{x}^{(1)}$ , $\textbf{x}^{(2)}$ .

Why do linear springs allow stabilization of multiple states with design but not with sequential learning? In design, the desired configurations are known ahead of time and so each spring’s parameters can be chosen cognizant of all desired configurations. In fact, one can check that changing one of the desired states, e.g., $\textbf{x}^{(1)}\to\textbf{x}^{(1)}+\delta\textbf{x}^{(1)}$ changes stiffness $k_{ij}$ and rest length $l_{ij}$ for all springs. In this sense, information about each desired state is stored in every spring.

However, in a learning model capable of acquiring arbitrary stable states in sequence, the parameter changes made to store a state $\textbf{x}^{(m)}$ cannot depend on the details of future desired states Hebbian1 , and indeed, in this model, does not depend on past encoded states either. That is, changing a desired configuration, e.g., $\textbf{x}^{(m)}\to\textbf{x}^{(m)}+\delta\textbf{x}^{(m)}$ changes the spring parameters $k_{ij}$ , $l_{ij}$ only for springs grown while learning state $m$ . Thus, information about each stored state is confined to a subset of springs.

Consequently, to stabilize a state $\textbf{x}^{(m)}$ , the elastic dynamics should only attempt to minimize strain to zero in a subset of all springs while leaving all other springs stretched arbitrarily as needed. However, the mechanics cannot possibly know which subset of springs was learned to stabilize a particular state $\textbf{x}^{(m)}$ and thus which subset to satisfy.

A clue to solving this problem comes from sparse regression Lasso ; zou2005regularization . As an example, consider an under-determined problem $A\mathbf{s}=b$ for a vector $\mathbf{s}$ . If we know a priori that an $\mathbf{s}$ exists which has some components that are strictly zero and others non-zero, we can find such ‘sparse’ solutions $\mathbf{s}$ by adding a ‘Bayesian prior’ $||\mathbf{s}||^{\xi}=\sum_{i}s_{i}^{\xi}$ to the least squares loss function,

[TABLE]

and then minimizing $E$ Lasso ; SparseRegression . If $\xi\leq 1$ , such a Bayesian prior $||\mathbf{s}||^{\xi}$ biases the search towards solutions $\mathbf{s}$ in which some elements of $\mathbf{s}$ are strictly zero while others are non-zero (i.e., ‘sparse’ solutions). We emphasize that the Bayesian prior $s^{\xi}$ contains no information about which components of $s$ are to be set to zero; rather, it biases regression towards such solutions and away from generic solutions in which all entries of $\mathbf{s}$ are non-zero.

We employ a similar strategy here by identifying $\mathbf{s}$ above with the vector of strains in different springs. Let us assume that the network spring energies take a non-linear form,

[TABLE]

where $k_{0}$ is the spring constant and $s_{ij}\equiv(r_{ij}-l_{ij})$ is the strain relative to rest length $l_{ij}$ . $\xi$ parameterizes the non-linearity (Fig. 2c); $\xi=2$ is a linear Hookean spring while $\xi<2$ springs have softer restoring forces at large distances, $E\sim s^{\xi}$ . Finally, $\sigma$ is a small length scale within which the interaction is linear for any $\xi$ and is introduced to keep the model realistic, reflecting practical realizations of non-linear $\xi<2$ springs IsobeNonlinearSpring ; ChenNonlinearSpring ; our results below hold for $\sigma\to 0$ as well. See Supplementary Note 2 for details.

We repeated the same learning procedure on the same states as earlier - but with non-linear springs $\xi<2$ . While the results for $1<\xi<2$ are qualitatively similar to linear springs $\xi=2$ , $\xi<1$ shows qualitatively different results – learning succeeds in stabilizing multiple states (Fig. 2b,d).

How do we understand this result? It is clear that forces due to $\xi<1$ springs diminish with strain and thus weaken the effect of strained springs that code for other states. However, the analogy with Bayesian priors goes further by explaining the sharp change in behavior at $\xi=1$ due to the non-analytic nature of $s^{\xi}$ . Following work in sparse regression Lasso , in Fig. 3b,c, we plot the energy contours for the red springs shown, where the two red springs have incompatible rest lengths. The constant energy contours are cusped for $\xi<1$ but not $\xi>1$ . At the cusps, one of the two red springs is completely unstrained while the other contains all the strain. When minimized in conjunction with other springs (dashed black contours), minima are exceedingly likely to be at cusps for $\xi<1$ , where strain is localized to one spring.

Thus, non-linear $\xi<1$ springs stabilize states with bimodal strain distributions - some springs are highly strained while others are unstrained. To complete the analogy with sparse regression, note that the energy of the system in Fig. 3 resembles Eq. 2. Let $\mathbf{F}^{ext}$ represent forces on the particle in Fig. 2a due to the black springs (assumed constant for simplicity). In the limit of small core sizes $\sigma\rightarrow 0$ , the red spring energies are given by $E(r)=k(r-l)^{\xi}\equiv ks^{\xi}$ , so that the total energy of the subsystem shown is,

[TABLE]

where $||\mathbf{s}||^{\xi}$ is the $\xi$ -norm of the strain vector $\mathbf{s}$ for the red springs. The non-linear elastic energy has the analytic form of sparse regression, Eq. 2, and thus one of the red springs is unstrained in each stable minimum. Note that the springs now play a dual role, both providing the equation that is to be solved (the equivalent of $A\mathbf{s}=b$ in sparse regression), and providing the bias towards a bimodal strain distribution.

To test this analogy in larger elastic networks, we let a $N=100$ particle network learn two distinct states, and measured the strain in each spring after relaxing to one of the states (Fig. 3d). For non-linear springs $\xi<1$ , we find a bimodal strain distribution - half of the springs are considerably strained, while the other half are at (approximately) zero strain. This result is in stark contrast to the designed minima with linear springs $\xi=2$ , for which all springs are strained.

Optimal non-linearity

The quality of both learning and design can be quantified by the attractor size and barrier heights around stored states. Large attractors and high energy barriers allow robust retrieval of states from a larger range of initial conditions. These measures have long been used to quantify quality of learning in neural networks hertz1991introduction ; amit1985storing ; amit1985spin .

We find that quality of designed and learned states, as measured by barrier heights, is highest at distinct $\xi^{*}$ ; see Fig. 4a,b. The quality of designed states, for our simple design algorithm, is optimal for linear springs $\xi^{*}\approx 2$ and is relatively insensitive to the number of designed states. However, the optimal $\xi^{*}$ for learned states is $0<\xi^{*}<1$ and varies with the number of learned states. We find similar results by measuring attractor radius instead of barrier heights (Fig. 4c). See Supplementary Note 3.

Much as in sparse regression friedman2012fast ; majumdar2010non , the optimal $\xi^{*}$ for learning can be understood as a balance of two factors – sparsity (smaller $\xi$ ) and convexity (larger $\xi$ ). Smaller $\xi$ leads to more sharply cusped energy contours in Fig. 3c and thus a stronger bias towards bimodal strain distributions with zero strain in some springs (i.e,. sparsity). However, smaller $\xi\to 0$ also leads to vanishing restoring forces outside the immediate vicinity of the unstrained configuration, creating a ‘golf course’ landscape with vanishing attractors.

Thus while smaller $\xi$ locally stabilizes each desired minimum using bimodal strain distributions, larger $\xi$ enlarges the attractor basin, making these minima easier to find. Similar considerations in canonical sparse regression problems select $\xi^{*}=1$ as an optimal choice Lasso .

The radius of spring connection $R$ plays an important role in setting the optimal $\xi^{*}$ value. We observe that the additional stabilizing contributions of the springs afforded at larger $R$ facilitates the optimal stabilization of the system at higher $\xi^{*}$ , and thus with attractors of larger size, as seen in Fig. 4d (for more information see Supplementary Note 4). $L$ is the length scale of the system.

Pattern Recognition

Finally, we ask whether our learned network with large robust attractors around the learned states can perform pattern recognition. To do this, we turn to the MNIST handwritten digits database lecun2010mnist , and try to teach an elastic network to recognize the digits ‘0’ and ‘1’ from examples of these digits.

We trained the elastic network with $5000$ samples of the digits 0 and 1 each from the MNIST database in the following way; each $400$ pixel image was interpreted as a $1$ -d configuration of $400$ particles by interpreting each pixel’s gray-scale value as a particle’s position in the interval $[0,1]$ . The particles in such a state are connected by elastic rods according to the learning rule in Eq. 1. For $\xi<1$ , we find that the training generally creates two distinct large attractors corresponding to an idealized 0 and 1 respectively (Fig. 5d).

We then test the network by using novel unseen examples of 0s and 1s from MNIST as initial conditions for the particles. While these test images are not identical to any particular 0 or 1 used in training, the elastic network still retrieves the correct stored 0 or 1 state. Thus the non-linear $\xi\leq 1$ elastic network learns states 0 and 1 with sufficiently large attractors to accommodate the typical handwriting variations seen in the MNIST database.

Discussion

In this work we contrasted a design and a learning framework for creating multi-stable elastic networks. We found that continually learning novel states without overwriting existing states requires a specific non-linear elasticity $\xi\leq 1$ . The learning model here relies on spontaneous growth of stabilizing rods between nearby nodes, a behavior displayed by microtubules Hess2017-gi , DNA nanotubes mohammed2013directing and other such seeded self-assembling tubes Li2005-lt ; Hartgerink1996-uc ; Blau2004-vy .

The non-linearity $\xi$ plays a unique role as a material design parameter. Most material parameters (e.g., $l_{ij}$ , $k_{ij}$ of springs here) encode information about desired states. But $\xi$ encodes an assumption about how information about desired states is distributed among parameters $l_{ij}$ , $k_{ij}$ of different springs. Learning localizes information about each state to a subset of all springs. Hence stabilizing learned states requires $\xi<1$ , establishing states in which some springs are fully relaxed even if others are highly strained, i.e., the strain profile is sparse. In this way, the non-linearity $\xi$ is mathematically analogous to Bayesian priors in statistical regression that encode assumptions about the sparse nature of solutions. However, the elastic network here goes beyond the classic sparsity problem (Eq. 2); the network has $2$ -d spatial geometry absent in Eq. 2 and is more closely related to (unsolved) sparse reconstruction of $2$ -d maps from pairwise distances between cities Montanari . Consequently, we can explore how physical parameters with no analog in Eq. 2, such as the maximum range of learned interactions $R$ (Fig. 4d) and spatial correlations between stored states, affect the optimal non-linearity $\xi$ (Supplementary Note 4).

Learning and design have complementary strengths, as seen before in neural networks and spin glasses. For example, Hopfield Hopfield introduced neural networks that can learn arbitrary novel memories in sequence using a biologically plausible ‘Hebbian’ learning rule. Gardner Gardner showed that the same model has a higher memory capacity if we assume an optimally designed network in lieu of learning. However, Gardner’s network can be designed only when all desired memories are known — and must be redesigned from scratch to include new memories.

Similarly, in materials, design might be sufficient if all desired states are known beforehand and unlimited computational power is available, since design allows optimization over all design parameters. In contrast, learning is a physically constrained exploration of the same design parameters. However, such constrained exploration can be superior when the desired behaviors are not known a priori and revealed only during use of the material itself. We hope the simple mechanical model studied here will stimulate further work on realistic learning rules that allow materials to acquire new functionalities on the fly.

Acknowledgments

We thank Miranda Cerfon-Holmes, Sidney Nagel, Lenka Zdeborova for insightful discussions and Vincenzo Vitelli for a careful reading of the manuscript. We acknowledge NSF-MRSEC 1420709 for funding and the University of Chicago Research Computing Center for computing resources.

Bibliography35

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Jesse L Silverberg, Jun-Hee Na, Arthur A Evans, Bin Liu, Thomas C Hull, Christian D Santangelo, Robert J Lang, Ryan C Hayward, and Itai Cohen. Origami structures with a critical transition to bistability arising from hidden degrees of freedom. Nature materials , 14(4):389, 2015.
2[2] Scott Waitukaitis, Rémi Menaut, Bryan Gin-Ge Chen, and Martin van Hecke. Origami multistability: from single vertices to metasheets. Phys. Rev. Lett. , 114(5):055503, February 2015.
3[3] Johannes TB Overvelde, Twan A De Jong, Yanina Shevchenko, Sergio A Becerra, George M Whitesides, James C Weaver, Chuck Hoberman, and Katia Bertoldi. A three-dimensional actuated origami-inspired transformable metamaterial with multiple degrees of freedom. Nature communications , 7:10929, 2016.
4[4] Sicong Shan, Sung H Kang, Jordan R Raney, Pai Wang, Lichen Fang, Francisco Candido, Jennifer A Lewis, and Katia Bertoldi. Multistable architected materials for trapping elastic strain energy. Advanced Materials , 27(29):4296–4301, 2015.
5[5] Katia Bertoldi, Vincenzo Vitelli, Johan Christensen, and Martin van Hecke. Flexible mechanical metamaterials. Nature Reviews Materials , 2(11):17066, 2017.
6[6] Gabi Steinbach, Dennis Nissen, Manfred Albrecht, Ekaterina V Novak, Pedro A Sánchez, Sofia S Kantorovich, Sibylle Gemming, and Artur Erbe. Bistable self-assembly in homogeneous colloidal systems for flexible modular architectures. Soft Matter , 12(10):2737–2743, 2016.
7[7] Lingling Wu, Xiaoqing Xi, Bo Li, and Ji Zhou. Multi-Stable mechanical structural materials. Adv. Eng. Mater. , 20(2):1700599, February 2018.
8[8] Yi Yang, Marcelo A Dias, and Douglas P Holmes. Multistable kirigami for tunable architected materials. Phys. Rev. Materials , 2(11):110601, November 2018.