Linear genetic programming control for strongly nonlinear dynamics with frequency crosstalk
Ruiying Li, Bernd R. Noack, Laurent Cordier, Jacques Bor\'ee, Eurika, Kaiser, Fabien Harambat

TL;DR
This paper introduces Linear Genetic Programming Control (LGPC), a model-free method that effectively exploits frequency crosstalk in nonlinear dynamics for control tasks, demonstrated on oscillator stabilization and turbulence drag reduction.
Contribution
The paper presents LGPC, a novel linear genetic programming approach for control of strongly nonlinear systems with multiple actuators and sensors, generalizing previous machine learning control methods.
Findings
LGPC successfully stabilizes a nonlinear oscillator model.
LGPC achieves 22% drag reduction in turbulence control.
LGPC exploits frequency crosstalk for optimal control.
Abstract
We advance Machine Learning Control (MLC), a recently proposed model-free control framework which explores and exploits strongly nonlinear dynamics in an unsupervised manner. The assumed plant has multiple actuators and sensors and its performance is measured by a cost functional. The control problem is to find a control logic which optimizes the given cost function. The corresponding regression problem for the control law is solved by employing linear genetic programming as an easy and simple regression solver in a high-dimensional control search space. This search space comprises open-loop actuation, sensor-based feedback and combinations thereof, thus generalizing former MLC studies. This methodology is denoted as linear genetic programming control (LGPC). Focus of this study is the frequency crosstalk between unforced unstable oscillation and the actuation at different frequencies.…
| Controller input | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| () | 10 | 20 | 50 | 100 | 200 | 250 | 333 | 400 | 500 |
| 0.2 | 0.4 | 1 | 2 | 4 | 5 | 6.6 | 8 | 10 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Linear genetic programming control for strongly nonlinear dynamics
with frequency crosstalk
Ruiying Li
Institut Pprime, CNRS – Université de Poitiers – ISAE-ENSMA, Futuroscope Chasseneuil, France
Bernd R. Noack
Institut Pprime, CNRS – Université de Poitiers – ISAE-ENSMA, Futuroscope Chasseneuil, France
LIMSI-CNRS, UPR 3251, 91405 Orsay cedex, France
Technische Universität Braunschweig, Braunschweig, Germany
Technische Universität Berlin, Berlin, Germany
Laurent Cordier
Institut Pprime, CNRS – Université de Poitiers – ISAE-ENSMA, Futuroscope Chasseneuil, France
Jacques Borée
Institut Pprime, CNRS – Université de Poitiers – ISAE-ENSMA, Futuroscope Chasseneuil, France
Eurika Kaiser
University of Washington, Mechanical Engineering Department,
Seattle, WA 98195, USA
Fabien Harambat
PSA Peugeot-Citroën, Centre Technique de Vélizy
Vélizy-Villacoublay, 78943, France
Abstract
We advance Machine Learning Control (MLC), a recently proposed model-free control framework which explores and exploits strongly nonlinear dynamics in an unsupervised manner. The assumed plant has multiple actuators and sensors and its performance is measured by a cost functional. The control problem is to find a control logic which optimizes the given cost function. The corresponding regression problem for the control law is solved by employing linear genetic programming as an easy and simple regression solver in a high-dimensional control search space. This search space comprises open-loop actuation, sensor-based feedback and combinations thereof — thus generalizing former MLC studies Gautier et al. (2015); Parezanovic et al. (2016). This methodology is denoted as linear genetic programming control (LGPC). Focus of this study is the frequency crosstalk between unforced unstable oscillation and the actuation at different frequencies. LGPC is first applied to the stabilization of a forced nonlinearly coupled three-oscillator model comprising open- and closed-loop frequency crosstalk mechanisms. LGPC performance is then demonstrated in a turbulence control experiment, achieving 22% drag reduction for a simplified car model. For both cases, LGPC identifies the best nonlinear control achieving the optimal performance by exploiting frequency crosstalk. Our control strategy is suited to complex control problems with multiple actuators and sensors featuring nonlinear actuation dynamics.
pacs:
Valid PACS appear here
††preprint: APS/123-QED
I Introduction
Turbulent flow is characterized by broadband dynamics varying from dominant frequencies corresponding to large-scale coherent structures to high frequencies corresponding to Kolmogorov microscales. In a direct energy cascade, the energy-containing coherent structures transfer the energy to small-scale eddies which are dissipated by viscosity. Inversely, a clustering of coherent structures may yield larger scales at lower frequencies (inverse energy cascade). Both mechanisms rely on the nonlinearity of Navier-Stokes equations. This frequency interaction, also called frequency crosstalk, provides an important challenge and opportunity for flow control: the actuation frequency may change the whole spectrum of frequencies and thus ultimately affects the mean flow.
The key role of frequency crosstalk in flow control has been reported in numerous studies. High-frequency forcing using pulsed or synthetic jets or fluidic oscillators Glezer and Amitay (2002); Cattafesta and Shelpak (2011) has been demonstrated to be able to stabilize the turbulent wakes of a circular cylinder Glezer et al. (2005), a car model Barros et al. (2016a), a rectangular bluff body Schmidt et al. (2015) and an axisymmetric body Oxlade et al. (2015). It has also been applied on a flow over a backward-facing step Vukasonivic et al. (2010), a mixing layer Parezanovic et al. (2016) and a jet Samimy et al. (2007). Low-frequency forcing, on the other hand, can either enhance the flow instability manifested by the amplified oscillation of vortex shedding Glezer et al. (2005); Barros et al. (2016b) or attenuate the instability by destroying the formation of shedding Pastoor et al. (2008). This frequency-crosstalk effect of actuation relies on the nonlinear interactions of high-frequency, low-frequency and the dominant modes of the flow.
Most of the studies mentioned above utilize periodic forcing as control strategy. Feedback control may increase actuation energy efficiency by adapting periodic forcing to slowly changing flow conditions Becker et al. (2007). Feedback may also react on the faster coherent structure dynamics Brunton and Noack (2015). In this case, a physics-based model-based control logic is desirable, distilling the physical mechanism and its relation to control. In many cases, this implies that frequency crosstalk is incorporated in the model which constitutes a big challenge. Simple examples of such control-oriented models describe an actuation at higher or lower frequency for stabilizing the dominant vortex shedding oscillation Luchtenburg et al. (2009); Sipp (2012). In general, incorporating multiple frequency crosstalks in a model-based control strategy constitutes a significant challenge, both, from a robust modelling and from a control design perspective, due to the difficulties in the mathematical modelling of the nonlinearities and limited knowledge of flow. Nevertheless, model-based feedback control has enjoyed many success stories for laminar and transitional flows for which a linear control theory can be applied Rowley and Williams (2006); Bagheri et al. (2009); Sipp et al. (2010); Theofilis (2011). Weakly nonlinear dynamics due to base-flow deformations are also easily incorporated in this strategy Gerhard et al. (2003); Fabbiane et al. (2017); Thirunavukkarasu et al. (2012).
In this study, we target a generic model-free control strategy for dynamics with strong nonlinearities — circumventing the challenge to construct corresponding reduced-order models and to derive nonlinear control laws. Instead, control laws are optimized in the plant with an evolutionary algorithm. Optimal parameters of open-loop control laws may be determined with a genetic algorithm Benard et al. (2016). The considered search space includes all nonlinear feedback laws which are approximated by a finite number of mathematical operations. Departure point is Genetic Programming Control (GPC) Duriez et al. (2016). The determination of feedback control laws is formulated as a regression problem in which the controller is optimized with respect to a given cost function. Genetic programming Koza (1992) is used as a powerful regression technique to explore and evolve effective control laws by learning from the training data of experiments or simulations. Successful applications of GPC include separation control Gautier et al. (2015); Debien et al. (2016) and mixing layer control Parezanovic et al. (2016). The innovations in this work include: (1) the use of linear genetic programming as a simpler algorithm and (2) a very general ansatz for control laws incorporating open-loop and sensor-based feedback control.
The paper is organized as follows. In Section II, we present the proposed method and its implementation. Then, in Section III, we demonstrate LGPC (linear genetic programming control) to the stabilization of a forced nonlinearly coupled three-oscillator model comprising open- and closed-loop frequency crosstalk mechanisms. In Section IV, LGPC is applied to a turbulence control experiment, achieving 22% drag reduction for a simplified car model. A landscape of the discovered control laws is visualized in Section V to examine its search space topology. Section VI concludes with a summary and outlook.
II Linear genetic programming control
We consider a multiple-input multiple-output (MIMO) system with the state , an input vector commanding actuation and an output vector sensing the state. Here, , and denote the dimension of the state, the number of actuators and sensors, respectively. The general form of the system reads
[TABLE]
The control directly affects the state through a general nonlinear propagator F. G is a measurement function comprising the sensor signals as function of the state . The control objective is to construct a MIMO controller so that the system has a desirable behaviour. Most control objectives can be formulated in a cost function . The definition of depends on the control goal. For instance, in a drag reduction problem, we define as the drag power penalized by the actuation power.
Following Duriez et al. (2016), the control design is formulated as a regression problem: find the control law which optimizes a given cost function . The cost only depends on the control law, or, symbolically for a well-defined initial value problem or statistically stationary actuation response. Summarizing, the control task is transformed to an optimization problem via cost minimization and is equivalent to finding such that
[TABLE]
The sensor-feedback law maps sensor signals onto actuation commands. Such feedback can be expected to be approximated by a finite number of elementary operations () acting on the sensor signals and finite number of fixed constants. Thus, the search space of permissible control laws is finite, yet of astronomical cardinality. Hence, an exhausting testing in an experiment or numerical calculation is not an option. Instead, we employ genetic programming (GP) as powerful evolutionary search algorithm. GP yields optimal or near-optimal control laws in the search space with high probability for suitable parameters, yet with no mathematically assured convergence. The original tree-based genetic programming (TGP) formulates the mapping by a binary tree structure Koza (1992). Here, we propose to apply a more recent alternative to TGP, called linear genetic programming (LGP) Brameier and Banzhaf (2007). TGP and LGP are equivalent in the sense that any LGP-law can be expressed in TGP and vice versa. The difference is the linear versus recursive coding of LGP and TGP, respectively. LGP is much easier to code and implement in systems with multiple actuators and multiple sensors. As presented before, we refer to this method as linear genetic programming control (LGPC). For details of LGPC, see Li et al. (2016).
The implementation of LGPC for feedback control is sketched in Fig. 1. The fast real-time control occurs in the inner loop with a control law proposed by LGPC. The control law is evaluated in the dynamical system over an evaluation time . Then, a cost is measured quantifying the performance of the control law. The cost value for each control law is sent to the slow outer learning loop, where LGPC evolves these laws.
The learning process of LGPC is detailed in the lower part of Fig. 1. An initial population of control law candidates, called individuals, is generated randomly like in a Monte-Carlo method (see Sec. 3.3 in Li et al. (2016)). Each individual is evaluated in the inner loop and a cost is attributed to them. After the whole generation is evaluated, its individuals are sorted in ascending order based on . The next generation of individuals is then evolved from the previously evaluated one by genetic operators (elitism, replication, mutation, and crossover). Elitism is a deterministic process which copies a given number of top-ranking individuals directly to the next generation. This ensures that the next generation will not perform worse than the previous one. The remaining genetic operations are stochastic in nature and have specified selection probabilities. The individual(s) used in these genetic operators is (are) selected by a tournament process: randomly chosen individuals compete in a tournament and the winner(s) (based on ) is (are) selected. Replication copies a statistically selected number of individuals to the next generation. Thus better performing individuals are memorized. Crossover involves two statistically selected individuals and generates a new pair of individuals by exchanging randomly their instructions. This operation contributes to breeding better individuals by searching the space around well-performing individuals. In the mutation operation, random elements in the instructions of a statistically selected individual are modified. Mutation serves to explore potentially new and better minima of . After the new generation is filled, the evaluation of this generation can be pursued in the plant. This learning process will continue until some stopping criterion is met. Different criteria are used. Ideally, the process is stopped when a known global minimum is obtained (which is unlikely in an experiment). Alternatively, the evolution terminates upon too slow improvement from one generation to the next or when a predefined maximum number of generations is reached. By definition, the targeted optimal control law is the best individual of the last generation.
LGPC can also be used to explore open-loop control by including time-periodic functions in the inputs of control law, i.e. . This method permits to search a much more general multi-frequency control which is hardly accessible to a parametric study of single frequency. Furthermore, the range of LGPC can be extended by comprising both the sensors and time-periodic functions into the inputs of . This results in a non-autonomous control law . This generalization permits to select between open-loop actuation , sensor-based feedback or combinations thereof depending on which performs better. In the following, we term the approach optimizing open-loop frequency combinations as LGPC-1. The approach to optimize autonomous controllers is referred to as LGPC-2. The generalized non-autonomous control design is denoted as LGPC-3.
III Model of three coupled oscillators
In this section, we illustrate LGPC to stabilize a forced dynamical system with three nonlinearly coupled oscillators at three incommensurable frequencies extending the generalized mean-field model Luchtenburg et al. (2009) (see Chapter 5 of Duriez et al. (2016)). The goal is to stabilize the first unstable, amplitude-limited oscillator, while the forcing is performed on the second and third oscillator (see Fig. 2). The second oscillator has also unstable, amplitude-limited dynamics and destabilizes the first oscillator. The third oscillator has linear stable dynamics and has a stabilizing effect on the first. The stabilization of the first oscillator can be performed by closed-loop suppression of the second oscillator or open-loop excitation of the third one. In the following, we formulate the control problem mathematically (Section III.1), parametrically explore the effect of periodic forcing like in many turbulence control experiments (Section III.2), and apply LGPC (Section III.3).
III.1 Problem formulation
The system has three oscillators at frequency , and , the coordinates of which being , and , respectively. The evolution equation of the state reads:
[TABLE]
where , and denote the fluctuation level of the three oscillators, respectively. The growth rate for each oscillator is denoted by . Without forcing , the first and second system are linearly unstable and damped by a Landau-type cubic term to asymptotic amplitudes . Here, and in the following, the superscript ‘’ refers to asymptotic values for unforced dynamics. The third system is linear and stable, i.e. converges to the vanishing amplitude . The forcing is only applied on the second and third oscillators. A linearization of Eqs. (3) around the fixed point yields a system in which the first oscillator is uncontrollable.
The effect of the forcing on the first oscillator can be inferred from the growth rate formula for (see first column in Eqs. (3)). The fluctuation level of the second system destabilizes the first oscillator, while the third system stabilizes it with increasing fluctuation level . Hence, stabilization of the first oscillator may be achieved by exploiting one of two frequency crosstalk mechanisms: stabilizing the second system or exciting the third one. Evidently stabilization of the second system requires feedback while excitation of the stable oscillator can be performed with periodic forcing at the resonance frequency and sufficiently large amplitude .
The cost function to be minimized is the averaged energy of the unstable oscillator penalized by the actuation cost . Here, the temporal averaging is indicated by the overbar. Without forcing, and . We normalize the total cost by the unforced value of the first oscillator to characterize the relative benefit of actuation:
[TABLE]
with as the penalization coefficient. By definition, for the unforced system.
The numerical evaluation of is based on the integration of the dynamical system (3) with the initial condition at . In the first 10 periods of the target oscillator, i.e. for with , no forcing is applied and the system converges to unforced quasi-periodic dynamics , , . The cost functional is evaluated in the next 500 periods, . This time interval contains an actuated transient but is dominated by the post-transient dynamics, i.e. sufficient for statistical averaging.
III.2 Open-loop periodic forcing
First, open-loop periodic forcing is studied, following a practice of many turbulence control experiments. The goal is to minimize the cost function Eq. (4) with periodic forcing employing a parametric variation of the amplitude and frequency in the range of and . respectively. The performance (Eq. (4)) at amplitude and frequency is scanned with increments and , respectively. The corresponding colormap of is shown in Fig. 3. This figure displays a local minimum of . The corresponding parameters are denoted by the superscript ‘’ in the following. The low value indicates a stabilization by over one order of magnitude in the fluctuation level, accounting for the actuation expense. The minimum is reached at the eigenfrequency of the third oscillator , as for , numerically observing that the second oscillator is hardly affected by the forcing at a non-resonant frequency, . The optimal amplitude is numerically determined as the best trade-off between the achieved stabilization and actuation cost. This amplitude leads to and . For a larger time evaluation horizon, the current results suggest a better performance at lower actuation leading to which just neutrally stabilizes the first oscillator , exploiting that the second oscillator is unaffected by forcing. The corresponding analytical approximations are described in Chapter 5 of Duriez et al. (2016).
On the other hand, the maximal value is associated with the forcing at the eigenfrequency of the second oscillator , as the excitation of leads to , resulting in an increase of . These results show that the enabler of open-loop control is the third oscillator rather than the second.
The unforced transient and actuated dynamics of the system are illustrated in Fig. 4 under the optimal periodic forcing . The unforced state during the time window is depicted by a blue dashed line and the forced one at by a red curve. For clarity, only the first 110 periods are shown in Fig. 4 (a-d). Fig. 4 (e,f) covers the whole time interval . When unforced, the unstable oscillators self-amplify towards the limit cycle , whilst the stable oscillator vanishes to . Convergence is implied by and . Once starts at , is rapidly excited to an energy level of , while keeps its original fluctuation level . The resulting system yields which leads consequently to the stabilization of , i.e. . The phase portraits in Fig. 4(e) and (f) illustrate the interactions between different oscillators. The circle indicates the initial point and the arrows the time direction. The forced trajectories represent low-pass filtered data, i.e. do not resolve cycle-to-cycle variation. In particular, Fig. 4(f) shows clearly that decreases with the increase of , corroborating that a high-frequency forcing stabilizes a low-frequency unstable oscillator via frequency crosstalk.
III.3 Results of LGPC
LGPC is applied to solve the control problem of Section III.1. For all LGPC tests, up to generations with individuals in each are evaluated. Hereafter, we denote the cost value of the th individual in the th generation by . After generating the individuals, each is pre-evaluated based on the state of the unforced system. The resulting actuation command is an indicator for their feedback control performance. If no actuation () is obtained in the pre-evaluation, this individual cannot change the unforced state. As a consequence, the individual is not subjected to a testing and is assigned a high cost value. This pre-evaluation step saves numerical testing time.
The parameters of linear genetic programming are similar to those of most GPC studies (see, e.g. the textbook Duriez et al. (2016)). Elitism is set to , i.e. the best individual of a generation is copied to the next one. The probabilities for replication, crossover and mutation are 10%, 60% and 30%, respectively. The individuals on which these genetic operations are performed are determined from a tournament selection of size . The instruction number in the initial generation is selected between to with a uniform probability distribution. In the following generations, the maximum instruction number for each individual is capped by . Elementary operations comprise and . The operation and are protected, i.e. the absolute value of the denominator of is set to when . Similarly, is modified to where is set to when . In addition, we choose six random constants in the range with uniform probability distribution.
In the following, we introduce successively the results of open-loop multi-frequency forcing LGPC-1 (Section III.3.1), full-state feedback control LGPC-2 (Section III.3.2) and non-autonomous control LGPC-3 (Section III.3.3).
III.3.1 LGPC-1
First we search for generalizing the open-loop control by including the best periodic forcing at all eigenfrequencies, i.e. where . This approach, called LGPC-1, contains the best periodic forcing frequency , thus it should be at least as good than the optimal periodic forcing . Figure 5 displays the ‘spectrogram’ of the cost values for the whole collection of control laws.
Each generation is seen to consist of a large range of cost values. The decreasing values towards the right bottom with increasing generation evidences the learning of increasingly better control laws. The best cost value of each generation is highlighted by a red line. The best individual () in the last generation () reads
[TABLE]
Here, and in the following, the superscript ‘’ refers to LGPC-1. When applying a first order approximation on , we get . This expression resembles that of the optimal periodic forcing , and leads to a slightly better cost as a better amplitude with a higher precision is explored by LGPC-1. The dynamics of the system with are similar to Fig. 4 and are not shown here for brevity.
If we increase the precision of to 0.001 in the parameter scan of the periodic forcing in Section III.2, we should find the same result. However, the number of evaluations raises to ( and being the number of the amplitudes and frequencies to be tested, respectively) which is 16 times that of LGPC-1 which equals . In summary, LGPC-1 identifies automatically the optimal frequency and the optimal amplitude by employing less time than that for the periodic forcing with an exhaustive parameter sweep.
III.3.2 LGPC-2
Next, an autonomous full-state feedback law (LGPC-2) is optimized,
[TABLE]
The ‘spectrogram’ of the cost values is shown in Fig. 6. The successive jumps of the best cost value for each generation (red line) reflect the evolution process to better individuals.
The targeted LGPC-2 feedback law, i.e. the best individual in the last generation, reads as follows:
[TABLE]
Here, and in the following, the superscript ‘’ refers to LGPC-2. The corresponding cost is more than seven times better than the value achieved with optimal open-loop control . Closed-loop control leads to both, a smaller fluctuation level and a lower actuation energy . The corresponding dynamics are depicted in Fig. 7.
Instead of the regular excitation of periodic forcing, Fig. 7 (a) shows that gives a strong initial ‘kick’ on the system by exciting the third oscillator to a high energy level of (see Fig. 7 (d), (f) and (g)), while simultaneously stabilizing the second oscillator, (see Fig. 7 (c) and (f)). The first oscillator exhibits consequently a fast decay as has decreased to due to the change in and (see Fig. 7 (b), (e) and (g)). This fast transient takes about one period , see the close view of forcing in Fig. 7 (a). It should be emphasized that LGPC-2 discovers and exploits both frequency crosstalk mechanisms, the excitation of the third oscillator for a quick transient and the suppression of the second oscillator to sustain the low fluctuation level of the target dynamics.
Following this fast transient, the first and second oscillators enter into a quasi-stable state at nearly vanishing fluctuation levels. Subsequently, the control command vanishes as full-state feedback shows no need to actuate after the energy is defeated. With vanishing , the third oscillator decays exponentially fast. This transient process converges to the fixed point as depicted in Fig. 7 (f) and (g). Now, the first oscillator has a stabilizing growth rate . LGPC-2 shows an example of feedback control better than the open-loop control. With only a tiny investment of actuation energy at the very beginning of the control, the whole system remains stabilized without actuation even after thousands of periods.
It should be noted that closed-loop control is not necessarily better than open-loop actuation. Suppose the growth-rate of the first oscillator reads
[TABLE]
In this case, exciting the third oscillator is the only effective stabilizing mechanism and this excitation can already be done with open-loop forcing.
III.3.3 LGPC-3
Finally, we explore a more general class of control laws which combines full-state feedback and the best periodic forcing at all eigenfrequencies , as discussed in Section II. Then, the generalized LGPC-3 control law includes the pure full-state feedback and the best periodic forcing frequency . Hence, it should be at least as good than LGPC-2. The learning process is similar to Fig. 6, thus we do not show the convergence of cost values here for brevity. The optimal control law from LGPC-3 reads
[TABLE]
Here, and in the following, the superscript ‘’ refers to LGPC-3 results. This control law achieves a better cost value compared to LGPC-1 with similar dynamics. Hence, the results are not detailed here to avoid redundancies. It is worth to note that Eq. (8) can also be expressed as b^{\bullet}=K_{1}\big{(}3a_{2}h_{1}h_{3}-a_{4}\big{)} where represents the operator ‘’. To shed light on the contribution of each term to , Fig. 8 displays the temporal evolution of the actuation command and the relevant input from the states and from the harmonic functions. It shows that the harmonic component destabilizes the stable oscillator by a quasi-periodic forcing while the states and act as an amplitude regulator.
To summarize, optimal periodic forcing (PF), open-loop multi-frequency forcing (LGPC-1), full-state feedback (LGPC-2), and generalized feedback (LGPC-3) are compared. The contributions to the cost function are depicted in Fig. 9, showing that the generalized feedback outperforms optimal periodic forcing and full-state feedback. The stabilizing mechanisms are schematically depicted in Fig. 10.
IV Drag reduction using LGPC
In this section, we apply LGPC to a turbulence control experiment targeting the drag reduction of a simplified car model. Given that the drag of a ground vehicle is dominated by pressure drag, we aim to increase the base pressure and thus reduce the drag. For that, active control is applied on the wake flow using fluidic actuators. In the following, the experimental setup is presented in Section IV.1. The implementation and results of LGPC are discussed in Section IV.2. Section IV.3 illustrates the effect of the optimal forcing on the near wake dynamics.
IV.1 Experimental setup
A sketch of the experimental setup is shown in Fig. 11. The experiment is performed in a closed-circuit wind tunnel, the test section of which is 2.4\text{\,}\mathrm{m}$\times$2.6\text{\,}\mathrm{m}$\times$6\text{\,}\mathrm{m}. The model is similar to the square-back Ahmed body Ahmed et al. (1984) and has the following dimensions: height 0.297\text{,}\mathrm{m}, width $W=$0.350\text{\,}\mathrm{m} and length 0.893\text{,}\mathrm{m}$$. The ground clearance is set to 0.05\text{,}\mathrm{m} as in Ahmed et al. (1984).
The experiment is conducted with a constant free-stream velocity 15\text{,}\mathrm{m}\text{,}{\mathrm{s}}^{-1} corresponding to a Reynolds number $Re_{H}={U_{\infty}H}/{\nu}=3\times 10^{5}$. The wake is manipulated by pulsed jets emerging parallel to the free stream through the slits immediately beneath the trailing edges (see Fig. [11](#S4.F11)(a) and (b)). The slit thickness is $h_{\text{slit}}=$1\text{\,}\mathrm{mm}$\approx 0.003H$. In addition, a rounded surface of radius $9h_{\text{slit}}$ is installed immediately beneath each slit as an additional passive device. The pulsed jets are driven by solenoid valves working in the frequency range $f\in[0,500]$\mathrm{Hz}, and are fed by a plenum connected to the lab pressurized air supply. The actuation command is binary. The valves are closed at and open at . The flow is monitored by 16 pressure sensors distributed over the base surface, 12 of which are used as feedback sensors, see Fig. 11(c). Particle Image Velocimetry (PIV) is performed to capture the flow dynamics in the near wake and to identify the control effects. The measured plane is the vertical (normal to ground) symmetry plane downstream the base. The first and second order statistics of the streamwise (along ) and cross-stream (along ) velocity are computed based on 1000 images with a spatial resolution of 0.8% of the model’s height. For more details on the experimental setup, see Barros et al. (2016a).
IV.2 Results of LGPC
In the following, we apply LGPC on the plant for the purpose of increasing the base pressure. We define the cost functional as
[TABLE]
where and represent the time- and area-averaged base pressure coefficients in the actuated and unforced flow, respectively. For estimating these quantities, all the pressure sensors in the base surface are used. By definition, for the unforced flow. () represents the increase (decrease) of the base pressure.
The optimal periodic forcing is found at with duty cycle , resulting in and increasing the base pressure by 33%. This result is taken as the benchmark. The included sensors are , where is the fluctuating component of the th pressure sensor signal. As the control command is binary, we apply a Heaviside function H to transform the continuous output of a control law to a binary output, i.e. where , otherwise. The control law is evaluated for a time period of 10\text{,}\mathrm{s}$$. This value is approximately 500 convective time units defined by . This period has been found to be sufficient for good statistical accuracy Barros (2015).
First, we explore the open-loop multi-frequency control (LGPC-1) optimizing the frequency combination. Let comprise 9 harmonic functions listed in Table 1. In this case, the control law reads . Up to generations with individuals in each are evaluated. We stop at the fourth generation because half of the individuals have similar values near the optimal one.
The optimal control law reads:
[TABLE]
The resulting cost beats the optimized periodic forcing, leading to 35% base pressure recovery associated with 22% drag reduction. The actuation energy defined by the time-averaged momentum of pulsed-jets is about 7% for both control laws. The optimal control law contains two frequencies, indicating that LGPC-1 explores a multi-frequency forcing which outperforms the reference periodic forcing.
The results for LGPC-2, , have been discussed in an earlier study Li et al. (2016) and are not shown here. Intriguingly, LGPC-2 provides a sensor optimization by reproducibly selecting only one sensor near the centre of bottom edge in the optimal control law. The corresponding control emulates the optimal high-frequency periodic forcing but is slightly worse (). A similar observation has been made for stabilization of the mixing layer Parezanović et al. (2015), where optimized high-frequency periodic forcing has outperformed GPC-optimized sensor-based feedback in stabilizing the flow. At high frequencies, time delays and noise in sensor-based feedback give rise to low-frequency actuation components which are detrimental to the cost function. We could even change the dynamical system (3) to have an unbeatable periodic forcing, as discussed at the end of Section III.3.2.
Finally, a test of the generalized non-autonomous control LGPC-3 is performed by combining the sensors and the optimal harmonic forcing , i.e. . LGPC-3 converges quickly to the optimal periodic forcing . The finding is in agreement with the LGPC-2 result where the optimal control emulates the optimal periodic forcing but is slightly worse. LGPC-3 prefers to select the optimal periodic forcing to the sensor feedback. Upon these results, we do not pursue LGPC-3 by including multiple frequencies in this experiment. We assume the result will be the same with LGPC-1.
In summary, LGPC identifies an open-loop multi-frequency forcing as the best control for drag reduction. The underlying dynamics will be presented in the following section. Note that this control has been identified by testing only 200 individuals in less than one hour. The required optimization time is less than that for finding the best frequency and duty cycle for the periodic reference with a thorough parameter scan.
IV.3 Near wake dynamics of LGPC-1
In this section, we investigate the impact of the best control from LGPC-1 on the near wake dynamics. To illustrate the actuation characteristics of , Fig. 12 displays (a) its phase-averaged jet velocity over one period and (b) its power spectral density . The results of are also presented for comparison. Intriguingly, exhibits a multi-frequency dynamic, showing two frequencies at and , respectively.
It has been reported that forcing at frequencies several times that of the natural vortex shedding can stabilize the wake dynamics by inducing large dissipation and inhibiting the entrainment of fluid into the recirculation region Barros et al. (2016a); Oxlade et al. (2015). Here, LGPC-1 exploits similar actuations in an unsupervised manner. The actuation frequencies in are one order of magnitude larger than that of the natural vortex shedding frequency . The impact of the actuation on the wake dynamics can be further inferred from the base pressure fluctuation. We use the area-averaged base pressure coefficient as a global indicator of the dynamics. Figure 12(c) compares the spectral energy of for the unforced and optimal forced flow, where represents its power spectral density. The high-frequency forcing has two major effects: (1) it significantly excites the frequencies over , and (2) it suppresses a range of frequencies below . The high level of energy around in the unforced flow is associated with the bubble pumping frequency, which is induced by an axial oscillation of the recirculation bubble Berger et al. (1990). It seems that the damping of this pumping mode contributes to reduce the drag. The benefit in drag reduction by the suppression of this mode has been also observed in Khalighi et al. (2001). This result is a good illustration of the frequency crosstalk between low- and high-frequency, and corroborates the mechanisms proposed in Oxlade et al. (2015).
Now, we focus on the effects of the best LGPC-1 control on the wake dynamics identified from the PIV measurements. Figure 13 shows the color map of the time-averaged velocity norm overlapped with 2D streamlines (a, b) and 2D estimation of the turbulent kinetic energy (c-f) for the baseline (a,c,e) and controlled flow (b,d,f). and represent the time-averaged streamwise and cross-stream velocity, respectively. and are their corresponding velocity fluctuations. The values of these quantities are normalized by .
The mean wake of the baseline flow consists of two counter-rotating structures with very low velocity inside, leading to a recirculating bubble extending up to , where denotes the bubble length. The upper recirculating structure dominates the wake and results in an asymmetry in the cross-stream direction. The distribution of is concentrated in the shear layers, indicating its important role in the wake dynamics. In addition, higher values of are noticeable at the lower shear layer near the ground which corroborates the asymmetry observed above. Such asymmetry is ascribed to the presence of ground as a perturbation.
The forcing induces significant changes in the wake. First, the shear layers are highly deviated toward the model base, resulting in a thinner and shorter recirculation bubble, the length of which is , reduced by 25% compared with the baseline flow. The vectorization of the shear layer is highlighted in Fig. 14(a) by the velocity angle of the streamline emerging from the point located near the upper separating edge. The angle variation immediately downstream the trailing edge () indicates that there is a reversal in the sign of streamline curvature. This modification of curvature results in a local rise in base pressure. Second, the vectorization of shear layers is accompanied by an overall reduction of turbulent kinetic energy inside the recirculation bubble, which can be qualitatively observed in Fig. 13(d) and (f). Following the analyses in Barros et al. (2016a), we quantify the modification of the wake dynamics by evaluating the streamwise evolution of the integral of the turbulent kinetic energy and averaged kinetic energy inside the domain defined as follows:
[TABLE]
[TABLE]
The results are shown in Fig. 14 (b) and (c). We observe an overall reduction of in the forced flow from , indicating an attenuation of the fluctuating dynamics in the wake. In particular, the significant reduction of near the end of the mean recirculating bubble is believed to be linked with the very strong damping of the low frequency dynamics observed in Fig. 12(c). A decrease of is discernible very close to the base () and further downstream . Between these two bounds, there is a slight increase of . To gain insights into this evolution, we present separately the contribution of streamwise velocity and cross-stream velocity to . The decrease of in the range is directly related to the reduction of near the base, indicating that the upward flow adjacent to the base is less energetic in the forced flow. Further downstream, increases compared with the baseline flow. In fact, the prominent deviation of the bubble boundary pushes the flow toward the inner wake and thus increases the absolute value of cross-stream velocity. Correspondingly, we observe an increase of in the range . Beyond , the decrease of is amenable to the diminution of . The overall attenuation of indicates that the streamwise motion of the reversed flow is reduced by the forcing.
These observations show that a base pressure recovery is associated with: (1) the modification of streamline curvature which narrows and shortens the bubble and (2) the stabilization of the wake induced by the enhanced interaction of the small- and large-structures due to the high-frequency forcing. These mechanisms are consistent with the results in Barros et al. (2016a) except that they did not observe a shorter bubble. This difference is related to the actuation parameters. We actuate at a lower frequency and higher amplitude, yielding a higher angle deviation which is responsible for reducing the bubble length.
V Visualization of control laws
In this section, we illustrate the control laws and cost function values by an easily interpretable ’topological landscape’, generalizing earlier work Kaiser et al. (2017). First (Section V.1), the visualisation technique is described, employing a control-law distance metric and multidimensional scaling for feature extraction. Then, (Section V.2), the LGPC laws for the dynamical system and the turbulence control experiment are depicted.
V.1 Multidimensional scaling
LGPC systematically explores the control law space by generating and evaluating a large number of control laws from one generation to the next. An assessment of the similarity of control laws gives additional insights into their diversity and convergence to optimal control laws, i.e. into the explorative and exploitative nature of LGPC. For that purpose, we rely on Multidimensional Scaling (MDS) Mardia et al. (1979), a method classically used to visualize abstract data in a low-dimensional space. The main purpose of MDS is to visualize the (dis)similarity of objects or observations. MDS comprises a collection of algorithms to detect a meaningful low-dimensional embedding given a dissimilarity matrix. Here, we employ Classical Multidimensional Scaling (CMDS) which originated from the works of Schoenberg (1935) and Young and Householder (1938).
Let us define as the number of objects to visualize, and as a given distance matrix of the original high-dimensional data. The aim of CMDS is to find a centred representation of points with , where is typically chosen to be 2 or 3 for visualization purposes, such that the pairwise distances of the points approximate the true distances, i.e. . The details of the implementation are given in Appendix A.
We choose to visualize all control laws in a two-dimensional space . Thus, the number of objects is , where is the number of individuals in a generation, and is the total number of generations. The distance between two control laws and , shall measure their ‘effective difference’. Let us consider the non-autonomous feedback . Here, denotes the sensor reading and the harmonic control input on the corresponding -forced attractor. The squared difference between and is defined as
[TABLE]
The time average is taken over all sensor readings and corresponding harmonic input in the evaluation time interval from both forced attractors under control laws and . Thus, represents the difference between the th and th control law averaged over the sensor readings of both actuated dynamics. The permutation of control laws and with its arguments guarantees that the distance matrix is symmetric. More importantly, this ensures that the control laws are compared in the relevant sensor space with an equal probability of both forced attractors.
The second term in (13) penalizes the difference of their achieved costs with coefficient . The penalization coefficient is chosen as the ratio between the maximum difference of two control laws (first term of ) and the maximum difference of the cost function (second term of ). Thus, the dissimilarities between control laws and between the cost functions have comparable weights in the distance matrix . This penalization evidently smoothes the control landscape .
A problem may arise for the comparison of two pure open-loop forcings and . We expect, for instance, that and give rise to the same actuation response modulo a time shift and would consider these control laws as equivalent. Even for sensor-based feedback enriched by harmonic input, we expect the actuation response to be ’in phase’ or synchronized with the harmonic input. This expectation is taken into account by minimizing the difference between two control commands modulo a minimizing time shift:
[TABLE]
Evidently, (13) and (14) concide at .
Summarizing, the square of the distance matrix is defined as follows:
- (1)
If both control laws have non-trivial harmonic input (are non-autonomous), (14) defines the distance.
- (2)
Otherwise, (13) is employed.
Applying CMDS to the distance matrix , each control law is associated with a point such that the distance between different emulates the distance between control laws defined by (13) and (14). More generally, are feature vectors which coefficients represent those features that contribute most on average to the discrimination of different control laws.
V.2 Control landscapes for the LGPC runs
Figure 15 visualizes the control laws determined by LGPC-3 for the three-oscillator model (a), and LGPC-1 for the simplified car model (b). Due to the huge number of control laws in the three-oscillator model (), we present every 10th individual in every 10th generation for clarity. The full ensemble of individuals are shown for the simplified car model as its number is moderate (). Each symbol represents a control law which is color-coded with respect to its performance ranking, for instance the dark color represents the best 10% of the presented control laws. The control laws in the first generation cover a significant portion of the control space, like in a Marte-Carlo search. When the value of increases, we observe a global movement of control laws towards the minimum where better performance is obtained (darker color). Moreover, the distances between control laws of different generations are also decreased resulting in a dense distribution. This is illustrated in Fig. 15 (a) where the inserted figure gives a close view of the control laws near the origin point, where the best control law(s) are found at . These observations show that LGPC has effectively explored the control space, evidenced by the extended distribution of control laws. In summary, the visualization provides not only a simple and revealing picture of the exploration and exploitation characteristics of the control approach, but also inspires further improvement of the methodology.
VI Conclusion
We have demonstrated that linear genetic programming control (LGPC) is a simple yet effective model-free control strategy for strongly nonlinear dynamics with frequency crosstalk, i.e. a very challenge of reduced-order modeling and model-based control design due to the difficulties in the mathematical modelling of the nonlinearities and limited knowledge of flow. LGPC is shown to discover and exploit the most effective nonlinear open- and closed-loop control mechanisms in dynamical systems and turbulence control experiments in an automated unsupervised manner without any model or knowledge of the plant.
Three categories of LGPC are proposed in this work, an open-loop multi-frequency control , named LGPC-1, an autonomous sensor-based feedback control , termed LGPC-2, and a generalized non-autonomous control comprising the sensors and time-periodic functions , called LGPC-3. All of them are successfully applied to the stabilization of a forced nonlinearly coupled three-oscillator model (Section III). The obtained control laws stabilize the first unstable oscillator by exploiting two frequency crosstalk mechanisms: (1) the excitation of the third oscillator by a hard ’kick’ for a quick transient and (2) the suppression of the second oscillator to sustain the low fluctuation level of the target dynamics. Following the quick transient, the first and second oscillators enter into a quasi-stable state at nearly vanishing fluctuation levels, so the full-state feedback hardly needs to actuate and the control command starts to vanish. The whole system is stabilized with only a small investment of the actuation energy at the very beginning of the control. Thus, LGPC-exploited control laws show a performance over the optimal open-loop control as both a lower fluctuation level and a lower actuation energy are obtained. The example and explored controller demonstrate the vital importance of frequency crosstalk for control design.
LGPC is applied in a turbulence control experiment targeting drag reduction of a car model (Section IV). It finds that multi-frequency forcing beats optimized periodic forcing by 22% over 19%, the past benchmark for this square-back Ahmed body configuration. This performance increase of 3% pays for almost half of the invested actuation energy. Perhaps surprisingly, the maximum actuation frequency is about 33 times that of the von Kármán vortex shedding. This high-frequency forcing leads to a broadband suppression in very low frequencies of base pressure signals and a global attenuation of averaged and turbulent kinetic energy in the near wake, resulting in a more stabilized wake. On the other hand, the mean wake geometry is modified such that the shear layers are deviated towards the center, resulting in a shorter, narrower, more stream-lined shaped bubble. The drag reduction is ultimately achieved by the combined effect of the wake stabilization and the shear layer deviation and can legitimately be called fluidic boat tailing.
One of the many benefits of LGPC is that it explores automatically the control space with little or no knowledge of the system being controlled. Moreover, the LGPC-3 ansatz for the control law can make the evolutionary algorithm choose between sensor-based feedback, multi-frequency forcing and combinations thereof. In addition, the number of control laws evaluations for the Ahmed body drag reduction was quite comparable to a single frequency optimization but yields a much more general multi-frequency actuation which is hardly accessible to a parametric study. In an even more general ansatz, noise signals could also be included in the control law arguments, leading to . Thus stochastic forcing and its generalizations are included. Another generalization is the use of temporal filters as considered operations. In Duriez et al. (2016), a filter-enriched GPC has successfully discovered the optimal linear quadratic gaussian control for the stabilization of a noise-driven oscillator. In summary, LGPC can work on a search space which includes in principle any perceivable control logic with finite amount of operations.
Visualization of the ensemble of the control laws in a two-dimensional plane sheds light on the explorative and exploitative nature of LGPC, and thus addresses the need to monitor the search space and guide the improvement of the algorithm. The example given in Fig. 15 indicates clearly the search space topology and distills the local extrema in this feature space. Evidently, in a future development of LGPC, this feature space may help to estimate the cost function of an untested control law or be used to avoid the redundant testing of control laws in unpromising terrain. Thus, experimental testing time can be reduced. The visualization is becoming an important component of LGPC for on-line decisions during a control experiment.
The authors currently improve the LGPC methodology, and pursue car model experiments for reducing the drag and yaw moment during cross-wind gusts. LGPC opens refreshingly new paths in fluid mechanics, as estimation, prediction and control tasks are all regression problems miminizing a cost function. LGPC exploits that control is a mapping from the plant sensors (output) to actuations (input) optimizing aerodynamic or other goals. Prediction is the mapping from the state to its time derivative or future state. And estimation maps sensor signals to flow fields. Evidently all these tasks can be solved with LGP. Moreover, a single LGPC run yields already rich actuation response data for the computation of a control-oriented nonlinear black-box model. Another more challenging direction is the exploitation of Navier-Stokes based insights in the problem formulation of LGPC. LGPC and machine learning in general can reasonably be expected to be a game changer in future flow control and in fluid mechanics in general.
Acknowledgements.
We warmly thank the great support during the experiment by J.-M. Breux, J. Laumonier, P. Braud and R. Bellanger. The thesis of RL is supported by PSA Peugeot Citroën in the context of OpenLab Fluidics (fluidics@poitiers). We also acknowledge the funding of the former Chair of Excellence ’Closed-loop control of turbulent shear layer flows using reduced-order models’ (TUCOROM, ANR-10-CHEX-0015) supported by the French Agence Nationale de la Recherche (ANR). LC acknowledges the funding of the ONERA/Carnot project INTACOO (INnovaTive ACtuators and mOdels for flow cOntrol). EK gratefully acknowledges funding by the Moore/Sloan foundations, the Washington Research Foundation and the eScience Institute. We appreciate valuable stimulating discussions with: Diogo Barros, Steven Brunton, Thomas Duriez and Andreas Spohn.
Appendix A Classical multidimensional scaling (CMDS)
Classical multidimensional scaling (CMDS) is employed to visualize the similarity of control laws (see Sec. V). CMDS aims to find a low-dimensional representation of points , , such that the average error between the distances between points and the elements of a given distance matrix , here emulating the distances between the time series of different control laws, is minimal.
In order to find a unique solution to CMDS, we assume that with is centered, i.e. is a mean-corrected matrix with . Rather than directly finding , we search for the Gram matrix that is real, symmetric and positive semi-definite. Since is assumed to be centred, the Gram matrix is the Euclidean inner product, and we have . In the first step of the classical scaling algorithm, the matrix of elements is constructed. Then, we form the ‘doubly centred’ matrix , where with the identity matrix of size and an matrix of ones. The term ‘doubly centred’ refers to the subtraction of the row as well as the column mean. Let the eigendecomposition of be where is a diagonal matrix with ordered eigenvalues and contains the eigenvectors as columns. Then can be recovered from
[TABLE]
Having only the distance matrix, the resulting representation is only defined up to a translation, a rotation, and reflections of the axes. If the distance matrix is computed using the Euclidean distance and all eigenvalues are non-negative, can be recovered. If , there exist zero eigenvalues, in which case a low-dimensional subspace can be found where the presentation of would be exact. For other distance metrics, the distances of the presentation found by CMDS is an approximation to the true distances. Some eigenvalues may be negative and only the positive eigenvalues and their associated eigenvectors are considered to determine an approximative representation of . Note that for the Euclidean distance metric, CMDS is closely related to a principal component analysis (PCA) commonly used to find a low-dimensional subspace. While CMDS, and multi-dimensional scaling generally, uses a distance matrix as input, PCA is based on a data matrix. A distance matrix can be directly computed for the centred matrix . If the Euclidean distance is employed for computing the distances, the result from applying CMDS to corresponds to the result from applying PCA to . A proof can be found in Mardia et al. (1979). The quality of the representation is typically measured by , and more generally if is not positive semi-definite using .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Gautier et al. (2015) N. Gautier, J.-L. Aider, T. Duriez, B. R. Noack, M. Segond, and M. W. Abel, “Closed-loop separation control using machine learning,” J. Fluid Mech. 770 , 424–441 (2015).
- 2Parezanovic et al. (2016) V. Parezanovic, L. Cordier, A. Spohn, T. Duriez, B. R. Noack, J.-P. Bonnet, M. Segond, M. Abel, and S. Brunton, “Frequency selection by feedback control in a turbulent shear flow,” J. Fluid Mech. 797 , 247–283 (2016).
- 3Glezer and Amitay (2002) A. Glezer and M. Amitay, “Synthetic jets,” Ann. Rev. Fluid Mech. 34 , 503–529 (2002).
- 4Cattafesta and Shelpak (2011) L. Cattafesta and M. Shelpak, “Actuators for active flow control,” Ann. Rev. Fluid Mech. 43 , 247–272 (2011).
- 5Glezer et al. (2005) A. Glezer, M. Amitay, and A.M. Honohan, “Aspects of low- and high-frequency actuation for aerodynamic flow control,” AIAA Journal 43 , 1501–1511 (2005).
- 6Barros et al. (2016 a) D. Barros, J. Borée, B. R. Noack, A. Spohn, and T. Ruiz, “Bluff body drag manipulation using pulsed jets and Coanda effect,” J. Fluid Mech. 805 , 422–459 (2016 a).
- 7Schmidt et al. (2015) H. J. Schmidt, R. Woszidlo, C.N. Nayeri, and C.O. Paschereit, “Drag reduction on a rectangular bluff body with base flaps and fluidic oscillators,” Exp. Fluids 56 , 1–16 (2015).
- 8Oxlade et al. (2015) A. R. Oxlade, J. F. Morrison, A. Qubain, and G. Rigas, “High-frequency forcing of a turbulent axisymmetric wake,” J. Fluid Mech. 770 , 305–318 (2015).
