Optimization of dynamic systems often requires system simulation. Several important classes of dynamic system models have computationally expensive time derivative functions, resulting in simulations that are significantly slower than real time. This makes design optimization based on these models impractical. An efficient two-loop method, based on surrogate modeling, is presented here for solving dynamic system design problems with computationally expensive derivative functions. A surrogate model is constructed for only the derivative function instead of the simulation response. Simulation is performed based on the computationally inexpensive surrogate derivative function; this strategy preserves the nature of the dynamic system, and improves computational efficiency and accuracy compared to conventional surrogate modeling. The inner-loop optimization problem is solved for a given derivative function surrogate model (DFSM), and the outer loop updates the surrogate model based on optimization results. One unique challenge of this strategy is to ensure surrogate model accuracy in two regions: near the optimal point in the design space, and near the state trajectory in the state space corresponding to the optimal design. The initial evidence of method effectiveness is demonstrated first using two simple design examples, followed by a more detailed wind turbine codesign problem that accounts for aeroelastic effects and simultaneously optimizes physical and control system design. In the last example, a linear state-dependent model is used that requires computationally expensive matrix updates when either state or design variables change. Results indicate an order-of-magnitude reduction in function evaluations when compared to conventional surrogate modeling. The DFSM method is expected to be beneficial only for problems where derivative function evaluation expense, and not large problem dimension, is the primary contributor to solution expense (a restricted but important problem class). The initial studies presented here revealed opportunities for potential further method improvement and deeper investigation.

## Introduction

Optimal design of dynamic systems is a challenging task, due in part to the computationally expensive system simulations involved. Moreover, dynamic system design problems often exhibit tight coupling between physical system (plant) and control system design, requiring the use of integrated plant and control design methods to arrive at system-optimal solutions [1]. Integrated dynamic system design methods are often referred to as “codesign” methods. In codesign formulations, plant and control design are considered simultaneously to solve an integrated optimization problem [2,3]. Applying codesign to a nonlinear dynamic system often incurs significant computational expense [4].

Many solutions have been proposed for reducing the computational expense of dynamic system design, including the use of surrogate modeling (SM). The most widely used surrogate modeling approach for simulation-based dynamic system design treats the simulation as a “black box,” where a surrogate model is constructed based on inputs to and outputs from the simulation. That is, if $xc\u2208U\u2282\mathbb{R}m$ are the control design inputs to a simulation, $xp\u2208P\u2282\mathbb{R}p$ are physical design inputs, and if $\varphi (xp,xc)\u2208\mathbb{R}y$ are the corresponding outputs, a set of $\u2329(xp,xc),\varphi (xp,xc)\u232a$ pairs is generated via simulation. This data are then used to construct a nonlinear surrogate model with established data fitting techniques such as neural networks, radial basis networks, echo state networks, or wavelet networks [5,6]. The resulting surrogate model is computationally inexpensive to evaluate, supporting the use of optimization in dynamic system design. Researchers have applied this strategy to a diverse range of applications, including multibody dynamic system optimization [4,7,8], nonlinear dynamic biological systems [9], aircraft wing design with uncertain parameters [10], and vehicle ride comfort [11].

Successful surrogate modeling methods support the rapid identification of an accurate optimum design point while helping to minimize the required number of high-fidelity function evaluations [12]. Accuracy can be preserved by using a trust region approach [13,14], and the number of high-fidelity model evaluations can be reduced by using an adaptive resampling method that focuses on improving accuracy only in regions of strategic interest (e.g., near the optimum) [15,16]. A significant number of developments have been made in the area of black-box surrogate modeling, including surrogate model ensembles where the best (or weighted average) surrogate model is used [17], and extension of surrogate modeling to multi-objective optimization problems where high accuracy is maintained in regions near the Pareto front [18]. While surrogate modeling has often been applied to single engineering disciplines [18] (e.g., structural design [19], multibody dynamic systems [20], design based on aerodynamics and aero-acoustics [21], etc.), it can be extended to multidisciplinary problems [22].

Surrogate modeling strategies must balance challenges such as the computational expense of high-fidelity model samples and the accuracy of the surrogate model at points between sample points. SM for dynamic systems involves unique additional issues. For example, most SM strategies employ generalized approximation functions to estimate input–output mapping $(xp,xc)\u2192\varphi (xp,xc)$. This black-box approach does not exploit the special properties of a dynamic system, such as the continuous evolution of state trajectories, to improve model accuracy between sample points. A more desirable strategy would leverage the intrinsic properties of dynamic systems to enhance model accuracy for a given number of samples, and provide insight into the underlying dynamic system.

Another established technique important for dynamic system modeling, simulation, and design is reduced-order modeling (ROM). A reduced-order model of a dynamic system approximates a higher-fidelity dynamic system model using fewer states to reduce simulation expense and often as an aid for control system design. ROM methods project a high-order dynamic system model onto subspaces consisting of basic elements that contain characteristics of the expected solution [23–26]. This approach helps models retain dynamic properties that correlate with the physical characteristics of the systems they approximate [24], whereas the conventional surrogate modeling approach described above uses generalized approximation functions not based on system physics. ROM is often applied to high-order, expensive to simulate, models of systems governed by partial differential equations [24,27–29]. Some important classes of ROM methods include those based on proper orthogonal decomposition [30], Krylov subspace methods [31], Volterra series methods [25], and the block Arnoldi algorithm [32]. These model reduction methods, however, are largely restricted to linear systems [21,27,32–36]. Notable application of reduced-order modeling to linear dynamic systems includes microelectromechanical systems [32] and fluid systems [21,27,35]. The ability to simulate nonlinear systems is vital for progress in dynamic system design. While some model reduction methods have been extended to nonlinear systems [28,37–39] (e.g., proper orthogonal decomposition for nonlinear structural dynamics [36]), the use of model reduction as a general tool for design of nonlinear systems is limited. While model reduction approaches retain simplified dynamic system properties, the underlying dynamics may be oversimplified in some cases [31], and higher-fidelity models are still required at later design stages [26]. In addition, ROMs often do not accommodate the direct modification of independent physical system design variables. As a result, reduced-order modeling as-is is not suitable for the codesign of dynamic systems, which is the focal point of this article.

In this paper, we introduce a fundamentally new method for design optimization of dynamic systems—derivative function surrogate modeling (DFSM)—that addresses above issues with SM and ROM by approximating only the state derivatives of a dynamic system using surrogate models, rather than treating the whole system simulation as a black box. This provides direct access to approximated system dynamics, and preserves the dynamic nature of the system model to improve approximation accuracy and efficiency. This method applies to nonlinear dynamic systems. In addition, derivative function surrogates are constructed that depend both on state and design variables, supporting solution of codesign problems. The coupling between physical and control system design disciplines in codesign problems introduces additional complexity to the surrogate modeling problem, as accuracy must be provided not only in the design space in the neighborhood of the optimum design point, but also in the state space in the neighborhood of the state trajectory that corresponds to the optimum design point. The latter requirement is more difficult because we are concerned about accuracy in a region near an entire path as opposed to a single point. This article introduces one possible approach for tackling this challenge.

The rest of the paper is organized as follows: First the DFSM methodology is discussed in-depth in Sec. 2 followed by the associated discussion on dynamic system design process in Sec. 3. Different codesign problem formulations are discussed in Sec. 4 and the design process is then demonstrated using three numerical examples in Sec. 5. Concluding remarks and a discussion of future work are presented in Sec. 6.

## Derivative Function Surrogate Modeling Methodology

where $\xi (t)\u2208X\u2282\mathbb{R}n$ are the system state trajectories, $u(t)\u2208U\u2282\mathbb{R}m$ are the control inputs, and $xp\u2208P\u2282\mathbb{R}p$ are the physical design variables. In some important classes of mechatronic system design problems, the most computationally expensive element of system simulation is the time derivative function, i.e., **f**(⋅) in Eq. (1), which must be evaluated multiple times at every simulation time step. DFSM reduces computational expense by constructing a surrogate model for the derivative function only, whereas conventional SM constructs an approximation of the entire simulation response.

To clarify the relationship between DFSM and SM for dynamic simulations, consider the input–output relationships illustrated in Fig. 1. The performance metric $\varphi (xp,xc)$ is the design objective to be minimized (or objectives and constraints if a vector), and depends on plant ($xp$) and control design ($xc$) variables. Note that $xc$ may represent either control trajectories **u**(*t*) directly or control system parameters that influence **u**(*t*), such as feedback control gains. The performance metric *ϕ*(⋅) depends on state trajectories that must be obtained via simulation. When using conventional SM (left side of Fig. 1), state trajectories are resolved internally. Consequently, the performance model appears to depend only on $xp$ and $xc$; here, the input–output mapping is treated as a black box. A conventional SM $\varphi \u0302(xp,xc)$ approximates the dependence of *ϕ*(⋅) on $xp$ and $xc$ using general-purpose functions (e.g., radial basis functions (RBFs) [40]).

Instead of approximating simulation responses, DFSM approximates the time derivative function. This is illustrated in the right-hand side of Fig. 1, where we zoom in on the functional structure of the derivative function used in simulation. The elements within the dotted lines are replaced by the approximate derivative function $f\u0302(\xb7)$.

Derivative function surrogate modeling retains the use of simulation; $\varphi (\xb7)$ is evaluated based on state trajectories, plant design, and control design, but state trajectories are computed using a simulation based on the computationally inexpensive approximate derivative function $f\u0302(\xb7)$. DFSM provides two advantages:

- (1)
Utilizing simulation (as opposed to SM, where simulation is treated as black box) retains the nature of dynamic systems. This allows us to capitalize on system structure to improve approximation accuracy between sample points.

- (2)
Restricting the scope of the surrogate model to just the derivative function supports more efficient model construction.

The second advantage follows from the first, and from the fact that each input–output pair is inexpensive to obtain for DFSM compared to SM (SM requires a full simulation for each sample point). In addition, simulation responses are often very complex, nonconvex surfaces that are difficult to approximate accurately using conventional SM. Derivative function responses are often much simpler. While derivative functions for realistic problems are more complex, a DFSM is easier to create than an SM of the simulation response.

To illustrate this concept, consider a linear model of a basic passive dynamic vibration absorber (DVA) illustrated in Fig. 2 [41–43]. The primary mass *m*_{1} is excited by a cyclic force *F*_{1}(*t*), and one potential design problem is to minimize maximum *m*_{1} amplitude by adding a vibration absorber consisting of a mass (*m*_{2}), spring (*k*_{2}), and damper (*c*_{2}). A linear DVA model is given by

Suppose we want to minimize $max|x1(t)|$ over a range of frequencies using an undamped (*c*_{2} = 0) DVA by tuning the secondary mass and stiffness values. Figure 2 illustrates $max|x1(t)|$ as a function of *k*_{2} and *m*_{2} based on simulation results (for the parameters $m1=5kg,c2=0N/m/s,k1=200N/m$ and $F1(t)=100\u2009sin(30t)$). This highly complex response surface would be exceedingly challenging to fit a surrogate model with a reasonable level of accuracy. In contrast, the derivative function can be modeled exactly using a simple linear model. While this is just one example, in our experience, derivative functions are typically easier to approximate accurately than simulation responses, providing an important motivation for DFSM.

The DFSM method presented here is based on adaptive SM [12–14,40,44,45] as a means to ensure accuracy between the surrogate and original derivative function. After solving an optimal codesign problem using an approximate derivative function, additional samples of the high-fidelity derivative function are taken in areas of strategic interest (e.g., near the optimum) [15,16]. Sampling **f**(⋅) to improve accuracy incurs computational cost; an adaptive strategy provides accuracy only where needed, and is far less expensive than generating a globally accurate SM.

### DFSM for Codesign.

*ϕ*is the overall system design objective function, where the integrand $L(\xb7):\mathbb{R}n\xd7\mathbb{R}p\xd7\mathbb{R}m\u2192\mathbb{R}$ is the Lagrange term

^{2}and $M(\xb7):\mathbb{R}n\xd7\mathbb{R}p\u2192\mathbb{R}$ is the Mayer term (defined at the boundary). $f(\xb7):\mathbb{R}n\xd7\mathbb{R}p\u2192\mathbb{R}n$ is the system derivative function. Here, we consider only the case where the control

**u**(

*t*) is affine with the system derivative where $B\u2208\mathbb{R}n\xd7m$. The plant and control design variables are $xp$ and

**u**(

*t*), respectively, and $C(\xb7):\mathbb{R}n\xd7\mathbb{R}p\xd7\mathbb{R}m\u2192\mathbb{R}c$ are the inequality design and path constraints (including any bounds on $\xi (t),u(t)andxp$). Note that design constraints depend indirectly on control design since

**u**(

*t*) influences state trajectories $\xi (t)$, and design constraints often depend on states (e.g., constraints on fatigue or displacement). Boundary constraints (including initial and final conditions) are given by $\phi (\xb7)$. This problem structure allows for bidirectional plant-control design coupling [46]. This generic formulation permits nonlinear system dynamics, i.e., the state derivatives $\xi \u02d9(t)$ are nonlinear functions of states and physical design. Replacing the derivative function

**f**(⋅) in Eq. (3

*c*) with the approximate surrogate model ($f\u0302(\xb7))$ forms a computationally inexpensive optimal codesign problem

This problem is solved multiple times. After each solution, $f\u0302(\xb7)$ is updated using additional sample points of **f**(⋅). Figure 3 illustrates this process. First, a modeling domain (i.e., $X$ and $P$) is defined that spans the state and design spaces from which samples are taken. After constructing $f\u0302(\xb7)$, it is validated by comparing it to **f**(⋅) at several points. If validation fails, $f\u0302(\xb7)$ is improved through reconstruction with additional samples. After validation, the approximate codesign problem is solved. This process is iterated until termination conditions, such as a step size norm, are met.

### DFSM Variants.

Two variants of DFSM are defined here

- (1)
Direct derivative function approximation (DDFA). Here, the complete time derivative function is approximated directly using a surrogate model as described above. This approach is conceptually straightforward, but in many cases, may be less efficient than an indirect approximation.

- (2)
Indirect derivative function approximation (IDFA). The derivative function is decomposed into computationally expensive and inexpensive elements, and an approximation is made only for the former. The decomposition of derivative function into expensive and inexpensive parts can be done using empirical tests. For example, when using scientific programming software such as Matlab

^{®}, one can use the*profiler*tool to identify specifically the computationally expensive elements of the code. In other cases, it may be obvious to identify the computationally expensive elements of derivative functions when they involve iterative numerical methods to evaluate elements of derivative functions.

Direct derivative function approximation and IDFA apply to a wide variety of system model types. Five important cases are described below with properties that are expected to benefit from DFSM solution. These cases should be viewed as opportunities to realize potential computational benefit through DFSM, including the ability to incorporate higher fidelity models into early stage optimization studies than otherwise would be possible. All five cases described below involve an IDFA strategy, and two common attributes across most of these cases are system models with (1) a low to moderate number of states, and (2) computationally expensive elements of time derivative functions. It should be emphasized that this list is not comprehensive. Other problem types may be amenable to DFSM solution. In addition, this article presents only initial evidence of DFSM effectiveness based on two simplified nonlinear codesign problems (DDFA) and one realistic wind turbine codesign problem (Case 2 IDFA). Additional systematic investigation of additional IDFA cases and deeper study of DFSM numerical properties are topics for future work.

*Case* 1. Nonlinear Parametric Models: Some derivative functions involve intermediate variables that are computed using an expensive numerical procedure, i.e., $f(\xi (t),xp,p)$ depends on parameters **p** that are functions of state and design variables: $p=a(\xi (t),xp)$.^{3} IDFA may then be applied to construct a surrogate model $a\u0302(\xb7)$ that maps design and state spaces to these intermediate parameters. Gain (or parameter) scheduling [47,48] is a nonlinear control design strategy that is analogous to this parametric IDFA approach where gain parameters are adjusted to accommodate dynamic behavior that changes while moving through the state space. IDFA adjusts parameter values to accommodate nonlinearities in **f**(⋅) while moving through both state and design spaces.

*Case* 2. Linear State-Dependent Models: In this important class of system models, the system matrix **A**(⋅) depends not only on design but also on state: $\xi .(t)=A(xp,\xi (t))\xi (t)+Bu(t)$. In some cases, updating **A**(⋅) is numerically expensive (e.g., linearization). IDFA may be used to approximate the dependence of **A**(⋅) on state and design variables. The wind turbine co-design problem presented later in this article is a Case 2 IDFA implementation.

*Case 3.* DFSM with ROMs: ROMs and DSFM may be used together in a complementary manner. ROM techniques offer a mature, powerful approach for reducing simulation expense and producing analytical insights, but are primarily a tool for control system design [49,50]. Most ROM techniques assume that the physical system does not change, or if it does change, only in limited ways (e.g., sensor location [35]). In addition, ROMs for dynamic systems have not been used yet for passive system design (i.e., physical system design without any feedback control).^{4} DFSM provides promising means for supporting the use of ROMs in codesign by allowing movement in the design space as well as the state space. For example, ROM parameter values could be updated continuously by a surrogate model that depends on both state and design variables. The core advantage of using DFSM with ROM techniques is the ability to simultaneously vary physical and control system design variables without reconstructing the ROM to provide a smooth optimization space. Thus, DFSM is not a ROM replacement, but rather extends ROM applicability to problems involving physical design changes.

*Case* 4. DFSM with Physics-Based ROMs: This special class of ROMs uses simplified physics-based models to capitalize on the physical nature of systems to reduce model dimension further. One strategy is the use of lumped parameter models to approximate distributed parameter models. For example, pseudo-rigid-body dynamic models (PRBDMs) [52,53] use systems of rigid bodies, springs, and dampers to approximate the dynamic behavior of a structure or mechanism with flexible elements, and to provide accuracy over large nonlinear displacements. PRBDMs have been applied in several domains, including microelectromechanical systems [54], space applications [55], and biomechanics [56]. A related area is deformable object modeling, where either PRBDMs or multiscale finite element models are used to estimate the dynamics of soft tissues to reduce simulation expense for real-time applications [57–59]. Often, model parameters are determined once (either through quantitative [5] or subjective [58] techniques) and left fixed, limiting the domain of accuracy. As with *Case* 3, DFSM here can provide an efficient means to update model parameter values continuously to maintain accuracy across the state and design spaces. In addition, these continuous parameter maps may further reduce the model dimension required to achieve desired accuracy levels. For example, reducing the number of PRBDM bodies reduces ROM dimension, but large deflections in the real system would render stiffness and inertia parameter values inaccurate as bodies change shape. Adjusting the stiffness parameters and inertia tensors to reflect these changes in shape (as a function of state and design) would help maintain accuracy for an exceptionally low-dimension model. Updating the inertia tensor, however, requires numerical integration, and updating stiffness parameters would require evaluation of a nonlinear finite element model. In other words, elements of the low-dimension derivative function are computationally expensive to evaluate. Using DFSM to approximate inertia and stiffness parameters as a function of state and design has the potential to produce a low-dimension model that is both computationally efficient and accurate.

*Case 5.* Multirate Systems*:* Multiphysics [60,61] or multiscale systems [62] often involve timescales that differ by order of magnitudes. For example, robotic systems have physical (mechanical and electronic) elements with different time scales, as well as control and planning systems with yet different time scales [63]. Nanofluidic device behavior depends on both continuum-level phenomena and molecular dynamics that operate at very different rates [64]. Dynamic system models with vastly different time scales are numerically stiff and may require special simulation techniques [65]. This is traditionally managed by using an implicit solver for ordinary differential equations, such as the trapezoidal method. When implicit solvers are unsuccessful, multiple distinct integrations are carried out—one for each time scale—that are coordinated to maintain accuracy. One basic coordination strategy involves a fixed-step simulation with two time steps sizes, *h*_{1} > *h*_{2}, where the former is for the slower dynamic component, the rates satisfy *h*_{1} = *rh*_{2}, $r\u2208Z+$, and the fast and slow integrations are coordinated every *h*_{1} seconds. Singular perturbation methods are another important strategy for multirate systems [66,67].

It is worth noting at this point that DFSM (like any other surrogate modeling technique) has potential to suffer from the curse of dimensionality. As previously defined, with *n* states, *p* physical design variables, and *n _{t}* simulation time steps, the DFSM input vector dimension is

*n*+

*p*, and is independent of

*n*. DFSM does not eliminate the curse of dimensionality associated with surrogate modeling, and it is not claimed that DFSM should replace well-established methods, such as ROMs, for high-dimensional systems. The proposed DFSM method, however, solves an important niche class of problems where the system derivative function is computationally expensive to evaluate despite having a small to moderate number of states. This is the case with the third example in this article, and a significant computational advantage is demonstrated compared to conventional SM.

_{t}## Derivative Function Surrogate Modeling Construction and Validation

The design optimization process illustrated in Fig. 3 consists of an inner loop that solves Problem (4) for a given surrogate model, and an outer loop that iteratively enhances the surrogate model. This section details the DFSM construction and validation steps that are associated with this process.

### Sample Plan Construction.

First, a modeling domain is defined within the state and design spaces over which the surrogate model will be constructed. The complete modeling domain is $X\xd7P$. A sampling domain $S\u2282X\xd7P$ is defined within the modeling domain from which samples are taken. We assume here that $S=X\xd7P$ for the first outer-loop iteration. Basic implementations hold the sampling domain fixed throughout the process; adaptive strategies can adjust sampling domain size, shape, and location to help focus sampling efforts more strategically. Here, the modeling domain is defined using simple bounds on the state and design spaces that are estimates of the maximum and minimum values that the plant design and state variables will attain. The sampling domain is adaptively shrunk in the vicinity of predicted optimum solution as explained in Sec. 3.2. In this work, the sample points were chosen from within the sampling domain using Latin hypercube sampling [68].

### Surrogate Model Construction.

The sample points obtained via Latin hypercube sampling in the previous step are used as training points to construct the surrogate model. For every training point defined, a corresponding output point must be obtained by evaluating the analysis function (the original model to be approximated) using the training point as input. The observed output points $f(qi)\u2208\mathbb{R}n$ are functions of each training point **q*** _{i}*. Each training point consists of concatenation of both state and design variables,

^{5}i.e., $qiT=[\xi T,xpT]\u2208S$. Derivative functions are vector-valued, so each training point produces

*n*observed outputs,

^{6}i.e., $f(qi)=[f1(qi),\u2009f2(qi),\u2009f3(qi),\u2009\u2026,\u2009fn(qi)]T$ is the output vector of observed derivatives for the training point

**q**

*, where*

_{i}*n*is the number of states and each entry

*f*(⋅) corresponds to the

_{j}*j*th state derivative.

Once the observed outputs **f**(**q*** _{i}*) are obtained, the $\u2329qi,f(qi)\u232a$ pairs are used to “train” the surrogate model for each of the state derivatives, separately. The surrogate model used here employs RBFs [69,70]. We use RBFs here for surrogate modeling because they are (1) generalizable, i.e., they can accurately model arbitrary functions, and (2) are simpler to implement compared to more involved approaches such as kriging models or artificial neural networks. These properties make RBFs very effective for multidimensional approximation (unlike splines, which cannot be easily scaled to multiple dimensions) [18,40,69,70].

*l*be the number of training points, then for each of the

*n*state derivatives, we can write the interpolation condition using

*l*training points

*n*is the number of states, and

**c**

*is the*

_{k}*k*th basis function center. The specific RBF used here is the thin plate spline function [40]: $\psi (r)=r2\u2009log\u2009\u2009r$, where

*r*is the Euclidean distance between the training point and basis function center: $r=||qi\u2212ck||$. The objective in constructing the surrogate model is to find the coefficient $wkj$. This can be done by solving $wj=[w1j,\u2009w2j,\u2009\u2026,\u2009wlj]T$ for

**w**

*for each state derivative*

^{j}*j*= 1, 2,…,

*n*:

where $\psi j\u2208\mathbb{R}l\xd7l$ is the “Gram matrix” [40] associated with *j*th state derivative: $\psi ikj=\psi (||qi\u2212ck||)$ for *i*, *k* = 1, 2,… *l*, and $fj=[fj(q1),\u2009fj(q2),\u2009\u2026,\u2009fj(ql)]T$ is the vector of observed outputs for *j*th state derivative using *l* training points.

Unique values for coefficients may be found since the Gram matrix is square and has full rank. Problem complexity is reduced further here by assuming that the RBF centers coincide with training points, i.e., **c*** _{i}* =

**q**

*. This simplification provides reasonably accurate results for the case studies presented in this article.*

_{i}### Model Validation.

Surrogate models used in optimization must be accurate in regions of interest. Adaptive surrogate modeling methods gradually enhance accuracy in the region of the approximate optimum, and at convergence, the highest desired level of accuracy need only be achieved right at the optimum. Ensuring model accuracy in regions far from the optimum incurs unnecessary computational expense. Using surrogate models of derivative functions adds some complexity to the task of validating model accuracy. In addition to checking accuracy in regions near the optimal design point, accuracy must also be assured in the neighborhood of the state trajectory that corresponds to this point. A validation domain must be defined in both the state and design spaces where the surrogate model is required to be accurate within a specified tolerance.

In the iterative solution process outlined in Fig. 3, the boundaries of the modeling domain from which training samples are obtained are updated with each outer-loop iteration. The inner-loop (codesign) solution is used to determine new bounds on the modeling domain for the next iteration. For simplicity here, the validation domain is assumed to be equivalent to the modeling domain, although more sophisticated approaches may be taken where the validation domain is much smaller than the modeling domain. Techniques, such as support vector domain description [71,72], could be used to construct nonconvex boundaries around the state trajectories to define a tighter validation domain.

where *n _{s}* is the number of test points and $||\xb7||$ is the

*l*

_{2}norm. As illustrated in Fig. 3, model error is checked before proceeding with codesign solution. An alternative strategy is to perform validation checks after codesign solution.

## Codesign Optimization

The primary motivation for the surrogate modeling approach presented here is the efficient solution of codesign problems that involve dynamic system models with computationally expensive derivative functions. The formulations of two important codesign techniques are reviewed here, along with direct transcription (DT) codesign formulations that support more comprehensive treatment of physical design elements in codesign problems. This review provides necessary background for analyzing the use of DFSM in codesign.

### Nested Codesign.

Fathy et al. [75] first proposed the nested codesign formulation, which was later identified in Ref. [46] as a special case of the multidisciplinary design feasible formulation [76]. This formulation incorporates two loops: an outer loop optimizes the plant design, and an inner loop solves the optimal control problem for each plant design tested by the outer loop as shown in Fig. 4.

Evaluating *ϕ*_{*}(⋅) requires solution of an inner-loop optimal control problem for a given plant design $xp$. The plant design $xp$ is specified by the outer loop and is held fixed during the inner-loop solution. The inner loop finds the optimal control design **u**_{*}(*t*) (and resultant optimal states $\xi *(t)$) and returns the corresponding objective function value. Plant design constraints (or path constraints) **C**(⋅) are imposed in both loops to ensure system-level design feasibility [46,77]. The presence of inequality path constraints in the inner-loop problem motivates the use of “discretize–then–optimize” optimal control methods that can manage these constraints, such as direct transcription, which is discussed later in this section.

### Simultaneous Codesign.

An alternative codesign formulation solves for the plant and control design variables simultaneously [46,78] as shown in Fig. 5. This formulation accounts for all dynamic system interactions and plant-control design coupling, resulting in a system-optimal design that is often significantly better than conventional sequential (plant design followed by control design) methods. This simultaneous codesign approach is used in the investigation presented here.

Please note that in all formulations above, an open-loop optimal control strategy is used. This produces optimal control input trajectories and aids investigation of ultimate system performance limits and tradeoffs. Open-loop studies are appropriate for early stage design studies, but in most cases do not yield implementable feedback control system designs. Here, we assume that implementable feedback control systems can achieve approximately the same performance of optimal open-loop results. Ongoing complementary work is addressing the connection between early stage open-loop codesign studies and development of implementable digital control systems [79], including stability and robustness considerations. In cases where a large gap exists between open and closed-loop behavior, additional measures must be taken during codesign investigations to account for the limitations of digital feedback control systems.

### Direct Transcription.

Conventional optimal control methods based on Pontryagin's maximum principle [80] take an “optimize–then–discretize” approach, where optimality conditions are applied to generate a boundary value problem that in special cases leads to a closed-form solution, but in general requires discretization and numerical solution to obtain optimal control trajectories. DT takes the inverse approach: the optimal control problem is discretized first, and the resulting nonlinear program (NLP) is solved using a standard NLP algorithm [81–83]. DT is a “discretize–then–optimize” approach that transcribes an infinite–dimensional optimal control problem into a large sparse finite-dimensional NLP. State and control trajectories trajectories are discretized over a finite number of time intervals, and discretized representations of these trajectories are part of the set of optimization variables. This DT approach is favored in early stage design since it can accommodate detailed nonlinear plant and path constraints involving physical design variables, states, and control as opposed to conventional pontryagin maximum principle-based approach where inclusion of generalized plant and path constraints is difficult to support (impossible in many cases, such as linear quadratic regulator control design). Moreover, in most previous codesign studies, plant design has been highly simplified, often overlooking physical constraints and the importance of design variable independence. This approach may lead to unrealizable plant designs or system designs that do not fully exploit plant-control design coupling. DT based codesign supports a balanced approach in which plant and control are both given thorough treatment enabling engineers to construct a formulation that is best for improving overall system utility [46]. With this motivation, we now describe in more detail codesign approaches based on DT.

*n*be the number of discrete time steps: $t0,t1,\u2026,tnt$, and let $\xi [ti]\u2208X$ be defined as the state vector and $u[ti]\u2208U$ as the control input vector at time

_{t}*t*. Now, consider the differential constraint in Eq. (4): $\xi \u02d9(t)=f\u0302(\xi (t),xp)+Bu(t)$. For the interval [

_{i}*t*,

_{i}*t*

_{i}_{+1}], it can be written as

*t*≤

_{i}*t*≤

_{k}*t*

_{i}_{+1}(

*k*= 1, 2,…,

*K*) define the quadrature points, $\beta k\u2208\mathbb{R}$ define weights at quadrature points, and

*h*is the

_{i}*i*th interval length. Using this quadrature, the defect constraints can now be formulated as

**U**) matrices are then defined as

*d*) is the defect constraint vector constructed using Eq. (10). When the defect constraints $\zeta (\xb7)$ are satisfied by $\Xi $, the discretized system state equations are satisfied. The surrogate derivative function $f\u0302(\xb7)$ is used in the calculation of the defect constraints. The function $\gamma (t,\Xi ,xp,U)$ is the discrete approximation of Lagrange term in Eq. (4). This discrete approximation can also be similarly obtained using any appropriate quadrature method. Equation (12) describes this approximation

where *w _{i}* are the weights specific to the quadrature method used (refer to Ref. [78] for further details). Several numerical examples with the structure of codesign problem (4) were solved using DT and are presented in Sec. 5. A trapezoidal quadrature method [81] was used for all the examples in this paper. The NLP defined in Prob. (11

*a*)–(11

*d*) was solved using the fmincon function in Matlab

^{®}. An important advantage of DT to emphasize here is its parallel nature; all defect constraints are independent, enabling massively parallel implementations. In addition, increased defect constraint dimension has a mild impact on solution expense compared to general NLPs due to problem structure; through clever formulation and application of sparse finite differences, the number of derivative function evaluations required to calculate NLP sensitivities does not increase with

*n*[81].

_{t}The efficiency of surrogate modeling methods is impacted significantly by the dimension of model input variables. For this DFSM codesign formulation, observe that the number of time points *n _{t}* does not impact the SM input dimension. The difficulty of constructing a derivative function surrogate model is independent of

*n*(see Fig. 1), but is influenced by the number of state and design variables, as well as the complexity of the derivative function response. Increasing

_{t}*n*does increase defect constraint dimension, but this is true whether or not a surrogate derivative function is used (and recall that this only impacts solution expense mildly due to parallelism and problem structure). In summary, DFSM codesign implementations are expected to be computationally inefficient when the number of state (

_{t}*n*) or design variables (

*p*) is large, but are still efficient for problems with fine temporal discretization.

## Numerical Results

Derivative function surrogate modeling is demonstrated here using three example problems that involve nonlinear dynamics. The first two examples are simple analytical problems with known solutions, supporting method validation and detailed exposition. The third example is a wind turbine codesign problem, where the state derivatives are calculated by a computationally expensive analysis function FAST [84]. In all three cases, a surrogate model is developed for the derivative function with dependence on both state and design variables, enabling efficient solution of the codesign problem.

### Example 1: Two-Dimensional Nonlinear System.

This example is based on a problem presented in Ref. [85], and was extended to a simple codesign problem formulation. Here, $xp=[a,\u2009b]T$ are treated as plant design variables, and $\xi (t)=[\xi 1(t),\u2009\xi 2(t)]T$ are the state variables. Please note that in realistic cases, model parameters in state space equations are not in general independent plant design variables, but are usually intermediate variables that depend on independent design variables [77]. This example formulation is a significant simplification.

Figure 6 illustrates solution results based on actual system dynamics (i.e., original derivative function) and compares them to DFSM solution results. They are identical within the specified surrogate model tolerance limit (0.001) and are qualitatively similar. The corresponding optimal plant design is presented in Table 1 for both nonsurrogate (actual) and DFSM solutions. The number of discretization time points (*n _{t}*) for this example was chosen to be 50. This example required 4000 original derivative function evaluations for the DFSM solution. This includes two iterations of surrogate model construction that were required to provide adequate accuracy. Each iteration used 2000 combined sample points for both training and validation. Please note that the total derivative function evaluations for DFSM is equal to the total number of sample points needed for SM training and validation. Compare these results to the solution based on the original (actual) derivative function: 53,550 derivative function evaluations (= 1071 (defect constraint evaluations) × 50 (

*n*, DT discretization points)) were required, far more than with DFSM. This highlights the value of DFSM in terms of computational expense in cases where derivative function evaluations dominate solution expense. If derivative function evaluation is inexpensive (as it is with this simple example), the benefit of reduced derivative function evaluations is minor, and total computational expense is increased due to surrogate modeling overhead. DFSM is advantageous only when the number of state and design variables is not large, and when the derivative function is computationally expensive.

_{t}### Example 2: Two-Dimensional Nonlinear System.

As in the first example, $xp=[a,\u2009b]T$ are the plant design variables and $\xi (t)=[\xi 1(t),\u2009\xi 2(t)]T$ are the system state variables. Figure 7 compares the solution of the above codesign problem using actual system dynamics and DFSM solution strategies. The number of discretization time points for this example was chosen to be *n _{t}* = 50. As with the first example, the DFSM solution is identical to the solution of the original problem within the specified model tolerance limit. The optimal plant design results are presented in Table 2. This example required 4000 original derivative function evaluations for DFSM approach, compared to 38,250 derivation function evaluations for the conventional solution strategy (i.e., DT codesign with direct derivative function evaluation).

### Example 3: Wind Turbine Codesign.

DFSM was also used to solve a simultaneous structural and control design problem for a horizontal axis wind turbine (Fig. 8). The objective here is to maximize the power extraction from the lightest possible wind turbine for given input wind conditions, while satisfying structural deflection constraints. Other possible design objectives for wind turbines include minimization of cost per unit of energy (CoE) [86] and maximization of annualized energy production [87]. The example here is derived from the codesign problem presented in Ref. [87], which uses annualized energy production as an objective and treats turbine blade distributed geometric design in a detailed way. Here, CoE is not used as the design objective due to unavailability of appropriate and nonproprietary cost models. CoE is a more ideal design objective due to its ability to capture important cost versus benefit trade-offs for wind turbine design. CoE functions, however, are not always smooth (even discontinuous at times), thereby introducing additional complexities in the design optimization problem. Other metrics may be used as proxies for cost, such as system mass that is known to approximate manufacturing costs reasonably well. We believe the wind turbine codesign formulation used here is sufficiently complex and computationally expensive to demonstrate the merits of DFSM problem, whereas deeper study of wind turbine codesign using accurate cost modeling is an important topic for future work. The codesign problem formulation used here is

where $ms(xp)$ is turbine mass for given plant design vector. Structural deflection constraints on states $\Vert \xi 1\Vert \u221e\u2212\xi 1max\u22640$ and $\Vert \xi 2\Vert \u221e\u2212\xi 2max\u22640$ depend on *ξ*_{1} and *ξ*_{2} (tower aft-fore bending and blade out-of-plane bending, respectively). The constraint $Pm(vrated)\u2212Prated\u22650$ ensures that the wind turbine generates the full rated power when wind is blowing at rated speed *v*_{rated}. The physical design variables ($xp$) considered in this study are tower height *H _{t}*, blade length

*R*and hub radius

_{r}*R*. Power capture maximization can be achieved by minimizing the deviation of rotor control torque

_{h}*u*(

*t*) =

*τ*(

*t*) from the optimal torque

*τ*

_{opt}(

*t*) required to track the locus of maximum power coefficient [88]. The derivative function model of this system is highly nonlinear in nature, and is based on the state derivative calculations available through FAST v7 [84]. It is important to note that these derivative function evaluations require seconds to evaluate, meaning that simulation based on direct derivative function evaluation is much slower than real time, making direct codesign solution (without the use of DFSM) computationally prohibitive, particularly when conducting more extensive trade-off studies or simulations using larger amounts of wind data.

Here, *w*_{1} > 0 and *w*_{2} > 0 are the weights on structural mass and energy production ($\u222b0tfL(\xb7)dt$) objective function terms, respectively. Weights could be varied to produce optimal tradeoff data, but a single-objective formulation is used for this DFSM study. This design problem was solved using wind profiles obtained at SITE-05730, Indiana, U.S. [89] and number of discretization time points (*n _{t}*) used for DT were 40. The optimal plant design vector for this problem is listed in Table 3. It should be noted from Table 3 that the DFSM solution is slightly worse than actual solution; this can be attributed to the approximation error introduced by surrogate modeling. The solution error is within the specified SM tolerance, and could be reduced if desired. The statistics provided in Table 4 show that the number of original derivative function evaluations and overall solution time are both reduced significantly when using the DFSM. The overall solution time for DFSM is also observed to be lower than conventional SM; this is due in part to the limitation that conventional SM can only be used with single shooting methods, which have provably slower convergence and are less robust numerically than more sophisticated methods such as DT [81,82]. This provides important initial evidence, based on a moderately high-fidelity system model, that DFSM is a promising approach for the design of nonlinear dynamic systems. The accuracy of the DFSM solution can be improved by increasing the number of training points in the vicinity of optimal solution, or by reducing the accuracy tolerance (sum of normed errors) to increase the number of DFSM update iterations. In either case, improved accuracy will come at the cost of higher computational expense. Thus, it is imperative to consider this trade-off between computational expense and accuracy of solution while using DFSM.

Variable | Actual | DFSM |
---|---|---|

R_{r} | 56.93 m | 57.91 m |

H_{t} | 70.00 m | 70.00 m |

R_{h} | 0.96 m | 0.95 m |

J | 1.95 × 10^{10} | 1.98 × 10^{10} |

Variable | Actual | DFSM |
---|---|---|

R_{r} | 56.93 m | 57.91 m |

H_{t} | 70.00 m | 70.00 m |

R_{h} | 0.96 m | 0.95 m |

J | 1.95 × 10^{10} | 1.98 × 10^{10} |

DT using f(⋅) | DT using $f\u0302(\xb7)$ | SM using $\varphi \u0302(\xb7)$ | |
---|---|---|---|

Parameter | (No DFSM) | (DFSM) | (Conventional SM) |

f(⋅) evals | 25,160 | 2800 | NA |

Solution time | 419 min | 124 min | 618 min |

DT using f(⋅) | DT using $f\u0302(\xb7)$ | SM using $\varphi \u0302(\xb7)$ | |
---|---|---|---|

Parameter | (No DFSM) | (DFSM) | (Conventional SM) |

f(⋅) evals | 25,160 | 2800 | NA |

Solution time | 419 min | 124 min | 618 min |

## Conclusion

The work reported in this article is an advancement of methods for multidisciplinary dynamic system design optimization [46], which often involve computationally expensive dynamic system simulations. Here, we demonstrated a new way of using surrogate modeling methods to capitalize on the unique properties of dynamic systems to enable efficient codesign solution. This supports the use of higher-fidelity models for early stage codesign studies. In many important cases, derivative function calculations dominate computational expense, and the DFSM method introduced here can reduce significantly the number of expensive original derivative calculations when compared to direct use of the derivative function. This comparison accounts for all sample points required for iterative surrogate model construction and validation. DFSM is expected to be advantageous only when system models have a small to moderate number of states, and each derivative function evaluation is numerically expensive. This computational benefit depends on the tradeoff between the DFSM input vector dimension (*n* + *p*, independent of *n _{t}*) and derivative function evaluation expense. For problems with significant derivative function expense, but a modest number of DFSM inputs, DFSM may be beneficial. Derivative function surrogate model construction expense is independent of the number of time steps in a simulation, but increases with the number of state and design variables. This article also illustrated the use of direct transcription in solving codesign problems, an emerging area of multidisciplinary dynamic system design optimization. Three example problems were used to demonstrate how to utilize surrogate models of derivative functions efficiently in solving codesign problems. Two were simple analytical problems, whereas the third was a moderately high-fidelity wind turbine codesign problem that is computationally expensive to solve using conventional techniques.

This preliminary work opens the door to a wide range of further research topics based on this new method. Ongoing and future work involves investigation of improved SM validation methods (including validation domain description), efficient resampling techniques, and extension to fully nonlinear systems, i.e., $\xi \u02d9(t)=f(\xi (t),xp,u(t))$. Another important aspect of DFSM-based codesign is the formal analysis of the dynamic system stability specifically to guard against artificial bifurcations and other nonlinear phenomena that are artifacts of approximation; this knowledge will help improve DFSM robustness. The studies presented here addressed only DDFA and one specific case of IDFA. Future efforts should investigate other IDFA cases, including developing a detailed understanding of when DFSM should be selected over other solution techniques. Integrating open-loop codesign studies with implementable feedback control system development is an important topic of ongoing investigation. A final topic identified here as future work is thorough convergence analysis to determine under what conditions DFSM converges to the solution of the original codesign problem, and at what rate convergence occurs.

## Acknowledgment

This work was supported in part by the Clean Energy Education and Research Fellowship awarded to first author by the Graduate College at the University of Illinois at Urbana–Champaign and National Science Foundation under Grant No. CMMI-1463203. This support is gratefully acknowledged.

## Funding Data

Division of Civil, Mechanical and Manufacturing Innovation (Grant No. 1463203).

Graduate College, University of Illinois at Urbana-Champaign (Clean Energy Fellow).

The notation $\mathbb{R}a\xd7\mathbb{R}b$ defines the Cartesian product between two spaces: $\mathbb{R}a$ and $\mathbb{R}b$.

More generally, derivative functions and these parameters may also be functions of control inputs.

In more general cases where **f**(⋅) depends nonlinearly on control, **q*** _{i}* includes control inputs as well. The resulting modeling domain in this case will be $X\xd7P\xd7U$.

This discussion assumes DDFA. In IDFA, the output observations may be model parameters or other values as opposed to derivative outputs.