Deep learning in computational mechanics a review
Deep learning in computational mechanics a review
https://doi.org/10.1007/s00466-023-02434-4
REVIEW ARTICLE
Received: 19 September 2023 / Accepted: 8 December 2023 / Published online: 13 January 2024
© The Author(s) 2024
Abstract
The rapid growth of deep learning research, including within the field of computational mechanics, has resulted in an
extensive and diverse body of literature. To help researchers identify key concepts and promising methodologies within this
field, we provide an overview of deep learning in deterministic computational mechanics. Five main categories are identified
and explored: simulation substitution, simulation enhancement, discretizations as neural networks, generative approaches,
and deep reinforcement learning. This review focuses on deep learning methods rather than applications for computational
mechanics, thereby enabling researchers to explore this field more effectively. As such, the review is not necessarily aimed at
researchers with extensive knowledge of deep learning—instead, the primary audience is researchers on the verge of entering
this field or those attempting to gain an overview of deep learning in computational mechanics. The discussed concepts are,
therefore, explained as simple as possible.
Keywords Deep learning · Computational mechanics · Neural networks · Surrogate model · Physics-informed · Generative
123
282 Computational Mechanics (2024) 74:281–331
123
Computational Mechanics (2024) 74:281–331 283
Fig. 1 Number of publications concerning artificial intelligence and some of its subtopics since 1999 showing the exponential growth of literature
within the field. Illustration inspired by [40]
the data, but to generate statistically similar data. This is use- 1.3 Deep learning
ful in diversifying the design space or enhancing a data set
to train surrogate models. Before continuing with the topics specific to computational
Finally, in deep reinforcement learning, an agent learns mechanics, NNs5 and the notation used throughout this work
how to interact with an environment in order to maximize are briefly introduced. In essence, NNs are function approx-
rewards provided by the environment. In the case of deep imators that are capable of approximating any continuous
reinforcement learning, the agent is modeled with NNs. In function [50]. The NN parametrized by the learnable param-
the context of computational mechanics, the environment is eters θ (typically consisting of weights w and biases b) learns
modeled by the governing physical equations. Reinforcement a function ŷ = f N N (x; θ ), which approximates the relation
learning provides an alternative to gradient-based optimiza- y = f (x). The NN is constructed with nested linear transfor-
tion, which is useful when gradient information is not mations in combination with non-linear activation functions
available. σ . The most basic NNs: Fully connected NNs achieve this
The unique proposed taxonomy arises from a methodolog- with layers of fully connected neurons (see Fig. 2), where
ical viewpoint, instead of an application [22–35], or problem the activation aki of each neuron (the ith neuron of layer k) is
[42] oriented perspective. However, parallels can be drawn obtained through linear combinations of the previous layer
to the in [42] identified challenges and proposed areas of and the non-linear activation function σ :
investigation in machine learning. Similarly, the distinction ⎛ ⎞
between machine learning enhanced4 and substitution by
n
machine learning models is made. Additionally, challenges aki = σ ⎝ wki j a i−1
j + bki ⎠ . (1)
such as robustness, explainability, and handling of complex j=1
and high-dimensional data are highlighted. Also, the sep-
aration between physics-informed learning and data-driven If more than one layer (excluding input x and output layer ŷ)
modeling is made by [42], as well as by [43]. Interestingly, is employed, the NN is considered a deep NN, and its training
older reviews [3, 4] arrived at similar categories, additionally process is thereby deep learning. The evaluation of the NN,
including NNs as means of more efficient implementations, i.e., the prediction is referred to as forward propagation. The
i.e., discretizations as NNs. Only the last two proposed quality of prediction is determined by a cost function C( ŷ),
categories, generative approaches, and deep reinforcement which is to be minimized. Its gradients ∇θ C = {∇w C, ∇b C}
learning, have not been spotlighted as methodologies within with respect to the parameters θ are obtained with automatic
reviews of computational mechanics. But these are well- differentiation [51], specifically referred to as backward
established within the machine learning community [44–47] propagation in the context of NNs. The gradients are used
and sufficiently distinct to be treated separately. within a gradient-based optimization [44, 52, 53] to update
the parameters θ and thereby improve the prediction ŷ. Super-
vised learning relies on labeled data x M , y M to establish a
cost function C, while unsupervised learning does not rely
4 A further interesting distinction is made between inner (within a for-
ward simulation) and outer loop enhancements (using multiple forward 5 See [44] for an in-depth treatment and PyTorch [48] or TensorFlow
simulations, e.g., within an optimization). [49] for deep learning libraries.
123
284 Computational Mechanics (2024) 74:281–331
on labeled data. The parameters defining the user-defined (Sects. 2.1.2 and 2.2.2). For simplicity, but without loss of
training algorithm and NN architecture are referred to as generality, time-stepping procedures will be presented on
hyperparameters. The concept is summarized by Fig. 2, PDEs with a first order derivative with respect to time:
showing a fully connected multi-layer, i.e., deep, NN. More
advanced NN architectures discussed throughout this work ∂u
= N T [u; λ], on × T . (3)
are described in Appendix A. ∂t
Notational Remark 1 Data sets are denoted by a superscript
M, i.e, {x M , y M }i=1
NM
, where NM is the data set size. with the non-linear operator N T . Another task in computa-
tional mechanics is the forward modeling and identification
Notational Remark 2 Although x and y may denote vector- of systems of ordinary differential equations (ODEs). For
valued quantities, we do not use bold-faced notation for them. this, we will consider systems of the following form:
Instead, this is reserved for all N degrees of freedom within
a problem, i.e., x = {xi }i=1N , y = {y } N . This can, for
i i=1 d x(t)
instance, be in the form of a domain sampled with N grid = f (x(t)). (4)
dt
points or systems composed of N degrees of freedom. Note
however, that matrices will still be denoted with capital letters Here, x(t) are the time-dependent degrees of freedom and
in bold face. f is the right-hand side defining the system of equations.9
Notational Remark 3 A multitude of NN architectures will Both the forward problem of computing x(t) and the inverse
be discussed throughout this work, for which we introduce problem of identifying f will be discussed in the following.
abbreviations and subscripts. Most prominent are fully con-
nected NNs FF N N (FC-NNs) [44, 54], convolutional NNs 2.1 Data-driven modeling
f C N N (CNNs) [55–57], recurrent NNs f R N N (RNNs) [58–
60], and graph NNs f G N N (GNNs) [61–63]6 . If the network Data-driven modeling relies entirely on labeled data x M , y M .
architecture is independent of the method, the network is The NN learns the mapping between x M and y M with
denoted as f N N . ŷi = f N N (xi ; θ ). Thereby an interpolation to yet unseen
data points is established. A data-driven loss LD , such as the
mean squared error, for example, can be used as cost function
2 Simulation substitution C.
1
NM
In the field of computational mechanics, numerical pro-
cedures are developed to solve or find partial differential C = LD = || ŷi − yiM ||22 (5)
2NM
i=1
equations (PDEs). A generic PDE can be written as
where a non-linear operator N acts on a solution u(x, t) of To declutter the notation, but without loss of generality, the
a PDE as well as the coefficients λ(x, t) of the PDE7 in the temporal dimension t is dropped in this section, as it is possi-
spatio-temporal domain × T . In the forward problem, the ble to treat it like any other spatial dimension x in the scope of
solution u(x, t) is to be computed, while the inverse problem these methods. The goal of the upcoming methods is to either
considers either the non-linear operator N or coefficients learn a forward operator û = F[λ; x], an inverse operator for
λ(x, t) as unknowns. the coefficients λ̂ = I [u; x], or an inverse operator for the
A further distinction is made between methods treating non-linear operator N̂ = O[u; λ; x].10 The methods will be
the temporal dimension t as a continuum, as in space- explained using the forward operator, but they apply analo-
time approaches [67] (Sects. 2.1.1 and 2.2.1)8 , or in dis- gously to the inverse operators. Only the inputs and outputs
crete sequential time steps, as in time-stepping procedures differ.
The solution prediction û i at coordinate xi or ûi on the
6 Another architecture worth mentioning, as it has recently been applied entire domain is made based on a set of inverse coefficients
for regression [64, 65] are spiking NNs [66] specialized to run on λi . The cost function C is formulated analogously to Eq. (5):
neuromorphic hardware and thereby reduce memory and energy con-
sumption. These are, however, not treated in this work.
7 In case of the bar equation (Eq. 25), the PDE coefficients could be 9Note that a spatial discretization of the PDE equation (3) can also be
the cross-sectional stiffness E A(x) or/and the distributed load p(x). written as a system of ODEs.
8 Static problems without time-dependence can only be treated by the 10 Note that u might only be partially known on the domain for
space-time approaches. inverse problems.
123
Computational Mechanics (2024) 74:281–331 285
1
NM
C = LD = ||û i − u iM ||22 or
2NM
i=1
1
NM
C = LD = ||ûi − uiM ||22 . (6)
2NM
i=1
123
286 Computational Mechanics (2024) 74:281–331
123
Computational Mechanics (2024) 74:281–331 287
123
288 Computational Mechanics (2024) 74:281–331
predictions induced by tunneling [195], which was extended (Eq. 18) and the true state in the dictionary space. Orthogo-
to damage prediction in affected structures [196, 197]. RNNs nality is not required and therefore not enforced.
are often combined with reduced order model encodings
[198], where the dynamics are predicted on the reduced 1
N
latent space, as demonstrated in [199–205]. Further varia- C= ||ψ̂(x(ti+1 )) − Aψ̂(x(ti ))||22 (20)
2N
tions employ classical time-stepping schemes on the reduced i=1
latent space obtained by autoencoders [206, 207].
2.1.2.2. Dynamic mode decomposition When the dictionary is learned, the state predictions can be
Another approach that was formulated for system dynam- reconstructed using the Koopman mode decomposition, as
ics, i.e., Eq. (4) is dynamic mode decomposition (DMD) explained in detail in [212].
[208, 209]. The aim of DMD is to identify a linear Alternatively, the mapping to the augmented state can
operator A that relates two successive snapshot matrices be performed with autoencoders, which at the same time
with n time steps X = [x(t1 ), x(t2 ), . . . , x(tn )]T , X = allows for a direct map back to the original space [214–
[x(t2 ), x(t3 ), . . . , x(tn+1 )]T : 217]. Thus, an encoder learns a reduced latent space ĥ(x) =
e N N (x; θ e ) and a decoder learns the inverse mapping x̂(h) =
d N N (h; θ d ). The networks are trained using three losses: the
X ≈ AX. (16)
autoencoder reconstruction loss LA , the linear dynamics loss
LR , and the future state prediction loss LF .
To solve this, the problem is reframed as a regression task.
The operator A is approximated by minimizing the Frobe-
n+1
nius norm of the difference between X and AX. This LA =
1
||x(ti ) − d N N (e N N (x(ti ); θ e ); θ d )||22
minimization can be performed using the Moore-Penrose 2(n + 1)
i=1
pseudoinverse X † (see, e.g., [38]): (21)
1
n
A = arg min||X − AX|| F = X X † . (17) LR = ||e N N (x(ti+1 ); θ e ) − Ae N N (x(ti ); θ e )||22
A 2n
i=1
(22)
Once the operator is identified, it can be used to propagate
1
n
the dynamics forward in time, approximating the next state LF = ||x(ti+1 ) − d N N ( Ae N N (x(ti ); θ e ); θ d )||22
x(ti+1 ) using the current state x(ti ): 2n
i=1
(23)
x(ti+1 ) ≈ Ax(ti ). (18) C = κA LA + κR LR + κF LF (24)
This framework, is however, only valid for linear dynamics. The cost function C is composed of a weighted sum of the
DMD can be extended to handle non-linear systems through loss terms LA , LR , LF and weighting terms κA , κR , κF .
the application of Koopman operator theory [210]. Accord- Furthermore, [216] allows A to vary depending on the state.
ing to Koopman operator theory, it is possible to represent This is achieved by predicting the eigenvalues of A with an
a non-linear system as a linear one by using an infinite- auxiliary network and constructing the matrix from these.
dimensional Koopman operator K that acts on a transformed
state e(x(ti )): 2.1.3 Active learning and transfer learning
g(x(ti+1 )) = K[e(x(ti ))]. (19) Finally, an important machine learning technique indepen-
dent of the NN architecture and applicable to both space-
In theory, the Koopman operator K is an infinite-dimensional time and time-stepping approaches is active learning [218].
linear transformation. In practice, however, finite-dimensional Instead of precomputing a labeled data set, data is only pro-
approximations are employed. This approach is, for example vided when the prediction quality of the NN is insufficient.
utilized in the extended DMD [211], where the regression Furthermore, the data is not chosen arbitrarily, but only in the
from Eq. (17) is performed on a higher-dimensional state vicinity of the failed prediction. In computational mechan-
h(ti ) = e(x(ti )) relying on a dictionary of orthonormal basis ics, the prediction of the NN can be assessed with an error
functions h(ti ) = ψ(x(ti )). Alternatively, the dictionary can indicator. For an insufficient result, the results of a classical
be learned using NNs, i.e., ψ̂(x) = ψ N N (x; θ ), as demon- simulation are used to retrain the NN. Over time, the NN
strated in [212, 213]. The NN is trained by minimizing the estimates improve in the respective domain of application.
mismatch between predicted state ψ( x̂(ti+1 )) = Aψ̂(x(ti )) Due to the error indicator and the classical simulations, the
123
Computational Mechanics (2024) 74:281–331 289
predictions are reliable. Examples for active learning in com- 2.2.1.1. Differential equation solving with neural net-
putational mechanics can be found in [219–221]. works
Another technique, transfer learning [222, 223], aims at The concept of solving PDEs15 was first proposed in the
accelerating the NN training. Here, the NN is first trained 1990s [8–10], but was recently popularized by the so-called
on a similar task. Subsequently, it is applied to the task of physics-informed neural networks (PINNs) [228] (see [229–
interest—where it converges faster than an untrained NN. 231] for recent review articles and SciANN [232], SimNet
Applications in computational mechanics can be found in [233], DeepXDE [234] for libraries).
[98, 224]. To illustrate the idea and variations of PINNs, we will
consider the differential equation of a static elastic bar
2.2 Physics-informed learning
d du
In supervised learning, as discussed in Sect. 2.1, the qual- EA + p = 0, x ∈ . (25)
dx dx
ity of prediction strongly depends on the amount of training
data. Acquiring data in computational mechanics may be Here, the operator N is given by the left-hand side of the
expensive. To reduce the amount of required data, con- equation, the solution u(x) is the axial displacement, and the
straints enforcing the physics have been proposed. Two main spatially varying coefficients λ(x) are given by the cross-
approaches exist [43, 225]. The physics can be enforced by sectional properties E A(x) and the distributed load p(x).
modifying the cost function through a penalty term punishing Additionally, boundary conditions are specified, which can
unphysical predictions, thus acting as a regularizer. Possible be in terms of Dirichlet (on D ) or Neumann boundary con-
modifications are discussed in the upcoming section. Alter- ditions (on N ):
natively, the physics can be enforced by construction, i.e.,
by reducing the learnable space to a physically meaningful
space. This approach is highly specific to its application and u(x) = g(x), x ∈ D , (26)
will therefore mainly be explored in Sect. 3. A brief coverage du(x)
E A(x) = f (x), x ∈ N . (27)
is provided in Sect. 2.2.3. dx
Both approaches can be found in overview publica-
tions, where [43] defines four overarching methodologies: Physics-informed neural networks
(i) augmentation of training data using prior knowledge, (ii) PINNs [228] approximate either the solution u(x), the coef-
modification of the model, i.e., enforcement by construction, ficients λ(x), or both with FC-NNs.
(iii) enhancement of the learning algorithm with regular-
ization terms, i.e., enforcing constraints through the cost û(x) = FF N N (x; θ u ) (28)
function, and (iv) checking the final estimate and thereby λ
λ̂(x) = I F N N (x; θ ) (29)
discarding physical violations (using, e.g., error indicators).
The two most prominent methodologies, i.e., modifying the
cost function and enforcement by construction are simi- Instead of training the network with labeled data as in Eq. (6),
larly mentioned in [225], which correspondingly refers to the residual of the PDE is considered. The residual is evalu-
them as physics-informed and physics-augmented. Further ated at a set of NN points, called collocation points. Taking
variations in terminology can be found in [182, 226], who the mean squared error over the residual evaluations yields
refer to physics-informed NNs for multiple solutions as the PDE loss
physics-constrained deep learning, or [227] using the term
1
physics-enhanced NNs for NNs enforcing the physics by NN
construction. Due to the many names within the relatively LN = ||N [u(xi ); λ(xi )]||22
2NN
new and interconnected field, we cover the variations under i=1
2
1
NN
the overarching term of physics-informed learning. d du(xi )
= E A(xi ) + p(xi ) . (30)
2NN dx dx
i=1
2.2.1 Space-time approaches
Once again and without loss of generality, the temporal The gradients of the possible predictions, i.e., u, E A, and p
dimension t is dropped to declutter the notation. However, in with respect to x, are obtained with automatic differentiation
contrast to Sect. 2.1.1, the following methods are not equally [51] through the NN approximation. Similarly, the boundary
applicable to forward and inverse problems. Thus, the pre-
diction of the solution û, the PDE coefficients λ̂, and the 15 Typically, a single solution to a PDE is obtained. If the PDE is
non-linear operator N are treated separately. parametrized, multiple solutions can be obtained.
123
290 Computational Mechanics (2024) 74:281–331
conditions are enforced at the NBD + NBN boundary points. The test function is learned through a minimax optimization
C = LN + LB + LD . (32) 2
1 du(x)
LE = i + e =E A(x) d
2 dx
Both the deep least-squares method [235] and the deep
du(x)
Galerkin method [236] are closely related. Instead of focus- − u(x)E A(x) d
dx
ing on the residuals at individual collocation points as in
PINNs, these methods consider the L 2 -norm of the residuals − u(x) p(x)d. (37)
integrated over the domain .
Variational physics-informed neural networks Note that the inverse problem generally cannot be solved
Computing high-order derivatives for the non-linear opera- using the minimization of the potential energy. Consider, for
tor N is expensive. Therefore, variational PINNs [237, 238] instance, the potential energy of the bar equation in Eq. (37),
consider the weak form of the PDE, which lowers the order which is not well-posed in the inverse setting. Here, E A(x)
of differentiation. In the case of the bar equation, the weak going towards −∞ in the domain and going towards ∞
PDE loss is given by at N minimizes the potential energy LE .
Extensions
dwi (x) du(x)
LVi = E A(x) d A multitude of extensions to the PINN methodology exist.
dx dx For in-depth reviews, see [229–231].
du(x) Learning multiple solutions
− wi (x)E A(x) d N
dx Currently, PINNs are mainly employed to learn a single
N
solution. As the training effort exceeds the solving effort of
− wi (x) p(x)d = 0, ∀wi (x), (33)
classical solvers, the viability of PINNs is questionable [246].
However, PINNs can also be employed to learn multiple solu-
1
NV
LV = LVi . (34) tions. This is achieved by providing the parametrization of
NV the PDE, i.e., λ as an additional input to the network, as
i
discussed in Sect. 2.1. This enables a cheap prediction stage
In [237], NV trigonometric and polynomial test functions without retraining for new solutions16 . One possible example
wi (x) are used. The cost function is obtained by replacing the for this is [247], where different geometries are captured in
PDE loss LN with the weak PDE loss LV in Eq. (32). Note terms of point clouds and processed with point cloud-based
that the Neumann boundary conditions are now not included NNs [117].
in the boundary loss LB , as they are already incorporated Boundary conditions
in the weak form in Eq. (33). The integrals are evaluated The enforcement of the boundary conditions through a
through numerical integration methods, such as Gaussian penalty term LB in Eq. (31) leads to an unbalanced opti-
quadrature, Monte Carlo integration methods [239, 240], mization, due to the competing loss terms LN , LB , LD
or sparse grid quadratures [241]. Severe inaccuracies can in Eq. (32)17 . One remedy is to modify the NN output
be introduced through the numerical integration of the NN
output—for which remedies have been proposed in [242]. 16 Importantly, the training would be without training data and would
Weak adversarial networks only require a definition of the parametrized PDE. Currently, this is only
Instead of specifying the test functions w(x), weak adversar- possible for simple PDEs with small parameter spaces.
17 Consider, for instance, a training procedure in which the PDE loss
ial networks [243] employ a second NN as test function
LN is first minimal, such that the PDE is fulfilled. Without fulfilment
of the boundary conditions, the solution is not unique. However, the NN
ŵ(x) = W F N N (x; θ w ). (35) struggles to modify the current boundary values without violating the
123
Computational Mechanics (2024) 74:281–331 291
FF N N by multiplication of a function, such that the Dirichlet which lead to larger residuals. This approach is strongly
boundary conditions are satisfied a priori, i.e., LB = 0, as related to the approaches relying on the augmented Lagrangian
demonstrated in [37, 248]. method [259] and competitive PINNs [260], where an addi-
tional NN models the penalty weights κ(x) = K F N N (x; θ κ ).
û(x) = G(x) + D(x)FF N N (x; θ u ) (38) This is similar to weak adversarial networks, but instead for-
mulated using the strong form.
Here, G(x) is a smooth interpolation of the boundary con- Ansatz
ditions, and D(x) is a signed distance function that is zero Another prominent topic is the question of which ansatz to
at the boundary. For Neumann boundary conditions, [249] choose. The type of ansatz is, for example, determined by
propose to predict u and its derivatives ∂u/∂ x with sepa- different NN architectures (see [261] for a comparison) or
rate networks, such that the Neumann boundary conditions combinations with classical ansatz formulations. Instead of
can be enforced strongly by modifying the derivative net- using FC-NNs, some authors [182, 226] employ CNNs to
work. This requires an additional constraint, ensuring that exploit the spatial structure of the data. Irregular geometries
the derivative predictions match the derivative of u. For can be handled by embedding the structure in a rectangu-
complex domains, G(x) and D(x) cannot be found analyti- lar domain using binary encodings [262] or signed distance
cally. Therefore, [248] use NNs to learn G(x) and D(x) in functions [86, 263]. Another option are coordinate trans-
a supervised manner by prescribing either the boundary val- formations into rectangular grids [264]. The CNN requires
ues or zero at the boundary and restricting the values within a full-grid discretization, meaning that the coordinates x
the domain to be non-zero. Similarly [250] proposed using are analytically independent of the prediction û = FC N N .
radial basis function networks for G(x), where D(x) = 1 is Thus, the gradients of u are not obtained with automatic
assumed. The radial basis function networks are determined differentiation, but with numerical differentiation, i.e., finite
by solving a linear system of equations constructed with the differences. Alternatively, the output of the CNN can rep-
boundary conditions. On uniform grids, strong enforcement resent coefficients of an interpolation, as proposed under
can be achieved through specialized CNN kernels [204] with the name spline-PINNs [265] using Hermite splines. This
constant padding terms for Dirichlet boundary conditions and again allows for an automatic differentiation. This is simi-
ghost cells for Neumann boundary conditions. Constrained larly applied for irregular geometries in [266], where GNNs
backward propagation [251] has also been proposed to guar- are used in combination with a piecewise polynomial basis.
antee the enforcement of boundary conditions [252, 253]. Using a classical basis has the added advantage that Dirich-
Another possibility is to introduce weighting terms let boundary conditions can be satisfied exactly. A further
κN , κB , κD for each loss term. These are either hyperpa- variation is the approximation of the coefficients of classical
rameters, or they are learned during the optimization with bases with FC-NNs. This is shown with B-splines in [267] in
attention mechanisms [254–256]. This is achieved by per- the sense of isogeometric analysis [268]. This was similarly
forming a minimax optimization with respect to all weighting done for piecewise polynomials in [269]. However, instead of
terms κ = {κN , κB , κD } simply minimizing the PDE residual from Eq. (30) directly,
the finite element discretization [270, 271] is exploited. The
min max C. (39) loss LF can thus be formulated in terms of the non-linear
θ κ
stiffness matrix K , the force vector F, and the degrees of
Expanding on this idea, each collocation point used for the freedom uh .
loss terms can be considered an individual equality constraint
[257, 258]. Therefore, a weighting term κNi is allocated for LF = ||K (uh )uh − F||22 (41)
each collocation point xi , as illustrated for the PDE loss LN
from Eq. (30) In the forward problem, uh is approximated by a FC-NN,
whereas for the inverse problem a FC-NN predicts K . Sim-
1 ilarly, [272, 273] map a NN onto a finite element space by
NN
LN = κN ,i ||N [u(xi ); λ(xi )]||22 . (40) using the NN evaluations at nodal coordinates as the cor-
2NN
i=1 responding basis function coefficents. This also allows a
straightforward strong enforcement of Dirichlet boundary
This has the added advantage that greater emphasis is conditions, as demonstrated in [79] with CNNs. The nodes
assigned on more important collocation points, i.e., points are represented as pixels (see Fig. 3).
Prior information on the solution can be incorporated
Footnote 17 continued
through a feature layer [274]. If, for example, it is known that
PDE loss and thereby increasing the total cost function C. The NN is
thus stuck in a bad local minimum. Similar scenarios can be formulated the solution is composed of trigonometric functions, a feature
for a too rapid minimization of the other loss terms. layer with trigonometric functions can be applied after the
123
292 Computational Mechanics (2024) 74:281–331
input layer. Thus, known features are given to the NN directly tive should be zero at the correct solution. However, a general
to aid the learning. Without known features, the task can problem in the cost function formulation persists. The cost
also be modified to improve learning. Inspired by adaptivity function should correspond to the norm of the error, which
from finite elements, refinements are progressively learned is not necessarily the case. This means that a reduction in
by additional layers of the NN [275] (see Fig. 6). Thus, a the cost does not necessarily yield an improvement in qual-
coarse solution u1 is learned to begin with, then refined to ity of solution. The error norm can be expressed in terms of
u2 by an additional layer, which again is refined to u3 until the H −1 -norm, which, according to [288], can efficiently be
the deepest refinement level is reached. computed on rectangular domains with Fourier transforms.
Domain decomposition Thus, the H −1 -norm can directly be used as cost function
To improve the scalability of PINNs to more complex prob- and minimized.
lems, several domain decomposition methods have been Another aspect is numerical differentiation, which is
proposed. One approach are hp-variational PINNs [238], advantageous for the residual of the PDE [289], as automatic
where the domain is decomposed into patches. Piecewise differentiation may be erroneous due to spurious oscillations
polynomial test functions are defined on each patch sepa- between collocation points. Thus, numerical differentiation
rately, while the solution is approximated by a globally acting enforces regularity, which was exploited in [289] by cou-
NN. This enables a separate numerical integration of each pling automatic differentiation and numerical differentiation
patch, improving its accuracy. to retain the advantages of automatic differentiation.
In an alternative formulation, one NN can be used per sub- Further specialized modifications to NN architectures
domain. This was proposed as conservative PINNs [276], have been proposed. Adaptive activation functions [290]
where conservation laws are enforced at the interface to have shown acceleration in convergence. Extreme learning
ensure continuity. Here, the discrepancies between both solu- machines [291, 292] remove the need for iterations alto-
tion and flux were penalized at the interface in a least squares gether. All layers are randomly initialized in extreme learning
manner. The advantages of this approach are twofold: Firstly, machines, and only the last layer is learnable. Without a non-
parallelization is possible [277] and, secondly, adaptivitiy linear activation function, the parameters are found with a
can be introduced. Shallower networks can be employed for least-squares regression. This was demonstrated for PINNs
smooth solutions and deeper networks for more complex in [293]. Instead of only learning the last layer, the problem
solutions. The approach was generalized for any PDE in the can be split into a non-linear and a linear regression prob-
context of extended PINNs [278]. Here, the interface con- lem, which are solved separately [294], such that the full
dition is formulated in terms of the difference in both the expressivity of NNs is retained.
residual and the solution. Applications to forward problems
Acceleration methods PINNs have been applied to various PDEs (see [229–231] for
Analogously to supervised learning, as discussed in Sect. 2.1, an overview). Forward problems can, for example, be found
transfer learning can be applied to PINNs [279] as, e.g., in solid mechanics [284, 295, 296], fluid mechanics [297–
demonstrated in phase-field fracture [280] or topology opti- 304], and thermomechanics [305, 306]. Currently, PINNs do
mization [281]. These are very suitable problems since crack not outperform classical solvers such as the finite element
and displacement fields evolve with mostly local changes in method [246, 307] in terms of speed for a given accuracy of
phase-field fracture. For topology optimization, only minor engineering relevance. In the author’s experience and judge-
updates are expected between each optimization iteration ment, this is especially the case for forward problems even
[281]. if the extensions mentioned above are employed. Often, the
The poor performance of PINNs in their original form can mentioned gains compared to classical forward solvers dis-
also be improved with better sampling strategies. In impor- regard the training effort and only report evaluation times.
tance sampling [282, 283], the collocation point density is Incorporating large parts of the solution in the form of
proportional to the value of the cost function. Alternatively, measurements with the data-driven loss LD improves the
residual-based adaptive refinement [234] adds collocation performance of PINNs, which thereby can become a viable
points in the vicinity of areas with a higher cost function. method in some cases. Yet, [308] states that data-driven meth-
Another essential topic for NNs is normalization of ods outperform PINNs. Thus PINNs should not be regarded
the inputs, outputs, and loss terms [284, 285]. For time- as a replacement for data-driven methods, but rather as a reg-
dependent problems, it is possible to use time-dependent ularization technique for data-driven methods to reduce the
normalization [286] to ensure that the solution is always in generalization error.
the same range regardless of the time step. Applications to inverse problems
Furthermore, the cost function can be enhanced by includ- However, PINNs are in particular useful for inverse problems
ing the derivative of the residual [287] as well. The derivative with full domain knowledge, i.e., the solution is available
should also be minimized, as both the residual and its deriva- throughout the entire domain. This has, for example, been
123
Computational Mechanics (2024) 74:281–331 293
shown for the identification of material properties [285, identified. AI-Feynman has been successfully applied to 100
309–312]. By contrast, for inverse problems with only par- equations from the Feynman lectures [325].
tial knowledge, the applicability of PINNs is limited [313],
as both forward and inverse solution have to be learned
simultaneously. Most applications therefore limit themselves 2.2.2 Time-stepping procedures
to simpler inversions such as size and shape optimization.
Examples are published, e.g., in [295, 314–319]. Exceptions Again Eqs. (3) and (4) will be considered for the time-
that deal with the identification of entire fields can be found in stepping procedures.
full waveform inversion [320], topology optimization [321], 2.2.2.1. Physics-informed neural networks
elasticity, and the heat equation [322]. In the spirit of domain decomposition, parareal PINNs [326]
2.2.1.2. Inverse problems split up the temporal domain in subdomains [ti < ti+1 ]. A
PINNs are capable of discovering governing equations by rough estimate of the solution u is provided by a conjugate
either learning the operator N or the coefficients λ. The gradient solver on a simplified form of the PDE starting from
resulting operator is, however, not always interpretable, and t0 . PINNs are then independently applied in each subdomain
in the case of identification of the coefficients, the underlying to correct the estimate. Subsequently, the conjugate gradi-
PDE is assumed. To discover interpretable operators, one can ent solver is applied again, starting from t1 . This process is
apply sparse regression approaches [323]. Here, potential dif- repeated until all time steps have been traversed. A closely
ferential operators are assumed as an input to the non-linear related approach can be found in [327], where a PINN is
operator retrained on successive time segments. It is however ensured
that previous time steps are kept fulfilled through a data-
driven loss term for time segments that were already learned.
∂u ∂ 2 u
N̂ x, u, , , . . . = 0. (42) Another approach are the discrete-time PINNs [228],
∂x ∂x2 which consider the temporal dimension in a discrete man-
ner. The differential equation from Eq. (3) is discretized with
Subsequently, a NN learns the corresponding coefficients the Runge-Kutta method with q stages [328]:
using observed solutions inserted into Eq. (42). The eval-
uation of the differential operators is achieved through
automatic differentiation by first interpolating the solution
q
123
294 Computational Mechanics (2024) 74:281–331
A NN FN N predicts all stages i = 1, . . . , q from an input x: N̂ T are combined with point-wise CNNs [334] in [332] or a
symbolic network in [333]. Both yield an interpretable oper-
û = [û n+c1 (x), . . . , û n+cq (x), û n+1 (x)] = FN N (x; θ ). (46) ator from which the analytical expression can be extracted.
In order to construct a loss function, Eqs. (3) and (49) are
The cost is then constructed by rearranging Eqs. (43) and discretized using the forward Euler method:
(44).
∂u ∂ 2 u
u(x, tn+1 ) = u(x, tn ) + t N̂ T x, u, , 2 , . . . . (50)
q ∂x ∂x
û n = û in = û n+ci − t ai j N T [û n+c j ], i = 1, . . . , q,
j=1 This temporal discretization is applied iteratively, and the
(47) discrepancy between the derived function and the measured
q data u M (x, tn ) serves as the loss function.
û =
n n
û q+1 = û n+1
− t b j N T [û n+c j ]. (48) SINDy
j=1 Sparse identification of non-linear dynamic systems (SINDy)
[335] deals with the discovery of dynamic systems of the
The q +1 predictions û in , û q+1
n of û n have to match the initial form of Eq. (4). The task is posed as a sparse regression prob-
conditions u M , where the mean squared error is used as a
n
lem. Snapshot matrices of the state X = [x(t1 ), x(t2 ), . . . ,
loss function to learn all stages û. The approach has been x(tn )] and its time derivative Ẋ = [ ẋ(t1 ), ẋ(t2 ), . . . , ẋ(tn )]
applied to fluid mechanics [329, 330]. are related to one another via candidate functions (X) eval-
2.2.2.2. Inverse problems uated at X using unknown coefficients :
As for inverse problems in the space-time approaches (Para-
graph 2.2.1.2), the non-linear operator N can be learned. For Ẋ = (X). (51)
temporal problems, this corresponds to the right-hand side
of Eq. (3) for PDEs and to Eq. (4) for systems of ODEs. The The coefficients are determined through sparse regres-
predicted right-hand side can then be used to predict time sion, such as sequential thresholded least squares or LASSO
series using a classical time-stepping scheme, as proposed in regression. By including partial derivatives, SINDy has been
[331]. More sophisticated methods leaning on similar prin- extended to the discovery of PDEs [336, 337].
ciples are presented in the following. Specifically, we will The expressivity of SINDy can further be increased by a
discuss PDE-Net for discovering PDEs, SINDy for discov- coordinate transformation into a representation allowing for
ering systems of ODEs in an interpretable sense, and an a simpler representation of the system dynamics. This can
approach relying on multistep methods for systems of ODEs. be achieved with an autoencoder (consisting of an encoder
The multistep approach leads to a non-interpretable, but more e N N (x; θ e ) and a decoder d N N (h; θ d ), as proposed in [338],
expressive approximation of the right-hand side. where the dynamics are learned on the reduced latent space h
PDE-Net using SINDy. A simultaneous optimization of the NN param-
PDE-Net [332, 333] is designed to learn both the system eters θ e , θ d and SINDy parameters is conducted with
dynamics u(x, t) and the underlying differential equation it gradient descent. The cost is defined in terms of the autoen-
follows. Given a problem of the form of Eq. (3), the right- coder reconstruction loss LA and the residual of Eq. (51)
hand side can be approximated as a function of coordinates at both the reduced latent space LR and the original space
and gradients of the solution. LF 19 . A L 1 -regularization for promotes sparsity.
1
n
T ∂u ∂ 2 u LA = ||x(ti ) − d N N e N N (x(ti ); θ e ); θ d ||22 (52)
N̂ x, u, , ,... (49) 2n
∂x ∂x2 i=1
1 n
The operator N̂ T is approximated by NNs. The first step LR = || ∇x e N N x(ti ); θ e · ẋ(ti )
2n
involves estimating spatial derivatives using learnable con- i=1
ḣ
volutional filters. The filters are designed to adjust their order
of approximation based on the fit to the underlying measure- − e N N x(ti ); θ e
||22 (53)
ments u M , while the type of gradient is predefined18 . Thus,
1
n
the NN learns how to best approximate spatial derivatives LF = || ẋ(ti ) − ∇h d N N e N N (x(ti ); θ e ); θ d
2n
specific to the underlying data. Subsequently, the inputs of i=1 h
18 This is enforced through constraints using moment matrices of the 19 The encoder and decoder are derived with respect to their inputs to
convolutional filters. estimate the derivatives ẋ, ḣ using the chain rule.
123
Computational Mechanics (2024) 74:281–331 295
· e N N (x(ti ); θ e ) ||22 (54) tions are predicted and subsequently differentiated to ensure
conservation of mass, the incorporation of symmetries [342],
ḣ or invariances [343] by using integrity bases [344]. Dynami-
C = κA LA + κR LR + κF LF (55) cal systems have been treated by learning the Lagrangian or
Hamiltonian with correspondingly Lagrangian NNs [345–
As in Eq. (24), a weighted cost function with weights 347] and Hamiltonian NNs [348]. The quantities of interest
κA , κR , κF is employed. The reduced latent space can be are obtained through the differentiable NN and compared
exploited for forward simulations of the identified system. to labeled data. Indirectly learning the quantities of interest
By solving the system with classical time-stepping schemes through the Lagrangian or Hamiltonian guarantees the con-
in the reduced latent space, the solution is obtained in the servation of energy. Enforcing the physics by construction is
full space through the decoder, as outlined in [339]. Thus, also referred to as physics-constrained learning, as the learn-
a reduced order model of a previously unknown system is able space is constrained. Note, however, that constraining
identified. The downside is, that the model is no longer inter- the learnable space also challenges the learning algorithm,
pretable in the full space. thus potentially making convergence more difficult. There-
Multistep methods fore, [225] relaxes the requirement of fulfilling the physical
Another approach [340] to learning the system dynamics laws by introducing a secondary unconstrained network—
from Eq. (4) is to approximate the right-hand side directly acting additively on the solution—whose influence is scaled
with a NN fˆ(x i ) = O N N (x i ; θ ), x i = x(ti ). A residual can by a hyperparameter. More examples of physics enforcement
be formulated by considering linear multistep methods [328], by construction are provided in the context of simulation
a residual can be formulated. In general, these methods take enhancement in Sect. 3.2.
the form:
M
3 Simulation enhancement
[αm x n−m + tβm f (x n−m )] = 0, (56)
m=0
The category of simulation enhancement deals with any
deep learning technique that interacts directly with and, thus,
where M, α0 , α1 , β0 , β1 are parameters specific to a multi-
improves a component of a classical simulation. This is the
step scheme. The scheme can be reformulated with a cost
most diverse category and will therefore be subdivided into
function, given as:
the individual steps of a classical simulation pipeline:
1
N
C= || ŷn ||22 (57) • pre-processing
N − M +1 • physical modeling
n=M
• numerical methods
M
ŷn = [αm x n−m + tβm fˆ(x n−m )] (58) • post-processing
m=0
Both data-driven and physics-informed approaches will be
The idea of the method is strongly linked to the discrete-time discussed in the following.
PINN presented in Paragraph 2.2.2.1, where a reformulation
of the Runge-Kutta method yields the cost function needed 3.1 Pre-processing
to learn the forward solution.
The discussed pre-processing methods are trained in a super-
2.2.3 Enforcement of physics by construction vised manner relying on the techniques presented in Sect. 2.1
and on labeled data.
Up to this point, this review only considered the case where
physics are enforced indirectly through penalty terms of the 3.1.1 Data preparation
PDE residual. The only exception, and the first example of
enforcing physics by construction, was the strong enforce- Data preparation includes tasks, such as geometry extrac-
ment of boundary conditions [37, 204, 248] by modifying the tion. For instance the detection of cracks from images by
outputs of the NN—which led to a fulfillment of the bound- means of segmentation [349–351] can subsequently be used
ary conditions independent of the NN parameters. For PDEs, in simulations to assess the impact of the identified cracks.
this can be achieved by manipulating the output, such that Also, CNNs have been used to prepare voxel data obtained
the solution automatically obeys fundamental physical laws. from computed tomography scans, see [352], where scan-
Examples for this are, e.g., given in [341], where stream func- ning artifacts are removed. Similarly NNs can be employed
123
296 Computational Mechanics (2024) 74:281–331
to enhance measurement data. This was, for example, demon- is assessed with a data-driven cost function (Eq. 5) using
strated in [353], where the NN acts as a denoiser for magnetic labeled data σ M , εM . The approach is applied to a variety
signals in the scope of non-destructive testing. Similarly, low- of problems, where the key difference lies in the definition
frequency extrapolation for full waveform inversion has been of input and output quantities. The same deep learning tech-
performed using NNs [354–356]. niques from data-driven simulation substitution (Sect. 2.1)
can be employed.
3.1.2 Initialization Applications include predictions of stress from strain
[366, 367], flow stresses from temperatures, strain rates
Instead of preparing the data, the simulation can be acceler- and strains [368, 369], yield functions [370], crack opening
ated by an initialization. This can, for example, be achieved responses from stresses [371], contact stiffness from pen-
through initial guesses by NNs, providing a better starting etration and contact pressure [372], point of contact from
point for classical iterative solvers [357]20 . A tighter inte- position of neighboring nodes of finite elements [373], or
gration is achieved by using a pre-trained [279] NN ansatz control points of NURBS surfaces [374]. Source terms of
whose parameters are subsequently tweaked by the classical simplified equations or coarser discretizations have also been
solver, as demonstrated for full waveform inversion in [224]. learned for turbulence [74, 375, 376] and the wave equation
[377]. Here, the reference—a high-fidelity model—is to be
3.1.3 Meshing captured in the best possible way by the source term.
Variations also predict the quantity of interest indirectly.
Finally, many simulation techniques rely on meshes. This For example, strain energy densities ψ are predicted by
can be achieved indirectly with NNs, by prediction of NNs from deformation tensors F, and subsequently derived
mesh density functions [358–362] incorporating either expert using automatic differentiation to obtain stresses [378, 379].
knowledge of where small elements are needed, or relying The approach can also be extended to incorporate uncer-
on error estimations. Subsequently, a classical mesh genera- tainty quantification [380]. By extending the input space
tor is employed. However, NNs (specifically let-it-grow NNs with microstructural information, an in-built homogeniza-
[363]) have also been proposed directly as mesh generators tion is added to the constitutive model [381–383]. Thus, the
[364, 365]. macroscale simulation considers the microstructure at the
integration points in the sense of FE2 [384, 385], but without
3.2 Physical modeling an additional finite element computation. Incorporation of
microstructures requires a large amount of realistic training
Physical models that capture physical phenomena accurately data, which can be obtained through generative approaches as
are a core component of mechanics. Deep learning offers discussed in Sect. 5. Active learning can reduce the required
three main approaches for physical models. Firstly, a NN number of simulations on these geometries [221].
is used as the physical model directly (model substitution). A specialized NN architecture is employed by [386],
Secondly, an underlying model may be assumed where a NN where a NN first estimates invariants I of the deformation
determines its coefficients (identification of model param- tensor F and thereupon predicts the strain energy density,
eters). Lastly, the entire model can be identified by a NN thus mimicking the classical constitutive modeling approach.
(model identification). In the first approach, the NN is inte- Another network extension is the use of RNNs to learn
grated within the simulation pipeline, while the latter two history-dependent models. This was shown in [381, 382, 387,
rely on incorporation of the identified models in a classical 388] for the prediction of the stress increment from the stress-
sense. strain history, the strain energy from the strain energy history
For illustration purposes, the approaches are mostly [389], and crack patterns based on prior cracks and crystalline
explained on the example of constitutive models. Here, the orientations [390, 391].
task is to relate the strain ε to a stress σ , i.e., find a function The learned models do not, however, necessarily obey fun-
σ = f (ε). This can, for example, be used within a finite damental physical laws. Attempts to incorporate physics as
element framework to determine the element stiffness, as constraints using penalty terms have been made in [392–
elaborated in [366]. 394]. Still, physical consistency is not guaranteed. Instead,
NN architectures can be chosen such that they satisfy phys-
3.2.1 Model substitution ical requirements by construction. In constitutive modeling,
objectivity can be enforced by using only deformation invari-
In model substitution, a NN f N N replaces the model, yield- ants as input [395], and polyconvexity can be enforced
ing the prediction σ̂ = f N N (ε; θ ). The quality of the model through the architecture, such as input-convex NNs [396–
399] or neural ordinary differential equations [395, 400].
20 Here, the initial guess is incorporated through a regularization term. It was demonstrated that ensuring fundamental physical
123
Computational Mechanics (2024) 74:281–331 297
aspects such as invariants combined with polyconvexivity incorporating meso-scale information by training a NN on
delivers a much better behavior for unseen data, especially if representative volume elements [420].
the model is used in extrapolation.
Input-convex NNs [401] enforce the convexity with spe- 3.2.3 Model identification
cialized activation functions such as log-sum-exponential,
or softplus functions in combination with constraints on the NN models as a replacement of classical approaches are not
NN weights to ensure that they are positive, while neural interpretable, while only identifying model parameters of
ordinary differential equations [402] (discussed in Sect. 4) known models restricts the models capacity. This gap can
approximate the strain energy density derivatives and ensure be bridged by the identification of models in terms of parsi-
non-negative values. Alternatively, a mapping from the monious mathematical expressions.
NN to a convex function can be defined [403] ensuring The typical procedure is to pose the problem in terms
a convex function for any NN output. Related are also of candidate functions and to identify the most relevant
thermodynamics-based NNs [404, 405], e.g., applied to com- terms. The methodology was inspired by SINDy [335] and
plex microstructures in [406], which by construction obey introduced in the framework for efficient unsupervised con-
fundamental thermodynamic laws. Training of these meth- stitutive law identification and discovery (EUCLID) [421].
ods can be performed in a supervised manner, relying on The approach is unsupervised, as the stress-strain data is
stress-strain data, or unsupervised. In the unsupervised set- only indirectly available through the displacement field and
ting, the constitutive model is incorporated in a finite element corresponding reaction forces. The N I invariants Ii of the
solver, yielding a displacement field for a specific boundary deformation tensor F are inserted into a candidate library
NI
value problem. The computed field, together with measure- Q({Ii }i=1 ) containing the candidate functions. Together with
ment data, yields a residual that is referred to as the modified the corresponding weights θ , the strain density ψ is deter-
constitutive relation error (mCRE) [407–409], which is mini- mined:
mized to improve the constitutive relation [410, 411]. Instead
NI NI
of formulating the mismatch in terms of displacements, [412, ψ({Ii }i=1 ) = Q T ({Ii }i=1 )θ . (60)
413] formulate it in terms of boundary forces. For an in-depth
overview of constitutive model substitution in deep learning, Through derivation of the strain density ψ using automatic
see [32]. differentiation, the stresses σ are determined. The prob-
lem is then cast into the weak form with which the linear
3.2.2 Identification of model parameters momentum balance is enforced. The weak form is then min-
imized with respect to θ using a fixed-point iteration scheme
Identification of model parameters is achieved by assuming (inspired by [422]), where a L p -regularization is used to
an underlying model and training a NN to predict its parame- promote sparsity in θ. Despite its young age, the approach
ters for a given input. In the constitutive model example, one has already been applied to plasticity [423], viscoelastic-
might assume a linear elastic model expressed in terms of a ity [424], combinations [425], and has been extended to
constitutive tensor c, such that σ = cε. The constitutive ten- incorporate uncertainties through a Bayesian model [426].
sor can be predicted from the material distribution defined Furthermore, the approach has been extended with an ensem-
in terms of a heterogeneous elasticity modulus E defined ble of input-convex NNs [413], yielding a more accurate, but
throughout the domain less interpretable model.
A similar effort was recently carried out by [427, 428],
where NNs are designed to retain interpretability. This is
ĉ = f N N (E; θ ). (59) achieved through sparse connections in combination with
specialized activation functions representing candidate func-
Typical applications are homogenization, where effec- tions, such that they are able to capture classical forms
tive properties are predicted from the geometry and material of constitutive terms. Through the sparse connections in
distribution. Examples are CNN-based homogenizations on the network and the specialized activation functions, the
computed tomography scans [414, 415], predictions of in- NN’s weights become physical parameters, yielding an inter-
vivo constitutive parameters of aortic walls from its geometry pretable model. This is best understood by consulting Fig. 7,
[416], predictions of elastoplastic properties [417] from where the strain energy density is expressed as
instrumented indentation results relying on a multi-fidelity
ψ̂ = θ01 eθ0 I1 + θ11 ln(θ10 I1 ) + θ21 eθ2 I1
0 0 2
approach [418], prediction of stress intensity factors from
the geometry in microfabricated microcantilevers [419],
+θ21 ln(θ20 I12 ) + θ31 eθ3 I2 + θ41 ln(θ40 I2 )
0
123
298 Computational Mechanics (2024) 74:281–331
∂nu (n)
≈ αi u i . (62)
∂x n
i
123
Computational Mechanics (2024) 74:281–331 299
123
300 Computational Mechanics (2024) 74:281–331
representations [475–480] have been presented in the con- Further variations perform the coarse-to-fine mapping in
text of topology optimization in [481]. Furthermore, [313, a patch-based manner, where the interfaces require a special
468, 472] showed how to conduct the gradient computation treatment [493]. Another approach uses a NN to map the
without automatic differentiation through the solver F. The coarse solution to the closest fine solution stored in a database
gradient computation is split up via the chain rule: [494]. The mapping is performed on patches of the domain.
Other post-processing tasks include feature extraction.
∇θ C = ∇λ C · ∇θ λ. (63) After a topology optimization, NNs have been used to extract
basic shapes to be used in a subsequent shape optimization
[495, 496]. Another aspect that can be ensured through post-
The first gradient ∇λ C is computed with the adjoint state processing is manufacturability.
method, such that the solver can be treated as a black box. Lastly, adaptive mesh refinement falls under the category
The second gradient ∇θ λ is obtained through automatic dif- of post-processing as well. Closely related to the meshing
ferentiation. An additional advantage of the NN ansatz is approaches discussed in Sect. 3.1.3, NNs have been proposed
that, if applied to multiple solutions with a problem specific as error indicators [361, 497] that are trained in a supervised
input, the NN is trained. Thus, after sufficient inversions, manner. The error indicators can subsequently be employed
the NN can be used as predictor, as presented in [482]. The to adapt the mesh based on the error.
training can also be performed in combination with labeled
data, yielding a semi-supervised approach, as demonstrated
in [224, 483].
4 Discretizations as neural networks
3.4 Post-processing
NNs are composed of linear transformations and non-linear
Post-processing concerns the modification and interpretation functions, which are basic building blocks of most PDE dis-
of the computed solution. One motivation is to reduce the cretizations. Thus, the motivation to construct NNs utilizing
numerical error of the computed solution. This can for exam- discretizations of PDEs are twofold. Firstly, deep learning
ple be achieved with super-resolution techniques relying on techniques can hereby be exploited within classical dis-
specialized CNN architectures from computer vision [484, cretization frameworks. Secondly, novel NN architectures
485]. Coarse to fine mappings can be obtained in a super- arise, which are more tailored towards many physical prob-
vised manner using matching coarse and fine simulations as lems in computational mechanics but potentially also find
labeled data, as presented for turbulent flows [462, 486] and their use cases outside of that field.
topology optimization [487–489]. The mapping is typically
performed from coarse to fine solution fields, but mappings 4.1 Finite element method
from a posteriori errors have been proposed as well [490].
Further specialized extensions to the cost function have been One method are finite element NNs [14, 498] (see [499–504]
suggested in the context of de-homogenization [491]. for applications), for which we consider the system of equa-
The methods can analogously be applied to temporal data tions from a finite element discretization with the stiffness
where the solution is refined at each time step,—as, e.g., matrix K i j , degrees of freedom u j , and the body load bi :
presented with RNNs as corrector of reduced order models
[492]. However, coarse discretizations in dynamical models
lead to an error accumulation, that increases with the number
N
123
Computational Mechanics (2024) 74:281–331 301
123
302 Computational Mechanics (2024) 74:281–331
(b) HiDeNN
123
Computational Mechanics (2024) 74:281–331 303
123
304 Computational Mechanics (2024) 74:281–331
and are described in detail in Appendix B. Currently, there Within generative design, the generator can also be consid-
are two prominent areas of application in computational ered as a reparametrization of the design space that reduces
mechanics. One area of focus is microstructure generation the number of design variables. With autoencoders, the latent
(Sect. 5.1.1), which aims to produce a sufficient quantity vector serves as the design parameter [553, 554], which
of realistic training data for surrogate models, as described is then optimized25 . Similarly, [556] find that point cloud
in Sect. 2.1. The second key application area is generative autoencoders [117, 557, 558] are advantageous as geometric
design (Sect. 5.1.2), which relies on algorithms to efficiently dimensionality reduction tools (potentially combined with
explore the design space within the constraints established performance features) for efficiently exploring the design
by the designer. space. In the context of GANs, the optimization task is
aimed at the random input ξ provided to the generator. This
5.1 Applications approach is demonstrated in various studies, such as ship
hull design parameterized by NURBS surfaces [559], airfoil
5.1.1 Data generation shapes expressed with Bézier curves [560, 561], structural
optimization [562], and full waveform inversion [563]. For
The most straightforward application of variational autoen- optimization, variational autoencoder GANs are particularly
coders and GANs in computational mechanics is the gen- important, as the GAN ensures high quality designs, while
eration of new data, based on existing examples. This has the autoencoder ensures well-behaving gradients. This was
been demonstrated in [531–535] for microstructures in [93] shown for microstructure optimization in [564].
for velocity models used in full waveform inversion, and An important requirement for generative design is design
in [536] for optimized structures using GANs. Variational diversity. Achieving this involves ensuring that the entire
autoencoders have also been used to model the crossover design space is spanned by the generated data. For this, the
operation in evolutionary algorithms to create new designs cost function can be extended, as presented in [565], using
from parent designs [537]. Applications of diffusion models determinantal point processes [566] or in [559] with a space-
for microstructure generation can be found in [538–540]. filling term [567].
Microstructures pose a unique challenge due to their Other strategies are specifically focused on promoting
inherent three-dimensional nature, while often only two- design diversity. This involves identifying novel designs via
dimensional reference images are available. This has led to a novelty score [568]. The novelty within these designs is
the development of specialized architectures that are capable segmented and used to modify the GAN using methods out-
of creating three-dimensional structures from representative lined in [569]. An alternative approach proposed by [570]
two-dimensional slices [541–543]. The approach typically quantifies creativity and maximizes it. This is achieved by
involves treating three-dimensional voxel data as a sequence performing a classification in pre-determined categories by
of two-dimensional slices of pixels. Sequences of images are the discriminator. If the classification is unsuccessful, the
predicted from individual slices, ultimately forming a three- design must lie outside the categories and is therefore deemed
dimensional microstructure. In [544], a RNN is applied to creative. Thus the generator then seeks to minimize the clas-
a two-dimensional reference image, yielding an additional sification accuracy.
dimension, and consequently creating a three-dimensional
structure. The RNN is applied at the latent vector inside
25 It is worth noting, that to ensure designs that are physically mean-
an encoder decoder architecture, such that the inputs and
ingful, a style transfer technique can be implemented [555]. Here, the
outputs of the RNN have a relatively small size. Similarly, training data is perceived as a style, and the Gram matrices’ differ-
[545, 546] apply a transformer [172] to the latent vector. An ence, characterizing the distribution of visual patterns or textures in the
alternative formulation using variational autoencoder GANs generated designs, is minimized.
123
Computational Mechanics (2024) 74:281–331 305
However, some applications necessitate a resemblance velocity distributions when they are transformed from seis-
to prior designs due to factors such as aesthetics [571] or mogram to velocity distribution and back again.
manufacturability [572]. In [571], a pixel-wise L 1 -distance Lastly, coarse-to-fine mappings as previously discussed
to previous designs is included in the loss26 . A complete in Sect. 3.4, can also be learned by GANs. This was, for
workflow with generative design enforcing resemblance of example, demonstrated in topology optimization, where a
previous designs and surrogate model training for the quan- conditional GAN refines coarse designs obtained from clas-
tification of mechanical properties is described in [573]. sical optimizations [579, 586] or CNN predictions [102].
Another option is the use of style transfer techniques [555], For temporal problems, such as fluid flows, the temporal
which in [574] is incorporated into a conventional topology coherence between time steps poses an additional challenge.
optimization scheme [575] as a constraint in the loss. These Temporal coherence can be ensured by a second discrimi-
are tools with the purpose of incorporating vague constraints nator, which receives three consecutive frames of either the
based on previous designs for topology optimization. generator or the real data and decides if they are real or gen-
GANs can also be applied to inverse problems, as pre- erated. The method is referred to as tempoGAN [587].
sented in [576] for full waveform inversion. The generator
predicts the material distribution, which is used in a differen- 5.1.4 Anomaly detection
tiable simulation providing the forward solution in the form
of a seismogram. The discriminator attempts to distinguish Finally, a last application of generative models is anomaly
between the seismogram indirectly coming from the genera- detection, see [588] for a review. This is particularly valu-
tor and the measured seismograms. The underlying material able for non-destructive testing, where flawed specimens can
distribution is determined through gradient descent. be identified in terms of anomalies. The approach relies on
generative models and attempts to reconstruct the geometry.
5.1.3 Conditional generation At first, the generative model is trained on structures with-
out flaws. During evaluation, the structures to be tested are
As stated earlier, GANs can take specific inputs to dictate the then fed through the NN. In case of an autoencoder, as in
output’s nature. The key difference to data-driven surrogate [589], it is fed through the encoder and decoder. For a GAN,
models from Sect. 2.1 is that GANs provide a tool to generate as discussed, e.g., in [590–592], the input of the generator
multiple outputs given the same conditional input. They are is optimized to fit the output as well as possible. The mis-
thus applicable to problems with multiple solutions, such as match in reconstruction then provides a spatially dependent
design optimization or data generation. measure of where an anomaly, i.e., defect is located.
Examples of conditional generation are rendered cars Another approach is to use the discriminator directly, as
from car sketches [577], hierarchical shape generation [578], presented in [593]. If a flawed specimen is given to the dis-
where the child shape considers its parent shape and topol- criminator, it will be categorized as fake, as it was not part
ogy optimization with predictions of optimal structures from of the undamaged structures during training. The discrimi-
initial fields, e.g., strain energy, of the unoptimized structure nator can also be used to check if the domain of application
[579, 580]. Physical properties can also be used as input. of a surrogate model is valid. Trained on the same training
The properties are computed by a differentiable solver after data as the surrogate model, the discriminator estimates the
generation and are incorporated in the loss. This was, e.g., dissimilarity between the data to be tested and the training
presented in [581] for airplane shapes, and in [582] for inverse data. For large discrepancies, the discriminator detects that
homogenization. For full waveform inversion, [583] trains the surrogate model becomes invalid.27
a conditional GAN with seismograms as input to predict
the corresponding velocity distributions. A similar effort is
made by [584] with CycleGANs [585] to circumvent the need 6 Deep reinforcement learning
for paired data. Here, one generator generates a seismogram
ŷ = G y (x) and another a corresponding velocity distribution In reinforcement learning, an agent interacts with an environ-
x̂ = G x (y). The predictions are judged by two separate dis- ment through a sequence of actions at , which is illustrated
criminators. Additionally, a cycle-consistency loss ensures in Fig. 14. Upon executing an action at , the agent receives
that a prediction from a prediction, i.e., G y (x̂) or G x ( ŷ), an updated state st+1 and reward rt+1 from the environment.
matches the initial input x or y. This cycle-consistency loss The agent’s objective is to maximize the cumulative reward
ensures, that the learned transformations preserve the essen- R . The environment can be treated as a black box. This
tial features and structures of the original seismograms or presents an advantage in computational mechanics when dif-
123
306 Computational Mechanics (2024) 74:281–331
6.1 Applications
6.1.1 Extensions
Deep reinforcement learning is mainly used for inverse prob-
lems (see [25] for a review within fluid mechanics), where Each interaction with the environment requires solving the
the PDE solver is treated as a black box, and assumed to not differential equation, which, due to the many interactions,
be differentiable. makes reinforcement learning expensive. The learning can be
The most prominent application are control problems. One accelerated through some basic modifications. The learning
example is discovering swimming strategies for fish—with can be perfectly parallelized by using multiple environments
the goal of efficiently minimizing the distance to a leader fish simultaneously [618], or by using multiple agents within
[601, 602]. The environment is given by the Navier Stokes the same environment [619]. Another idea is to construct
equation. Another example is balancing rigid bodies with a surrogate model of the environment and thereby exploit
fluid jets while using as little force as possible [603]. Simi- model-based approaches [620–623]. The general procedure
larly, [604] control jets in order to reduce the drag around a consists of three steps:
cylinder. Reducing the drag around a cylinder is also achieved
by controlling small rotating cylinders in the wake of the flow
[605]. A more complex example is controlling unmanned • model learning: learn surrogate of environment,
aerial vehicles [606]. The control schemes are learned by • behavior learning: learn policy or value function,
interacting with simulations and, subsequently, applied in • environment interaction: apply learned policy and collect
experiments. data.
Further applications in connection with inverse problems
are learning filters to perturb flows in order to match target
flows [607]. Also, constitutive laws can be identified. The Most approaches construct the surrogate with data-driven
individual arithmetic manipulations within a constitutive law modeling (Sect. 2.1), but physics-informed approaches have
can be represented as graphs. An agent constructs the graph been proposed as well [620, 622] (Sect. 3.2).
123
Computational Mechanics (2024) 74:281–331 307
7 Conclusion and outlook tutive laws, which are inherently phenomenological and
thereby well-suited to be identified from data using tools
In order to structure the state-of-the-art, an overview of the such as deep learning. In addition, simulation enhance-
most prominent deep learning methods employed in compu- ment, makes it possible to draw on insights gained from
tational mechanics was presented. Five main categories were classical methods developed since the inception of com-
identified: simulation substitution, simulation enhancement, putational mechanics. Furthermore, it is currently more
discretizations as NNs, generative approaches, and deep rein- realistic to learn smaller components of the simulation
forcement learning. chain with NNs rather than the entire model. These com-
Despite the variety and abundance of the literature, few ponents should ideally be expensive and have limited
approaches are competitive in comparison to classical meth- requirements regarding accuracy and reliability. Lastly,
ods. This manifests itself in the lack of comparisons in it is also easier to assess whether a method enhanced by
the literature of NN-based methods to classical methods. deep learning outperforms the classical method, as direct
We have found little evidence that NN-based methods truly and fair comparisons are readily possible.
outperform classical methods in computational mechanics. • An interesting research direction is to employ discretiza-
However, with only few exceptions, current research is still tions as NNs, as this offers the potential to discover NNs
in its early stages, with a focus on showcasing possibili- tailored to computational mechanics tasks, such as CNNs
ties without focusing too much attention on accuracy and for computer vision or RNNs and transformers for natural
efficiency. Future research must, nevertheless, shift its focus language processing. In computational mechanics, their
to incorporate more in-depth investigations into the perfor- main benefit seems to stem from being able to exploit
mance of the developed methods—including thorough and the computational benefits of tools and hardware that
meaningful comparisons to performant classical methods were created for the wider community of deep learning—
dedicated to the task under investigation. This is in agreement such as NN libraries programmed for GPUs which enable
with the recent review article on deep learning in topology an efficient, yet effortless massive parallelization. In our
optimization [22], where critical and fair assessments are assessment, none of the methods encountered in this
requested. This includes the determination of generalization review were shown to be able to consistently outperform
capabilities, greater transparency by including, e.g., worst classical approaches using a comparable amount of com-
case performances to illustrate reliability, and computation putational resources.
times without disregarding the training time. • Generative approaches have been shown to be highly ver-
In line with this, and to the best of our knowledge, we pro- satile in applications of computational mechanics since
vide a final overview outlining the potentials and limitations the accuracy of a specific instance under investigation
of the discussed methods. is less of a concern here. They have been used to gen-
erate statistically equivalent data to train other machine
• Simulation substitution has potential for surrogate mod- learning models, to incorporate vague constraints based
eling of parameterized models that need to be evaluated on data within optimization frameworks, and to detect
many times. However, currently this is only realizable anomalies.
for small parameter spaces, due to the amount of data • Deep reinforcement learning has already shown encour-
required and unlikely to replace established methods, aging results—for example in controlling unmanned
as also stated in [42]. Complex problems can still be vehicles in complex physics environments. It is mainly
tackled by NN surrogates if they are first reduced to a applicable for problems where efficient differentiable
low-dimensional space through model order reduction physics solvers are unavailable, which is why it is popu-
techniques. Physics-informed learning further reduces lar in control problems for turbulence. In the presence of
the amount of required data and improves the general- differentiable solvers, gradient-based methods are, how-
ization capabilities. However, enforcing physics through ever, still the state-of-the-art [443] and, thus, preferred.
penalty terms increases the computational effort, where
the solutions still do not necessarily satisfy the corre-
sponding physical laws. Instead, enforcing physical laws
by construction guarantees that they are obeyed, which
is more favorable to adding constraints through penalty
terms.
• Simulation enhancement is currently one of the most
Acknowledgements The authors gratefully acknowledge the funding
promising areas of investigation. It is in particular bene- through the joint research project Geothermal-Alliance Bavaria (GAB)
ficial for tasks where classical methods show difficulties. by the Bavarian State Ministry of Science and the Arts (StMWK) as well
An excellent example for this is the formulation of consti- as the Georg Nemetschek Institut (GNI) under the project DeepMonitor.
123
308 Computational Mechanics (2024) 74:281–331
Funding Open Access funding enabled and organized by Projekt In a directed graph28 , each edge ei has a sender node vis
DEAL. and a receiver node vir . This enables the formulation of an
algorithm operating first on the edges, and subsequently on
Declarations the nodes, as summarized in Algorithm 1 for a single graph
block. These graph blocks can be stacked similarly to layers
Conflict of interest No potential conflict of interest was reported by the
authors. in other NN architectures.
123
Computational Mechanics (2024) 74:281–331 309
Fig. 15 The fundamental operations within CNNs: in the top the convolution operation and in the bottom the pooling operation. Adapted from [37]
123
310 Computational Mechanics (2024) 74:281–331
cost function: latent space, where proper sampling leads to smooth interpo-
lations in the generated space. In other words, small changes
1
ND in the latent space correspond to small changes in the gen-
C= log D N N ( yi ; θ D ) erated space—a characteristic not inherent to GANs. To
ND
i=1
achieve smooth interpolations, autoencoders can be com-
1
NG
bined with GANs [641], where the autoencoder acts as
+ log 1 − D N N G N N (ξ i ; θ G ); θ D . (81) generator in the GAN framework, employing both an autoen-
NG
i=1
coder loss and a GAN loss.
Here, N D real samples and N G generated samples are used
for training. The goal for the generator is to minimize the B.3 Diffusion models
cost function, implying that the discriminator fails to dis-
tinguish between real and generated samples. However, the Diffusion models enhanced by NNs [642–644] convert ran-
discriminator strives to maximize the cost. Therefore, this is dom noise x into a sample resembling the training data
formulated as a minimax optimization problem through a series of transformations. Given a data set { yi0 }i=1
N
min max C. (82) process q(x t |x t−1 ) is introduced. This process adds Gaus-
θG θD sian noise to x t−1 at each time step t − 1. The process is
applied iteratively
Convergence is ideally reached at the Nash equilibrium
[634], where the discriminator always outputs a probabil-
T
ity of 1/2, signifying its inability to distinguish between real q(x 0 , x 1 , . . . , x T ) = q(x 0 ) q(x t |x t−1 ). (83)
and generated samples. However, GANs can be challenging t=1
to train. Problems like mode collapse [635] can arise. Here,
the generator learns only a few modes from the training data. After a sufficient number of iterations T , the resulting distri-
In the extreme case, only a single sample from the train- bution approximates a Gaussian distribution. Consequently,
ing data is learned, yielding a low discriminator score, yet a random sample from a Gaussian distribution x T can be
an undesirable outcome. To combat mode collapse, design denoised with the reverse denoising process q(x t−1 |x t ),
diversity can be either promoted in the learning algorithm resulting in a sample x 0 that matches the original distribution
or the cost [635, 636]. Another challenge lies in balancing q(x 0 ). The reverse denoising process is, however, unknown
the training of the two NNs. If the discriminator learns too and therefore modeled as a Gaussian distribution, where the
quickly and manages to distinguish all generated samples, mean and covariance are learned by a NN. With the learned
the gradient of the cost function (Eq. 81) with respect to the denoising process, data can be generated by denoising sam-
weights becomes zero, halting further progress. A possible ples drawn from a Gaussian distribution. Note the similarity
remedy is to use the Wasserstein distance in the cost function to autoencoders. Instead of learning a mapping to a hidden
[637]. random state hi , the encoding is prescribed as the iterative
Additionally, GANs can be modified to include inputs application of Gaussian noise [530].
that control the generated data. This can be achieved in a A related approach are normalizing flows [645] (see [646]
supervised manner with conditional GANs [638]. The con- for an introduction and extensive review). Here, a basic
ditional GAN does not just receive random noise, but also an probability distribution is transformed through a series of
additional input. This supplementary input is considered by invertible transformations, i.e., flows. The goal is to model
the discriminator, which assesses whether the input-output distributions of interest. The individual transformations can
pairs are real or generated. An unsupervised alternative are be modeled by NNs. A normalization is required, such that
InfoGANs [639], which disentangle the input information, each intermediate probability distribution integrates to one.
i.e., the random input ξ , defining the generated data. This is
achieved by introducing an additional parameter c, a latent
code to the generator G N N (ξ, c; θ G ). To ensure that the C Deep reinforcement learning
parameter is used by the NN, the cost (Eq. 81)) is extended by
a mutual information term [640] I (c, G N N (x, c; θ G )) ensur- In reinforcement learning, the environment is commonly
ing that the generated data varies meaningfully based on the modeled as a Markov Decision Process (MDP). This math-
input latent code c. ematical model is defined by a set of all possible states S,
In comparison to variational autoencoders, GANs typi- actions A, and associated rewards R. Furthermore, the prob-
cally generate higher quality data. However, the advantage of ability of getting to the next state st+1 from the previous
autoencoders lies in their ability to construct a well-structured st with action at is given by P(st+1 |st , at ). Thus, the envi-
123
Computational Mechanics (2024) 74:281–331 311
ronment is not necessarily deterministic. One key aspect of and a critic that judges its quality. Both can be modeled by
a Markov Decision Process is the Markov property, stating NNs.
that future states depend solely on the current state and action,
and not the history of states and actions. C.1 Deep policy networks
The goal of a reinforcement learning algorithm is to deter-
mine a policy π(s, a) which dictates the next action at in In deep policy networks, the policy, i.e., the mapping of states
order to maximize the cumulative reward R . The cumula- to actions, is modeled by a NN â = π(s; θ ). The quality of
tive reward R is discounted by a discount factor γ t in order the NN is assessed by the expected cumulative reward R ,
to give more importance to immediate rewards. formulated in terms of the action-value function Q(s, a).
∞
C = R = E Q(s, a) (89)
R = γ rt
t
(84)
t=0 Its gradient (see [38, 658, 660] for a derivation), given as
The quality of a policy π(s, a) can be assessed by a state-
∇θ R = E Q(s, a)∇θ log π(s, a; θ ) , (90)
value function Vπ (s), defined as the expected future reward
given the current state s and following the policy π . Similarly, can be applied within a gradient ascent scheme to learn the
an action-value function Q π (s) determines the expected optimal policy.
future reward given the current state s and action a, while sub-
sequently following the policy π . The expected value along C.2 Deep Q-learning
a policy π is denoted as Eπ .
Deep Q-learning identifies the optimal action-value func-
Vπ (s) = Eπ R (t)|s (85)
tion Q(s, a) from which the optimal policy is extracted.
Q π (s, a) = Eπ R (t)|s, a (86) Q-Learning relies on the Bellman optimality criterion [666,
667]. By separating the reward r0 at the first step, the recur-
The optimal value and quality function correspondingly fol- sion formula of the optimal state-value function, i.e., the
low the optimal policy: Bellman optimality criterion, can be established:
123
312 Computational Mechanics (2024) 74:281–331
Here, the TD target estimate only looks one step ahead—and 12. Tomonari F, Genki Y (1998) Implicit constitutive modelling for
is therefore referred to as TD(0). The generalization is called viscoplasticity using neural networks. Int J Numer Methods Eng
43(2):195–219
TD(N). In the limit N → ∞, the method is equivalent to 13. Okuda H, Yoshimura S, Yagawa G, Matsuda A (1998) Neural
Monte Carlo learning, where all steps are performed and a network-based parameter estimation for non-linear finite element
true target is obtained. analyses. Eng Comput 15(1):103–138. https://doi.org/10.1108/
Deep Q-learning introduces a NN for the action-value 02644409810200721
14. Jun T, Yukio K (1994) Neural network representation of finite
function Q(s, a; θ ). Its quality is assessed with a loss com- element method. Neural Netw 7(2):389–395. https://doi.org/10.
posed of the mean squared error of the TD error. 1016/0893-6080(94)90031-0
15. Yagawa G, Okuda H (1996) Finite element solutions with feed-
1 2 back network mechanism through direct minimization of energy
C =E rt + γ max Q(st+1 , a; θ ) − Q(st , at ; θ ) (96) functionals. Int J Numer Methods Eng 39(5):867–883
2 a 16. Topping BHV, Khan AI, Bahreininejad A (1997) Parallel train-
ing of neural networks for finite element mesh decomposition.
Comput Struct 63(4):693–707. https://doi.org/10.1016/S0045-
Lastly, the optimal policy π(s) maximizing the action- 7949(96)00082-X
value function Q(s, a; θ ) is extracted: 17. Rezende DJ, Mohamed S, Wierstra D (2014) Stochastic backprop-
agation and approximate inference in deep generative models.
arXiv:1401.4082 [cs, stat]
π(s) = arg max Q(s, a; θ ) (97) 18. Kingma Diederik P, Welling M (2022) Auto-encoding variational
a
bayes. arXiv:1312.6114 [cs, stat]
19. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley
D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial
nets. In: Advances in neural information processing systems, vol
References 27. Curran Associates, Inc. https://papers.nips.cc/paper_files/
paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.
1. Abu-Mostafa YS, Magdon-Ismail M, Lin H-T (2012) Learning html
from data. AML Book 20. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Belle-
2. Adie J, Juntao Y, Zhang X, See S (2018) Deep learning mare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G,
for computational science and engineering. In: GPU technol- Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran
ogy conference. https://on-demand.gputechconf.com/gtc/2018/ D, Wierstra D, Legg S, Hassabis D (2015) Human-level control
presentation/S8242-Yang-Juntao-paper.pdf through deep reinforcement learning. Nature 518(7540):529–533.
3. Yagawa G, Okuda H (1996) Neural networks in computational https://doi.org/10.1038/nature14236
mechanics. Arch Comput Methods Eng 3(4):435–512. https://doi. 21. Zhang D, Maslej N, Brynjolfsson E, Etchemendy J, Lyons T,
org/10.1007/BF02818935 Manyika J, Ngo H, Niebles JC, Sellitto M, Sakhaee E, Shoham
4. Waszczyszyn Z, Ziemiański L (2001) Neural networks in mechan- Y, Clark J, Perrault R (2022) The AI index 2022 annual report.
ics of structures and materials—new results and prospects of arXiv:2205.03468 [cs]
applications. Comput Struct 79(22):2261–2276. https://doi.org/ 22. Woldseth RV, Aage N, Andreas Bærentzen J, Sigmund O (2022)
10.1016/S0045-7949(01)00083-9 On the use of artificial neural networks in topology optimi-
5. Yagawa G, Oishi A (2021) Computational mechanics with neural sation. Struct Multidiscip Optim 65(10):294. https://doi.org/10.
networks. Lecture notes on numerical methods in engineering and 1007/s00158-022-03347-1
sciences. Springer, Cham 23. Seungyeon S, Dongju S, Namwoo K (2023) Topology opti-
6. Song SJ, Schmerr LW (1992) Ultrasonic flaw classification in mization via machine learning and deep learning: a review. J
weldments using probabilistic neural networks. J Nondestr Eval Comput Des Eng 10(4):1736–1766. https://doi.org/10.1093/jcde/
11(2):69–77. https://doi.org/10.1007/BF00568290 qwad072
7. Yagawa G, Yoshimura S, Mochizuki Y, Oishi T (1993) Identifi- 24. Adler A, Araya-Polo M, Poggio T (2021) Deep learning for
cation of crack shape hidden in solid by means of neural network seismic inverse problems: toward the acceleration of geophysi-
and computational mechanics. In: Masataka T, Huy Duong B (eds) cal analysis workflows. IEEE Signal Process Mag 38(2):89–119.
Inverse problems in engineering mechanics, international union of https://doi.org/10.1109/MSP.2020.3037429
theoretical and applied mechanics. Springer, Berlin, pp 213–222. 25. Garnier P, Viquerat J, Rabault J, Larcher A, Kuhnle A, Hachem E
https://doi.org/10.1007/978-3-642-52439-4_21 (2019) A review on deep reinforcement learning for fluid mechan-
8. Psichogios DC, Ungar LH (1992) A hybrid neural network-first ics. arXiv:1908.04127 [physics]
principles approach to process modeling. AIChE J 38(10):1499– 26. Karthik D, Gianluca I, Heng X (2019) Turbulence modeling in
1511. https://doi.org/10.1002/aic.690381003 the age of data. Ann Rev Fluid Mech 51(1):357–377. https://doi.
9. Dissanayake MWMG, Phan-Thien N (1994) Neural-network- org/10.1146/annurev-fluid-010518-040547
based approximations for solving partial differential equations. 27. Brunton S, Noack B, Koumoutsakos P (2020) Machine learning
Commun Numer Methods Eng 10(3):195–201. https://doi.org/ for fluid mechanics. Annu Rev Fluid Mech 52(1):477–508.
10.1002/cnm.1640100303 https://doi.org/10.1146/annurev-fluid-010719-060214. arXiv:
10. Lagaris IE, Likas A, Fotiadis DI (1998) Artificial neural networks 1905.11075
for solving ordinary and partial differential equations. IEEE Trans 28. Cai S, Mao Z, Wang Z, Yin M, Karniadakis GE (2021)
Neural Netw 9(5):987–1000. https://doi.org/10.1109/72.712178 Physics-informed neural networks (PINNs) for fluid mechanics:
11. Theocaris Pericles S, Panagiotopoulos PD (1995) Generalised a review. Acta Mech Sin 37(12):1727–1738. https://doi.org/10.
hardening plasticity approximated via anisotropic elasticity: a 1007/s10409-021-01148-1
neural network approach. Comput Methods Appl Mech Eng 29. Giovanni C, Wei L (2021) Deep learning to replace, improve,
125(1):123–139. https://doi.org/10.1016/0045-7825(94)00769-J or aid CFD analysis in built environment applications: a review.
123
Computational Mechanics (2024) 74:281–331 313
Build Environ 206:108315. https://doi.org/10.1016/j.buildenv. high-performance deep learning library. arXiv:1912.01703 [cs,
2021.108315 stat]
30. Bock FE, Aydin RC, Cyron CJ, Huber N, Kalidindi SR, Kluse- 49. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Cor-
mann B (2019) A review of the application of machine learning rado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I,
and data mining approaches in continuum materials mechanics. Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur
Front Mater 6:110. https://doi.org/10.3389/fmats.2019.00110 M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah
31. Bishara D, Xie Y, Liu WK, Li S (2023) A state-of-the-art review on C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker
machine learning-based multiscale modeling, simulation, homog- P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P,
enization and design of materials. Arch Comput Methods Eng Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow:
30(1):191–222. https://doi.org/10.1007/s11831-022-09795-8 large-scale machine learning on heterogeneous distributed sys-
32. Max R, Kalina Karl A, Jörg B, Markus K (2023) A comparative tems. arXiv:1603.04467 [cs]
study on different neural network architectures to model inelastic- 50. Kurt H, Maxwell S, Halbert W (1989) Multilayer feedforward
ity. Int J Numer Methods Eng. https://doi.org/10.1002/nme.7319 networks are universal approximators. Neural Netw 2(5):359–
33. Lyle R, Heyrani NA, Faez A (2022) Deep generative models in 366. https://doi.org/10.1016/0893-6080(89)90020-8
engineering design: a review. J Mech Des 144(7):071704. https:// 51. Baydin AG, Pearlmutter BA, Radul AA, Siskind JM (2018) Auto-
doi.org/10.1115/1.4053859 matic differentiation in machine learning: a survey, p 43
34. Moosavi SM, Jablonka KM, Smit B (2020) The role of machine 52. Kingma DP, Ba J (2017) Adam: a method for stochastic optimiza-
learning in the understanding and design of materials. J Am Chem tion. arXiv:1412.6980 [cs]
Soc 142(48):20273–20287. https://doi.org/10.1021/jacs.0c09105 53. Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn.
35. Faller William E, Schreck Scott J (1996) Neural networks: appli- Springer series in operations research. Springer, New York
cations and opportunities in aeronautics. Progress Aerosp Sci 54. Rosenblatt F (1958) The perceptron: a probabilistic model for
32(5):433–456. https://doi.org/10.1016/0376-0421(95)00011-9 information storage and organization in the brain. Psychol Rev
36. Thuerey N, Holl P, Mueller M, Schnell P, Trost F, Um K (2022) 65(6):386–408. https://doi.org/10.1037/h0042519
Physics-based deep learning. arXiv:2109.05237 [physics] 55. LeCun Y, Boser B, Denker J, Henderson D, Howard R,
37. Kollmannsberger S, D’Angella D, Jokeit M, Herrmann L (2021) Hubbard W, Jackel L (1989) Handwritten digit recogni-
Deep learning in computational mechanics: an introductory tion with a back-propagation network. In: Advances in
course, vol 977. Studies in computational intelligence. Springer, neural information processing systems, vol 2. Morgan-
Cham Kaufmann. https://proceedings.neurips.cc/paper/1989/hash/
38. Brunton SL, Kutz JN (2022) Data-driven science and engineering: 53c3bce66e43be4f209556518c2fcb54-Abstract.html
machine learning, dynamical systems, and control. Cambridge 56. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hub-
University Press, Cambridge bard W, Jackel LD (1989) Backpropagation applied to handwritten
39. Anuj K, Ramakrishnan K, Vipin K (2022) Knowledge guided zip code recognition. Neural Comput 1(4):541–551. https://doi.
machine learning: accelerating discovery using scientific knowl- org/10.1162/neco.1989.1.4.541
edge and data. Chapman and Hall/CRC, New York. https://doi. 57. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-
org/10.1201/9781003143376 based learning applied to document recognition. Proc IEEE
40. Yagawa G, Oishi A (2023) Computational mechanics with deep 86(11):2278–2324. https://doi.org/10.1109/5.726791
learning: an introduction. Springer, Cham 58. Rumelhart David E, Hinton Geoffrey E, Williams Ronald J (1986)
41. Rabczuk T, Bathe K-J (2023) Machine learning in modeling and Learning representations by back-propagating errors. Nature
simulation: methods and applications. Springer 323(6088):533–536. https://doi.org/10.1038/323533a0
42. Baker N, Alexander F, Bremer T, Hagberg A, Kevrekidis Y, Najm 59. Hochreiter S, Schmidhuber J (1997) Long short-term memory.
H, Parashar M, Patra A, Sethian J, Wild S, Willcox K, Lee S (2019) Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.
Workshop report on basic research needs for scientific machine 1997.9.8.1735
learning: core technologies for artificial intelligence. Technical 60. Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares
Report 1478744. http://www.osti.gov/servlets/purl/1478744/ F, Schwenk H, Bengio Y (2014) Learning phrase representations
43. von Rueden L, Mayer S, Beckh K, Georgiev B, Giesselbach S, using RNN encoder–decoder for statistical machine translation.
Heese R, Kirsch B, Pfrommer J, Pick A, Ramamurthy R, Walczak In: Proceedings of the 2014 conference on empirical methods in
M, Garcke J, Bauckhage C, Schuecker J (2023) Informed machine natural language processing (EMNLP). Association for Computa-
learning—a taxonomy and survey of integrating prior knowledge tional Linguistics, Doha, pp 1724–1734. https://doi.org/10.3115/
into learning systems. IEEE Trans Knowl Data Eng 35(1):614– v1/D14-1179
633. https://doi.org/10.1109/TKDE.2021.3079836 61. Kipf TN, Welling M (2017) Semi-supervised classification with
44. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT graph convolutional networks. arXiv:1609.02907 [cs, stat]
Press 62. Monti F, Shchur O, Bojchevski A, Litany O, Günnemann S,
45. Sutton RS, Barto AG (2018) Reinforcement learning: an introduc- Bronstein MM (2018) Dual-primal graph convolutional networks.
tion, 2nd edn. Adaptive computation and machine learning series. arXiv:1806.00770 [cs, stat]
The MIT Press, Cambridge 63. Battaglia PW, Hamrick JB, Bapst V, Sanchez-Gonzalez A, Zam-
46. Alpaydin E (2020) Introduction to machine learning, 4th edn. baldi V, Malinowski M, Tacchetti A, Raposo D, Santoro A,
Adaptive computation and machine learning series. The MIT Faulkner R, Gulcehre C, Song F, Ballard A, Gilmer J, Dahl G,
Press, Cambridge Vaswani A, Allen K, Nash C, Langston V, Dyer C, Heess N, Wier-
47. Russell SJ, Norvig P (2022) Artificial intelligence: a modern stra D, Kohli P, Botvinick M, Vinyals O, Li Y, Pascanu R (2018)
approach, 4th edn. Pearson series in artificial intelligence. Pear- Relational inductive biases, deep learning, and graph networks.
son, Harlow arXiv:1806.01261 [cs, stat]
48. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, 64. Henkes A, Eshraghian JK, Wessels H (2022) Spiking neural net-
Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, works for nonlinear regression. arXiv:2210.03515 [cs]
Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner 65. Tandale SB, Stoffel M (2023) Spiking recurrent neural networks
B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, for neuromorphic computing in nonlinear structural mechanics.
123
314 Computational Mechanics (2024) 74:281–331
Comput Methods Appl Mech Eng 412:116095. https://doi.org/ 83. Chen LW, Thuerey N (2023) Towards high-accuracy deep learning
10.1016/j.cma.2023.116095 inference of compressible flows over aerofoils. Comput Fluids
66. Gerstner W, Kistler WM (2002) Spiking neuron models: single 250:105707. https://doi.org/10.1016/j.compfluid.2022.105707
neurons, populations, plasticity, 1st edn. Cambridge University 84. Khadilkar A, Wang J, Rai R (2019) Deep learning-based
Press, Cambridge. https://doi.org/10.1017/CBO9780511815706 stress prediction for bottom-up SLA 3D printing process. Int J
67. Hughes Thomas JR, Hulbert GM (1988) Space-time finite element Adv Manuf Technol 102(5):2555–2569. https://doi.org/10.1007/
methods for elastodynamics: formulations and error estimates. s00170-019-03363-4
Comput Methods Appl Mech Eng 66(3):339–363. https://doi.org/ 85. Zhenguo N, Haoliang J, Burak KL (2020) Stress field prediction
10.1016/0045-7825(88)90006-0 in cantilevered structures using convolutional neural networks. J
68. Alsalman M, Colvert B, Kanso E (2018) Training bioinspired sen- Comput Inform Sci Eng 20(1):011002. https://doi.org/10.1115/1.
sors to classify flows. Bioinspir Biomimet 14(1):016009. https:// 4044097
doi.org/10.1088/1748-3190/aaef1d 86. Guo X, Li W, Iorio F (2016) Convolutional neural networks
69. Colvert B, Alsalman M, Kanso E (2018) Classifying vortex wakes for steady flow approximation. In: Proceedings of the 22nd
using neural networks. Bioinspir Biomimet 13(2):025003. https:// ACM SIGKDD international conference on knowledge discov-
doi.org/10.1088/1748-3190/aaa787 ery and data mining. ACM, pp 481–490. https://doi.org/10.1145/
70. Pierret S, Van Den Braembussche RA (1999) Turbomachinery 2939672.2939738
blade design using a Navier–Stokes solver and artificial neural 87. Zhang Z, Jaiswal P, Rai R (2018) FeatureNet: machining feature
network. J Turbomach 121(2):326–332. https://doi.org/10.1115/ recognition based on 3D convolution neural network. Comput
1.2841318 Aided Des 101:12–22. https://doi.org/10.1016/j.cad.2018.03.006
71. Vurtur Badarinath P, Chierichetti M, Davoudi Kakhki F (2021) 88. Williams G, Meisel NA, Simpson TW, McComb C (2019)
A machine learning approach as a surrogate for a finite element Design repository effectiveness for 3d convolutional neural
analysis: status of research and application to one dimensional networks: application to additive manufacturing. J Mech Des
systems. Sensors 21(5):1654. https://doi.org/10.3390/s21051654 141(11):111701. https://doi.org/10.1115/1.4044199
72. Lee C, Kim J, Babcock D, Goodman R (1997) Application of 89. Wu Y, Lin Y, Zhou Z (2018) Inversionet: accurate and efficient
neural networks to turbulence control for drag reduction. Phys seismic-waveform inversion with convolutional neural networks.
Fluids 9(6):1740–1747. https://doi.org/10.1063/1.869290 In: SEG technical program expanded abstracts 2018. Society of
73. Jambunathan K, Hartle SL, Ashforth-Frost S, Fontama VN (1996) Exploration Geophysicists, Anaheim, pp 2096–2100. https://doi.
Evaluating convective heat transfer coefficients using neural net- org/10.1190/segam2018-2998603.1
works. Int J Heat Mass Transfer 39(11):2329–2332. https://doi. 90. Wang W, Yang F, Ma J (2018) Velocity model building with a
org/10.1016/0017-9310(95)00332-0 modified fully convolutional network. In: SEG technical program
74. Tracey BD, Duraisamy K, Alonso JJ (2015) A machine learn- expanded abstracts 2018. Society of Exploration Geophysicists,
ing strategy to assist turbulence model development. In: 53rd Anaheim, pp 2086–2090. https://doi.org/10.1190/segam2018-
AIAA aerospace sciences meeting. American Institute of Aero- 2997566.1
nautics and Astronautics, Kissimmee. https://doi.org/10.2514/6. 91. Yang F, Ma J (2019) Deep-learning inversion: a next-
2015-1287 generation seismic velocity model building method. Geophysics
75. Ramuhalli P, Udpa L, Udpa SS (2002) Electromagnetic NDE sig- 84(4):R583–R599. https://doi.org/10.1190/geo2018-0249.1
nal inversion by function-approximation neural networks. IEEE 92. Zheng Y, Zhang Q, Yusifov A, Shi Y (2019) Applications of super-
Trans Magn 38(6):3633–3642. https://doi.org/10.1109/TMAG. vised deep learning for seismic interpretation and inversion. Lead
2002.804817 Edge 38(7):526–533. https://doi.org/10.1190/tle38070526.1
76. Araya-Polo M, Jennings J, Adler A, Dahlke T (2018) Deep- 93. Araya-Polo M, Farris S, Florez M (2019) Deep learning-driven
learning tomography. Lead Edge 37(1):58–66. https://doi.org/10. velocity model building workflow. Lead Edge 38(11):872–872.
1190/tle37010058.1 https://doi.org/10.1190/tle38110872a1.1
77. Kim Y, Nakata N (2018) Geophysical inversion versus machine 94. Das V, Pollack A, Wollner U, Mukerji T (2019) Convolutional
learning in inverse problems. Lead Edge 37(12):894–901. https:// neural network for seismic impedance inversion. Geophysics
doi.org/10.1190/tle37120894.1 84(6):R869–R880. https://doi.org/10.1190/geo2018-0838.1
78. Hoang V-N, Nguyen N-L, Tran DQ, Vu Q-V, Nguyen-Xuan H 95. Wang W, Ma J (2020) Velocity model building in a cross-
(2022) Data-driven geometry-based topology optimization. Struct well acquisition geometry with image-trained artificial neural
Multidiscip Optim 65(2):69. https://doi.org/10.1007/s00158- networks. Geophysics 85(2):U31–U46. https://doi.org/10.1190/
022-03170-8 geo2018-0591.1
79. Zhang X, Garikipati K (2023) Label-free learning of elliptic 96. Li S, Liu B, Ren Y, Chen Y, Yang S, Wang Y, Jiang P (2020)
partial differential equation solvers with generalizability across Deep-learning inversion of seismic data. IEEE Trans Geosci
boundary value problems. Comput Methods Appl Mech Eng. Remote Sens 58(3):2135–2149. https://doi.org/10.1109/TGRS.
https://doi.org/10.1016/j.cma.2023.116214 2019.2953473
80. Thuerey N, Weißenow K, Prantl L, Xiangyu H (2020) Deep learn- 97. Bangyu W, Meng D, Wang L, Liu N, Wang Y (2020)
ing methods for Reynolds-averaged Navier–Stokes simulations Seismic impedance inversion using fully convolutional resid-
of airfoil flows. AIAA J 58(1):25–36. https://doi.org/10.2514/1. ual network and transfer learning. IEEE Geosci Remote
J058291 Sens Lett 17(12):2140–2144. https://doi.org/10.1109/LGRS.
81. Li-Wei C, Cakal Berkay A, Xiangyu H, Nils T (2021) Numerical 2019.2963106
investigation of minimum drag profiles in laminar flow using deep 98. Park MJ, Sacchi MD (2020) Automatic velocity analysis using
learning surrogates. J Fluid Mech 919:A34. https://doi.org/10. convolutional neural network and transfer learning. Geophysics
1017/jfm.2021.398 85(1):V33–V43. https://doi.org/10.1190/geo2018-0870.1
82. Chen X, Zhao X, Gong Z, Zhang J, Zhou W, Chen X, Yao W (2021) 99. Ye J, Toyama N (2022) Automatic defect detection for ultrasonic
A deep neural network surrogate modeling benchmark for temper- wave propagation imaging method using spatio-temporal convo-
ature field prediction of heat source layout. Sci China Phys Mech lution neural networks. Struct Health Monit 21(6):2750–2767.
Astron 64(11):1. https://doi.org/10.1007/s11433-021-1755-6 https://doi.org/10.1177/14759217211073503
123
Computational Mechanics (2024) 74:281–331 315
100. Jing R, Fangshu Y, Huadong M, Stefan K, Ernst R (2023) 117. Qi CR, Su H, Mo K, Guibas LJ (2017) PointNet: deep
Quantitative reconstruction of defects in multi-layered bonded learning on point sets for 3D classification and segmentation.
composites using fully convolutional network-based ultrasonic arXiv:1612.00593
inversion. J Sound Vib 542:117418. https://doi.org/10.1016/j.jsv. 118. Groueix T, Fisher M, Kim VG, Russell BC, Aubry M (2018) Atlas-
2022.117418 Net: a Papier-Mâché approach to learning 3D surface generation.
101. Qiyin L, Jun H, Zheng L, Baotong L, Jihong W (2018) arXiv:1802.05384 [cs]
Investigation into the topology optimization for conductive 119. Cunningham JD, Simpson TW, Tucker CS (2019) An investiga-
heat transfer based on deep learning approach. Int Com- tion of surrogate models for efficient performance-based decoding
mun Heat Mass Transfer 97:103–109. https://doi.org/10.1016/j. of 3D point clouds. J Mech Des 141(12):121401. https://doi.org/
icheatmasstransfer.2018.07.001 10.1115/1.4044597
102. Yonggyun Yu, Hur T, Jung J, Jang IG (2019) Deep learning for 120. Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer
determining a near-optimal topological design without any itera- series in statistics. Springer, New York
tion. Struct Multidiscip Optim 59(3):787–799. https://doi.org/10. 121. Tobias H, Hans-Peter M (2009) Statistical shape models for
1007/s00158-018-2101-5 3D medical image segmentation: a review. Med Image Anal
103. Abueidda Diab W, Seid K, Sobh Nahil A (2020) Topology opti- 13(4):543–563. https://doi.org/10.1016/j.media.2009.05.004
mization of 2D structures with nonlinearities using deep learning. 122. Bhattacharya K, Hosseini B, Kovachki NB, Stuart AM (2021)
Comput Struct 237:106283. https://doi.org/10.1016/j.compstruc. Model reduction and neural networks for parametric PDEs. SMAI
2020.106283 J Comput Math 7:121–157. https://doi.org/10.5802/smai-jcm.74
104. Nakamura K, Suzuki Y (2020) Deep learning-based topologi- 123. Berkooz G, Holmes P, Lumley JL (1993) The proper orthog-
cal optimization for representing a user-specified design area. onal decomposition in the analysis of turbulent flows. Annu
arXiv:2004.05461 Rev Fluid Mech 25(1):539–575. https://doi.org/10.1146/annurev.
105. Zhang Y, Peng B, Zhou X, Xiang C, Wang D (2020) A deep con- fl.25.010193.002543
volutional neural network for topology optimization with strong 124. Muñoz D, Allix O, Chinesta F, Ródenas JJ, Nadal E (2023) Man-
generalization ability. arXiv:1901.07761 [cs, stat] ifold learning for coherent design interpolation based on geomet-
106. Zheng S, He Z, Liu H (2021) Generating three-dimensional rical and topological descriptors. Comput Methods Appl Mech
structural topologies via a U-Net convolutional neural network. Eng 405:115859. https://doi.org/10.1016/j.cma.2022.115859
Thin-Walled Struct 159:107263. https://doi.org/10.1016/j.tws. 125. Liang L, Liu M, Martin C, Sun W (2018) A deep learning approach
2020.107263 to estimate stress distribution: a fast and accurate surrogate of
107. Shuai Z, Haojie F, Ziyu Z, Zhiqiang T, Kang J (2021) Accu- finite-element analysis. J R Soc Interface 15(138):20170844.
rate and real-time structural topology prediction driven by deep https://doi.org/10.1098/rsif.2017.0844
learning under moving morphable component-based framework. 126. Ali M, Ahmed B, Jiwon K, Yara M, Mofrad Mohammad RK
Appl Math Modell 97:522–535. https://doi.org/10.1016/j.apm. (2019) Bridging finite element and machine learning modeling:
2021.04.009 stress prediction of arterial walls in atherosclerosis. J Biomech
108. Wang D, Xiang C, Pan Y, Chen A, Zhou X, Zhang Y (2022) A deep Eng 141(8):084502. https://doi.org/10.1115/1.4043290
convolutional neural network for topology optimization with per- 127. Muravleva E, Oseledets I, Koroteev D (2018) Application of
ceptible generalization ability. Eng Optim 54(6):973–988. https:// machine learning to viscoplastic flow modeling. Phys Fluids
doi.org/10.1080/0305215X.2021.1902998 30(10):103102. https://doi.org/10.1063/1.5058127
109. Jun Y, Zhang Qi X, Qi FZ, Haijiang L, Wei S, Guangyuan W 128. Liang L, Liu M, Martin C, Sun W (2018) A machine learning
(2022) Deep learning driven real time topology optimisation based approach as a surrogate of finite element analysis-based inverse
on initial stress learning. Adv Eng Inform 51:101472. https://doi. method to estimate the zero-pressure geometry of human thoracic
org/10.1016/j.aei.2021.101472 aorta. Int J Numer Methods Biomed Eng 34(8):e3103. https://doi.
110. Seo J, Kapania RK (2023) Topology optimization with advanced org/10.1002/cnm.3103
CNN using mapped physics-based data. Struct Multidiscip Optim 129. Derouiche K, Garois S, Champaney V, Daoud M, Traidi
66(1):21. https://doi.org/10.1007/s00158-022-03461-0 K, Chinesta F (2021) Data-driven modeling for multiphysics
111. Ivan S, Ivan O (2019) Neural networks for topology optimization. parametrized problems-application to induction hardening pro-
Russian J Numer Anal Mathl Modell 34(4):215–223. https://doi. cess. Metals 11(5):738. https://doi.org/10.3390/met11050738
org/10.1515/rnam-2019-0018 130. Quercus H, Alberto B, Francisco C, Elías C (2023)
112. Joo Y, Yonggyun Yu, Jang IG (2021) Unit module-based conver- Thermodynamics-informed neural networks for physically realis-
gence acceleration for topology optimization using the spatiotem- tic mixed reality. Comput Methods Appl Mech Eng 407:115912.
poral deep neural network. IEEE Access 9:149766–149779. https://doi.org/10.1016/j.cma.2023.115912
https://doi.org/10.1109/ACCESS.2021.3125014 131. Hinton GE, Salakhutdinov RR (2006) Reducing the dimension-
113. Kallioras NA, Kazakis G, Lagaros ND (2020) Accelerated ality of data with neural networks. Science 313(5786):504–507.
topology optimization by means of deep learning. Struct Multi- https://doi.org/10.1126/science.1127647
discip Optim 62(3):1185–1212. https://doi.org/10.1007/s00158- 132. Michele M, Petros K (2002) Neural network modeling for near
020-02545-z wall turbulent flow. J Comput Phys 182(1):1–26. https://doi.org/
114. Sanchez-Gonzalez A, Godwin J, Pfaff T, Ying R, Leskovec J, 10.1006/jcph.2002.7146
Battaglia PW (2020) Learning to simulate complex physics with 133. Siddharth N, Walsh Timothy F, Greg P, Fabio S (2023) GRIDS-
graph networks. arXiv:2002.09405 Net: inverse shape design and identification of scatterers via
115. Pfaff T, Fortunato M, Sanchez-Gonzalez A, Battaglia PW geometric regularization and physics-embedded deep learning.
(2021) Learning mesh-based simulation with graph networks. Comput Methods Appl Mech Eng 414:116167. https://doi.org/
arXiv:2010.03409 10.1016/j.cma.2023.116167
116. Roberto P, Davide G, Vinamra A (2022) Graph neural networks for 134. Ana F-N, Diego Z-S, Omella Ángel J, David P, David G-S, Filipe
simulating crack coalescence and propagation in brittle materials. M (2022) Supervised deep learning with finite element simula-
Comput Methods Appl Mech Eng 395:115021. https://doi.org/10. tions for damage identification in bridges. Eng Struct 257:114016.
1016/j.cma.2022.115021 https://doi.org/10.1016/j.engstruct.2022.114016
123
316 Computational Mechanics (2024) 74:281–331
135. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolu- 150. Clark DLP, Lu L, Charles M, Em KG, Zaki Tamer A (2023) Neu-
tional networks for biomedical image segmentation. In Nassir N, ral operator prediction of linear instability waves in high-speed
Joachim H, Wells WM, Frangi AF (eds) Medical image comput- boundary layers. J Comput Phys 474:111793. https://doi.org/10.
ing and computer-assisted intervention—MICCAI 2015. Lecture 1016/j.jcp.2022.111793
notes in computer science. Springer, Cham, pp 234–241. https:// 151. Seid K, Abueidda Diab W (2023) Data-driven and physics-
doi.org/10.1007/978-3-319-24574-4_28 informed deep learning operators for solution of heat
136. Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2018) UNet++: conduction equation with parametric heat source. Int J
a nested U-Net architecture for medical image segmentation. Heat Mass Transfer 203:123809. https://doi.org/10.1016/j.
arXiv:1807.10165 [cs, eess, stat] ijheatmasstransfer.2022.123809
137. Lu L, Xuhui M, Shengze C, Zhiping M, Somdatta G, Zhongqiang 152. Liu C, He Q, Zhao A, Tao W, Song Z, Liu B, Feng C (2023) Oper-
Z, Em KG (2022) A comprehensive and fair comparison of two ator learning for predicting mechanical response of hierarchical
neural operators (with practical extensions) based on FAIR data. composites with applications of inverse design. Int J Appl Mech
Comput Methods Appl Mech Eng 393:114778. https://doi.org/ 15(04):2350028. https://doi.org/10.1142/S175882512350028X
10.1016/j.cma.2022.114778 153. Ahmed Shady E, Panos S (2023) A multifidelity deep opera-
138. Chen T, Chen H (1995) Universal approximation to nonlinear tor network approach to closure for multiscale systems. Comput
operators by neural networks with arbitrary activation functions Methods Appl Mech Eng 414:116161. https://doi.org/10.1016/j.
and its application to dynamical systems. IEEE Trans Neural Netw cma.2023.116161
6(4):911–917. https://doi.org/10.1109/72.392253 154. Wang S, Wang H, Perdikaris P (2021) Learning the solution
139. Lu L, Pengzhan J, Guofei P, Zhongqiang Z, Em KG (2021) Learn- operator of parametric partial differential equations with physics-
ing nonlinear operators via DeepONet based on the universal informed DeepONets. Sci Adv 7(40):8605. https://doi.org/10.
approximation theorem of operators. Nat Mach Intell 3(3):218– 1126/sciadv.abi8605
229. https://doi.org/10.1038/s42256-021-00302-5 155. Somdatta G, Yin Minglang Yu, Yue KG (2022) A physics-
140. Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K, informed variational DeepONet for predicting crack path in quasi-
Stuart A, Anandkumar A (2021) Fourier neural operator for para- brittle materials. Comput Methods Appl Mech Eng 391:114587.
metric partial differential equations. arXiv:2010.08895 https://doi.org/10.1016/j.cma.2022.114587
141. Chensen L, Martin M, Zhen L, Em KG (2021) A seamless mul- 156. Goswami S, Bora A, Yu Y, Karniadakis GE (2022) Physics-
tiscale operator neural network for inferring bubble dynamics. J informed deep neural operator networks. arXiv:2207.05748 [cs,
Fluid Mech 929:A18. https://doi.org/10.1017/jfm.2021.866 math]
142. Mao Zhiping LL, Olaf M, Zaki Tamer A, Em KG (2021) 157. Kovachki N, Lanthaler S, Mishra S (2021) On universal approx-
DeepM&Mnet for hypersonics: predicting the coupled flow and imation and error bounds for Fourier neural operators. J Mach
finite-rate chemistry behind a normal shock using neural-network Learn Res 22(1):290:13237-290:13312
approximation of operators. J Comput Phys 447:110698. https:// 158. Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K,
doi.org/10.1016/j.jcp.2021.110698 Stuart A, Anandkumar A (2020) Neural operator: graph kernel
143. Clark DLP, Lu L, Meneveau C, Karniadakis G, Zaki TA (2021) network for partial differential equations. arXiv:2003.03485 [cs,
DeepONet prediction of linear instability waves in high-speed math, stat]
boundary layers. arXiv:2105.08697 [physics] 159. Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K,
144. Shengze C, Wang Zhicheng LL, Zaki Tamer A, Em KG Stuart A, Anandkumar A (2020) Multipole graph neural operator
(2021) DeepM&Mnet: inferring the electroconvection multi- for parametric partial differential equations. In: Proceedings of the
physics fields based on operator approximation by neural net- 34th international conference on neural information processing
works. J Comput Phys 436:110296. https://doi.org/10.1016/j.jcp. systems, NIPS’20. Curran Associates Inc., Red Hook, pp 6755–
2021.110296 6766
145. Chensen L, Li Zhen LL, Shengze C, Martin M, Em KG (2021) 160. Cao Q, Goswami S, Karniadakis GE (2023) LNO: laplace neural
Operator learning for predicting multiscale bubble growth dynam- operator for solving differential equations. arXiv:2303.10528 [cs]
ics. J Chem Phys 154(10):104118. https://doi.org/10.1063/5. 161. Zhu C, Ye H, Zhan B (2021) Fast solver of 2D Maxwell’s
0041203 equations based on Fourier neural operator. In: 2021 Photon-
146. Minglang Y, Ehsan B, Rego Bruno V, Enrui Z, Cristina C, ics and electromagnetics research symposium (PIERS). IEEE,
Humphrey Jay D, Em KG (2022) Simulating progressive intra- Hangzhou, pp 1635–1643. https://doi.org/10.1109/PIERS53385.
mural damage leading to aortic dissection using DeepONet: 2021.9695119
an operator-regression neural network. J Roy Soc Interface 162. Chao S, Yanghua W (2022) High-frequency wavefield extrapola-
19(187):20210670. https://doi.org/10.1098/rsif.2021.0670 tion using the Fourier neural operator. J Geophys Eng 19(2):269–
147. Osorio Julian D, Zhicheng W, George K, Shengze C, Chrys 282. https://doi.org/10.1093/jge/gxac016
C, Mayank P, Mayank H (2022) Forecasting solar-thermal sys- 163. Wei W, Li-Yun F (2022) Small-data-driven fast seismic sim-
tems performance under transient operation using a data-driven ulations for complex media using physics-informed Fourier
machine learning approach based on the deep operator network neural operators. Geophysics 87(6):T435–T446. https://doi.org/
architecture. Energy Convers Manag 252:115063. https://doi.org/ 10.1190/geo2021-0573.1
10.1016/j.enconman.2021.115063 164. Mehran RM, Tanu P, Souvik C, Anoop Krishnan NM (2022)
148. Goswami S, Li DS, Rego BV, Latorre M, Humphrey JD, Kar- Learning the stress-strain fields in digital composites using
niadakis GE (2022) Neural operator learning of heterogeneous Fourier neural operator. iScience 25(11):105452. https://doi.org/
mechanobiological insults contributing to aortic aneurysms. J 10.1016/j.isci.2022.105452
R Soc Interface 19(193):20220410. https://doi.org/10.1098/rsif. 165. Kai Z, Yuande Z, Hanjun Z, Ma Xiaopeng G, Jianwei WJ, Yongfei
2022.0410 Y, Chuanjin Y, Jun Y (2022) Fourier neural operator for solving
149. Seid K, Asha V, Abueidda Diab W, Sobh Nahil A, Kamran K subsurface oil/water two-phase flow partial differential equation.
(2023) Deep learning operator network for plastic deformation SPE J 27(03):1815–1830. https://doi.org/10.2118/209223-PA
with variable loads and material properties. Eng Comput. https:// 166. Bicheng Y, Bailian C, Dylan RH, Wei J, Pawar Rajesh J (2022) A
doi.org/10.1007/s00366-023-01822-x robust deep learning workflow to predict multiphase flow behavior
during geological CO2 sequestration injection and Post-Injection
123
Computational Mechanics (2024) 74:281–331 317
123
318 Computational Mechanics (2024) 74:281–331
199. Gonzalez FJ, Balajewicz M (2018) Deep convolutional recurrent In: Proceedings of the 32nd international conference on neural
autoencoders for learning low-dimensional feature dynamics of information processing systems, NIPS’18. Curran Associates Inc,
fluid systems. arXiv:1808.01346 [physics] Red Hook, pp 9278–9288
200. Holden D, Duong BC, Datta S, Nowrouzezahrai D (2019) Sub- 216. Lusch B, Nathan Kutz J, Brunton SL (2018) Deep learning for
space neural physics: fast data-driven interactive simulation. In: universal linear embeddings of nonlinear dynamics. Nat Commun
Proceedings of the 18th annual ACM SIGGRAPH/Eurographics 9(1):4950. https://doi.org/10.1038/s41467-018-07210-0
symposium on computer animation, SCA ’19. Association for 217. Otto SE, Rowley CW (2019) Linearly recurrent autoencoder net-
Computing Machinery, New York, pp 1–12. https://doi.org/10. works for learning dynamics. SIAM J Appl Dyn Syst 18(1):558–
1145/3309486.3340245 593. https://doi.org/10.1137/18M1177846
201. Stefania F, Andrea M, Luca D, Alfio Q (2020) Deep learning- 218. Cohn D, Ghahramani Z, Jordan M (1994) Active learning with
based reduced order models in cardiac electrophysiology. statistical models. In: Advances in neural information processing
PLoS ONE 15(10):e0239416. https://doi.org/10.1371/journal. systems, vol 7. MIT Press, Cambridge
pone.0239416 219. Liu X, Athanasiou CE, Padture NP, Sheldon BW, Gao H
202. Fresca S, Dede’ L, Manzoni A (2021) A comprehensive deep (2021) Knowledge extraction and transfer in data-driven fracture
learning-based approach to reduced order modeling of nonlin- mechanics. Proc Natl Acad Sci 118(23):e2104765118. https://doi.
ear time-dependent parametrized PDEs. J Sci Comput 87(2):61. org/10.1073/pnas.2104765118
https://doi.org/10.1007/s10915-021-01462-7 220. Haasdonk B, Kleikamp H, Ohlberger M, Schindler F, Wenzel T
203. Stefania F, Andrea M (2022) POD-DL-ROM: enhancing deep (2023) A new certified hierarchical and adaptive RB-ML-ROM
learning-based reduced order models for nonlinear parametrized surrogate model for parametrized PDEs. SIAM J Sci Comput
PDEs by proper orthogonal decomposition. Comput Methods 45(3):A1039–A1065. https://doi.org/10.1137/22M1493318
Appl Mech Eng 388:114181. https://doi.org/10.1016/j.cma.2021. 221. Kalina KA, Linden L, Brummund J, Kästner M (2023) FE
114181 ANN: an efficient data-driven multiscale approach based on
204. Ren P, Chengping R, Yang L, Jian-Xun W, Hao S (2022) Phy- physics-constrained neural networks and automated data mining.
CRNet: physics-informed convolutional-recurrent network for Comput Mech 71(5):827–851. https://doi.org/10.1007/s00466-
solving spatiotemporal PDEs. Comput Methods Appl Mech Eng 022-02260-0
389:114399. https://doi.org/10.1016/j.cma.2021.114399 222. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE
205. Hu C, Martin S, Dingreville R (2022) Accelerating phase-field Trans Knowl Data Eng 22(10):1345–1359. https://doi.org/10.
predictions via recurrent neural networks learning the microstruc- 1109/TKDE.2009.191
ture evolution in latent space. Comput Methods Appl Mech Eng 223. Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transfer-
397:115128. https://doi.org/10.1016/j.cma.2022.115128 able are features in deep neural networks? In: Proceedings of the
206. Kookjin L, Carlberg Kevin T (2020) Model reduction of dynam- 27th international conference on neural information processing
ical systems on nonlinear manifolds using deep convolutional systems, vol 2, NIPS’14. MIT Press, Cambridge, pp 3320–3328
autoencoders. J Comput Phys 404:108973. https://doi.org/10. 224. Kollmannsberger S, Singh D, Herrmann L (2023) Transfer
1016/j.jcp.2019.108973 learning enhanced full waveform inversion. arXiv:2302.11259
207. Shen S, Yin Y, Shao T, Wang H, Jiang C, Lan L, Zhou K (2021) [physics]
High-order differentiable autoencoder for nonlinear model reduc- 225. Liu Z, Chen Y, Du Y, Tegmark M (2021) Physics-augmented
tion. arXiv:2102.11026 [cs] learning: a new paradigm beyond physics-informed learning.
208. Schmid Peter J (2010) Dynamic mode decomposition of numer- arXiv:2109.13901 [physics]
ical and experimental data. J Fluid Mech 656:5–28. https://doi. 226. Zhu Y, Zabaras N, Koutsourelakis PS, Perdikaris P (2019)
org/10.1017/S0022112010001217 Physics-constrained deep learning for high-dimensional surrogate
209. Tu JH, Rowley CW, Luchtenburg DM, Brunton SL, Nathan KJ modeling and uncertainty quantification without labeled data. J
(2013) On dynamic mode decomposition: theory and applications. Comput Phys 394:56–81. https://doi.org/10.1016/j.jcp.2019.05.
arXiv:1312.0041 [physics] 024. arXiv: 1901.06314
210. Koopman BO (1931) Hamiltonian systems and transformation in 227. Eichelsdörfer J, Kaltenbach S, Koutsourelakis PS (2021)
Hilbert space. Proc Natl Acad Sci 17(5):315–318. https://doi.org/ Physics-enhanced neural networks in the small data regime.
10.1073/pnas.17.5.315 arXiv:2111.10329 [physics, stat] version: 1
211. Williams MO, Kevrekidis IG, Rowley CW (2015) A data-driven 228. Raissi M (2018) Deep hidden physics models: deep learning
approximation of the Koopman operator: extending dynamic of nonlinear partial differential equations. arXiv:1801.06637 [cs,
mode decomposition. J Nonlinear Sci 25(6):1307–1346. https:// math, stat]
doi.org/10.1007/s00332-015-9258-5 229. Em KG, Kevrekidis Ioannis G, Lu L, Paris P, Sifan W, Liu Y (2021)
212. Li Q, Dietrich F, Bollt EM, Kevrekidis IG (2017) Extended Physics-informed machine learning. Nat Rev Phys 3(6):422–440.
dynamic mode decomposition with dictionary learning: a data- https://doi.org/10.1038/s42254-021-00314-5
driven adaptive spectral decomposition of the Koopman operator. 230. Cuomo S, Cola VSD, Giampaolo F, Rozza G, Raissi M, Piccialli
Chaos Interdiscip J Nonlinear Sci 27(10):103111. https://doi.org/ F (2022) Scientific machine learning through physics-informed
10.1063/1.4993854 neural networks: where we are and what’s next. J Sci Comput
213. Yeung E, Kundu S, Hodas N (2019) Learning deep neural network 92(3):88. https://doi.org/10.1007/s10915-022-01939-z
representations for koopman operators of nonlinear dynamical 231. Hao Z, Liu S, Zhang Y, Ying C, Feng Y, Su H, Zhu J (2022)
systems. In: 2019 American Control Conference (ACC), pp 4832– Physics-informed machine learning: a survey on problems, meth-
4839. https://doi.org/10.23919/ACC.2019.8815339 ods and applications. arXiv:2211.08064
214. Takeishi N, Kawahara Y, Yairi T (2017) Learning Koopman invari- 232. Ehsan H, Ruben J (2021) SciANN: a Keras/tensorflow wrapper
ant subspaces for dynamic mode decomposition. In: Proceedings for scientific computations and physics-informed deep learning
of the 31st international conference on neural information pro- using artificial neural networks. Comput Methods Appl Mech Eng
cessing systems, NIPS’17. Curran Associates Inc, Red Hook, pp 373:113552. https://doi.org/10.1016/j.cma.2020.113552
1130–1140 233. Hennigh O, Narasimhan S, Nabian MA, Subramaniam A, Tangsali
215. Morton J, Witherden FD, Jameson A, Kochenderfer MJ (2018) K, Fang Z, Rietmann M, Byeon W, Choudhry S (2021) NVIDIA
Deep dynamical modeling and control of unsteady fluid flows. SimNet: an AI-accelerated multi-physics simulation framework.
123
Computational Mechanics (2024) 74:281–331 319
In: Paszynski M, Kranzlmüller D, Krzhizhanovskaya VV, Don- 251. Ferrari S, Jensenius M (2008) A constrained optimization
garra JJ, Sloot PMA (eds) Computational science—ICCS 2021. approach to preserving prior knowledge during incremental train-
Lecture notes in computer science. Springer, Cham, pp 447–461. ing. IEEE Trans Neural Netw 19(6):996–1009. https://doi.org/10.
https://doi.org/10.1007/978-3-030-77977-1_36 1109/TNN.2007.915108
234. Lu L, Meng X, Mao Z, Karniadakis GE (2021) DeepXDE: a 252. Rudd K, Di Muro G, Ferrari S (2014) A constrained backprop-
deep learning library for solving differential equations. SIAM Rev agation approach for the adaptive solution of partial differential
63(1):208–228. https://doi.org/10.1137/19M1274067 equations. IEEE Trans Neural Netw Learn Syst 25(3):571–584.
235. Zhiqiang C, Jingshuang C, Min L, Xinyu L (2020) Deep https://doi.org/10.1109/TNNLS.2013.2277601
least-squares methods: an unsupervised learning-based numeri- 253. Keith R, Silvia F (2015) A constrained integration (CINT)
cal method for solving elliptic PDEs. J Comput Phys 420:109707. approach to solving partial differential equations using artificial
https://doi.org/10.1016/j.jcp.2020.109707 neural networks. Neurocomputing 155:277–285. https://doi.org/
236. Justin S, Konstantinos S (2018) DGM: a deep learning algo- 10.1016/j.neucom.2014.11.058
rithm for solving partial differential equations. J Comput Phys 254. Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang
375:1339–1364. https://doi.org/10.1016/j.jcp.2018.08.029 X (2017) Residual attention network for image classification. In:
237. Kharazmi E, Zhang Z, Karniadakis GE (2019) Variational 2017 IEEE conference on computer vision and pattern recog-
physics-informed neural networks for solving partial differential nition (CVPR), pp 6450–6458. https://doi.org/10.1109/CVPR.
equations. arXiv:1912.00873 [physics, stat] 2017.683
238. Ehsan K, Zhongqiang Z, Karniadakis George EM (2021) hp- 255. Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection
VPINNs: variational physics-informed neural networks with through guided attention in CNNs. In: 2018 IEEE/CVF confer-
domain decomposition. Comput Methods Appl Mech Eng ence on computer vision and pattern recognition, pp 6995–7003.
374:113547. https://doi.org/10.1016/j.cma.2020.113547 https://doi.org/10.1109/CVPR.2018.00731
239. Morokoff William J, Caflisch Russel E (1995) Quasi-Monte Carlo 256. Jim M, Deep R, Hesthaven Jan S, Christian R (2020) Constraint-
integration. J Comput Phys 122(2):218–230. https://doi.org/10. aware neural networks for Riemann problems. J Comput Phys
1006/jcph.1995.1209 409:109345. https://doi.org/10.1016/j.jcp.2020.109345
240. 14-Monte Carlo integration I: basic concepts (2004). In: Pharr M, 257. Nandwani Y, Pathak AM, Singla P (2019) A primal dual for-
Humphreys G (eds) Physically based rendering. Morgan Kauf- mulation for deep learning with constraints. In: Advances in
mann, Burlington, pp 631–660. https://doi.org/10.1016/B978- neural information processing systems, vol 32. Curran Asso-
012553180-1/50016-8 ciates, Inc. https://papers.nips.cc/paper_files/paper/2019/hash/
241. Novak E, Ritter K (1996) High dimensional integration of smooth cf708fc1decf0337aded484f8f4519ae-Abstract.html
functions over cubes. Numer Math 75(1):79–97. https://doi.org/ 258. McClenny L, Braga-Neto U (2022) Self-adaptive physics-
10.1007/s002110050231 informed neural networks using a soft attention mechanism.
242. Rivera Jon A, Taylor Jamie M, Omella Angel J, David P (2022) arXiv:2009.04544 [cs, stat]
On quadrature rules for solving partial differential equations using 259. Lu L, Pestourie R, Yao W, Wang Z, Verdugo F, Johnson SG
neural networks. Comput Methods Appl Mech Eng 393:114710. (2021) Physics-informed neural networks with hard constraints
https://doi.org/10.1016/j.cma.2022.114710 for inverse design. SIAM J Sci Comput 43(6):B1105–B1132.
243. Yaohua Z, Gang B, Xiaojing Y, Haomin Z (2020) Weak adversar- https://doi.org/10.1137/21M1397908
ial networks for high-dimensional partial differential equations. 260. Zeng Q, Kothari Y, Bryngelson SH, Schäfer F (2022) Competitive
J Comput Phys 411:109409. https://doi.org/10.1016/j.jcp.2020. physics informed networks. arXiv:2204.11144 [cs, math]
109409 261. Philipp M, Wolfgang F, Stefan T, Isabell G, Michael G (2023)
244. Minh N-TV, Xiaoying Z, Timon R (2019) A deep energy method Modeling of 3D blood flows with physics-informed neural net-
for finite deformation hyperelasticity. Eur J Mech A Solids. works: comparison of network architectures. Fluids 8(2):46.
https://doi.org/10.1016/j.euromechsol.2019.103874 https://doi.org/10.3390/fluids8020046
245. Weinan E, Bing Yu (2018) The Deep Ritz method: a deep learning- 262. Han J, Tao J, Wang C (2020) FlowNet: a deep learning framework
based numerical algorithm for solving variational problems. for clustering and selection of streamlines and stream surfaces.
Commun Math Stat 6(1):1–12. https://doi.org/10.1007/s40304- IEEE Trans Visual Comput Graph 26(4):1732–1744. https://doi.
018-0127-z org/10.1109/TVCG.2018.2880207
246. Grossmann TG, Komorowska UJ, Latz J, Schönlieb CB (2023) 263. Bhatnagar S, Afshar Y, Pan S, Duraisamy K, Kaushik S (2019)
Can physics-informed neural networks beat the finite element Prediction of aerodynamic flow fields using convolutional neural
method? arXiv:2302.04107 networks. Comput Mech 64(2):525–545. https://doi.org/10.1007/
247. Ali K, Tapan M (2022) Physics-informed PointNet: a deep learn- s00466-019-01740-0
ing solver for steady-state incompressible flows and thermal 264. Han G, Luning S, Jian-Xun W (2021) PhyGeoNet: physics-
fields on multiple sets of irregular geometries. J Comput Phys informed geometry-adaptive convolutional neural networks for
468:111510. https://doi.org/10.1016/j.jcp.2022.111510 solving parameterized steady-state PDEs on irregular domain.
248. Jens B, Kaj N (2018) A unified deep artificial neural network J Comput Phys 428:110079. https://doi.org/10.1016/j.jcp.2020.
approach to partial differential equations in complex geometries. 110079
Neurocomputing 317:28–41. https://doi.org/10.1016/j.neucom. 265. Wandel N, Weinmann M, Neidlin M, Klein R (2022) Spline-
2018.06.056 PINN: approaching PDEs without data using fast, physics-
249. Alexander H, Henning W, Rolf M (2022) Physics informed neu- informed hermite-spline CNNs. arXiv:2109.07143 [physics]
ral networks for continuum micromechanics. Comput Methods 266. Han G, Zahr Matthew J, Jian-Xun W (2022) Physics-informed
Appl Mech Eng 393:114790. https://doi.org/10.1016/j.cma.2022. graph neural Galerkin networks: a unified framework for solving
114790 PDE-governed forward and inverse problems. Comput Methods
250. Lagaris IE, Likas AC, Papageorgiou DG (2000) Neural-network Appl Mech Eng 390:114502. https://doi.org/10.1016/j.cma.2021.
methods for boundary value problems with irregular boundaries. 114502
IEEE Trans Neural Netw 11(5):1041–1049. https://doi.org/10. 267. Möller M, Toshniwal D, Van Ruiten F (2021) Physics-
1109/72.870037 informed machine learning embedded into isogeometric analysis.
Mathematics: key enabling technology for scientific machine
123
320 Computational Mechanics (2024) 74:281–331
123
Computational Mechanics (2024) 74:281–331 321
for the incompressible Navier–Stokes equations. J Comput Phys lems in unsaturated groundwater flow. Georisk Assess Manag
426:109951. https://doi.org/10.1016/j.jcp.2020.109951 Risk Eng Syst Geohazards 16(1):21–36. https://doi.org/10.1080/
301. Shengze C, Zhicheng W, Frederik F, Jin JY, Callum G, Em KG 17499518.2021.1971251
(2021) Flow over an espresso cup: inferring 3-D velocity and 318. Chen X, Trung CB, Yong Y, Günther M (2023) Transfer learning
pressure fields from tomographic background oriented Schlieren based physics-informed neural networks for solving inverse prob-
via physics-informed neural networks. J Fluid Mech 915:A102. lems in engineering structures under different loading scenarios.
https://doi.org/10.1017/jfm.2021.135 Comput Methods Appl Mech Eng 405:115852. https://doi.org/
302. Fraces Cedric G, Hamdi T (2021) Physics informed deep learning 10.1016/j.cma.2022.115852
for flow and transport in porous media. OnePetro. https://doi.org/ 319. Yubiao S, Ushnish S, Matthew J (2023) Physics-informed
10.2118/203934-MS deep learning for simultaneous surrogate modeling and PDE-
303. Wenbo Z, Li David S, Tan B-T, Sacks Michael S (2022) Simulation constrained optimization of an airfoil geometry. Comput Methods
of the 3D hyperelastic behavior of ventricular myocardium using a Appl Mech Eng 411:116042. https://doi.org/10.1016/j.cma.2023.
finite-element based neural-network approach. Comput Methods 116042
Appl Mech Eng 394:114871. https://doi.org/10.1016/j.cma.2022. 320. Rasht-Behesht M, Huber C, Shukla K, Karniadakis GE (2022)
114871 Physics-informed neural networks (PINNs) for wave propagation
304. Wang Jeremy CH, Jean-Pierre H (2023) FluxNet: a physics- and full waveform inversions. J Geophys Res Solid Earth. https://
informed learning-based Riemann solver for transcritical flows doi.org/10.1029/2021JB023120
with non-ideal thermodynamics. Comput Methods Appl Mech 321. Zehnder J, Li Y, Coros S, Thomaszewski B (2021) NTopo: mesh-
Eng 411:116070. https://doi.org/10.1016/j.cma.2023.116070 free topology optimization using implicit neural representations.
305. Sina AN, Ehsan H, Trevor C, Anoush P, Reza V (2021) Physics- arXiv:2102.10782
informed neural network for modelling the thermochemical 322. Di Lorenzo D, Champaney V, Marzin JY, Farhat C, Chinesta
curing process of composite-tool systems during manufacture. F (2023) Physics informed and data-based augmented learning
Comput Methods Appl Mech Eng 384:113959. https://doi.org/ in structural health diagnosis. Comput Methods Appl Mech Eng
10.1016/j.cma.2021.113959 414:116186. https://doi.org/10.1016/j.cma.2023.116186
306. Zhu Q, Liu Z, Yan J (2021) Machine learning for metal addi- 323. Jens B, Kaj N (2019) Data-driven discovery of PDEs in complex
tive manufacturing: predicting temperature and melt pool fluid datasets. J Comput Phys 384:239–252. https://doi.org/10.1016/j.
dynamics using physics-informed neural networks. Comput Mech jcp.2019.01.036
67(2):619–635. https://doi.org/10.1007/s00466-020-01952-9 324. Udrescu S-M, Tegmark M (2020) AI Feynman: a physics-inspired
307. Markidis S (2021) The old and the new: can physics-informed method for symbolic regression. Sci Adv 6(16):2631. https://doi.
deep-learning replace traditional linear solvers? Front Big Data. org/10.1126/sciadv.aay2631
https://doi.org/10.3389/fdata.2021.669097 325. Feynman Richard P, Leighton Robert B, Sands Matthew L (2011)
308. Liangliang L, Li Yunzhu D, Qiuwan LT, Yonghui X (2022) ReF- The Feynman lectures on physics. Basic Books, New York
nets: physics-informed neural network for Reynolds equation 326. Xuhui M, Zhen L, Dongkun Z, Em KG (2020) PPINN: parareal
of gas bearing. Comput Methods Appl Mech Eng 391:114524. physics-informed neural network for time-dependent PDEs. Com-
https://doi.org/10.1016/j.cma.2021.114524 put Methods Appl Mech Eng 370:113250. https://doi.org/10.
309. Chen Yuyao LL, Em KG, Dal NL (2020) Physics-informed neural 1016/j.cma.2020.113250
networks for inverse problems in nano-optics and metamaterials. 327. Revanth M, Susanta G (2022) A novel sequential method to
Optics Express 28(8):11618. https://doi.org/10.1364/OE.384875 train physics informed neural networks for Allen Cahn and Cahn
310. Ruiyang Z, Yang L, Hao S (2020) Physics-informed multi-LSTM Hilliard equations. Comput Methods Appl Mech Eng 390:114474.
networks for metamodeling of nonlinear structures. Comput https://doi.org/10.1016/j.cma.2021.114474
Methods Appl Mech Eng 369:113226. https://doi.org/10.1016/ 328. Iserles A (2008) A first course in the numerical analysis of differ-
j.cma.2020.113226 ential equations. Cambridge University Press
311. Shukla K, Di Leoni PC, Blackshire J, Sparkman D, Karniadakis 329. Henning W, Christian W, Peter W (2020) The neural parti-
GE (2020) Physics-informed neural network for ultrasound non- cle method—an updated Lagrangian physics informed neural
destructive quantification of surface breaking cracks. J Nondestr network for computational fluid dynamics. Comput Methods
Eval 39(3):61. https://doi.org/10.1007/s10921-020-00705-1 Appl Mech Eng 368:113127. https://doi.org/10.1016/j.cma.2020.
312. Anton D, Wessels H (2022) Physics-informed neural networks 113127
for material model calibration from full-field displacement data. 330. Jinshuai B, Ying Z, Yuwei M, Hyogu J, Haifei Z, Charith R, Sauret
arXiv:2212.07723 Emilie G (2022) A general neural particle method for hydrody-
313. Herrmann L, Bürchner T, Dietrich F, Kollmannsberger S (2023) namics modeling. Comput Methods Appl Mech Eng 393:114740.
On the use of neural networks for full waveform inversion. Com- https://doi.org/10.1016/j.cma.2022.114740
put Methods Appl Mech Eng 415:116278. https://doi.org/10. 331. González-García R, Rico-Martínez R, Kevrekidis IG (1998) Iden-
1016/j.cma.2023.116278 tification of distributed parameter systems: a neural net based
314. Rojas Carlos JG, Bitterncourt ML, Boldrini JL (2021) Parameter approach. Comput Chem Eng 22:S965–S968. https://doi.org/10.
identification for a damage model using a physics informed neural 1016/S0098-1354(98)00191-4
network. arXiv:2107.08781 332. Long Z, Lu Y, Ma X, Dong B (2018) PDE-Net: learning PDEs
315. Li W, Lee K-M (2021) Physics informed neural network for from data. In: Proceedings of the 35th international conference
parameter identification and boundary force estimation of compli- on machine learning. PMLR, pp 3208–3216. https://proceedings.
ant and biomechanical systems. Int J Intell Robot Appl 5(3):313– mlr.press/v80/long18a.html
325. https://doi.org/10.1007/s41315-021-00196-x 333. Long Zichao L, Yiping DB (2019) PDE-Net 2.0: learning PDEs
316. Zhang E, Dao M, Karniadakis GE, Suresh S (2022) Analyses from data with a numeric-symbolic hybrid deep network. J Com-
of internal structures and defects in materials using physics- put Phys 399:108925. https://doi.org/10.1016/j.jcp.2019.108925
informed neural networks. Sci Adv 8(7):0644. https://doi.org/10. 334. Hua BS, Tran MK, Yeung SK (2018) Pointwise convolutional
1126/sciadv.abk0644 neural networks. arXiv:1712.05245 [cs]
317. Depina I, Jain S, Mar Valsson S, Gotovac H (2022) Appli- 335. Brunton SL, Proctor JL, Nathan Kutz J (2016) Discovering gov-
cation of physics-informed neural networks to inverse prob- erning equations from data by sparse identification of nonlinear
123
322 Computational Mechanics (2024) 74:281–331
dynamical systems. Proc Natl Acad Sci 113(15):3932–3937. uation of steel plates by neural networks. IEEE Trans Appl
https://doi.org/10.1073/pnas.1517384113 Supercond 9(2):3475–3478. https://doi.org/10.1109/77.783778
336. Rudy SH, Brunton SL, Proctor JL, Nathan Kutz J (2017) 354. Ovcharenko O, Kazei V, Kalita M, Peter D, Alkhalifah T (2019)
Data-driven discovery of partial differential equations. Sci Adv Deep learning for low-frequency extrapolation from multioffset
3(4):e1602614. https://doi.org/10.1126/sciadv.1602614 seismic data. Geophysics 84(6):R989–R1001. https://doi.org/10.
337. Schaeffer H (2017) Learning partial differential equations via data 1190/geo2018-0884.1
discovery and sparse optimization. Proc Roy Soc A Math Phys 355. Sun H, Demanet L (2020) Extrapolated full waveform inversion
Eng Sci 473(2197):20160446. https://doi.org/10.1098/rspa.2016. with deep learning. Geophysics, 85(3):R275–R288. https://doi.
0446 org/10.1190/geo2019-0195.1. arXiv:1909.11536
338. Champion K, Lusch B, Nathan Kutz J, Brunton SL (2019) Data- 356. Sun H, Demanet L (2022) Deep learning for low-frequency
driven discovery of coordinates and governing equations. Proc extrapolation of multicomponent data in elastic FWI. IEEE Trans
Natl Acad Sci 116(45):22445–22451. https://doi.org/10.1073/ Geosci Remote Sens 60:1–11. https://doi.org/10.1109/TGRS.
pnas.1906995116 2021.3135790
339. Paolo C, Giorgio G, Stefania F, Andrea M, Attilio F (2023) 357. Lewis W, Vigh W (2017) Deep learning prior models from
Reduced order modeling of parametrized systems through autoen- seismic images for full-waveform inversion. In: SEG techni-
coders and SINDy approach: continuation of periodic solutions. cal program expanded abstracts 2017. Society of Exploration
Comput Methods Appl Mech Eng 411:116072. https://doi.org/ Geophysicists, Houston, pp 1512–1517. https://doi.org/10.1190/
10.1016/j.cma.2023.116072 segam2017-17627643.1
340. Raissi M, Perdikaris P, Karniadakis GE (2018) Multistep neural 358. Dyck DN, Lowther DA, McFee S (1992) Determining an approxi-
networks for data-driven discovery of nonlinear dynamical sys- mate finite element mesh density using neural network techniques.
tems. arXiv:1801.01236 [nlin, physics:physics, stat] IEEE Trans Magn 28(2):1767–1770. https://doi.org/10.1109/20.
341. Kim B, Azevedo VC, Thuerey N, Kim T, Gross M, Solenthaler B 124047
(2019) Deep fluids: a generative network for parameterized fluid 359. Chedid R, Najjar N (1996) Automatic finite-element mesh gen-
simulations. Comput Graph Forum 38(2):59–70. https://doi.org/ eration using artificial neural networks-part I: prediction of mesh
10.1111/cgf.13619 density. IEEE Trans Magn 32(5):5173–5178. https://doi.org/10.
342. Julia L, Reese J, Jeremy T (2016) Machine learning strategies for 1109/20.538619
systems with invariance properties. J Comput Phys 318:22–35. 360. Triantafyllidis DG, Labridis DP (2000) An automatic mesh
https://doi.org/10.1016/j.jcp.2016.05.003 generator for handling small features in open boundary power
343. Julia L, Andrew K, Jeremy T (2016) Reynolds averaged tur- transmission line problems using artificial neural networks. Com-
bulence modelling using deep neural networks with embedded mun Numer Methods Eng 16(3):177–190
invariance. J Fluid Mech 807:155–166. https://doi.org/10.1017/ 361. Zhang Z, Wang Y, Jimack PK, Wang H (2020) MeshingNet:
jfm.2016.615 a new mesh generation method based on deep learning. In:
344. Smith GF (1965) On isotropic integrity bases. Arch Ration Mech Krzhizhanovskaya VV, Závodszky G, Lees MH, Dongarra JJ,
Anal 18(4):282–292. https://doi.org/10.1007/BF00251667 Sloot PMA, Brissos S, Teixeira J (eds) Computational science—
345. Lutter M, Listmann K, Peters J (2019) Deep Lagrangian networks ICCS 2020, vol 12139. Lecture notes in computer science.
for end-to-end learning of energy-based control for under-actuated Springer, Cham, pp 186–198. https://doi.org/10.1007/978-3-030-
systems. In: 2019 IEEE/RSJ international conference on intelli- 50420-5_14
gent robots and systems (IROS), pp 7718–7725. https://doi.org/ 362. Lock C, Hassan O, Sevilla R, Jones J (2023) Meshing using neural
10.1109/IROS40897.2019.8968268 networks for improving the efficiency of computer modelling. Eng
346. Lutter M, Ritter C, Peters J (2019) Deep Lagrangian networks: Comput. https://doi.org/10.1007/s00366-023-01812-z
using physics as model prior for deep learning. arXiv:1907.04490 363. Bernd F (1994) Growing cell structures—a self-organizing net-
[cs, eess, stat] work for unsupervised and supervised learning. Neural Netw
347. Cranmer M, Greydanus S, Hoyer S, Battaglia P, Spergel D, Ho S 7(9):1441–1460. https://doi.org/10.1016/0893-6080(94)90091-
(2020) Lagrangian neural networks. arXiv:2003.04630 [physics, 4
stat] 364. Alfonzetti S, Coco S, Cavalieri S, Malgeri M (1996) Automatic
348. Greydanus S, Dzamba M, Yosinski J (2019) Hamiltonian neural mesh generation by the let-it-grow neural network. IEEE Trans
networks. arXiv:1906.01563 [cs] Magn 32(3):1349–1352. https://doi.org/10.1109/20.497496
349. Zhang L, Yang F, Daniel Zhang Y, Zhu YJ (2016) Road crack 365. Triantafyllidis DG, Labridis DP (2002) A finite-element mesh
detection using deep convolutional neural network. In: 2016 IEEE generator based on growing neural networks. IEEE Trans Neu-
international conference on image processing (ICIP), pp 3708– ral Netw 13(6):1482–1496. https://doi.org/10.1109/TNN.2002.
3712. https://doi.org/10.1109/ICIP.2016.7533052 804223
350. Chen F-C, Jahanshahi MR (2018) NB-CNN: deep learning-based 366. Lefik M, Schrefler BA (2003) Artificial neural network as an incre-
crack detection using convolutional neural network and Naïve mental non-linear constitutive model for a finite element code.
Bayes data fusion. IEEE Trans Ind Electron 65(5):4392–4400. Comput Methods Appl Mech Eng 192(28):3265–3283. https://
https://doi.org/10.1109/TIE.2017.2764844 doi.org/10.1016/S0045-7825(03)00350-5
351. Jaeger BE, Schmid S, Grosse CU, Gögelein A, Elischberger F 367. Phill JD, Piemaan F, Whan YJ (2021) Machine learning-based
(2022) Infrared thermal imaging-based turbine blade crack clas- constitutive model for J2- plasticity. Int J Plast 138:102919.
sification using deep learning. J Nondestr Eval 41(4):74. https:// https://doi.org/10.1016/j.ijplas.2020.102919
doi.org/10.1007/s10921-022-00907-9 368. Lin YC, Jun Z, Jue Z (2008) Application of neural networks to
352. Korshunova N, Jomo J, Lékó G, Reznik D, Balázs P, Kollmanns- predict the elevated temperature flow behavior of a low alloy
berger S (2020) Image-based material characterization of complex steel. Comput Mater Sci 43(4):752–758. https://doi.org/10.1016/
microarchitectured additively manufactured structures. Comput j.commatsci.2008.01.039
Math Appl 80(11):2462–2480. https://doi.org/10.1016/j.camwa. 369. Li Hong-Ying H, Ji-Dong WD-D, Xiao-Feng W, Yang-Hua L
2020.07.018 (2012) Artificial neural network and constitutive equations to
353. Hall Barbosa C, Bruno AC, Vellasco M, Pacheco M, Wikswo predict the hot deformation behavior of modified 2.25Cr-1Mo
JP, Ewing AP (1999) Automation of SQUlD nondestructive eval-
123
Computational Mechanics (2024) 74:281–331 323
steel. Mater Des 42:192–197. https://doi.org/10.1016/j.matdes. modeling by deep learning. J Comput Phys 429:110010. https://
2012.05.056 doi.org/10.1016/j.jcp.2020.110010
370. Daoping L, Hang Y, Elkhodary KI, Shan T, Kam LW, Guo 387. Mozaffar M, Bostanabad R, Chen W, Ehmann K, Cao J, Bessa
X (2022) Mechanistically informed data-driven modeling of MA (2019) Deep learning predicts path-dependent plasticity. Proc
cyclic plasticity via artificial neural networks. Comput Methods Natl Acad Sci 116(52):26414–26420. https://doi.org/10.1073/
Appl Mech Eng 393:114766. https://doi.org/10.1016/j.cma.2022. pnas.1911815116
114766 388. Ling W, Ludovic N (2022) Recurrent neural networks (RNNs)
371. Unger Jörg F, Carsten K (2009) Neural networks as material mod- with dimensionality reduction and break down in computational
els within a multiscale approach. Comput Struct 87(19):1177– mechanics; application to multi-scale localization step. Comput
1186. https://doi.org/10.1016/j.compstruc.2008.12.003 Methods Appl Mech Eng 390:114476. https://doi.org/10.1016/j.
372. Gabriel H, Luiz SA (2015) Contact stiffness estimation in ANSYS cma.2021.114476
using simplified models and artificial neural networks. Finite Elem 389. Abueidda Diab W, Seid K, Sobh Nahil A, Huseyin S (2021)
Anal Des 97:43–53. https://doi.org/10.1016/j.finel.2015.01.003 Deep learning for plasticity and thermo-viscoplasticity. Int J Plast
373. Atsuya O, Shinobu Y (1970) A new local contact search method 136:102852. https://doi.org/10.1016/j.ijplas.2020.102852
using a multi-layer neural network. Comput Model Eng Sci 390. Hsu Yu-Chuan Yu, Chi-Hua BM (2020) Using deep learning to
21(2):93–104. https://doi.org/10.3970/cmes.2007.021.093 predict fracture patterns in crystalline solids. Matter 3(1):197–
374. Oishi A, Yagawa G (2020) A surface-to-surface contact search 211. https://doi.org/10.1016/j.matt.2020.04.019
method enhanced by deep learning. Comput Mech 65(4):1125– 391. Lew AJ, Yu CH, Hsu YC, Buehler MJ (2021) Deep learning model
1147. https://doi.org/10.1007/s00466-019-01811-2 to predict fracture mechanisms of graphene. Npj 2D Mater Appl
375. Singh AP, Medida S, Duraisamy K (2017) Machine-learning- 5(1):1–8. https://doi.org/10.1038/s41699-021-00228-x
augmented predictive modeling of turbulent separated flows 392. Minliang L, Liang L, Wei S (2020) A generic physics-informed
over airfoils. AIAA J 55(7):2215–2227. https://doi.org/10.2514/ neural network-based constitutive model for soft biological tis-
1.J055595 sues. Comput Methods Appl Mech Eng 372:113402. https://doi.
376. Maulik R, San O, Rasheed A, Vedula P (2019) Subgrid modelling org/10.1016/j.cma.2020.113402
for two-dimensional turbulence using neural networks. J Fluid 393. Weber P, Geiger J, Wagner W (2021) Constrained neural net-
Mech 858:122–144. https://doi.org/10.1017/jfm.2018.770 work training and its application to hyperelastic material mod-
377. Arnau F, Joan B, Ramon C (2022) Finite element approximation eling. Comput Mech 68(5):1179–1204. https://doi.org/10.1007/
of wave problems with correcting terms based on training artificial s00466-021-02064-8
neural networks with fine solutions. Comput Methods Appl Mech 394. Leng Y, Tac V, Calve S, Tepole AB (2021) Predicting the
Eng 399:115280. https://doi.org/10.1016/j.cma.2022.115280 mechanical properties of biopolymer gels using neural net-
378. Le BA, Yvonnet J, He Q-C (2015) Computational homogeniza- works trained on discrete fiber network data. Comput Methods
tion of nonlinear elastic materials using neural networks. Int J Appl Mech Eng 387:114160. https://doi.org/10.1016/j.cma.2021.
Numer Method Eng 104(12):1061–1084. https://doi.org/10.1002/ 114160. arXiv:2101.11712 [cs, q-bio]
nme.4953 395. Vahidullah T, Francisco SC, Tepole Adrian B (2022) Data-driven
379. Xiaoxin L, Giovanis DG, Yvonnet J, Papadopoulos V, Detrez tissue mechanics with polyconvex neural ordinary differential
F, Bai J (2019) A data-driven computational homogenization equations. Comput Methods Appl Mech Eng 398:115248. https://
method based on neural networks for the nonlinear anisotropic doi.org/10.1016/j.cma.2022.115248
electrical response of graphene/polymer nanocomposites. Com- 396. Linden L, Klein DK, Kalina KA, Brummund J, Weeger O, Käst-
put Mech 64(2):307–321. https://doi.org/10.1007/s00466-018- ner M (2023) Neural networks meet hyperelasticity: a guide to
1643-0 enforcing physics. arXiv:2302.02403 [cs]
380. Huang Daniel Z, Kailai X, Charbel F, Eric D (2020) Learning 397. Klein Dominik K, Rogelio O, Jesús M-F, Oliver W (2022) Finite
constitutive relations from indirect observations using deep neural electro-elasticity with physics-augmented neural networks. Com-
networks. J Comput Phys 416:109491. https://doi.org/10.1016/j. put Methods Appl Mech Eng 400:115501. https://doi.org/10.
jcp.2020.109491 1016/j.cma.2022.115501
381. Kun W, WaiChing S (2018) A multiscale multi-permeability poro- 398. Klein Dominik K, Mauricio F, Martin Robert J, Patrizio N, Oliver
plasticity model linked by recursive homogenizations and deep W (2022) Polyconvex anisotropic hyperelasticity with neural net-
learning. Comput Methods Appl Mech Eng 334:337–380. https:// works. J Mech Phys Solids 159:104703. https://doi.org/10.1016/
doi.org/10.1016/j.cma.2018.01.036 j.jmps.2021.104703
382. Li B, Zhuang X (2020) Multiscale computation on feedforward 399. As’ad F, Farhat C (2023) A mechanics-informed neural network
neural network and recurrent neural network. Front Struct Civ Eng framework for data-driven nonlinear viscoelasticity. In: AIAA
14(6):1285–1298. https://doi.org/10.1007/s11709-020-0691-7 SCITECH 2023 forum. American Institute of Aeronautics and
383. Vlassis Nikolaos N, Ran M, WaiChing S (2020) Geometric deep Astronautics, National Harbor. https://doi.org/10.2514/6.2023-
learning for computational mechanics part I: anisotropic hypere- 0949
lasticity. Comput Methods Appl Mech Eng 371:113299. https:// 400. Vahidullah T, Rausch Manuel K, Francisco SC, Buganza TA
doi.org/10.1016/j.cma.2020.113299 (2023) Data-driven anisotropic finite viscoelasticity using neu-
384. Frankenreiter I, Rosato D, Miehe C (2011) Hybrid micro- ral ordinary differential equations. Comput Methods Appl Mech
macro-modeling of evolving anisotropies and length scales in Eng 411:116046. https://doi.org/10.1016/j.cma.2023.116046
finite plasticity of polycrystals: hybrid micro-macro-modeling of 401. Amos B, Xu L, Zico KJ (2017) Input convex neural networks.
evolving anisotropies and length scales in finite plasticity of poly- In: Proceedings of the 34th international conference on machine
crystals. PAMM 11(1):515–518. https://doi.org/10.1002/pamm. learning. PMLR, pp 146–155. https://proceedings.mlr.press/v70/
201110249 amos17b.html
385. Fish J (2013) Practical multiscaling. Wiley, Chichester 402. Chen Ricky TQ, Rubanova Y, Bettencourt J, Duvenaud D (2019)
386. Kevin L, Markus H, Abdolazizi Kian P, Aydin Roland C, Mikhail Neural ordinary differential equations. arXiv:1806.07366
I, Cyron Christian J (2021) Constitutive artificial neural networks: 403. Peiyi C, Johann G (2022) Polyconvex neural networks for hyper-
a fast and general approach to predictive data-driven constitutive elastic constitutive models: a rectification approach. Mech Res
123
324 Computational Mechanics (2024) 74:281–331
Commun 125:103993. https://doi.org/10.1016/j.mechrescom. deep learning from instrumented indentation. Proc Natl Acad Sci
2022.103993 117(13):7052–7062. https://doi.org/10.1073/pnas.1922210117
404. Filippo M, Ioannis S, Paolo V, Victor M-B (2021) 418. Xuhui M, Em KG (2020) A composite neural network that learns
Thermodynamics-based artificial neural networks for con- from multi-fidelity data: application to function approximation
stitutive modeling. J Mech Phys Solids 147:104277. https://doi. and inverse PDE problems. J Comput Phys 401:109020. https://
org/10.1016/j.jmps.2020.104277 doi.org/10.1016/j.jcp.2019.109020
405. Masi F, Stefanou I, Vannucci P, Maffi-Berthier V (2021) Material 419. Xing L, Athanasiou Christos E, Padture Nitin P, Sheldon Brian
modeling via thermodynamics-based artificial neural networks. W, Huajian G (2020) A machine learning approach to fracture
In: Barbaresco F, Nielsen F (eds) Geometric structures of sta- mechanics problems. Acta Mater 190:105–112. https://doi.org/
tistical physics, information geometry, and learning. Springer 10.1016/j.actamat.2020.03.016
proceedings in mathematics and statistics. Springer, Cham, pp 420. Hambli R, Katerchi H, Benhamou C-L (2011) Multiscale
308–329. https://doi.org/10.1007/978-3-030-77957-3_16 methodology for bone remodelling simulation using coupled
406. Filippo M, Ioannis S (2022) Multiscale modeling of inelastic finite element and neural network computation. Biomech Model
materials with thermodynamics-based artificial neural networks Mechanobiol 10(1):133–145. https://doi.org/10.1007/s10237-
(TANN). Comput Methods Appl Mech Eng 398:115190. https:// 010-0222-x
doi.org/10.1016/j.cma.2022.115190 421. Moritz F, Siddhant K, Laura DL (2021) Unsupervised discovery
407. Ladeveze P, Nedjar D, Reynier M (1994) Updating of finite ele- of interpretable hyperelastic constitutive laws. Comput Methods
ment models using vibration tests. AIAA J 32(7):1485–1491. Appl Mech Eng 381:113852. https://doi.org/10.1016/j.cma.2021.
https://doi.org/10.2514/3.12219 113852
408. Basile M, Ludovic C, Christian R (2019) Parameter identifica- 422. Robert T (1996) Regression shrinkage and selection via the lasso.
tion and model updating in the context of nonlinear mechanical J Roy Stat Soc Ser B Methodol 58(1):267–288
behaviors using a unified formulation of the modified constitu- 423. Flaschel M, Kumar S, De Lorenzis L (2022) Discovering plasticity
tive relation error concept. Comput Methods Appl Mech Eng models without stress data. npj Comput Mater, 8(1):91. https://
345:1094–1113. https://doi.org/10.1016/j.cma.2018.09.008 doi.org/10.1038/s41524-022-00752-4. arXiv:2202.04916 [cs]
409. Nam NH, Ludovic C, Cuong HM (2022) mCRE-based parameter 424. Enzo M, Moritz F, Siddhant K, Laura DL (2023) Auto-
identification from full-field measurements: consistent frame- mated identification of linear viscoelastic constitutive laws with
work, integrated version, and extension to nonlinear material EUCLID. Mech Mater 181:104643. https://doi.org/10.1016/j.
behaviors. Comput Methods Appl Mech Eng 400:115461. https:// mechmat.2023.104643
doi.org/10.1016/j.cma.2022.115461 425. Moritz F, Siddhant K, Laura DL (2023) Automated discovery
410. Benady A, Baranger E, Chamoin L (2023) NN-mCRE: a modified of generalized standard material models with EUCLID. Comput
constitutive relation error framework for unsupervised learning Methods Appl Mech Eng 405:115867. https://doi.org/10.1016/j.
of nonlinear state laws with physics-augmented neural networks. cma.2022.115867
https://doi.org/10.13140/RG.2.2.32171.00804 426. Akshay J, Prakash T, Yiwen Z, Maxime E, Moritz F, Laura DL,
411. Benady AB, Chamoin LC, Baranger EB (2023) A modi- Siddhant K (2022) Bayesian-EUCLID: discovering hyperelastic
fied constitutive relation error (mCRE) framework to learn material laws with uncertainties. Comput Methods Appl Mech
nonlinear constitutive models from strain measurements with Eng 398:115225. https://doi.org/10.1016/j.cma.2022.115225
thermodynamics-consistent neural networks. In: International 427. Kevin L, Sarah P, Kuhl E (2023) Automated model discovery for
conference on adaptive modeling and simulation (ADMOS 2023), human brain using constitutive artificial neural networks. Acta
advanced techniques for data assimilation, inverse analysis, and Biomater 160:134–151. https://doi.org/10.1016/j.actbio.2023.01.
data-based enrichment of simulation models. https://doi.org/10. 055
23967/admos.2023.020 428. Kevin L, Ellen K (2023) A new family of constitutive artificial
412. Xueyang L, Roth Christian C, Dirk M (2019) Machine-learning neural networks towards automated model discovery. Comput
based temperature- and rate-dependent plasticity model: applica- Methods Appl Mech Eng 403:115731. https://doi.org/10.1016/
tion to analysis of fracture experiments on DP steel. Int J Plast j.cma.2022.115731
118:320–344. https://doi.org/10.1016/j.ijplas.2019.02.012 429. Atsuya O, Genki Y (2017) Computational mechanics enhanced
413. Prakash T, Akshay J, Yiwen Z, Yiwen F, Laura DL, Siddhant K by deep learning. Comput Methods Appl Mech Eng 327:327–351.
(2022) NN-EUCLID: deep-learning hyperelasticity without stress https://doi.org/10.1016/j.cma.2017.08.040
data. J Mech Phys Solids 169:105076. https://doi.org/10.1016/j. 430. Jaeho J, Kyungho Y, Phill-Seung L (2020) Deep learned finite
jmps.2022.105076 elements. Comput Methods Appl Mech Eng 372:113401. https://
414. Xiang L, Zhanli L, Shaoqing C, Chengcheng L, Chenfeng L, Zhuo doi.org/10.1016/j.cma.2020.113401
Z (2019) Predicting the effective mechanical property of hetero- 431. Bar-Sinai Y, Hoyer S, Hickey J, Brenner MP (2019) Learning data-
geneous materials by image based modeling and deep learning. driven discretizations for partial differential equations. Proc Natl
Comput Methods Appl Mech Eng 347:735–753. https://doi.org/ Acad Sci USA 116(31):15344–15349. https://doi.org/10.1073/
10.1016/j.cma.2019.01.005 pnas.1814058116
415. Henkes A, Caylak I, Mahnken R (2021) A deep learning 432. Panos P, Mobasher Mostafa E (2023) Integrated finite ele-
driven pseudospectral PCE based FFT homogenization algo- ment neural network (I-FENN) for non-local continuum dam-
rithm for complex microstructures. Comput Methods Appl Mech age mechanics. Comput Methods Appl Mech Eng 404:115766.
Eng 385:114070. https://doi.org/10.1016/j.cma.2021.114070. https://doi.org/10.1016/j.cma.2022.115766
arXiv:2110.13440 433. Arcones DA, Meethal RE, Obst B, Wüchner R (2022) Neural
416. Minliang L, Liang L, Wei S (2019) Estimation of in vivo network-based surrogate models applied to fluid–structure inter-
constitutive parameters of the aortic wall using a machine learn- action problems. In: WCCM-APCOM 2022, 1700 data science,
ing approach. Comput Methods Appl Mech Eng 347:201–217. machine learning and artificial intelligence. https://doi.org/10.
https://doi.org/10.1016/j.cma.2018.12.030 23967/wccm-apcom.2022.080
417. Lu L, Dao M, Kumar P, Ramamurty U, Karniadakis GE, Suresh S 434. Changnian H, Peng Z, Danny B, Guojing C, Yuefan D (2021)
(2020) Extraction of mechanical properties of materials through Artificial intelligence for accelerating time integrations in multi-
123
Computational Mechanics (2024) 74:281–331 325
scale modeling. J Comput Phys 427:110053. https://doi.org/10. Appl Mech Eng 412:115991. https://doi.org/10.1016/j.cma.2023.
1016/j.jcp.2020.110053 115991
435. Tomasz S, Mateusz D, Anna P, Ignacio M, Marcin Ł, Maciej 452. Plessix R-E (2006) A review of the adjoint-state method for com-
P (2023) Automatic stabilization of finite-element simulations puting the gradient of a functional with geophysical applications.
using neural networks and hierarchical matrices. Comput Meth- Geophys J Int 167(2):495–503. https://doi.org/10.1111/j.1365-
ods Appl Mech Eng 411:116073. https://doi.org/10.1016/j.cma. 246X.2006.02978.x
2023.116073 453. Dan G (2021) A tutorial on the adjoint method for inverse prob-
436. Mariusz B, Salman YM, Nathan Z, Duane D, Stefan M, Satchit R, lems. Comput Methods Appl Mech Eng 380:113810. https://doi.
Thiago R, Fabian D (2023) Learning hyperparameter predictors org/10.1016/j.cma.2021.113810
for similarity-based multidisciplinary topology optimization. Sci 454. Keshavarzzadeh V, Kirby RM, Narayan A (2021) Robust topology
Rep 13(1):14856. https://doi.org/10.1038/s41598-023-42009-0 optimization with low rank approximation using artificial neu-
437. Casadei F, Rimoli JJ, Ruzzene M (2013) A geometric multiscale ral networks. Comput Mech 68(6):1297–1323. https://doi.org/10.
finite element method for the dynamic analysis of heterogeneous 1007/s00466-021-02069-3
solids. Comput Methods Appl Mech Eng 263:56–70. https://doi. 455. Qian C, Ye W (2021) Accelerating gradient-based topology
org/10.1016/j.cma.2013.05.009 optimization design with dual-model artificial neural networks.
438. Oztoprak O, Paolini A, D’Acunto P, Rank E, Kollmannsberger S Struct Multidiscip Optim 63(4):1687–1707. https://doi.org/10.
(2023) Two-scale analysis of spaceframes with complex additive 1007/s00158-020-02770-6
manufactured nodes. Eng Struct 289:116283. https://doi.org/10. 456. Heng C, Yuyu Z, Elaine TTL, Lucia M, Livio D, Le S, Paulino
1016/j.engstruct.2023.116283 Glaucio H (2021) Universal machine learning for topology opti-
439. Arnd K, Franz B, Bernd M (2020) An intelligent nonlinear meta mization. Comput Methods Appl Mech Eng 375:112739. https://
element for elastoplastic continua: deep learning using a new doi.org/10.1016/j.cma.2019.112739
time-distributed residual U-Net architecture. Comput Methods 457. Aulig N, Olhofer M (2013) Evolutionary generation of neural
Appl Mech Eng 366:113088. https://doi.org/10.1016/j.cma.2020. network update signals for the topology optimization of struc-
113088 tures. In: Proceedings of the 15th annual conference companion on
440. German C, Rimoli Julian J (2019) Smart finite elements: a novel genetic and evolutionary computation, GECCO ’13 Companion.
machine learning application. Comput Methods Appl Mech Eng Association for Computing Machinery, New York, pp 213–214.
345:363–381. https://doi.org/10.1016/j.cma.2018.10.046 https://doi.org/10.1145/2464576.2464685
441. Taichi Y, Hiroshi O (2021) Zooming method for FEA using a 458. Aulig N, Olhofer M (2014) Topology optimization by predicting
neural network. Comput Struct 247:106480. https://doi.org/10. sensitivities based on local state features. https://congress.cimne.
1016/j.compstruc.2021.106480 com/iacm-eccomas2014/admin/files/filePaper/p437.pdf
442. Minglang Y, Zhang Enrui Yu, Yue KG (2022) Interfacing finite 459. Aulig N, Olhofer M (2015) Neuro-evolutionary topology opti-
elements with deep neural operators for fast multiscale model- mization with adaptive improvement threshold. In: Mora AM,
ing of mechanics problems. Comput Methods Appl Mech Eng Squillero G (eds) Applications of evolutionary computation. Lec-
402:115027. https://doi.org/10.1016/j.cma.2022.115027 ture notes in computer science. Springer, Cham, pp 655–666.
443. Sigmund O (2011) On the usefulness of non-gradient approaches https://doi.org/10.1007/978-3-319-16549-3_53
in topology optimization. Struct Multidiscip Optim 43(5):589– 460. Zhang Y, Chi H, Chen B, Tang TLE, Mirabella L, Song L,
596. https://doi.org/10.1007/s00158-011-0638-7 Paulino GH (2021) Speeding up computational morphogenesis
444. Holl P, Koltun V, Thuerey N (2020) Learning to control PDEs with online neural synthetic gradients. arXiv:2104.12282
with differentiable physics. arXiv:2001.07457 [physics, stat] 461. Hunter TH, Hulsoff SH, Sitaram A (2023) SuperAdjoint: super-
445. Um K, Brand R, Fei Y, Holl P, Thuerey N (2020) Solver-in-the- resolution neural networks in adjoint-based output error esti-
loop: learning from differentiable physics to interact with iterative mation. In: International conference on adaptive modeling and
PDE-solvers. In: Proceedings of the 34th international conference simulation (ADMOS 2023), recent developments in methods
on neural information processing systems, NIPS’20. Curran Asso- and applications for mesh adaptation. https://doi.org/10.23967/
ciates Inc, Red Hook, pp 6111–6122 admos.2023.058
446. Um K, Brand R, Yun F, Holl P, Thuerey P (2021) Solver-in-the- 462. Kai F, Koji F, Kunihiko T (2021) Machine-learning-based spatio-
loop: learning from differentiable physics to interact with iterative temporal super resolution reconstruction of turbulent flows. J
PDE-solvers. arXiv:2007.00016 [physics] Fluid Mech 909:A9. https://doi.org/10.1017/jfm.2020.948
447. Jensen CA, Reed RD, Marks RJ, El-Sharkawi MA, Jung J-B, 463. Senhora Fernando V, Heng C, Yuyu Z, Lucia M, Elaine TTL,
Miyamoto RT, Anderson GM, Eggen CJ (1999) Inversion of feed- Paulino Glaucio H (2022) Machine learning for topology opti-
forward neural networks: algorithms and applications. Proc IEEE mization: physics-based learning through an independent training
87(9):1536–1549. https://doi.org/10.1109/5.784232 strategy. Comput Methods Appl Mech Eng 398:115116. https://
448. Chi-Hua Yu, Qin Z, Buehler MJ (2019) Artificial intelligence doi.org/10.1016/j.cma.2022.115116
design algorithm for nanocomposites optimized for shear crack 464. Hsieh JT, Zhao S, Eismann S, Mirabella L, Ermon S (2019)
resistance. Nano Futures 3(3):035001. https://doi.org/10.1088/ Learning neural PDE solvers with convergence guarantees.
2399-1984/ab36f0 arXiv:1906.01200 [cs, stat]
449. Chen C-T, Grace XG (2020) Generative deep neural networks 465. Hong-Ling Y, Ji-Cheng L, Bo-Shuai Y, Nan W, Yun-Kang
for inverse materials design using backpropagation and active S (2021) Acceleration design for continuum topology opti-
learning. Adv Sci 7(5):1902607. https://doi.org/10.1002/advs. mization by using Pix2pix neural network. Int J Appl Mech
201902607 13(04):2150042. https://doi.org/10.1142/S1758825121500423
450. Tanyu DN, Ning J, Freudenberg T, Heilenkötter N, Rademacher 466. Hoyer S, Sohl-Dickstein J, Greydanus S (2019) Neural reparam-
A, Iben U, Maass Pr (2022) Deep learning methods for partial dif- eterization improves structural optimization. arXiv:1909.04240
ferential equations and related parameter identification problems. 467. Xu K, Darve E (2019) The neural network approach to inverse
arXiv:2212.03130 problems in differential equations. arXiv:1901.07758
451. Zohdi TI (2023) A machine-learning digital-twin for rapid large- 468. Jens B, Kaj N (2021) Neural networks as smooth priors for inverse
scale solar-thermal energy system design. Comput Methods problems for PDEs. J Comput Math Data Sci 1:100008. https://
doi.org/10.1016/j.jcmds.2021.100008
123
326 Computational Mechanics (2024) 74:281–331
469. Chen L, Shen MHH (2021) A new topology optimization 486. Kai F, Koji F, Kunihiko T (2019) Super-resolution reconstruction
approach by physics-informed deep learning process. Adv Sci of turbulent flows with machine learning. J Fluid Mech 870:106–
Technol Eng Syst J 6(4):233–240. https://doi.org/10.25046/ 120. https://doi.org/10.1017/jfm.2019.238
aj060427 487. Nicholas N, Sai-Aksharah S, Tran Huy T, James Kai A (2020) An
470. Alex H, Flavio CL, Alexander H (2021) An artificial intelligence- artificial neural network approach for generating high-resolution
assisted design method for topology optimization without pre- designs from low-resolution input in topology optimization. J
optimized training data. Appl Sci 11(19):9041. https://doi.org/ Mech Des 142(1):011402. https://doi.org/10.1115/1.4044332
10.3390/app11199041 488. Wang C, Yao S, Wang Z, Jie H (2021) Deep super-
471. Deng H, Albert CT (2020) Topology optimization based on resolution neural network for structural topology optimiza-
deep representation learning (DRL) for compliance and stress- tion. Eng Optim 53(12):2108–2121. https://doi.org/10.1080/
constrained design. Comput Mech 66(2):449–469. https://doi. 0305215X.2020.1846031
org/10.1007/s00466-020-01859-5 489. Xue L, Liu J, Wen G, Wang H (2021) Efficient, high-resolution
472. Chandrasekhar A, Suresh K (2021) TOuNN: topology optimiza- topology optimization method based on convolutional neural net-
tion using neural networks. Struct Multidiscip Optim 63(3):1135– works. Front Mech Eng 16(1):80–96. https://doi.org/10.1007/
1149. https://doi.org/10.1007/s00158-020-02748-4 s11465-020-0614-2
473. Chandrasekhar A, Suresh K (2021) Length scale control in 490. Oishi A, Yagawa G (2021) Finite elements using neural networks
topology optimization using fourier enhanced neural networks. and a posteriori error. Arch Comput Methods Eng 28(5):3433–
arXiv:2109.01861 3456. https://doi.org/10.1007/s11831-020-09507-0
474. Aaditya C, Krishnan S (2021) Multi-material topology optimiza- 491. Ohrt EM, Niels A, Andreas BJ, Ole S (2022) De-homogenization
tion using neural networks. Comput Aided Des 136:103017. using convolutional neural networks. Comput Methods Appl
https://doi.org/10.1016/j.cad.2021.103017 Mech Eng 388:114197. https://doi.org/10.1016/j.cma.2021.
475. Park JJ, Florence P, Straub J, Newcombe R, Lovegrove S (2019) 114197
DeepSDF: learning continuous signed distance functions for 492. Wan ZY, Vlachas P, Koumoutsakos P, Sapsis T (2018) Data-
shape representation. In: 2019 IEEE/CVF conference on com- assisted reduced-order modeling of extreme events in complex
puter vision and pattern recognition (CVPR), pp 165–174. https:// dynamical systems. PLoS ONE 13(5):e0197704. https://doi.org/
doi.org/10.1109/CVPR.2019.00025 10.1371/journal.pone.0197704
476. Michalkiewicz M, Pontes JK, Jack D, Baktashmotlagh M, Eriks- 493. Sato S, Dobashi Y, Kim T, Nishita T (2018) Example-based tur-
son A (2019) Implicit surface representations as layers in neural bulence style transfer. ACM Trans Graph 37(4):84:1-84:9. https://
networks. In: 2019 IEEE/CVF international conference on com- doi.org/10.1145/3197517.3201398
puter vision (ICCV), pp 4742–4751. https://doi.org/10.1109/ 494. Chu M, Thuerey N (2017) Data-driven synthesis of smoke
ICCV.2019.00484 flows with CNN-based feature descriptors. ACM Trans Graph
477. Gropp A, Yariv L, Haim N, Atzmon M, Lipman Y (2020) Implicit 36(4):69:1-69:14. https://doi.org/10.1145/3072959.3073643
geometric regularization for learning shapes. In: Proceedings of 495. Yildiz AR, Öztürk N, Kaya N, Öztürk F (2003) Integrated
the 37th international conference on machine learning, vol 119 of optimal topology design and shape optimization using neural net-
ICML’20, pp 3789–3799. JMLR.org works. Struct Multidiscip Optim 25(4):251–260. https://doi.org/
478. Sitzmann V, Martel Julien NP, Bergman AW, Lindell DB, Wet- 10.1007/s00158-003-0300-0
zstein G (2020) Implicit neural representations with periodic 496. Chyi-Yeu L, Shin-Hong L (2005) Artificial neural network
activation functions. arXiv:2006.09661 [cs, eess] based hole image interpretation techniques for integrated topol-
479. Huang Z, Bai S, Zico KJ (2021) (Implicit)2 : implicit ogy and shape optimization. Comput Methods Appl Mech Eng
layers for implicit representations. In: Advances in neu- 194(36):3817–3837. https://doi.org/10.1016/j.cma.2004.09.005
ral information processing systems, vol 34. Curran Asso- 497. Chen G, Fidkowski K (2020) Output-based error estimation and
ciates, Inc., pp 9639–9650. https://papers.nips.cc/paper/2021/ mesh adaptation using convolutional neural networks: applica-
hash/4ffbd5c8221d7c147f8363ccdc9a2a37-Abstract.html tion to a scalar advection-diffusion problem. In: AIAA Scitech
480. Deng H, To AC (2021) A parametric level set method for 2020 forum. American Institute of Aeronautics and Astronautics,
topology optimization based on deep neural network (DNN). Orlando. https://doi.org/10.2514/6.2020-1143
arXiv:2101.03286 498. Ramuhalli P, Udpa L, Udpa SS (2005) Finite-element neural net-
481. Zeyu Z, Li Yu, Weien Z, Xiaoqian C, Wen Y, Yong Z (2021) works for solving differential equations. IEEE Trans Neural Netw
TONR: an exploration for a novel way combining neural network 16(6):1381–1392. https://doi.org/10.1109/TNN.2005.857945
with topology optimization. Comput Methods Appl Mech Eng 499. Sikora R, Sikora J, Cardelli E, Chady T (1999) Artificial neural
386:114083. https://doi.org/10.1016/j.cma.2021.114083 network application for material evaluation by electromagnetic
482. Biswas R, Sen MK, Das V, Mukerji T (2019) Prestack and methods. In: International joint conference on neural networks.
poststack inversion using a physics-guided convolutional neu- Proceedings (Cat. No.99CH36339), IJCNN’99, vol 6, pp 4027–
ral network. Interpretation 7(3):161–174. https://doi.org/10.1190/ 4032. https://doi.org/10.1109/IJCNN.1999.830804
INT-2018-0236.1 500. Xu G, Littlefair G, Penson R, Callan R (1999) Application of
483. Alfarraj M, AlRegib G (2019) Semi-supervised learning for FE-based neural networks to dynamic problems. In: ICONIP’99.
acoustic impedance inversion. In: SEG technical program ANZIIS’99 & ANNES’99 & ACNN’99. 6th International con-
expanded abstracts 2019. Society of Exploration Geophysicists, ference on neural information processing. Proceedings (Cat.
San Antonio, pp 2298–2302. https://doi.org/10.1190/segam2019- No.99EX378), vol 3, pp 1039–1044. https://doi.org/10.1109/
3215902.1 ICONIP.1999.844679
484. Dong C, Loy CC, He K, Tang X (2014) Learning a deep convo- 501. Guo F, Zhang P, Wang F, Ma X, Qiu G (1999) Finite element anal-
lutional network for image super-resolution. In: Fleet D, Pajdla ysis based Hopfield neural network model for solving nonlinear
T, Schiele B, Tuytelaars T (eds) Computer vision—ECCV 2014. electromagnetic field problems. In: International joint confer-
Lecture notes in computer science. Springer, Cham, pp 184–199. ence on neural networks. Proceedings (Cat. No.99CH36339),
https://doi.org/10.1007/978-3-319-10593-2_13 IJCNN’99, vol 6, pp 4399–4403. https://doi.org/10.1109/IJCNN.
485. Dong C, Loy CC, He K, Tang X (2015) Image super-resolution 1999.830877
using deep convolutional networks. arXiv:1501.00092 [cs]
123
Computational Mechanics (2024) 74:281–331 327
502. Hyuk L, Seok KI (1990) Neural algorithm for solving differen- Comput Methods Appl Mech Eng 363:112892. https://doi.org/
tial equations. J Comput Phys 91(1):110–131. https://doi.org/10. 10.1016/j.cma.2020.112892
1016/0021-9991(90)90007-N 518. Mishra RK, Hall PS (2005) NFDTD concept. IEEE Trans Neural
503. Kalkkuhl J, Hunt KJ, Fritz H (1999) FEM-based neural-network Netw 16(2):484–490. https://doi.org/10.1109/TNN.2004.841799
approach to nonlinear modeling with application to longitudinal 519. Richardson A (2018) Seismic full-waveform inversion using deep
vehicle dynamics control. IEEE Trans Neural Netw 10(4):885– learning tools and techniques. arXiv:1801.07232
897. https://doi.org/10.1109/72.774241 520. Sun J, Niu Z, Innanen KA, Li J, Trad DO (2020) A theory-guided
504. Chao X, Wang C, Ji F, Yuan X (2012) Finite-element neural deep-learning formulation and optimization of seismic waveform
network-based solving 3-D differential equations in MFL. IEEE inversion. Geophysics 85(2):R87–R99. https://doi.org/10.1190/
Trans Magn 48(12):4747–4756. https://doi.org/10.1109/TMAG. geo2019-0138.1
2012.2207732 521. Hughes TW, Williamson IAD, Minkov M, Fan S (2019)
505. Yang Z, Ruess M, Kollmannsberger S, Düster A, Rank E (2012) Wave physics as an analog recurrent neural network. Sci Adv
An efficient integration technique for the voxel-based finite cell 5(12):6946. https://doi.org/10.1126/sciadv.aay6946
method: efficient integration technique for finite cells. Int J Numer 522. Liu Zeliang WCT, Koishi M (2019) A deep material network for
Methods Eng 91(5):457–471. https://doi.org/10.1002/nme.4269 multiscale topology learning and accelerated nonlinear modeling
506. Zhang L, Cheng L, Li H, Gao J, Cheng Yu, Domel R, Yang Y, Tang of heterogeneous materials. Comput Methods Appl Mech Eng
S, Liu WK (2021) Hierarchical deep-learning neural networks: 345:1138–1168. https://doi.org/10.1016/j.cma.2018.09.020
finite elements and beyond. Comput Mech 67(1):207–230. https:// 523. Liu Zeliang WCT (2019) Exploring the 3D architectures of deep
doi.org/10.1007/s00466-020-01928-9 material network in data-driven multiscale mechanics. J Mech
507. Sourav S, Zhengtao G, Lin C, Jiaying G, Kafka Orion L, Xiaoyu Phys Solids 127:20–46. https://doi.org/10.1016/j.jmps.2019.03.
X, Hengyang L, Mahsa T, Alicia Kim H, Kam LW (2021) Hier- 004
archical deep learning neural network (HiDeNN): an artificial 524. Haber E, Ruthotto L (2018) Stable architectures for deep neu-
intelligence (AI) framework for computational science and engi- ral networks. Inverse Problems, 34(1):014004. https://doi.org/10.
neering. Comput Methods Appl Mech Eng 373:113452. https:// 1088/1361-6420/aa9a90. arXiv:1705.03341 [cs, math]
doi.org/10.1016/j.cma.2020.113452 525. Ruthotto L, Haber E (2018) Deep neural networks motivated by
508. Zhang Lei L, Ye TS, Kam LW (2022) HiDeNN-TD: reduced-order partial differential equations. arXiv:1804.04272 [cs, math, stat]
hierarchical deep learning neural networks. Comput Methods 526. Lu Y, Zhong A, Li Q, Dong B (2020) Beyond finite layer neural
Appl Mech Eng 389:114414. https://doi.org/10.1016/j.cma.2021. networks: bridging deep architectures and numerical differential
114414 equations. arXiv:1710.10121 [cs, stat]
509. Liu Y, Park C, Ye L, Mojumder S, Liu WK, Qian D (2023) 527. Pontriagin LS, Neustadt LW, Pontriagin LS (1986) The math-
HiDeNN-FEM: a seamless machine learning approach to nonlin- ematical theory of optimal processes. In: Classics of Soviet
ear finite element analysis. Comput Mech 72(1):173–194. https:// mathematics. Gordon and Breach Science Publishers, New York
doi.org/10.1007/s00466-023-02293-z 528. Yu Y, Yao H, Liu Y (2018) Physics-based learning for aircraft
510. Ye L, Li H, Zhang L, Park C, Mojumder S, Knapik S, Sang Z, Tang dynamics simulation. In: Annual conference of the PHM society.
S, Apley DW, Wagner GJ, Liu WK (2023) Convolution hierarchi- https://doi.org/10.36001/phmconf.2018.v10i1.513
cal deep-learning neural networks (C-HiDeNN): finite elements, 529. Rishikesh R, Chris H, Jay P (2021) DiscretizationNet: a
isogeometric analysis, tensor decomposition, and beyond. Com- machine-learning based solver for Navier–Stokes equations using
put Mech 72(2):333–362. https://doi.org/10.1007/s00466-023- finite volume discretization. Comput Methods Appl Mech Eng
02336-5 378:113722. https://doi.org/10.1016/j.cma.2021.113722
511. Park C, Ye L, Saha S, Xue T, Guo J, Mojumder S, Apley DW, Wag- 530. Foster D (2023) Generative deep learning: teaching machines to
ner GJ, Liu WK (2023) Convolution hierarchical deep-learning paint, write, compose, and play, 2nd edn. O’Reilly Media Incor-
neural network (C-HiDeNN) with graphics processing unit (GPU) porated, Sebastopol
acceleration. Comput Mech 72(2):383–409. https://doi.org/10. 531. Mosser L, Dubrule O, Blunt MJ (2017) Reconstruction of three-
1007/s00466-023-02329-4 dimensional porous media using generative adversarial neural
512. Li H, Knapik S, Li Y, Park C, Guo J, Mojumder S, Ye L, networks. Phys Rev E 96(4):043309. https://doi.org/10.1103/
Chen W, Apley DW, Liu WK (2023) Convolution hierarchical PhysRevE.96.043309
deep-learning neural network tensor decomposition (C-HiDeNN- 532. Feng J, He X, Teng Q, Ren C, Chen H, Li Y (2019) Recon-
TD) for high-resolution topology optimization. Comput Mech struction of porous media from extremely limited information
72(2):363–382. https://doi.org/10.1007/s00466-023-02333-8 using conditional generative adversarial networks. Phys Rev E
513. Grosse IR, Katragadda P, Benoit J (1992) An adaptive accuracy- 100(3):033308. https://doi.org/10.1103/PhysRevE.100.033308
based a posteriori error estimator. Finite Elem Anal Des 12(1):75– 533. Reza S, Mohsen M, Bozorgmehry BR, Blunt Martin J (2020)
90. https://doi.org/10.1016/0168-874X(92)90008-Z Coupled generative adversarial and auto-encoder neural networks
514. Zhu JZ, Zienkiewicz OC (1997) A posteriori error estima- to reconstruct three-dimensional multi-scale porous media. J
tion and three-dimensional automatic mesh generation. Finite Petrol Sci Eng 186:106794. https://doi.org/10.1016/j.petrol.2019.
Elem Anal Des 25(1):167–184. https://doi.org/10.1016/S0168- 106794
874X(96)00037-6 534. Xia P, Bai H, Zhang T (2022) Multi-scale reconstruction of porous
515. Möller M, Kuzmin D (2006) Adaptive mesh refinement for high- media based on progressively growing generative adversarial net-
resolution finite element schemes. Int J Numer Meth Fluids works. Stoch Env Res Risk Assess 36(11):3685–3705. https://doi.
52(5):545–569. https://doi.org/10.1002/fld.1183 org/10.1007/s00477-022-02216-z
516. Yao H, Ren Y, Liu Y (2019) FEA-Net: a deep convolutional neural 535. Alexander H, Henning W (2022) Three-dimensional microstruc-
network with physicsprior for efficient data driven PDE learning. ture generation using generative adversarial neural networks in
In: AIAA Scitech 2019 forum. American Institute of Aeronau- the context of continuum micromechanics. Comput Methods
tics and Astronautics, San Diego. https://doi.org/10.2514/6.2019- Appl Mech Eng 400:115497. https://doi.org/10.1016/j.cma.2022.
0680 115497
517. Houpu Y, Yi G, Yongming L (2020) FEA-Net: a physics-guided
data-driven model for efficient mechanical response prediction.
123
328 Computational Mechanics (2024) 74:281–331
536. Rawat S, Herman Shen MH (2019) A novel topology design 553. Tinghao G, Lohan Danny J, Ruijin C, Yi RM, Allison James
approach using an integrated deep learning network architecture. T (2018) An indirect design representation for topology opti-
arXiv:1808.02334 mization using variational autoencoder and style transfer. In:
537. Kentaro Y, Shintaro Y, Kikuo F (2022) Data-driven multifi- AIAA/ASCE/AHS/ASC structures, structural dynamics, and
delity topology design using a deep generative model: application materials conference. AIAA SciTech Forum American Institute
to forced convection heat transfer problems. Comput Methods of Aeronautics and Astronautics. https://doi.org/10.2514/6.2018-
Appl Mech Eng 388:114284. https://doi.org/10.1016/j.cma.2021. 0804
114284 554. Vulimiri Praveen S, Hao D, Florian D, Xiaoli Z, To Albert C
538. Lee KH, Yun GJ (2023) Microstructure reconstruction using (2021) Integrating geometric data into topology optimization via
diffusion-based generative models. arXiv:2211.10949 [cond-mat, neural style transfer. Materials 14(16):4551. https://doi.org/10.
physics:physics] 3390/ma14164551
539. Christian D, Paul S, Dennis R, Stephanie H, Markus K, Maik 555. Gatys L, Ecker A, Bethge M (2016) A neural algorithm of artistic
G (2023) Conditional diffusion-based microstructure reconstruc- style. J Vis 16(12):326. https://doi.org/10.1167/16.12.326
tion. Mater Today Commun 35:105608. https://doi.org/10.1016/ 556. Dommaraju N, Bujny M, Menzel S, Olhofer M, Duddeck F (2023)
j.mtcomm.2023.105608 Evaluation of geometric similarity metrics for structural clusters
540. Vlassis Nikolaos N, WaiChing S (2023) Denoising diffusion generated using topology optimization. Appl Intell 53(1):904–
algorithm for inverse design of microstructures with fine-tuned 929. https://doi.org/10.1007/s10489-022-03301-0
nonlinear material properties. Comput Methods Appl Mech Eng 557. Achlioptas P, Diamanti O, Mitliagkas I, Guibas L (2018) Learn-
413:116126. https://doi.org/10.1016/j.cma.2023.116126 ing representations and generative models for 3D point clouds.
541. Junxi F, Qizhi T, Bing L, Xiaohai H, Honggang C, Yang L (2020) In: Proceedings of the 35th international conference on machine
An end-to-end three-dimensional reconstruction framework of learning. PMLR, pp 40–49. https://proceedings.mlr.press/v80/
porous media from a single two-dimensional image based on deep achlioptas18a.html
learning. Comput Methods Appl Mech Eng 368:113043. https:// 558. Yang Y, Feng C, Shen Y, Tian D (2018) FoldingNet: point
doi.org/10.1016/j.cma.2020.113043 cloud auto-encoder via deep grid deformation, pp 206–215.
542. Steve K, Cooper Samuel J (2021) Generating three-dimensional https://openaccess.thecvf.com/content_cvpr_2018/html/Yang_
structures from a two-dimensional slice with generative adver- FoldingNet_Point_Cloud_CVPR_2018_paper.html
sarial network-based dimensionality expansion. Nat Mach Intell 559. Shahroz K, Kosa G-L, Konstantinos K, Panagiotis K (2023) Shi-
3(4):299–305. https://doi.org/10.1038/s42256-021-00322-1 pHullGAN: a generic parametric modeller for ship hull design
543. Li Y, Jian P, Han G (2022) Cascaded progressive generative adver- using deep convolutional generative model. Comput Methods
sarial networks for reconstructing three-dimensional grayscale Appl Mech Eng 411:116051. https://doi.org/10.1016/j.cma.2023.
core images from a single two-dimensional image. Front Phys. 116051
https://doi.org/10.3389/fphy.2022.716708 560. Qiuyi C, Jun W, Phillip P, Chen W, Fuge M (2022) Inverse design
544. Fan Z, Xiaohai H, Teng Qizhi W, Xiaohong DX (2022) 3D- of two-dimensional airfoils using conditional generative models
PMRNN: Reconstructing three-dimensional porous media from and surrogate log-likelihoods. J Mech Des 144(2):021712. https://
the two-dimensional image with recurrent neural network. J doi.org/10.1115/1.4052846
Petrol Sci Eng 208:109652. https://doi.org/10.1016/j.petrol.2021. 561. Chen W, Fuge M (2021) BézierGAN: automatic generation of
109652 smooth curves from interpretable low-dimensional parameters.
545. Zheng Q, Zhang D (2022) RockGPT: reconstructing three- arXiv:1808.08871 [cs, stat]
dimensional digital rocks from single two-dimensional slice with 562. Wei C, Faez A (2021) MO-PaDGAN: reparameterizing engineer-
deep learning. Comput Geosci 26(3):677–696. https://doi.org/10. ing designs for augmented multi-objective optimization. Appl
1007/s10596-022-10144-8 Soft Comput 113:107909. https://doi.org/10.1016/j.asoc.2021.
546. Johan P, Leonardo R, Gabriel K, Frank L (2022) Size-invariant 107909
3D generation from a single 2D rock image. J Petrol Sci Eng 563. Richardson A (2018) Generative adversarial networks for
215:110648. https://doi.org/10.1016/j.petrol.2022.110648 model order reduction in seismic full-waveform inversion.
547. Fan Z, Qizhi T, Honggang C, Xiaohai H, Xiucheng D (2021) Slice- arXiv:1806.00828 [physics]
to-voxel stochastic reconstructions on porous media with hybrid 564. Zhang Y, Seibert P, Otto A, Raßloff A, Ambati M, Kästner M
deep generative model. Comput Mater Sci 186:110018. https:// (2023) DA-VEGAN: differentiably augmenting VAE-GAN for
doi.org/10.1016/j.commatsci.2020.110018 microstructure reconstruction from extremely small data sets.
548. Rawat S, Shen MHH (2019) Application of adversarial networks arXiv:2303.03403 [cs]
for 3D structural topology optimization, pp 2019-01-0829. https:// 565. Wei C, Faez A (2021) PaDGAN: learning to generate high-quality
doi.org/10.4271/2019-01-0829 novel designs. J Mech Des 143(3):031703. https://doi.org/10.
549. Rawat S, Herman SMH (2019) A novel topology optimization 1115/1.4048626
approach using conditional deep learning. arXiv:1901.04859 566. Kulesza A, Taskar B (2012) Determinantal point processes for
550. Herman Shen MH, Chen L (2019) A new CGAN technique for machine learning. Found Trends Mach Learn 5(2–3):123–286.
constrained topology design optimization. arXiv:1901.07675 https://doi.org/10.1561/2200000044. arXiv:1207.6083 [cs, stat]
551. Henning W, Christoph B, Fadi A, Markus H, Michael H, Ludger 567. Bates SJ, Sienz J, Langley DS (2003) Formulation of the
L, Peter W (2022) Computational homogenization using convo- Audze–Eglais uniform latin hypercube design of experiments.
lutional neural networks. In: Fadi A, Blaž H, Meisam S, Henning Adv Eng Softw 34(8):493–506. https://doi.org/10.1016/S0965-
W, Christian W, Michele M (eds) Current trends and open prob- 9978(03)00042-5
lems in computational mechanics. Springer, Cham, pp 569–579. 568. Heyrani Nobari A, Rashad MF, Ahmed F (2021) CreativeGAN:
https://doi.org/10.1007/978-3-030-87312-7_55 editing generative adversarial networks for creative design syn-
552. Mosser L, Dubrule O, Blunt MJ (2020) Stochastic seismic thesis. In: 47th Design automation conference (DAC), page
waveform inversion using generative adversarial networks as a V03AT03A002, virtual, vol 3. American Society of Mechanical
geological prior. Math Geosci 52(1):53–79. https://doi.org/10. Engineers. https://doi.org/10.1115/DETC2021-68103
1007/s11004-019-09832-6 569. Bau D, Liu S, Wang T, Zhu JY, Torralba A (2020) Rewriting a
deep generative model. arXiv:2007.15646 [cs]
123
Computational Mechanics (2024) 74:281–331 329
570. Elgammal A, Liu B, Elhoseiny M, Mazzone M (2017) CAN: 587. Xie Y, Franz E, Chu M, Thuerey N (2018) tempoGAN: a tempo-
creative adversarial networks, generating “art” by learning about rally coherent, volumetric GAN for super-resolution fluid flow.
styles and deviating from style norms. arXiv:1706.07068 [cs] ACM Trans Graph 37(4):95:1-95:15. https://doi.org/10.1145/
571. Oh S, Jung Y, Kim S, Lee I, Kang N (2019) Deep genera- 3197517.3201304
tive design: integration of topology optimization and generative 588. Pang G, Shen C, Cao L, Van Den Hengel A (2022) Deep learning
models. J Mech Des 141(11):111405. https://doi.org/10.1115/1. for anomaly detection: a review. ACM Comput Surv 54(2):1–38.
4044229. arXiv:1903.01548 https://doi.org/10.1145/3439950
572. Greminger M (2020) Generative adversarial networks with syn- 589. Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection
thetic training data for enforcing manufacturing constraints on using replicator neural networks. In: Kambayashi Y, Winiwarter
topology optimization. In: 46th Design automation conference W, Arikawa M (eds) Data warehousing and knowledge discovery.
(DAC), vol 11A, p V11AT11A005. American Society of Mechan- Lecture notes in computer science. Springer, Berlin, pp 170–180.
ical Engineers. https://doi.org/10.1115/DETC2020-22399 https://doi.org/10.1007/3-540-46145-0_17
573. Yoo S, Lee S, Kim S, Hwang KH, Park JH, Kang N (2021) Inte- 590. Thomas S, Philipp S, Waldstein Sebastian M, Ursula S-E, Georg L
grating deep learning into CAD/CAE system: generative design (2017) Unsupervised anomaly detection with generative adversar-
and evaluation of 3D conceptual wheel. Struct Multidiscip Optim ial networks to guide marker discovery. In: Niethammer M, Styner
64(4):2725–2747. https://doi.org/10.1007/s00158-021-02953-9 M, Aylward S, Zhu H, Oguz I, Yap P-T, Shen D (eds) Information
574. Weisheng Z, Wang Yue D, Zongliang LC, Sung-Kie Y, Guo processing in medical imaging. Lecture notes in computer sci-
X (2023) Machine-learning assisted topology optimization ence. Springer, Cham, pp 146–157. https://doi.org/10.1007/978-
for architectural design with artistic flavor. Comput Methods 3-319-59050-9_12
Appl Mech Eng 413:116041. https://doi.org/10.1016/j.cma.2023. 591. Zenati H, Foo CS, Lecouat B, Manek G, Chandrasekhar
116041 VR (2019) Efficient GAN-based anomaly detection.
575. Bendsøe MP, Sigmund O (2003) Topology optimization: theory, arXiv:1802.06222 [cs, stat]
methods, and applications. Springer, New York 592. Thomas S, Philipp S, Waldstein Sebastian M, Georg L, Ursula
576. Yang F, Ma J (2023) FWIGAN: full-waveform inversion via S-E (2019) f-AnoGAN: fast unsupervised anomaly detection
a physics-informed generative adversarial network. J Geophys with generative adversarial networks. Med Image Anal 54:30–
Res Solid Earth 128(4):e2022JB025493. https://doi.org/10.1029/ 44. https://doi.org/10.1016/j.media.2019.01.010
2022JB025493 593. Henkes A, Herrmann L, Wessels H, Kollmannsberger S (2023)
577. Radhakrishnan S, Bharadwaj V, Manjunath V, Srinath R (2018) Gan enables outlier detection and property monitoring for additive
Creative intelligence—automating car design studio with gener- manufacturing of complex structures. Preprint https://www.ssrn.
ative adversarial networks (GAN). In: Holzinger A, Kieseberg com/abstract=4627723
P, Tjoa AM, Weippl E (eds) Machine learning and knowledge 594. Duddeck F (2008) Multidisciplinary optimization of car bod-
extraction. Lecture notes in computer science. Springer, Cham, ies. Struct Multidiscip Optim 35(4):375–389. https://doi.org/10.
pp 160–175. https://doi.org/10.1007/978-3-319-99740-7_11 1007/s00158-007-0130-6
578. Wei C, Mark F (2019) Synthesizing designs with interpart depen- 595. David S, Julian S, Karen S, Ioannis A, Aja H, Arthur G, Thomas
dencies using hierarchical generative adversarial networks. J H, Lucas B, Matthew L, Adrian B, Yutian C, Timothy L, Fan H,
Mech Des 141(11):111403. https://doi.org/10.1115/1.4044076 Laurent S, van den Driessche G, Graepel T, Hassabis D (2017)
579. Nie Z, Lin T, Jiang H, Kara LB (2020) TopologyGAN: topol- Mastering the game of Go without human knowledge. Nature
ogy optimization using generative adversarial networks based on 550(7676):354–359. https://doi.org/10.1038/nature24270
physical fields over the initial domain. https://doi.org/10.48550/ 596. Oriol V, Igor B, Czarnecki Wojciech M, Michaël M, Andrew D,
arXiv.2003.04685. arXiv:2003.04685v2 Junyoung C, Choi David H, Richard P, Timo E, Petko G, Junhyuk
580. Nathan H, Buskohl Philip R, Andrew G, Kumar V, Sam A (2021) O, Dan H, Manuel K, Ivo D, Aja H, Laurent S, Trevor C, Agapiou
Generative adversarial network for early-stage design flexibility John P, Max J, Vezhnevets Alexander S, Rémi L, Tobias P, Valentin
in topology optimization for additive manufacturing. J Manuf Syst D, David B, Yury S, James M, Paine Tom L, Caglar G, Ziyu
59:675–685. https://doi.org/10.1016/j.jmsy.2021.04.007 W, Tobias P, Yuhuai W, Roman R, Dani Y, Dario W, Katrina
581. Heyrani Nobari A, Chen W, Ahmed F (2021) RANGE-GAN: MK, Oliver S, Tom S, Timothy L, Koray K, Demis H, Chris A,
design synthesis under constraints using conditional generative David S (2019) Grandmaster level in StarCraft II using multi-
adversarial networks. J Mech Des 10(1115/1):4052442 agent reinforcement learning. Nature 575(7782):350–354. https://
582. Jun W, Wei C, Da D, Fuge M, Rai R (2022) IH-GAN: a conditional doi.org/10.1038/s41586-019-1724-z
generative model for implicit surface-based inverse design of cel- 597. Kober J, Andrew Bagnell J, Peters J (2013) Reinforcement learn-
lular structures. Comput Methods Appl Mech Eng 396:115060. ing in robotics: a survey. Int J Robot Res 32(11):1238–1274.
https://doi.org/10.1016/j.cma.2022.115060 https://doi.org/10.1177/0278364913495721
583. Duque L, Gutiérrez G, Arias C, Rüger A, Jaramillo H (2019) 598. Kim H, Jordan M, Sastry S, Ng A (2003) Autonomous
Automated velocity estimation by deep learning based seismic- helicopter flight via reinforcement learning. In: Advances
to-velocity mapping. Eur Assoc Geosci Eng. https://doi.org/10. in neural information processing systems, vol 16. MIT
3997/2214-4609.201901523 Press. https://papers.nips.cc/paper_files/paper/2003/hash/
584. Yu-Qing W, Wang Qi L, Wen-Kai GQ, Xin-Fei Y (2022) Seismic b427426b8acd2c2e53827970f2c2f526-Abstract.html
impedance inversion based on cycle-consistent generative adver- 599. Abbeel P, Coates A, Quigley M, Ng A (2006). An applica-
sarial network. Pet Sci 19(1):147–161. https://doi.org/10.1016/j. tion of reinforcement learning to aerobatic helicopter flight.
petsci.2021.09.038 In: Advances in neural information processing systems, vol
585. Zhu JY, Park T, Isola P, Efros AA (2020) Unpaired image-to- 19. MIT Press. https://proceedings.neurips.cc/paper/2006/hash/
image translation using cycle-consistent adversarial networks. 98c39996bf1543e974747a2549b3107c-Abstract.html
arXiv:1703.10593 [cs] 600. Abbeel P, Coates A, Andrew YN (2010) Autonomous heli-
586. Baotong L, Congjia H, Xin L, Shuai Z, Jun H (2019) Non-iterative copter aerobatics through apprenticeship learning. Int J Robot Res
structural topology optimization using deep learning. Comput 29(13):1608–1639. https://doi.org/10.1177/0278364910371999
Aided Des 115:172–180. https://doi.org/10.1016/j.cad.2019.05. 601. Novati G, Verma S, Alexeev D, Rossinelli D, van Rees WM,
038 Koumoutsakos P (2017) Synchronised swimming of two fish.
123
330 Computational Mechanics (2024) 74:281–331
Bioinspir Biomimet 12(3):036001. https://doi.org/10.1088/1748- 618. Rabault J, Kuhnle A (2019) Accelerating deep reinforcement
3190/aa6311. arXiv:1610.04248 [physics] learning strategies of flow control through a multi-environment
602. Verma S, Novati G, Koumoutsakos P (2018) Efficient collective approach. Phys Fluids 31(9):094105. https://doi.org/10.1063/1.
swimming by harnessing vortices through deep reinforcement 5116415. arXiv:1906.10382 [physics]
learning. Proc Natl Acad Sci 115(23):5849–5854. https://doi.org/ 619. Novati G, de Laroussilhe HL, Koumoutsakos P (2020) Automat-
10.1073/pnas.1800923115 ing turbulence modeling by multi-agent reinforcement learning.
603. Ma P, Tian Y, Pan Z, Ren B, Manocha D (2018) Fluid directed arXiv:2005.09023 [physics]
rigid body control using deep reinforcement learning. ACM 620. Liu X-Y, Wang J-X (2021) Physics-informed Dyna-style model-
Trans Graph 37(4):96:1-96:11. https://doi.org/10.1145/3197517. based deep reinforcement learning for dynamic control. Proc Roy
3201334 Soc A Math Phys Eng Sci 477(2255):20210618. https://doi.org/
604. Jean R, Miroslav K, Atle J, Ulysse R, Nicolas C (2019) Artifi- 10.1098/rspa.2021.0618
cial neural networks trained through deep reinforcement learning 621. Haotian S, Zhou Yang W, Keshu CS, Bin R, Qinghui N (2023)
discover control strategies for active flow control. J Fluid Mech Physics-informed deep reinforcement learning-based integrated
865:281–302. https://doi.org/10.1017/jfm.2019.62 two-dimensional car-following control strategy for connected
605. Fan D, Yang L, Wang Z, Triantafyllou MS, Karniadakis GE (2020) automated vehicles. Knowl-Based Syst 269:110485. https://doi.
Reinforcement learning for bluff body active flow control in org/10.1016/j.knosys.2023.110485
experiments and simulations. Proc Natl Acad Sci 117(42):26091– 622. Ramesh A, Ravindran B (2023) Physics-informed model-based
26098. https://doi.org/10.1073/pnas.2004939117 reinforcement learning. arXiv:2212.02179 [cs]
606. Jie X, Tao D, Foshey M, Li B, Zhu B, Schulz A, Matusik W (2019) 623. Colin R, Phanindra T (2023) Physics-informed reinforcement
Learning to fly: computational controller design for hybrid UAVs learning for motion control of a fish-like swimming robot. Sci
with reinforcement learning. ACM Trans Graph 38(4):42:1-42:12. Rep 13(1):10754. https://doi.org/10.1038/s41598-023-36399-4
https://doi.org/10.1145/3306346.3322940 624. Nielsen MA (2015) Neural networks and deep learning. Determi-
607. Lee XY, Balu A, Stoecklein D, Ganapathysubramanian B, Sarkar nation Press. http://neuralnetworksanddeeplearning.com
S (2018) Flow shape design for microfluidic devices using deep 625. Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel
reinforcement learning. arXiv:1811.12444 [cs, stat] T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks
608. Kun W, WaiChing S (2019) Meta-modeling game for deriving on graphs for learning molecular fingerprints. In: Advances in
theory-consistent, microstructure-based traction-separation laws neural information processing systems, vol 28. Curran Asso-
via deep reinforcement learning. Comput Methods Appl Mech ciates, Inc. https://papers.nips.cc/paper_files/paper/2015/hash/
Eng 346:216–241. https://doi.org/10.1016/j.cma.2018.11.026 f9be311e65d81a9ad8150a60844bb94c-Abstract.html
609. Bendsøe MP (1989) Optimal shape design as a material distri- 626. Bird S, Klein E, Loper E (2009) Natural language processing with
bution problem. Struct Optim 1(4):193–202. https://doi.org/10. Python, 1st edn. Beijing, Cambridge
1007/BF01650949 627. Hobson L, Cole H, Max HH (2019) Natural language process-
610. Martin P (2004) Bendsøe and ole sigmund, topology optimization. ing in action: understanding, analyzing, and generating text with
Springer, Berlin. https://doi.org/10.1007/978-3-662-05086-6 Python. Manning Publications Co, Shelter Island
611. Hayashi K, Ohsaki M (2020) Reinforcement learning and graph 628. Jurafsky D, Martin JH, Norvig P, Russell SJ (2009) Speech and
embedding for binary truss topology optimization under stress and language processing: an introduction to natural language process-
displacement constraints. Front Built Environ. https://doi.org/10. ing, computational linguistics, and speech recognition. Prentice
3389/fbuil.2020.00059 Hall series in artificial intelligence, 2nd edn. Prentice Hall, Pear-
612. Shaojun Z, Makoto O, Kazuki H, Xiaonong G (2021) Machine- son Education International, Upper Saddle River
specified ground structures for topology optimization of binary 629. Olah C (2015) Understanding LSTM networks. http://colah.
trusses using graph embedding policy network. Adv Eng Softw github.io/posts/2015-08-Understanding-LSTMs/
159:103032. https://doi.org/10.1016/j.advengsoft.2021.103032 630. Le Cun Y, Françoise F-S (1987) Modèles connexionnistes
613. Hongbo S, Ling M (2020) Generative design by using exploration de l’apprentissage. Intellectica 2(1):114–143. https://doi.org/10.
approaches of reinforcement learning in density-based structural 3406/intel.1987.1804
topology optimization. Designs 4(2):10. https://doi.org/10.3390/ 631. Bourlard H, Kamp Y (1988) Auto-association by multilayer
designs4020010 perceptrons and singular value decomposition. Biol Cybern
614. Seowoo J, Soyoung Y, Namwoo K (2022) Generative design by 59(4):291–294. https://doi.org/10.1007/BF00332918
reinforcement learning: enhancing the diversity of topology opti- 632. Hinton GE, Zemel R (1993) Autoencoders, minimum descrip-
mization designs. Comput Aided Des 146:103225. https://doi.org/ tion length and helmholtz free energy. In: Advances in
10.1016/j.cad.2022.103225 neural information processing systems, vol 6. Morgan-
615. Jiequn H, Arnulf J, Weinan E (2018) Solving high-dimensional Kaufmann. https://proceedings.neurips.cc/paper/1993/hash/
partial differential equations using deep learning. Proc Natl 9e3cfc48eccf81a0d57663e129aef3cb-Abstract.html
Acad Sci 115(34):8505–8510. https://doi.org/10.1073/pnas. 633. Shuangshuang C, Wei G (2023) Auto-encoders in deep learning–
1718942115 a review with new perspectives. Mathematics 11(8):1777. https://
616. Weinan E, Han J, Jentzen A (2017) Deep learning-based numer- doi.org/10.3390/math11081777
ical methods for high-dimensional parabolic partial differential 634. Nash JF (1950) Equilibrium points in n-person games. Proc Natl
equations and backward stochastic differential equations. Com- Acad Sci 36(1):48–49. https://doi.org/10.1073/pnas.36.1.48
mun Math Stat 5(4):349–380. https://doi.org/10.1007/s40304- 635. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A,
017-0117-6. arXiv:1706.04702 Chen X, Chen X (2016) Improved techniques for training GANs.
617. Yang J, Dzanic T, Petersen B, Kudo J, Mittal K, Tomov Vl Camier In: Advances in neural information processing systems, vol 29.
JS, Zhao T, Zha H, Kolev T, Anderson R, Faissol D (2023) Rein- Curran Associates, Inc. https://papers.nips.cc/paper_files/paper/
forcement learning for adaptive mesh refinement. In: Proceedings 2016/hash/8a3363abe792db2d8761d6403605aeb7-Abstract.
of The 26th international conference on artificial intelligence and html
statistics PMLR, pp 5997–6014. https://proceedings.mlr.press/ 636. Srivastava A, Valkov L, Russell C, Gutmann MU, Sutton C
v206/yang23e.html (2017) VEEGAN: reducing mode collapse in GANs using implicit
variational learning. In: Proceedings of the 31st international
123
Computational Mechanics (2024) 74:281–331 331
conference on neural information processing systems, NIPS’17. 653. Heess N, Wayne G, Silver D, Lillicrap T, Erez T, Tassa Y (2015)
Curran Associates Inc, Red Hook, pp 3310–3320 Learning continuous control policies by stochastic value gra-
637. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. dients. In: Advances in neural information processing systems,
arXiv:1701.07875 [cs, stat] vol 28. Curran Associates, Inc., https://papers.nips.cc/paper_
638. Mirza M, Osindero S (2014) Conditional generative adversarial files/paper/2015/hash/148510031349642de5ca0c544f31b2ef-
nets. arXiv:1411.1784 [cs, stat] Abstract.html
639. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P 654. Clavera I, Fu V, Abbeel P (2020) Model-augmented actor-critic:
(2016) InfoGAN: interpretable representation learning by infor- backpropagating through paths. arXiv:2005.08068 [cs, stat]
mation maximizing generative adversarial nets. In: Proceedings 655. Hafner D, Lillicrap T, Ba J, Norouzi M (2020) Dream to control:
of the 30th international conference on neural information pro- learning behaviors by latent imagination. arXiv:1912.01603 [cs]
cessing systems, NIPS’16. Curran Associates Inc, Red Hook, pp 656. Hafner D, Lillicrap T, Norouzi M, Ba J (2022) Mastering atari
2180–2188 with discrete world models. arXiv:2010.02193 [cs, stat]
640. Bridle JS, Heading Anthony JR, MacKay David JC (1991) Unsu- 657. Williams RJ (1992) Simple statistical gradient-following algo-
pervised classifiers, mutual information and ’phantom targets. In: rithms for connectionist reinforcement learning. Mach Learn
Proceedings of the 4th international conference on neural informa- 8(3):229–256. https://doi.org/10.1007/BF00992696
tion processing systems, NIPS’91. Morgan Kaufmann Publishers 658. Sutton RS, McAllester D, Singh S, Mansour Y (1999) Policy gra-
Inc., San Francisco, pp 1096–1101 dient methods for reinforcement learning with function approxi-
641. Larsen ABL, Sønderby SK, Larochelle H, Winther O (2016) mation. In: Advances in neural information processing systems,
Autoencoding beyond pixels using a learned similarity metric. In: vol 12. MIT Press. https://papers.nips.cc/paper_files/paper/1999/
Proceedings of the 33rd international conference on international hash/464d828b85b0bed98e80ade0a5c43b0f-Abstract.html
conference on machine learning, vol 48, ICML’16, pp 1558–1566 659. Kakade S (2001) A natural policy gradient. In: Advances
642. Sohl-Dickstein J, Weiss E, Maheswaranathan N, Ganguli S (2015) in neural information processing systems, vol 14. MIT
Deep unsupervised learning using nonequilibrium thermodynam- Press. https://papers.nips.cc/paper_files/paper/2001/hash/
ics. In: Proceedings of the 32nd international conference on 4b86abe48d358ecf194c56c69108433e-Abstract.html
machine learning. PMLR, pp 2256–2265. https://proceedings.mlr. 660. Silver D, Lever G, Heess N, Degris T, Wierstra T, Riedmiller M
press/v37/sohl-dickstein15.html (2014) Deterministic policy gradient algorithms. In: Proceedings
643. Ho J, Jain A, Abbeel P (2020) Denoising diffusion prob- of the 31st international conference on machine learning. PMLR,
abilistic models. In: Advances in neural information pp 387–395. https://proceedings.mlr.press/v32/silver14.html
processing systems, vol 33. Curran Associates, Inc., pp 661. Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust
6840–6851. https://proceedings.neurips.cc/paper/2020/hash/ region policy optimization. In: Proceedings of the 32nd interna-
4c5bcfec8584af0d967f1ab10179ca4b-Abstract.html tional conference on machine learning. PMLR, pp 1889–1897.
644. Nichol A, Dhariwal P (2021) Improved denoising diffusion prob- https://proceedings.mlr.press/v37/schulman15.html
abilistic models. arXiv:2102.09672 [cs, stat] 662. Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8:279–292.
645. Rezende D, Mohamed S (2015) Variational inference with nor- https://doi.org/10.1007/BF00992698
malizing flows. In: Proceedings of the 32nd international con- 663. van Hasselt H, Guez A, Silver D (February) Deep reinforcement
ference on machine learning. PMLR, pp 1530–1538. https:// learning with double Q-learning. In: Proceedings of the thirti-
proceedings.mlr.press/v37/rezende15.html eth AAAI conference on artificial intelligence, AAAI’16. AAAI
646. Ivan K, Prince Simon JD, Brubaker Marcus A (2021) Normalizing Press, Phoenix, pp 2094–2100
flows: an introduction and review of current methods. IEEE Trans 664. Wang Z, Schaul T, Hessel M, Van Hasselt H, Lanctot M, De Freitas
Pattern Anal Mach Intell 43(11):3964–3979. https://doi.org/10. N (2016) Dueling network architectures for deep reinforcement
1109/TPAMI.2020.2992934 learning. In: Proceedings of the 33rd international conference on
647. Sutton RS (1991) Dyna, an integrated architecture for learning, international conference on machine learning, vol 48, ICML’16,
planning, and reacting. ACM SIGART Bull 2(4):160–163. https:// New York, pp 1995–2003
doi.org/10.1145/122344.122377 665. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017)
648. Janner M, Fu J, Zhang M, Levine S (2019) When to trust your Proximal policy optimization algorithms. arXiv:1707.06347 [cs]
model: model-based policy optimization. In: Proceedings of the 666. Richard B (1957) A Markovian decision process. J Math Mech
33rd international conference on neural information processing 6(5):679–684
systems, vol 1122. Curran Associates Inc., Red Hook, pp 12519– 667. Capuzzo Dolcetta I, Ishii H (1984) Approximate solutions of
12530 the bellman equation of deterministic control theory. Appl Math
649. Lukasz K, Mohammad B, Piotr M, Blazej O, Campbell RH, Kon- Optim 11(1):161–181. https://doi.org/10.1007/BF01442176
rad C, Dumitru E, Chelsea F, Piotr K, Sergey L, Afroz M, Ryan S, 668. Sutton RS (1988) Learning to predict by the methods of tempo-
George T, Henryk M (2020) Model-based reinforcement learning ral differences. Mach Learn 3(1):9–44. https://doi.org/10.1007/
for Atari. arXiv:1903.00374 [cs, stat] BF00115009
650. Luo Y, Xu H, Li Y, Tian Y, Darrell T, Ma T (2021) Algorithmic 669. Bradtke SJ, Barto AG (1996) Linear least-squares algorithms for
framework for model-based deep reinforcement learning with the- temporal difference learning. Mach Learn 22(1):33–57. https://
oretical guarantees. arXiv:1807.03858 [cs, stat] doi.org/10.1007/BF00114723
651. Deisenroth MP, Rasmussen CE (2011) PILCO: a model-based
and data-efficient approach to policy search. In: Proceedings of
the 28th international conference on international conference on
Publisher’s Note Springer Nature remains neutral with regard to juris-
machine learning, ICML’11. Omnipress, Madison, pp 465–472
dictional claims in published maps and institutional affiliations.
652. Levine S, Abbeel P (2014) Learning neural network poli-
cies with guided policy search under unknown dynamics. In:
Advances in neural information processing systems, vol 27.
Curran Associates, Inc. https://papers.nips.cc/paper_files/paper/
2014/hash/6766aa2750c19aad2fa1b32f36ed4aee-Abstract.html
123