0% found this document useful (0 votes)

7 views

Deep learning in computational mechanics a review

This review article provides an overview of deep learning methodologies in deterministic computational mechanics, identifying five main categories: simulation substitution, simulation enhancement, discretizations as neural networks, generative approaches, and deep reinforcement learning. It aims to assist researchers, particularly those new to the field, in understanding key concepts and methodologies rather than specific applications. The article emphasizes the importance of bridging gaps between scientific communities by highlighting similarities in methods across various applications.

Uploaded by

Huy Lê

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Deep learning in computational mechanics a review

Uploaded by

Huy Lê

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Computational Mechanics (2024) 74:281–331

https://doi.org/10.1007/s00466-023-02434-4

REVIEW ARTICLE

Deep learning in computational mechanics: a review

Leon Herrmann1 · Stefan Kollmannsberger1

Received: 19 September 2023 / Accepted: 8 December 2023 / Published online: 13 January 2024
© The Author(s) 2024

Abstract
The rapid growth of deep learning research, including within the field of computational mechanics, has resulted in an
extensive and diverse body of literature. To help researchers identify key concepts and promising methodologies within this
field, we provide an overview of deep learning in deterministic computational mechanics. Five main categories are identified
and explored: simulation substitution, simulation enhancement, discretizations as neural networks, generative approaches,
and deep reinforcement learning. This review focuses on deep learning methods rather than applications for computational
mechanics, thereby enabling researchers to explore this field more effectively. As such, the review is not necessarily aimed at
researchers with extensive knowledge of deep learning—instead, the primary audience is researchers on the verge of entering
this field or those attempting to gain an overview of deep learning in computational mechanics. The discussed concepts are,
therefore, explained as simple as possible.

Keywords Deep learning · Computational mechanics · Neural networks · Surrogate model · Physics-informed · Generative

Contents 3.3.1 Algorithm enhancement . . . . . . . . . . . . . . 298

3.3.2 Multiscale methods . . . . . . . . . . . . . . . . . 298
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 282 3.3.3 Optimization . . . . . . . . . . . . . . . . . . . . 299
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 282 3.4 Post-processing . . . . . . . . . . . . . . . . . . . . . . 300
1.2 Taxonomy of deep learning techniques in computational 4 Discretizations as neural networks . . . . . . . . . . . . . . . 300
mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 282 4.1 Finite element method . . . . . . . . . . . . . . . . . . . 300
1.3 Deep learning . . . . . . . . . . . . . . . . . . . . . . . 283 4.2 Finite difference method . . . . . . . . . . . . . . . . . 302
2 Simulation substitution . . . . . . . . . . . . . . . . . . . . . 284 4.3 Material discretizations . . . . . . . . . . . . . . . . . . 303
2.1 Data-driven modeling . . . . . . . . . . . . . . . . . . . 284 4.4 Neural differential equations . . . . . . . . . . . . . . . 303
2.1.1 Space-time approaches . . . . . . . . . . . . . . . 284
5 Generative approaches . . . . . . . . . . . . . . . . . . . . . 304
2.1.2 Time-stepping procedures . . . . . . . . . . . . . 287
5.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . 304
2.1.3 Active learning and transfer learning . . . . . . . . 288
2.2 Physics-informed learning . . . . . . . . . . . . . . . . 289 5.1.1 Data generation . . . . . . . . . . . . . . . . . . . 304
2.2.1 Space-time approaches . . . . . . . . . . . . . . . 289 5.1.2 Generative design and design optimization . . . . 304
2.2.2 Time-stepping procedures . . . . . . . . . . . . . 293 5.1.3 Conditional generation . . . . . . . . . . . . . . . 305
2.2.3 Enforcement of physics by construction . . . . . . 295 5.1.4 Anomaly detection . . . . . . . . . . . . . . . . . 305
3 Simulation enhancement . . . . . . . . . . . . . . . . . . . . 295 6 Deep reinforcement learning . . . . . . . . . . . . . . . . . . 305
3.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . 295 6.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . 306
3.1.1 Data preparation . . . . . . . . . . . . . . . . . . 295 6.1.1 Extensions . . . . . . . . . . . . . . . . . . . . . 306
3.1.2 Initialization . . . . . . . . . . . . . . . . . . . . 296 7 Conclusion and outlook . . . . . . . . . . . . . . . . . . . . 307
3.1.3 Meshing . . . . . . . . . . . . . . . . . . . . . . . 296 A Advanced neural network architectures . . . . . . . . . . . . 308
3.2 Physical modeling . . . . . . . . . . . . . . . . . . . . . 296 A.1 Convolutional neural networks . . . . . . . . . . . . . . 308
3.2.1 Model substitution . . . . . . . . . . . . . . . . . 296
A.2 Graph neural networks . . . . . . . . . . . . . . . . . . 308
3.2.2 Identification of model parameters . . . . . . . . . 297
A.3 Recurrent neural networks . . . . . . . . . . . . . . . . 308
3.2.3 Model identification . . . . . . . . . . . . . . . . 297
3.3 Numerical methods . . . . . . . . . . . . . . . . . . . . 298 B Generative approaches . . . . . . . . . . . . . . . . . . . . . 309
B.1 Autoencoders . . . . . . . . . . . . . . . . . . . . . . . 309
B Leon Herrmann B.2 Generative adversarial networks . . . . . . . . . . . . . 309
[email protected] B.3 Diffusion models . . . . . . . . . . . . . . . . . . . . . 310
C Deep reinforcement learning . . . . . . . . . . . . . . . . . . 310
1 Chair of Computational Modeling and Simulation, School of C.1 Deep policy networks . . . . . . . . . . . . . . . . . . . 311
Engineering and Design, Technical University of Munich, C.2 Deep Q-learning . . . . . . . . . . . . . . . . . . . . . . 311
Arcisstraße 21, 80 333 Munich, Germany References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

123
282 Computational Mechanics (2024) 74:281–331

1 Introduction for material mechanics, [32] for constitutive modeling, [33]

for generative design, [34] for material design, and [35] for
1.1 Motivation aeronautics)3 . The aim of this work is, however, to focus on
the general methods rather than applications, where similar
In recent years, access to enormous quantities of data com- methods are often applied to different problems. This has
bined with rapid advances in machine learning has yielded the potential to bridge gaps between scientific communities
outstanding results in computer vision, recommendation by highlighting similarities between methods and thereby
systems, medical diagnosis, and financial forecasting [1]. establishing clarity on the state-of-the-art.
Nonetheless, the impact of learning algorithms reaches far
beyond and has already found its way into many scientific 1.2 Taxonomy of deep learning techniques in
disciplines [2]. computational mechanics
The rapid interest in machine learning in general and
within computational mechanics is well documented in the In order to discuss the deep learning methods in a structured
scientific literature. By considering the number of publica- manner, we introduce the following taxonomy:
tions treating “Artificial Intelligence”, “Machine Learning”,
“Deep Learning”, and “Neural Networks”, the interest can
• simulation substitution (Sect. 2)
be quantified. Figure 1a shows the trend in all journals of
Elsevier and Springer since 1999, while Fig. 1b depicts the – data-driven modeling (Sect. 2.1)
trend within the computational mechanics community by – physics-informed learning (Sect. 2.2)
considering representative journals1 at Elsevier and Springer.
• simulation enhancement (Sect. 3)
The trends before 2017 differ slightly, with a steady growth
• discretizations as neural networks (Sect. 4)
in general but only limited interest within computational
• generative approaches (Sect. 5)
mechanics2 . However, around 2017, both curves show a shift
• deep reinforcement learning (Sect. 6)
in trend, namely a vast increase in publications highlighting
the interest and potential prospects of artificial intelligence
and its subtopics for a variety of applications. Simulation substitution replaces the entire simulation with
Due to the rapid growth [21] of research in the field of deep a surrogate model, which in the context of deep learning
learning (see Fig. 1a), we provide an overview of the various are deep neural networks (NNs). The model can be trained
deep learning methodologies in deterministic computational with supervised learning, which purely relies on labeled
mechanics. To limit the scope of this work, we focus on data and therefore is referred to as data-driven modeling.
deterministic approaches and problems within computational The generalization errors of these models can be reduced
mechanics. Numerous review articles on deep learning for by physics-informed learning. Here, physics constraints are
specific applications have already emerged (see [22, 23] for imposed on the learnable space such that only physically
topology optimization, [24] for full waveform inversion, [25– admissible solutions are learned.
29] for fluid mechanics, [30] for continuum mechanics, [31] Simulation enhancement instead only replaces compo-
nents of the simulation chain, while the remaining parts
are still handled by classical methods. Approaches within
1 The considered journals are Computer Methods in Applied Mechan- this category are strongly linked to their respective applica-
ics and Engineering, Computers & Mathematics with Applications, tions and will, therefore, be presented in the context of their
Computers & Structures, Computational Mechanics, Engineering with
Computers, Journal of Computational Physics. specific use cases. Both data-driven and physics-informed
2 Pioneering works exploring neural networks for computational
approaches will be discussed.
mechanics prior to the current rise of deep learning are compiled in Treating discretizations as neural networks is achieved by
reviews such as [3, 4], see [5] for a more recent treatment. Contribu- constructing a discretization from the basic building blocks
tions across almost all of the discussed categories have already been of NNs, i.e., linear transformations and non-linear activa-
made before the year 2000. Aligning with the proposed taxonomy
from Sect. 1.2, these include data-driven modeling, such as inverse
tion functions. Thereby, techniques within deep learning
surrogate models [6, 7], physics-informed approaches for solving dif- frameworks—such as automatic differentiation, gradient-
ferential equations [8–10], efforts in simulation substitution, such as based optimization, and efficient GPU-based parallelization
constitutive modeling [11, 12] or estimating numerical parameters [13]. —can be leveraged to improve classical simulation tech-
Also, efforts to exploit the parallel computation capabilities of neural
networks have been made, where more efficient implementations are
niques.
obtained by constructing networks from discretizations [14–16]. The Generative approaches deal with creating new content
notable exclusions are generative approaches, arising with variational based on a data set. The goal is not, however, to recreate
autoencoders [17, 18], and generative adversarial networks [19] in 2014,
as well as deep reinforcement learning, which was popularized in the
early 2010s [20]. 3 For introductory textbooks, see [36–41].

123
Computational Mechanics (2024) 74:281–331 283

(a) Publications in all fields (b) Publications within computational mechanics

Fig. 1 Number of publications concerning artificial intelligence and some of its subtopics since 1999 showing the exponential growth of literature
within the field. Illustration inspired by [40]

the data, but to generate statistically similar data. This is use- 1.3 Deep learning
ful in diversifying the design space or enhancing a data set
to train surrogate models. Before continuing with the topics specific to computational
Finally, in deep reinforcement learning, an agent learns mechanics, NNs5 and the notation used throughout this work
how to interact with an environment in order to maximize are briefly introduced. In essence, NNs are function approx-
rewards provided by the environment. In the case of deep imators that are capable of approximating any continuous
reinforcement learning, the agent is modeled with NNs. In function [50]. The NN parametrized by the learnable param-
the context of computational mechanics, the environment is eters θ (typically consisting of weights w and biases b) learns
modeled by the governing physical equations. Reinforcement a function ŷ = f N N (x; θ ), which approximates the relation
learning provides an alternative to gradient-based optimiza- y = f (x). The NN is constructed with nested linear transfor-
tion, which is useful when gradient information is not mations in combination with non-linear activation functions
available. σ . The most basic NNs: Fully connected NNs achieve this
The unique proposed taxonomy arises from a methodolog- with layers of fully connected neurons (see Fig. 2), where
ical viewpoint, instead of an application [22–35], or problem the activation aki of each neuron (the ith neuron of layer k) is
[42] oriented perspective. However, parallels can be drawn obtained through linear combinations of the previous layer
to the in [42] identified challenges and proposed areas of and the non-linear activation function σ :
investigation in machine learning. Similarly, the distinction ⎛ ⎞
between machine learning enhanced4 and substitution by
n
machine learning models is made. Additionally, challenges aki = σ ⎝ wki j a i−1
j + bki ⎠ . (1)
such as robustness, explainability, and handling of complex j=1
and high-dimensional data are highlighted. Also, the sep-
aration between physics-informed learning and data-driven If more than one layer (excluding input x and output layer ŷ)
modeling is made by [42], as well as by [43]. Interestingly, is employed, the NN is considered a deep NN, and its training
older reviews [3, 4] arrived at similar categories, additionally process is thereby deep learning. The evaluation of the NN,
including NNs as means of more efficient implementations, i.e., the prediction is referred to as forward propagation. The
i.e., discretizations as NNs. Only the last two proposed quality of prediction is determined by a cost function C( ŷ),
categories, generative approaches, and deep reinforcement which is to be minimized. Its gradients ∇θ C = {∇w C, ∇b C}
learning, have not been spotlighted as methodologies within with respect to the parameters θ are obtained with automatic
reviews of computational mechanics. But these are well- differentiation [51], specifically referred to as backward
established within the machine learning community [44–47] propagation in the context of NNs. The gradients are used
and sufficiently distinct to be treated separately. within a gradient-based optimization [44, 52, 53] to update
the parameters θ and thereby improve the prediction ŷ. Super-
vised learning relies on labeled data x M , y M to establish a
cost function C, while unsupervised learning does not rely
4 A further interesting distinction is made between inner (within a for-
ward simulation) and outer loop enhancements (using multiple forward 5 See [44] for an in-depth treatment and PyTorch [48] or TensorFlow
simulations, e.g., within an optimization). [49] for deep learning libraries.

123
284 Computational Mechanics (2024) 74:281–331

on labeled data. The parameters defining the user-defined (Sects. 2.1.2 and 2.2.2). For simplicity, but without loss of
training algorithm and NN architecture are referred to as generality, time-stepping procedures will be presented on
hyperparameters. The concept is summarized by Fig. 2, PDEs with a first order derivative with respect to time:
showing a fully connected multi-layer, i.e., deep, NN. More
advanced NN architectures discussed throughout this work ∂u
= N T [u; λ], on × T . (3)
are described in Appendix A. ∂t
Notational Remark 1 Data sets are denoted by a superscript
M, i.e, {x M , y M }i=1
NM
, where NM is the data set size. with the non-linear operator N T . Another task in computa-
tional mechanics is the forward modeling and identification
Notational Remark 2 Although x and y may denote vector- of systems of ordinary differential equations (ODEs). For
valued quantities, we do not use bold-faced notation for them. this, we will consider systems of the following form:
Instead, this is reserved for all N degrees of freedom within
a problem, i.e., x = {xi }i=1N , y = {y } N . This can, for
i i=1 d x(t)
instance, be in the form of a domain sampled with N grid = f (x(t)). (4)
dt
points or systems composed of N degrees of freedom. Note
however, that matrices will still be denoted with capital letters Here, x(t) are the time-dependent degrees of freedom and
in bold face. f is the right-hand side defining the system of equations.9
Notational Remark 3 A multitude of NN architectures will Both the forward problem of computing x(t) and the inverse
be discussed throughout this work, for which we introduce problem of identifying f will be discussed in the following.
abbreviations and subscripts. Most prominent are fully con-
nected NNs FF N N (FC-NNs) [44, 54], convolutional NNs 2.1 Data-driven modeling
f C N N (CNNs) [55–57], recurrent NNs f R N N (RNNs) [58–
60], and graph NNs f G N N (GNNs) [61–63]6 . If the network Data-driven modeling relies entirely on labeled data x M , y M .
architecture is independent of the method, the network is The NN learns the mapping between x M and y M with
denoted as f N N . ŷi = f N N (xi ; θ ). Thereby an interpolation to yet unseen
data points is established. A data-driven loss LD , such as the
mean squared error, for example, can be used as cost function
2 Simulation substitution C.

1
NM
In the field of computational mechanics, numerical pro-
cedures are developed to solve or find partial differential C = LD = || ŷi − yiM ||22 (5)
2NM
i=1
equations (PDEs). A generic PDE can be written as

N [u; λ] = 0, on × T , (2) 2.1.1 Space-time approaches

where a non-linear operator N acts on a solution u(x, t) of To declutter the notation, but without loss of generality, the
a PDE as well as the coefficients λ(x, t) of the PDE7 in the temporal dimension t is dropped in this section, as it is possi-
spatio-temporal domain × T . In the forward problem, the ble to treat it like any other spatial dimension x in the scope of
solution u(x, t) is to be computed, while the inverse problem these methods. The goal of the upcoming methods is to either
considers either the non-linear operator N or coefficients learn a forward operator û = F[λ; x], an inverse operator for
λ(x, t) as unknowns. the coefficients λ̂ = I [u; x], or an inverse operator for the
A further distinction is made between methods treating non-linear operator N̂ = O[u; λ; x].10 The methods will be
the temporal dimension t as a continuum, as in space- explained using the forward operator, but they apply analo-
time approaches [67] (Sects. 2.1.1 and 2.2.1)8 , or in dis- gously to the inverse operators. Only the inputs and outputs
crete sequential time steps, as in time-stepping procedures differ.
The solution prediction û i at coordinate xi or ûi on the
6 Another architecture worth mentioning, as it has recently been applied entire domain is made based on a set of inverse coefficients
for regression [64, 65] are spiking NNs [66] specialized to run on λi . The cost function C is formulated analogously to Eq. (5):
neuromorphic hardware and thereby reduce memory and energy con-
sumption. These are, however, not treated in this work.
7 In case of the bar equation (Eq. 25), the PDE coefficients could be 9Note that a spatial discretization of the PDE equation (3) can also be
the cross-sectional stiffness E A(x) or/and the distributed load p(x). written as a system of ODEs.
8 Static problems without time-dependence can only be treated by the 10 Note that u might only be partially known on the domain for
space-time approaches. inverse problems.

123
Computational Mechanics (2024) 74:281–331 285

Fig. 2 Conceptual illustration

on how NNs, parametrized with
weights and biases θ = (w, b),
are trained, relying on the
backward propagation algorithm
computing the gradients of the
cost function C, and how
predictions ŷ are performed via
the forward propagation.
Specifically, a fully connected
deep NN is depicted

1
NM
C = LD = ||û i − u iM ||22 or
2NM
i=1

1
NM
C = LD = ||ûi − uiM ||22 . (6)
2NM
i=1

2.1.1.1. Fully connected neural networks

Fig. 3 Representation of nodes of a Cartesian grid as pixels in an image.
The simplest procedure is to approximate the operator F with Adapted from [79]
a FC-NN FF N N .

û(x) = FF N N (λ; x; θ ) (7)

Applications include pressure and velocity predictions
around airfoils [80–83], stress predictions from geometries
Example applications are flow classification [68, 69], fluid
and boundary conditions [84, 85], steady flow predictions
flow in turbomachinery [70], dynamic beam displacements
[86], detection of manufacturing features [87, 88], full
from previous measurements [71], wall velocity predictions
waveform inversion [89–100], and topology optimization
in turbulence [72], heat transfer [73], prediction of source
[101–110]. An important choice in the design of the learning
terms in turbulence models [74], full waveform inversion
algorithm is the encoding of the input data. In the case of
[75–77], and topology optimization based on moving mor-
geometries and boundary conditions, binary representations
phable bars [78]. The approach is however limited to simple
are the most straightforward approach. These are however
problems, as an abundance of data is required. Therefore,
challenging for CNNs, as discussed in [86]. Signed distance
several improvements have been proposed.
functions [86] or simulations on coarse grids provide supe-
2.1.1.2. Image-to-image mapping rior alternatives. For inverse problems, an initial forward
One downside of the application of FC-NNs to problems simulation of an initial guess of the inverse field can be
in computational mechanics is that they often need to learn used to encode the desired boundary conditions [105, 108–
spatial relationships with respect to x from scratch. CNNs 110]. Another possibility for CNNs is a decomposition of the
inherently account for these spatial relationships due to their domain. The mapping can be performed on the full domain
kernel-based structure. Therefore, image-to-image mappings [111], smaller subdomains [112], or even individual pixels
using CNNs have been proposed, where an image, i.e., a [113]. In the latter two cases, interfaces require special treat-
uniform grid (see Fig. 3) of the coefficients λ, is used as ment.
input. The disadvantage of CNN mappings is being constrained
to uniform grids on rectangular domains. This can be circum-
û = FC N N (λ; θ ) (8) vented by using GNNs acting on graph data, e.g., meshes,
such as in [114–116], or point cloud-based NNs [117,
This results in a prediction of the solution û throughout the 118] acting on point cloud data, such as in [119]. Just as
entire image, i.e., the domain. CNNs, GNNs operate on the invariant structural elements

123
286 Computational Mechanics (2024) 74:281–331

of the data, which for GNNs are edges connecting vertices

(see Appendix A.2) instead of pixels aligned on a structured
grid for CNNs. In fact, GNNs can be regarded as a general-
ization of CNNs since they can handle a broader class of data
structures, i.e., graphs (including images)11 . This comes at
the cost of less efficient implementations when compared to
Fig. 4 DeepONet, operator learning via prediction of the basis func-
pure CNNs. tions t̂ and the corresponding coefficients b̂ [139]
2.1.1.3. Model order reduction encoding
Independent of the NN architecture, learning can be aided
map between function spaces instead of functions. Neural
by applying the NN to a lower-dimensional space that is able
operators rely on the extension of the universal approxima-
to capture the data. For complex problems, mappings e to
tion theorem [50] to non-linear operators [138]. The two
low-dimensional spaces (also referred to as latent space or
most prominent neural operators are DeepONets13 [139] and
latent vector) h can be identified with model order reduction
Fourier neural operators [140].
techniques. Thus, in the case of simulation substitution, a
DeepONet
low-dimensional encoding hλ = e(λ) of λ (sampled on all
In DeepONets [139], illustrated in Fig. 4, the task of predict-
sample points x) is identified. This is provided as input to a
ing the operator û(λ; x) is split up into two sub-tasks:
NN to predict the solution field hu in a reduced latent space.
The full solution field u (on all sample points x) is obtained
• the prediction of N P basis functions t̂(x) (TrunkNet),
in a decoding d = e−1 step. The prediction is given as
• the prediction of the corresponding N P problem-specific
u coefficients b̂(λ) (BranchNet).
û = d( ĥ ) = d FN N (hλ ; θ ) = d FN N e(λ); θ . (9)
The basis is predicted by the TrunkNet with parameters
The dimensional reduction can, e.g., be performed with θ T via an evaluation at coordinates x. The coefficients
principal components analysis [120, 121], as proposed in are estimated from the coefficients λ using the BranchNet
[122], proper orthogonal decomposition [123], or reduced parametrized by θ B and, thus, specific to the problem being
manifold learning [124]. These techniques have been applied solved. Taking the dot product over the evaluated basis and
to learning aortic wall stresses [125], arterial wall stresses the coefficients yields the solution prediction û(λ; x).
[126], flow velocities in viscoplastic flow [127], and the
inverse problem of identifying unpressurized geometries t̂(x) = FFT N N (x; θ T ) (10)
from pressurized geometries [128]. Currently, the most
b̂(λ) = FFBN N (λ; θ B ) (11)
impressive results in data-driven surrogate modeling are
achieved with model order reduction encodings combined û(x) = b̂(λ) · t̂(x) (12)
with NNs [129, 130], which can be combined with most
other methodologies presented in this work. Applications can be found in [141–153]. DeepONets have
Another dimensionality reduction technique are autoen- also been extended with physics-informed loss functions
coders [131], where e and d are modeled by NNs12 . These [154–156].
are treated in detail in Appendix B.1 and enable non-linear Fourier neural operators
encodings. An early investigation is presented in [132], Fourier neural operators [140] predict the solution û on a uni-
where proper orthogonal decomposition is related to NNs. form grid x from the spatially varying coefficients λ = λ(x).
Application areas are the prediction of designs of acoustic As the aim is to learn a mapping between functions, sampled
scatterers from the reduced latent space [133], or mappings on the entire domain, non-local mappings can be performed
from dynamic responses of bridges to damage [134]. Fur- at each layer [157]. For example, mappings such as inte-
thermore, it has to be stated that many of the image-to-image gral kernels [158, 159], Laplace transformations [160], and
mapping techniques rely on NN architectures inspired by Fourier transforms [140] can be employed. These transfor-
autoencoders, such as U-nets [135, 136]. mations enhance the non-local expressivity of the NN [157],
2.1.1.4. Neural operators The most recent trend in surro- where Fourier transforms are particularly favorable due to
gate modeling with NNs are neural operators [137], which the computational efficiency achievable through fast Fourier
transforms.
11 The Fourier neural operator, as illustrated in Fig. 5,
For an in-depth treatment of the inner workings of GNNs, see [63].
12 consists of Fourier layers, where linear transformations K
Note that the autoencoder is modified, as it does not perform an
identity mapping. Nonetheless, the idea of mapping to a reduced latent
state is exploited. 13 Originally proposed in [138] with shallow NNs.

123
Computational Mechanics (2024) 74:281–331 287

• training data: generalization error,

• training algorithm: optimization error.

A lack of sufficient training data leads to poor generaliza-

tion. This might be alleviated through faster data generation
using, e.g., faster and specialized classical methods [178],
or improved sampling strategies, i.e., finding the minimum
number of required data points distributed in a specific
manner to train the surrogate. Additionally, current train-
ing algorithms only converge to local optima. Research into
improved optimization algorithms, such as current trends in
Fig. 5 Fourier neural operator, operator learning in the Fourier space computing better initial weights [179] and thereby better
[140]
local optima, attempts to reduce the optimization error. At
the same time, training times are reduced drastically increas-
are performed after Fourier transforms F along the spatial ing the competitiveness.
dimensions x. Subsequently, an inverse Fourier transform
F −1 is applied, which is added to the output of a linear 2.1.2 Time-stepping procedures
transformation W performed outside the Fourier space. Thus,
the Fourier transform can be skipped by the NN. The final For the time-stepping procedures, we will consider Eqs. (3)
step is an activation function σ . The manipulations within and (4) in the following.
a Fourier layer to predict the next activation on the uniform 2.1.2.1. Recurrent neural networks
grid a( j+1) (x) can be written as The simplest approach to modeling time series data is by
using FC-NNs to predict the next time step ti+1 from the
current time step ti :
a( j+1) (x) = σ W a( j) (x) + b

−1 ( j)
û(x, ti+1 ) = FF N N x, ti ; u(x, ti ); θ . (14)
+(F KF a (x) , (13)
However, this approach lacks the ability to capture the tempo-
where b is the bias. Both the linear transformations K , W and ral dependencies between different time steps, as each input is
the bias b are learnable and thereby part of the parameters θ . treated independently and without considering more than just
Multiple Fourier layers can be employed, typically used in the previous time step. Incorporating the sequential nature of
combination with an encoding network PN N and a decoding the data can be achieved directly with RNNs. RNNs maintain
network Q N N . a hidden state which captures information from the previous
Applications can be found in [161–171]. An extension time steps, to be used for the next time step prediction. By
relying on the attention mechanisms of transformers [172] unrolling the RNN, the entire time-history can be predicted.
is presented in [173]. Analogously to DeepONets, Fourier
neural operators have been combined with physics-informed {û(x, t2 ), û(x, t3 ), . . . , û(x, t N )}
loss functions [174]. = FR N N (x; u(x, t1 ); θ ) (15)
2.1.1.5. Neural network approximation power
Despite the advancements in NN architectures14 , NN surro- Shortcomings of RNNs, such as their tendency to struggle
gates struggle to learn solutions of general PDEs. Typically, with learning long-term dependencies due to the problem
successes have only been achieved for parametrized PDEs of vanishing or exploding gradients, have been addressed
with relatively small parameter spaces—or in cases where by more sophisticated architectures such as long short-time
accuracy, reliability, or generalization were disregarded. It memory networks (LSTM) [59], gated recurrent unit net-
has, however, been shown—both for simple architectures works (GRU) [180], and transformers [172] (see [181] for a
such as FC-NNs [175, 176] as well as for advanced archi- recent contribution on transformers for thermal analysis in
tectures such as DeepONets [177]—that NNs possess an additive manufacturing). The concept of recurrent units has
excellent theoretical approximation power which can capture also been combined with other architectures, as demonstrated
solutions of various PDEs. Currently, there are two obsta- for CNNs [182] and GNNs [114, 115, 183–187].
cles that impede the identification of sufficiently good optima Further applications of RNNs are full waveform inversion
with these desirable NN parameter spaces [175]: [188–190], high-dimensional chaotic systems [191], fluid
flow [40, 192], fracture propagation [116], sensor signals in
14 including architectures specifically designed to solve PDEs non-linear dynamic systems [193, 194], and settlement field

123
288 Computational Mechanics (2024) 74:281–331

predictions induced by tunneling [195], which was extended (Eq. 18) and the true state in the dictionary space. Orthogo-
to damage prediction in affected structures [196, 197]. RNNs nality is not required and therefore not enforced.
are often combined with reduced order model encodings
[198], where the dynamics are predicted on the reduced 1
N
latent space, as demonstrated in [199–205]. Further varia- C= ||ψ̂(x(ti+1 )) − Aψ̂(x(ti ))||22 (20)
2N
tions employ classical time-stepping schemes on the reduced i=1
latent space obtained by autoencoders [206, 207].
2.1.2.2. Dynamic mode decomposition When the dictionary is learned, the state predictions can be
Another approach that was formulated for system dynam- reconstructed using the Koopman mode decomposition, as
ics, i.e., Eq. (4) is dynamic mode decomposition (DMD) explained in detail in [212].
[208, 209]. The aim of DMD is to identify a linear Alternatively, the mapping to the augmented state can
operator A that relates two successive snapshot matrices be performed with autoencoders, which at the same time
with n time steps X = [x(t1 ), x(t2 ), . . . , x(tn )]T , X = allows for a direct map back to the original space [214–
[x(t2 ), x(t3 ), . . . , x(tn+1 )]T : 217]. Thus, an encoder learns a reduced latent space ĥ(x) =
e N N (x; θ e ) and a decoder learns the inverse mapping x̂(h) =
d N N (h; θ d ). The networks are trained using three losses: the
X ≈ AX. (16)
autoencoder reconstruction loss LA , the linear dynamics loss
LR , and the future state prediction loss LF .
To solve this, the problem is reframed as a regression task.
The operator A is approximated by minimizing the Frobe-
n+1
nius norm of the difference between X and AX. This LA =
1
||x(ti ) − d N N (e N N (x(ti ); θ e ); θ d )||22
minimization can be performed using the Moore-Penrose 2(n + 1)
i=1
pseudoinverse X † (see, e.g., [38]): (21)
1
n
A = arg min||X − AX|| F = X X † . (17) LR = ||e N N (x(ti+1 ); θ e ) − Ae N N (x(ti ); θ e )||22
A 2n
i=1
(22)
Once the operator is identified, it can be used to propagate
1
n
the dynamics forward in time, approximating the next state LF = ||x(ti+1 ) − d N N ( Ae N N (x(ti ); θ e ); θ d )||22
x(ti+1 ) using the current state x(ti ): 2n
i=1
(23)
x(ti+1 ) ≈ Ax(ti ). (18) C = κA LA + κR LR + κF LF (24)

This framework, is however, only valid for linear dynamics. The cost function C is composed of a weighted sum of the
DMD can be extended to handle non-linear systems through loss terms LA , LR , LF and weighting terms κA , κR , κF .
the application of Koopman operator theory [210]. Accord- Furthermore, [216] allows A to vary depending on the state.
ing to Koopman operator theory, it is possible to represent This is achieved by predicting the eigenvalues of A with an
a non-linear system as a linear one by using an infinite- auxiliary network and constructing the matrix from these.
dimensional Koopman operator K that acts on a transformed
state e(x(ti )): 2.1.3 Active learning and transfer learning

g(x(ti+1 )) = K[e(x(ti ))]. (19) Finally, an important machine learning technique indepen-
dent of the NN architecture and applicable to both space-
In theory, the Koopman operator K is an infinite-dimensional time and time-stepping approaches is active learning [218].
linear transformation. In practice, however, finite-dimensional Instead of precomputing a labeled data set, data is only pro-
approximations are employed. This approach is, for example vided when the prediction quality of the NN is insufficient.
utilized in the extended DMD [211], where the regression Furthermore, the data is not chosen arbitrarily, but only in the
from Eq. (17) is performed on a higher-dimensional state vicinity of the failed prediction. In computational mechan-
h(ti ) = e(x(ti )) relying on a dictionary of orthonormal basis ics, the prediction of the NN can be assessed with an error
functions h(ti ) = ψ(x(ti )). Alternatively, the dictionary can indicator. For an insufficient result, the results of a classical
be learned using NNs, i.e., ψ̂(x) = ψ N N (x; θ ), as demon- simulation are used to retrain the NN. Over time, the NN
strated in [212, 213]. The NN is trained by minimizing the estimates improve in the respective domain of application.
mismatch between predicted state ψ( x̂(ti+1 )) = Aψ̂(x(ti )) Due to the error indicator and the classical simulations, the

123
Computational Mechanics (2024) 74:281–331 289

predictions are reliable. Examples for active learning in com- 2.2.1.1. Differential equation solving with neural net-
putational mechanics can be found in [219–221]. works
Another technique, transfer learning [222, 223], aims at The concept of solving PDEs15 was first proposed in the
accelerating the NN training. Here, the NN is first trained 1990s [8–10], but was recently popularized by the so-called
on a similar task. Subsequently, it is applied to the task of physics-informed neural networks (PINNs) [228] (see [229–
interest—where it converges faster than an untrained NN. 231] for recent review articles and SciANN [232], SimNet
Applications in computational mechanics can be found in [233], DeepXDE [234] for libraries).
[98, 224]. To illustrate the idea and variations of PINNs, we will
consider the differential equation of a static elastic bar
2.2 Physics-informed learning

d du
In supervised learning, as discussed in Sect. 2.1, the qual- EA + p = 0, x ∈ . (25)
dx dx
ity of prediction strongly depends on the amount of training
data. Acquiring data in computational mechanics may be Here, the operator N is given by the left-hand side of the
expensive. To reduce the amount of required data, con- equation, the solution u(x) is the axial displacement, and the
straints enforcing the physics have been proposed. Two main spatially varying coefficients λ(x) are given by the cross-
approaches exist [43, 225]. The physics can be enforced by sectional properties E A(x) and the distributed load p(x).
modifying the cost function through a penalty term punishing Additionally, boundary conditions are specified, which can
unphysical predictions, thus acting as a regularizer. Possible be in terms of Dirichlet (on D ) or Neumann boundary con-
modifications are discussed in the upcoming section. Alter- ditions (on N ):
natively, the physics can be enforced by construction, i.e.,
by reducing the learnable space to a physically meaningful
space. This approach is highly specific to its application and u(x) = g(x), x ∈ D , (26)
will therefore mainly be explored in Sect. 3. A brief coverage du(x)
E A(x) = f (x), x ∈ N . (27)
is provided in Sect. 2.2.3. dx
Both approaches can be found in overview publica-
tions, where [43] defines four overarching methodologies: Physics-informed neural networks
(i) augmentation of training data using prior knowledge, (ii) PINNs [228] approximate either the solution u(x), the coef-
modification of the model, i.e., enforcement by construction, ficients λ(x), or both with FC-NNs.
(iii) enhancement of the learning algorithm with regular-
ization terms, i.e., enforcing constraints through the cost û(x) = FF N N (x; θ u ) (28)
function, and (iv) checking the final estimate and thereby λ
λ̂(x) = I F N N (x; θ ) (29)
discarding physical violations (using, e.g., error indicators).
The two most prominent methodologies, i.e., modifying the
cost function and enforcement by construction are simi- Instead of training the network with labeled data as in Eq. (6),
larly mentioned in [225], which correspondingly refers to the residual of the PDE is considered. The residual is evalu-
them as physics-informed and physics-augmented. Further ated at a set of NN points, called collocation points. Taking
variations in terminology can be found in [182, 226], who the mean squared error over the residual evaluations yields
refer to physics-informed NNs for multiple solutions as the PDE loss
physics-constrained deep learning, or [227] using the term
1
physics-enhanced NNs for NNs enforcing the physics by NN

construction. Due to the many names within the relatively LN = ||N [u(xi ); λ(xi )]||22
2NN
new and interconnected field, we cover the variations under i=1
2
1
NN
the overarching term of physics-informed learning. d du(xi )
= E A(xi ) + p(xi ) . (30)
2NN dx dx
i=1
2.2.1 Space-time approaches

Once again and without loss of generality, the temporal The gradients of the possible predictions, i.e., u, E A, and p
dimension t is dropped to declutter the notation. However, in with respect to x, are obtained with automatic differentiation
contrast to Sect. 2.1.1, the following methods are not equally [51] through the NN approximation. Similarly, the boundary
applicable to forward and inverse problems. Thus, the pre-
diction of the solution û, the PDE coefficients λ̂, and the 15 Typically, a single solution to a PDE is obtained. If the PDE is
non-linear operator N are treated separately. parametrized, multiple solutions can be obtained.

123
290 Computational Mechanics (2024) 74:281–331

conditions are enforced at the NBD + NBN boundary points. The test function is learned through a minimax optimization

B N min max C, (36)

1 D θ
u w θ
LB = (u(xi ) − g)2
2NN D
i=1 where the test function w(x) continually challenges the solu-
NB N
2 tion u(x).
1 du(xi )
+ E A(xi ) − f (31) Deep energy method and deep Ritz method
2NB N dx
i=1 By minimizing the potential energy = i + e instead,
the need for test functions is overcome by the deep energy
The cost function is composed of the PDE loss LN , boundary method [244] and the deep Ritz method [245]. This results
loss LB , and possibly a data-driven loss LD in the following loss term

C = LN + LB + LD . (32) 2
1 du(x)
LE = i + e =E A(x) d
2 dx
Both the deep least-squares method [235] and the deep
du(x)
Galerkin method [236] are closely related. Instead of focus- − u(x)E A(x) d
dx
ing on the residuals at individual collocation points as in
PINNs, these methods consider the L 2 -norm of the residuals − u(x) p(x)d. (37)

integrated over the domain .
Variational physics-informed neural networks Note that the inverse problem generally cannot be solved
Computing high-order derivatives for the non-linear opera- using the minimization of the potential energy. Consider, for
tor N is expensive. Therefore, variational PINNs [237, 238] instance, the potential energy of the bar equation in Eq. (37),
consider the weak form of the PDE, which lowers the order which is not well-posed in the inverse setting. Here, E A(x)
of differentiation. In the case of the bar equation, the weak going towards −∞ in the domain and going towards ∞
PDE loss is given by at N minimizes the potential energy LE .
Extensions
dwi (x) du(x)
LVi = E A(x) d A multitude of extensions to the PINN methodology exist.
dx dx For in-depth reviews, see [229–231].

du(x) Learning multiple solutions
− wi (x)E A(x) d N
dx Currently, PINNs are mainly employed to learn a single
N
solution. As the training effort exceeds the solving effort of
− wi (x) p(x)d = 0, ∀wi (x), (33)
classical solvers, the viability of PINNs is questionable [246].
However, PINNs can also be employed to learn multiple solu-
1
NV
LV = LVi . (34) tions. This is achieved by providing the parametrization of
NV the PDE, i.e., λ as an additional input to the network, as
i
discussed in Sect. 2.1. This enables a cheap prediction stage
In [237], NV trigonometric and polynomial test functions without retraining for new solutions16 . One possible example
wi (x) are used. The cost function is obtained by replacing the for this is [247], where different geometries are captured in
PDE loss LN with the weak PDE loss LV in Eq. (32). Note terms of point clouds and processed with point cloud-based
that the Neumann boundary conditions are now not included NNs [117].
in the boundary loss LB , as they are already incorporated Boundary conditions
in the weak form in Eq. (33). The integrals are evaluated The enforcement of the boundary conditions through a
through numerical integration methods, such as Gaussian penalty term LB in Eq. (31) leads to an unbalanced opti-
quadrature, Monte Carlo integration methods [239, 240], mization, due to the competing loss terms LN , LB , LD
or sparse grid quadratures [241]. Severe inaccuracies can in Eq. (32)17 . One remedy is to modify the NN output
be introduced through the numerical integration of the NN
output—for which remedies have been proposed in [242]. 16 Importantly, the training would be without training data and would
Weak adversarial networks only require a definition of the parametrized PDE. Currently, this is only
Instead of specifying the test functions w(x), weak adversar- possible for simple PDEs with small parameter spaces.
17 Consider, for instance, a training procedure in which the PDE loss
ial networks [243] employ a second NN as test function
LN is first minimal, such that the PDE is fulfilled. Without fulfilment
of the boundary conditions, the solution is not unique. However, the NN
ŵ(x) = W F N N (x; θ w ). (35) struggles to modify the current boundary values without violating the

123
Computational Mechanics (2024) 74:281–331 291

FF N N by multiplication of a function, such that the Dirichlet which lead to larger residuals. This approach is strongly
boundary conditions are satisfied a priori, i.e., LB = 0, as related to the approaches relying on the augmented Lagrangian
demonstrated in [37, 248]. method [259] and competitive PINNs [260], where an addi-
tional NN models the penalty weights κ(x) = K F N N (x; θ κ ).
û(x) = G(x) + D(x)FF N N (x; θ u ) (38) This is similar to weak adversarial networks, but instead for-
mulated using the strong form.
Here, G(x) is a smooth interpolation of the boundary con- Ansatz
ditions, and D(x) is a signed distance function that is zero Another prominent topic is the question of which ansatz to
at the boundary. For Neumann boundary conditions, [249] choose. The type of ansatz is, for example, determined by
propose to predict u and its derivatives ∂u/∂ x with sepa- different NN architectures (see [261] for a comparison) or
rate networks, such that the Neumann boundary conditions combinations with classical ansatz formulations. Instead of
can be enforced strongly by modifying the derivative net- using FC-NNs, some authors [182, 226] employ CNNs to
work. This requires an additional constraint, ensuring that exploit the spatial structure of the data. Irregular geometries
the derivative predictions match the derivative of u. For can be handled by embedding the structure in a rectangu-
complex domains, G(x) and D(x) cannot be found analyti- lar domain using binary encodings [262] or signed distance
cally. Therefore, [248] use NNs to learn G(x) and D(x) in functions [86, 263]. Another option are coordinate trans-
a supervised manner by prescribing either the boundary val- formations into rectangular grids [264]. The CNN requires
ues or zero at the boundary and restricting the values within a full-grid discretization, meaning that the coordinates x
the domain to be non-zero. Similarly [250] proposed using are analytically independent of the prediction û = FC N N .
radial basis function networks for G(x), where D(x) = 1 is Thus, the gradients of u are not obtained with automatic
assumed. The radial basis function networks are determined differentiation, but with numerical differentiation, i.e., finite
by solving a linear system of equations constructed with the differences. Alternatively, the output of the CNN can rep-
boundary conditions. On uniform grids, strong enforcement resent coefficients of an interpolation, as proposed under
can be achieved through specialized CNN kernels [204] with the name spline-PINNs [265] using Hermite splines. This
constant padding terms for Dirichlet boundary conditions and again allows for an automatic differentiation. This is simi-
ghost cells for Neumann boundary conditions. Constrained larly applied for irregular geometries in [266], where GNNs
backward propagation [251] has also been proposed to guar- are used in combination with a piecewise polynomial basis.
antee the enforcement of boundary conditions [252, 253]. Using a classical basis has the added advantage that Dirich-
Another possibility is to introduce weighting terms let boundary conditions can be satisfied exactly. A further
κN , κB , κD for each loss term. These are either hyperpa- variation is the approximation of the coefficients of classical
rameters, or they are learned during the optimization with bases with FC-NNs. This is shown with B-splines in [267] in
attention mechanisms [254–256]. This is achieved by per- the sense of isogeometric analysis [268]. This was similarly
forming a minimax optimization with respect to all weighting done for piecewise polynomials in [269]. However, instead of
terms κ = {κN , κB , κD } simply minimizing the PDE residual from Eq. (30) directly,
the finite element discretization [270, 271] is exploited. The
min max C. (39) loss LF can thus be formulated in terms of the non-linear
θ κ
stiffness matrix K , the force vector F, and the degrees of
Expanding on this idea, each collocation point used for the freedom uh .
loss terms can be considered an individual equality constraint
[257, 258]. Therefore, a weighting term κNi is allocated for LF = ||K (uh )uh − F||22 (41)
each collocation point xi , as illustrated for the PDE loss LN
from Eq. (30) In the forward problem, uh is approximated by a FC-NN,
whereas for the inverse problem a FC-NN predicts K . Sim-
1 ilarly, [272, 273] map a NN onto a finite element space by
NN
LN = κN ,i ||N [u(xi ); λ(xi )]||22 . (40) using the NN evaluations at nodal coordinates as the cor-
2NN
i=1 responding basis function coefficents. This also allows a
straightforward strong enforcement of Dirichlet boundary
This has the added advantage that greater emphasis is conditions, as demonstrated in [79] with CNNs. The nodes
assigned on more important collocation points, i.e., points are represented as pixels (see Fig. 3).
Prior information on the solution can be incorporated
Footnote 17 continued
through a feature layer [274]. If, for example, it is known that
PDE loss and thereby increasing the total cost function C. The NN is
thus stuck in a bad local minimum. Similar scenarios can be formulated the solution is composed of trigonometric functions, a feature
for a too rapid minimization of the other loss terms. layer with trigonometric functions can be applied after the

123
292 Computational Mechanics (2024) 74:281–331

input layer. Thus, known features are given to the NN directly tive should be zero at the correct solution. However, a general
to aid the learning. Without known features, the task can problem in the cost function formulation persists. The cost
also be modified to improve learning. Inspired by adaptivity function should correspond to the norm of the error, which
from finite elements, refinements are progressively learned is not necessarily the case. This means that a reduction in
by additional layers of the NN [275] (see Fig. 6). Thus, a the cost does not necessarily yield an improvement in qual-
coarse solution u1 is learned to begin with, then refined to ity of solution. The error norm can be expressed in terms of
u2 by an additional layer, which again is refined to u3 until the H −1 -norm, which, according to [288], can efficiently be
the deepest refinement level is reached. computed on rectangular domains with Fourier transforms.
Domain decomposition Thus, the H −1 -norm can directly be used as cost function
To improve the scalability of PINNs to more complex prob- and minimized.
lems, several domain decomposition methods have been Another aspect is numerical differentiation, which is
proposed. One approach are hp-variational PINNs [238], advantageous for the residual of the PDE [289], as automatic
where the domain is decomposed into patches. Piecewise differentiation may be erroneous due to spurious oscillations
polynomial test functions are defined on each patch sepa- between collocation points. Thus, numerical differentiation
rately, while the solution is approximated by a globally acting enforces regularity, which was exploited in [289] by cou-
NN. This enables a separate numerical integration of each pling automatic differentiation and numerical differentiation
patch, improving its accuracy. to retain the advantages of automatic differentiation.
In an alternative formulation, one NN can be used per sub- Further specialized modifications to NN architectures
domain. This was proposed as conservative PINNs [276], have been proposed. Adaptive activation functions [290]
where conservation laws are enforced at the interface to have shown acceleration in convergence. Extreme learning
ensure continuity. Here, the discrepancies between both solu- machines [291, 292] remove the need for iterations alto-
tion and flux were penalized at the interface in a least squares gether. All layers are randomly initialized in extreme learning
manner. The advantages of this approach are twofold: Firstly, machines, and only the last layer is learnable. Without a non-
parallelization is possible [277] and, secondly, adaptivitiy linear activation function, the parameters are found with a
can be introduced. Shallower networks can be employed for least-squares regression. This was demonstrated for PINNs
smooth solutions and deeper networks for more complex in [293]. Instead of only learning the last layer, the problem
solutions. The approach was generalized for any PDE in the can be split into a non-linear and a linear regression prob-
context of extended PINNs [278]. Here, the interface con- lem, which are solved separately [294], such that the full
dition is formulated in terms of the difference in both the expressivity of NNs is retained.
residual and the solution. Applications to forward problems
Acceleration methods PINNs have been applied to various PDEs (see [229–231] for
Analogously to supervised learning, as discussed in Sect. 2.1, an overview). Forward problems can, for example, be found
transfer learning can be applied to PINNs [279] as, e.g., in solid mechanics [284, 295, 296], fluid mechanics [297–
demonstrated in phase-field fracture [280] or topology opti- 304], and thermomechanics [305, 306]. Currently, PINNs do
mization [281]. These are very suitable problems since crack not outperform classical solvers such as the finite element
and displacement fields evolve with mostly local changes in method [246, 307] in terms of speed for a given accuracy of
phase-field fracture. For topology optimization, only minor engineering relevance. In the author’s experience and judge-
updates are expected between each optimization iteration ment, this is especially the case for forward problems even
[281]. if the extensions mentioned above are employed. Often, the
The poor performance of PINNs in their original form can mentioned gains compared to classical forward solvers dis-
also be improved with better sampling strategies. In impor- regard the training effort and only report evaluation times.
tance sampling [282, 283], the collocation point density is Incorporating large parts of the solution in the form of
proportional to the value of the cost function. Alternatively, measurements with the data-driven loss LD improves the
residual-based adaptive refinement [234] adds collocation performance of PINNs, which thereby can become a viable
points in the vicinity of areas with a higher cost function. method in some cases. Yet, [308] states that data-driven meth-
Another essential topic for NNs is normalization of ods outperform PINNs. Thus PINNs should not be regarded
the inputs, outputs, and loss terms [284, 285]. For time- as a replacement for data-driven methods, but rather as a reg-
dependent problems, it is possible to use time-dependent ularization technique for data-driven methods to reduce the
normalization [286] to ensure that the solution is always in generalization error.
the same range regardless of the time step. Applications to inverse problems
Furthermore, the cost function can be enhanced by includ- However, PINNs are in particular useful for inverse problems
ing the derivative of the residual [287] as well. The derivative with full domain knowledge, i.e., the solution is available
should also be minimized, as both the residual and its deriva- throughout the entire domain. This has, for example, been

123
Computational Mechanics (2024) 74:281–331 293

Fig. 6 Refinement expressed

with NNs in terms of NN depth.
Thick black lines indicate
non-learnable connections and
gray lines indicate learnable
connections. Each added layer is
composed of a projection from
the coarser level and a
correction obtained through the
learnable connection

shown for the identification of material properties [285, identified. AI-Feynman has been successfully applied to 100
309–312]. By contrast, for inverse problems with only par- equations from the Feynman lectures [325].
tial knowledge, the applicability of PINNs is limited [313],
as both forward and inverse solution have to be learned
simultaneously. Most applications therefore limit themselves 2.2.2 Time-stepping procedures
to simpler inversions such as size and shape optimization.
Examples are published, e.g., in [295, 314–319]. Exceptions Again Eqs. (3) and (4) will be considered for the time-
that deal with the identification of entire fields can be found in stepping procedures.
full waveform inversion [320], topology optimization [321], 2.2.2.1. Physics-informed neural networks
elasticity, and the heat equation [322]. In the spirit of domain decomposition, parareal PINNs [326]
2.2.1.2. Inverse problems split up the temporal domain in subdomains [ti < ti+1 ]. A
PINNs are capable of discovering governing equations by rough estimate of the solution u is provided by a conjugate
either learning the operator N or the coefficients λ. The gradient solver on a simplified form of the PDE starting from
resulting operator is, however, not always interpretable, and t0 . PINNs are then independently applied in each subdomain
in the case of identification of the coefficients, the underlying to correct the estimate. Subsequently, the conjugate gradi-
PDE is assumed. To discover interpretable operators, one can ent solver is applied again, starting from t1 . This process is
apply sparse regression approaches [323]. Here, potential dif- repeated until all time steps have been traversed. A closely
ferential operators are assumed as an input to the non-linear related approach can be found in [327], where a PINN is
operator retrained on successive time segments. It is however ensured
that previous time steps are kept fulfilled through a data-
driven loss term for time segments that were already learned.
∂u ∂ 2 u
N̂ x, u, , , . . . = 0. (42) Another approach are the discrete-time PINNs [228],
∂x ∂x2 which consider the temporal dimension in a discrete man-
ner. The differential equation from Eq. (3) is discretized with
Subsequently, a NN learns the corresponding coefficients the Runge-Kutta method with q stages [328]:
using observed solutions inserted into Eq. (42). The eval-
uation of the differential operators is achieved through
automatic differentiation by first interpolating the solution
q

with a NN. Sparsity is ensured with a L 1 -regularization. u n+ci

=u +
n
t ai j N T [u n+c j ], i = 1, . . . , q,
j=1
A more sophisticated and complete framework is AI-
(43)
Feynman [324]. Sequentially, dimensional analysis, polyno-
mial regression, and brute force search algorithms are applied
q

to identify fundamental laws in the data. If unsuccessful, a u n+1 = u n + t b j N T [u n+c j ], (44)

j=1
NN interpolates the data, which can thereby be queried for
symmetry and separability. The identification of symmetries
leads to a reduction in variables, i.e., a reduction of the input where
space. In the case of separability, the problem is decomposed
into two subproblems. The reduced problems or subproblems
are iteratively fed through the framework until an equation is u n+c j (x) = u(t n + c j t, x), j = 1, . . . , q. (45)

123
294 Computational Mechanics (2024) 74:281–331

A NN FN N predicts all stages i = 1, . . . , q from an input x: N̂ T are combined with point-wise CNNs [334] in [332] or a
symbolic network in [333]. Both yield an interpretable oper-
û = [û n+c1 (x), . . . , û n+cq (x), û n+1 (x)] = FN N (x; θ ). (46) ator from which the analytical expression can be extracted.
In order to construct a loss function, Eqs. (3) and (49) are
The cost is then constructed by rearranging Eqs. (43) and discretized using the forward Euler method:
(44).
∂u ∂ 2 u
u(x, tn+1 ) = u(x, tn ) + t N̂ T x, u, , 2 , . . . . (50)

q ∂x ∂x
û n = û in = û n+ci − t ai j N T [û n+c j ], i = 1, . . . , q,
j=1 This temporal discretization is applied iteratively, and the
(47) discrepancy between the derived function and the measured

q data u M (x, tn ) serves as the loss function.
û =
n n
û q+1 = û n+1
− t b j N T [û n+c j ]. (48) SINDy
j=1 Sparse identification of non-linear dynamic systems (SINDy)
[335] deals with the discovery of dynamic systems of the
The q +1 predictions û in , û q+1
n of û n have to match the initial form of Eq. (4). The task is posed as a sparse regression prob-
conditions u M , where the mean squared error is used as a
n
lem. Snapshot matrices of the state X = [x(t1 ), x(t2 ), . . . ,
loss function to learn all stages û. The approach has been x(tn )] and its time derivative Ẋ = [ ẋ(t1 ), ẋ(t2 ), . . . , ẋ(tn )]
applied to fluid mechanics [329, 330]. are related to one another via candidate functions (X) eval-
2.2.2.2. Inverse problems uated at X using unknown coefficients :
As for inverse problems in the space-time approaches (Para-
graph 2.2.1.2), the non-linear operator N can be learned. For Ẋ = (X). (51)
temporal problems, this corresponds to the right-hand side
of Eq. (3) for PDEs and to Eq. (4) for systems of ODEs. The The coefficients are determined through sparse regres-
predicted right-hand side can then be used to predict time sion, such as sequential thresholded least squares or LASSO
series using a classical time-stepping scheme, as proposed in regression. By including partial derivatives, SINDy has been
[331]. More sophisticated methods leaning on similar prin- extended to the discovery of PDEs [336, 337].
ciples are presented in the following. Specifically, we will The expressivity of SINDy can further be increased by a
discuss PDE-Net for discovering PDEs, SINDy for discov- coordinate transformation into a representation allowing for
ering systems of ODEs in an interpretable sense, and an a simpler representation of the system dynamics. This can
approach relying on multistep methods for systems of ODEs. be achieved with an autoencoder (consisting of an encoder
The multistep approach leads to a non-interpretable, but more e N N (x; θ e ) and a decoder d N N (h; θ d ), as proposed in [338],
expressive approximation of the right-hand side. where the dynamics are learned on the reduced latent space h
PDE-Net using SINDy. A simultaneous optimization of the NN param-
PDE-Net [332, 333] is designed to learn both the system eters θ e , θ d and SINDy parameters is conducted with
dynamics u(x, t) and the underlying differential equation it gradient descent. The cost is defined in terms of the autoen-
follows. Given a problem of the form of Eq. (3), the right- coder reconstruction loss LA and the residual of Eq. (51)
hand side can be approximated as a function of coordinates at both the reduced latent space LR and the original space
and gradients of the solution. LF 19 . A L 1 -regularization for promotes sparsity.

1
n

T ∂u ∂ 2 u LA = ||x(ti ) − d N N e N N (x(ti ); θ e ); θ d ||22 (52)
N̂ x, u, , ,... (49) 2n
∂x ∂x2 i=1

1 n

The operator N̂ T is approximated by NNs. The first step LR = || ∇x e N N x(ti ); θ e · ẋ(ti )
2n
involves estimating spatial derivatives using learnable con- i=1
ḣ
volutional filters. The filters are designed to adjust their order
of approximation based on the fit to the underlying measure- − e N N x(ti ); θ e
||22 (53)
ments u M , while the type of gradient is predefined18 . Thus,
1
n

the NN learns how to best approximate spatial derivatives LF = || ẋ(ti ) − ∇h d N N e N N (x(ti ); θ e ); θ d
2n
specific to the underlying data. Subsequently, the inputs of i=1 h

18 This is enforced through constraints using moment matrices of the 19 The encoder and decoder are derived with respect to their inputs to
convolutional filters. estimate the derivatives ẋ, ḣ using the chain rule.

123
Computational Mechanics (2024) 74:281–331 295

· e N N (x(ti ); θ e ) ||22 (54) tions are predicted and subsequently differentiated to ensure
conservation of mass, the incorporation of symmetries [342],
ḣ or invariances [343] by using integrity bases [344]. Dynami-
C = κA LA + κR LR + κF LF (55) cal systems have been treated by learning the Lagrangian or
Hamiltonian with correspondingly Lagrangian NNs [345–
As in Eq. (24), a weighted cost function with weights 347] and Hamiltonian NNs [348]. The quantities of interest
κA , κR , κF is employed. The reduced latent space can be are obtained through the differentiable NN and compared
exploited for forward simulations of the identified system. to labeled data. Indirectly learning the quantities of interest
By solving the system with classical time-stepping schemes through the Lagrangian or Hamiltonian guarantees the con-
in the reduced latent space, the solution is obtained in the servation of energy. Enforcing the physics by construction is
full space through the decoder, as outlined in [339]. Thus, also referred to as physics-constrained learning, as the learn-
a reduced order model of a previously unknown system is able space is constrained. Note, however, that constraining
identified. The downside is, that the model is no longer inter- the learnable space also challenges the learning algorithm,
pretable in the full space. thus potentially making convergence more difficult. There-
Multistep methods fore, [225] relaxes the requirement of fulfilling the physical
Another approach [340] to learning the system dynamics laws by introducing a secondary unconstrained network—
from Eq. (4) is to approximate the right-hand side directly acting additively on the solution—whose influence is scaled
with a NN fˆ(x i ) = O N N (x i ; θ ), x i = x(ti ). A residual can by a hyperparameter. More examples of physics enforcement
be formulated by considering linear multistep methods [328], by construction are provided in the context of simulation
a residual can be formulated. In general, these methods take enhancement in Sect. 3.2.
the form:

M
3 Simulation enhancement
[αm x n−m + tβm f (x n−m )] = 0, (56)
m=0
The category of simulation enhancement deals with any
deep learning technique that interacts directly with and, thus,
where M, α0 , α1 , β0 , β1 are parameters specific to a multi-
improves a component of a classical simulation. This is the
step scheme. The scheme can be reformulated with a cost
most diverse category and will therefore be subdivided into
function, given as:
the individual steps of a classical simulation pipeline:

1
N
C= || ŷn ||22 (57) • pre-processing
N − M +1 • physical modeling
n=M
• numerical methods

M
ŷn = [αm x n−m + tβm fˆ(x n−m )] (58) • post-processing
m=0
Both data-driven and physics-informed approaches will be
The idea of the method is strongly linked to the discrete-time discussed in the following.
PINN presented in Paragraph 2.2.2.1, where a reformulation
of the Runge-Kutta method yields the cost function needed 3.1 Pre-processing
to learn the forward solution.
The discussed pre-processing methods are trained in a super-
2.2.3 Enforcement of physics by construction vised manner relying on the techniques presented in Sect. 2.1
and on labeled data.
Up to this point, this review only considered the case where
physics are enforced indirectly through penalty terms of the 3.1.1 Data preparation
PDE residual. The only exception, and the first example of
enforcing physics by construction, was the strong enforce- Data preparation includes tasks, such as geometry extrac-
ment of boundary conditions [37, 204, 248] by modifying the tion. For instance the detection of cracks from images by
outputs of the NN—which led to a fulfillment of the bound- means of segmentation [349–351] can subsequently be used
ary conditions independent of the NN parameters. For PDEs, in simulations to assess the impact of the identified cracks.
this can be achieved by manipulating the output, such that Also, CNNs have been used to prepare voxel data obtained
the solution automatically obeys fundamental physical laws. from computed tomography scans, see [352], where scan-
Examples for this are, e.g., given in [341], where stream func- ning artifacts are removed. Similarly NNs can be employed

123
296 Computational Mechanics (2024) 74:281–331

to enhance measurement data. This was, for example, demon- is assessed with a data-driven cost function (Eq. 5) using
strated in [353], where the NN acts as a denoiser for magnetic labeled data σ M , εM . The approach is applied to a variety
signals in the scope of non-destructive testing. Similarly, low- of problems, where the key difference lies in the definition
frequency extrapolation for full waveform inversion has been of input and output quantities. The same deep learning tech-
performed using NNs [354–356]. niques from data-driven simulation substitution (Sect. 2.1)
can be employed.
3.1.2 Initialization Applications include predictions of stress from strain
[366, 367], flow stresses from temperatures, strain rates
Instead of preparing the data, the simulation can be acceler- and strains [368, 369], yield functions [370], crack opening
ated by an initialization. This can, for example, be achieved responses from stresses [371], contact stiffness from pen-
through initial guesses by NNs, providing a better starting etration and contact pressure [372], point of contact from
point for classical iterative solvers [357]20 . A tighter inte- position of neighboring nodes of finite elements [373], or
gration is achieved by using a pre-trained [279] NN ansatz control points of NURBS surfaces [374]. Source terms of
whose parameters are subsequently tweaked by the classical simplified equations or coarser discretizations have also been
solver, as demonstrated for full waveform inversion in [224]. learned for turbulence [74, 375, 376] and the wave equation
[377]. Here, the reference—a high-fidelity model—is to be
3.1.3 Meshing captured in the best possible way by the source term.
Variations also predict the quantity of interest indirectly.
Finally, many simulation techniques rely on meshes. This For example, strain energy densities ψ are predicted by
can be achieved indirectly with NNs, by prediction of NNs from deformation tensors F, and subsequently derived
mesh density functions [358–362] incorporating either expert using automatic differentiation to obtain stresses [378, 379].
knowledge of where small elements are needed, or relying The approach can also be extended to incorporate uncer-
on error estimations. Subsequently, a classical mesh genera- tainty quantification [380]. By extending the input space
tor is employed. However, NNs (specifically let-it-grow NNs with microstructural information, an in-built homogeniza-
[363]) have also been proposed directly as mesh generators tion is added to the constitutive model [381–383]. Thus, the
[364, 365]. macroscale simulation considers the microstructure at the
integration points in the sense of FE2 [384, 385], but without
3.2 Physical modeling an additional finite element computation. Incorporation of
microstructures requires a large amount of realistic training
Physical models that capture physical phenomena accurately data, which can be obtained through generative approaches as
are a core component of mechanics. Deep learning offers discussed in Sect. 5. Active learning can reduce the required
three main approaches for physical models. Firstly, a NN number of simulations on these geometries [221].
is used as the physical model directly (model substitution). A specialized NN architecture is employed by [386],
Secondly, an underlying model may be assumed where a NN where a NN first estimates invariants I of the deformation
determines its coefficients (identification of model param- tensor F and thereupon predicts the strain energy density,
eters). Lastly, the entire model can be identified by a NN thus mimicking the classical constitutive modeling approach.
(model identification). In the first approach, the NN is inte- Another network extension is the use of RNNs to learn
grated within the simulation pipeline, while the latter two history-dependent models. This was shown in [381, 382, 387,
rely on incorporation of the identified models in a classical 388] for the prediction of the stress increment from the stress-
sense. strain history, the strain energy from the strain energy history
For illustration purposes, the approaches are mostly [389], and crack patterns based on prior cracks and crystalline
explained on the example of constitutive models. Here, the orientations [390, 391].
task is to relate the strain ε to a stress σ , i.e., find a function The learned models do not, however, necessarily obey fun-
σ = f (ε). This can, for example, be used within a finite damental physical laws. Attempts to incorporate physics as
element framework to determine the element stiffness, as constraints using penalty terms have been made in [392–
elaborated in [366]. 394]. Still, physical consistency is not guaranteed. Instead,
NN architectures can be chosen such that they satisfy phys-
3.2.1 Model substitution ical requirements by construction. In constitutive modeling,
objectivity can be enforced by using only deformation invari-
In model substitution, a NN f N N replaces the model, yield- ants as input [395], and polyconvexity can be enforced
ing the prediction σ̂ = f N N (ε; θ ). The quality of the model through the architecture, such as input-convex NNs [396–
399] or neural ordinary differential equations [395, 400].
20 Here, the initial guess is incorporated through a regularization term. It was demonstrated that ensuring fundamental physical

123
Computational Mechanics (2024) 74:281–331 297

aspects such as invariants combined with polyconvexivity incorporating meso-scale information by training a NN on
delivers a much better behavior for unseen data, especially if representative volume elements [420].
the model is used in extrapolation.
Input-convex NNs [401] enforce the convexity with spe- 3.2.3 Model identification
cialized activation functions such as log-sum-exponential,
or softplus functions in combination with constraints on the NN models as a replacement of classical approaches are not
NN weights to ensure that they are positive, while neural interpretable, while only identifying model parameters of
ordinary differential equations [402] (discussed in Sect. 4) known models restricts the models capacity. This gap can
approximate the strain energy density derivatives and ensure be bridged by the identification of models in terms of parsi-
non-negative values. Alternatively, a mapping from the monious mathematical expressions.
NN to a convex function can be defined [403] ensuring The typical procedure is to pose the problem in terms
a convex function for any NN output. Related are also of candidate functions and to identify the most relevant
thermodynamics-based NNs [404, 405], e.g., applied to com- terms. The methodology was inspired by SINDy [335] and
plex microstructures in [406], which by construction obey introduced in the framework for efficient unsupervised con-
fundamental thermodynamic laws. Training of these meth- stitutive law identification and discovery (EUCLID) [421].
ods can be performed in a supervised manner, relying on The approach is unsupervised, as the stress-strain data is
stress-strain data, or unsupervised. In the unsupervised set- only indirectly available through the displacement field and
ting, the constitutive model is incorporated in a finite element corresponding reaction forces. The N I invariants Ii of the
solver, yielding a displacement field for a specific boundary deformation tensor F are inserted into a candidate library
NI
value problem. The computed field, together with measure- Q({Ii }i=1 ) containing the candidate functions. Together with
ment data, yields a residual that is referred to as the modified the corresponding weights θ , the strain density ψ is deter-
constitutive relation error (mCRE) [407–409], which is mini- mined:
mized to improve the constitutive relation [410, 411]. Instead
NI NI
of formulating the mismatch in terms of displacements, [412, ψ({Ii }i=1 ) = Q T ({Ii }i=1 )θ . (60)
413] formulate it in terms of boundary forces. For an in-depth
overview of constitutive model substitution in deep learning, Through derivation of the strain density ψ using automatic
see [32]. differentiation, the stresses σ are determined. The prob-
lem is then cast into the weak form with which the linear
3.2.2 Identification of model parameters momentum balance is enforced. The weak form is then min-
imized with respect to θ using a fixed-point iteration scheme
Identification of model parameters is achieved by assuming (inspired by [422]), where a L p -regularization is used to
an underlying model and training a NN to predict its parame- promote sparsity in θ. Despite its young age, the approach
ters for a given input. In the constitutive model example, one has already been applied to plasticity [423], viscoelastic-
might assume a linear elastic model expressed in terms of a ity [424], combinations [425], and has been extended to
constitutive tensor c, such that σ = cε. The constitutive ten- incorporate uncertainties through a Bayesian model [426].
sor can be predicted from the material distribution defined Furthermore, the approach has been extended with an ensem-
in terms of a heterogeneous elasticity modulus E defined ble of input-convex NNs [413], yielding a more accurate, but
throughout the domain less interpretable model.
A similar effort was recently carried out by [427, 428],
where NNs are designed to retain interpretability. This is
ĉ = f N N (E; θ ). (59) achieved through sparse connections in combination with
specialized activation functions representing candidate func-
Typical applications are homogenization, where effec- tions, such that they are able to capture classical forms
tive properties are predicted from the geometry and material of constitutive terms. Through the sparse connections in
distribution. Examples are CNN-based homogenizations on the network and the specialized activation functions, the
computed tomography scans [414, 415], predictions of in- NN’s weights become physical parameters, yielding an inter-
vivo constitutive parameters of aortic walls from its geometry pretable model. This is best understood by consulting Fig. 7,
[416], predictions of elastoplastic properties [417] from where the strain energy density is expressed as
instrumented indentation results relying on a multi-fidelity
ψ̂ = θ01 eθ0 I1 + θ11 ln(θ10 I1 ) + θ21 eθ2 I1
0 0 2
approach [418], prediction of stress intensity factors from
the geometry in microfabricated microcantilevers [419],
+θ21 ln(θ20 I12 ) + θ31 eθ3 I2 + θ41 ln(θ40 I2 )
0

estimation of effective bone properties from the boundary

+θ51 eθ5 I2 + θ61 ln(θ60 I22 ).
0 2
conditions and applied stresses within a finite element, and (61)

123
298 Computational Mechanics (2024) 74:281–331

∂nu (n)
≈ αi u i . (62)
∂x n
i

The coefficients αi are predicted by NNs from the current

coarse solution. Special constraints are imposed on αi to
guarantee accurate derivatives. Another application are spe-
cialized strain mappings for damage mechanics embedded
within individual finite elements learned by PINNs [432].
It has even been suggested to partially replace solvers. For
example, [433] replace either the fluid or structural solver by
a surrogate model for fluid-structure interaction problems.
Learning tunable parameters was demonstrated for the
estimation of the largest possible time step using a RNN
acting at the latent vector of an autoencoder [434]. Also,
optimal test functions for finite elements were learned to
improve stability [435]. Another approach to learning numer-
ical parameters for simulation is presented in [436], where
hyperparameters connected to a similarity-based topology
optimization are learned—specifically, an energy scaling
factor is predicted from a dissimilarity metric based on a
previous topology optimization. These approaches have in
Fig. 7 Automated model discovery through a sparsely connected NN common that they spare the user from performing multiple
with specialized activation functions acting as candidate functions. The simulations to tune the numerical parameters.
thick black connections are not learnable, while the gray ones repre-
sent linearly weighted connections. Figure adapted and simplified from
[427] 3.3.2 Multiscale methods

Multiscale methods have been proposed to efficiently inte-

Differentiating the predicted strain energy density ψ̂ with grate and resolve systems acting on multiple scales. One
respect to the invariants Ii yields the constitutive model, relat- approach are the learned constitutive models from Sect. 3.2
ing stress and strain. that incorporate the microstructure. This is essentially achieved
through a homogenization at the mesoscale used within a
macroscale simulation.
3.3 Numerical methods
A related approach is element substructuring [437, 438],
where superelements mimic the behavior of a conglomerate
This subsection describes efforts in which NNs are used
of classic basic finite elements. In [439], the superelements
to replace or enhance classical numerical schemes to solve
are enhanced by NNs, which draw on the boundary dis-
PDEs.
placements to predict the displacements and stresses within
the element as well as the reaction forces at the boundary.
3.3.1 Algorithm enhancement Through assembly of the reaction forces in the global finite
element system, an equilibrium is reached with a Newton-
Classical algorithms can be enhanced by NNs, by learning Raphson solver. Similarly, the approach in [440] learns the
corrections to commonly arising numerical errors, or by esti- internal forces from the coarse degrees of freedom of the
mating tunable parameters within the algorithm. Corrections superelements. These approaches are particularly valuable,
have, for example, been used for numerical quadrature [429] as they can seamlessly incorporate history-dependent behav-
in the context of finite elements. Therein, NNs are used to ior using RNNs.
predict adjustments to quadrature weights and positions from Finally, multiscale analysis can also be performed by first
the nodal positions to improve the accuracy for distorted ele- solving a coarse global model with a subsequent local anal-
ments. Similarly, NNs have been applied as correction for ysis. This is referred to as zooming methods. In [441], a NN
strain-displacement matrices for distorted elements [430]. learns the global model and thereby predicts the boundary
NNs have also been employed to provide improved gradient conditions for the local model. In a similar sense, DeepONets
estimates. Specifically, [431] modify the gradient computa- have been applied for the local analysis [442], whereas the
tion to match a fine scale simulation on a coarse grid: global analysis is performed with a finite element solver. Both

123
Computational Mechanics (2024) 74:281–331 299

[457, 459]. In [457, 459], an evolutionary algorithm was

employed for the general case that the sensitivites are not
readily available. Training can adaptively be reintroduced
during the optimization phase, if the cost C does not decrease
[456], improving the NN for the specific problem it is han-
dling. Taking this idea to the extreme, the NN is trained on
Fig. 8 Gradient-based optimization the initial gradient updates of a specific optimization. Later,
solely the NN delivers the sensitivities [460] with supervised
are conducted in an alternating fashion until convergence is updates every n updates to improve accuracy, where n is a
reached. hyperparameter. The ideas of learning a forward operator
and a sensitivity operator are combined in [455], where it is
3.3.3 Optimization pointed out that the sensitivity from automatic differentia-
tion through the learned forward operator can be inaccurate,
Optimization is a fundamental task within computational despite an accurate forward operator22 . Therefore, an addi-
mechanics and therefore addressed separately. It is not only tional loss term is added to the cost function, enforcing
used to find optimal structures, but also to solve inverse prob- the correctness of the sensitivity through labels obtained
lems. Generally, the task can be formulated as minimizing with the adjoint state method. Alternatively, the sensitivity
a cost function C with respect to parameters λ. In computa- computation can be enhanced by correcting the sensitiv-
tional mechanics, λ is typically fed to a forward simulation ity computation performed on a coarse grid, as proposed
u = F(λ), yielding a solution u inserted into the cost function in [461] and related to the multiscale techniques discussed
C. If the gradients ∇λ C are available, gradient-based opti- in Sect. 3.3.2. Here, the adjoint field used for the sensitivity
mization is the state-of-the-art [443], where the gradients are computation is reduced by both a proper orthogonal decom-
used to update λ. In order to access the gradients, the forward position, and a coarser discretization. Subsequently, a NN
simulation F has to be differentiable. This requirement is, for corrects the coarse estimate through a super-resolution NN
example, utilized within the branch of deep learning called [462]. Similarly, [456, 463] maps the forward solution on a
differentiable physics [36]. Incorporating gradient informa- coarse grid to the design variable sensitivity on a fine grid. A
tion from the numerical solver into the NN improves learning, similar application is a correction term within a fixed-point
feedback, and generalization. An overview and introduction iterator, as outlined in [464].
to differentiable physics is provided in [36], with applications Related to the sensitivity predictions are approaches that
in [215, 402, 431, 444–446]21 . directly predict an updated state. The goal is to decrease the
The iterative gradient-based optimization procedure is total number of iterations. In practice, a combination of pre-
illustrated in Fig. 8. For an in-depth treatment of NNs in dictions and classical gradient-based updates is performed
optimization, see the recent review [22]. [111–113, 465]. The main variations between the methods
Inserting a learned forward operator F, as those dis- in the literature are the inputs and how far the forecasting is
cussed in Sect. 2.1, into an optimization problem provides performed. In [111], the update is obtained from the current
two advantages [447–451]. Firstly, a faster forward operator state and gradient, while [113] predicts the final state from
results in faster optimization iterations. Secondly, the gra- the history of initial updates. The history is also considered in
dient computation is simplified, as automatic differentiation [112], but the prediction is performed on subpatches which
through the forward operator F is straightforward in contrast are then stitched together.
to the adjoint state method [452, 453]. Note however, that Another option of introducing NNs to the optimization
for time-stepping procedures, the computational cost might loop is to use NNs as an ansatz of λ, see, e.g., [313, 444, 466–
be greater for automatic differentiation, as shown in [313]. 474]. In the context of inverse problems [313, 444, 466–470],
Applications include full waveform inversion [313], topol- the NN acts as regularizer on a spatially varying inverse quan-
ogy optimization [454–456], and control problems [70, 72, tity λ(x) = I N N (x; θ ), providing both smoother and sharper
444]. solutions. For topology optimization with a NN parametriza-
Similarly, an operator replacing the sensitivity computation of the density function [471–474], no regularizing effect
tion can be learned [456–459]. This can be achieved in a was observed. It was however possible to obtain a greater
supervised manner with precomputed sensitivities to reduce design diversity through different initializations of the NN.
the cost C [456, 458], or by intending to maximize the Extensions using specialized NN architectures for implicit
improvement of the cost function after the gradient update
22 Although automatic differentiation in principle has a high accuracy,
21 Applications of differentiable physics vary widely and are addressed oscillations between the sampled points may lead to spurious gradients
throughout this work. with regard to the sampled points [242].

123
300 Computational Mechanics (2024) 74:281–331

representations [475–480] have been presented in the con- Further variations perform the coarse-to-fine mapping in
text of topology optimization in [481]. Furthermore, [313, a patch-based manner, where the interfaces require a special
468, 472] showed how to conduct the gradient computation treatment [493]. Another approach uses a NN to map the
without automatic differentiation through the solver F. The coarse solution to the closest fine solution stored in a database
gradient computation is split up via the chain rule: [494]. The mapping is performed on patches of the domain.
Other post-processing tasks include feature extraction.
∇θ C = ∇λ C · ∇θ λ. (63) After a topology optimization, NNs have been used to extract
basic shapes to be used in a subsequent shape optimization
[495, 496]. Another aspect that can be ensured through post-
The first gradient ∇λ C is computed with the adjoint state processing is manufacturability.
method, such that the solver can be treated as a black box. Lastly, adaptive mesh refinement falls under the category
The second gradient ∇θ λ is obtained through automatic dif- of post-processing as well. Closely related to the meshing
ferentiation. An additional advantage of the NN ansatz is approaches discussed in Sect. 3.1.3, NNs have been proposed
that, if applied to multiple solutions with a problem specific as error indicators [361, 497] that are trained in a supervised
input, the NN is trained. Thus, after sufficient inversions, manner. The error indicators can subsequently be employed
the NN can be used as predictor, as presented in [482]. The to adapt the mesh based on the error.
training can also be performed in combination with labeled
data, yielding a semi-supervised approach, as demonstrated
in [224, 483].
4 Discretizations as neural networks
3.4 Post-processing
NNs are composed of linear transformations and non-linear
Post-processing concerns the modification and interpretation functions, which are basic building blocks of most PDE dis-
of the computed solution. One motivation is to reduce the cretizations. Thus, the motivation to construct NNs utilizing
numerical error of the computed solution. This can for exam- discretizations of PDEs are twofold. Firstly, deep learning
ple be achieved with super-resolution techniques relying on techniques can hereby be exploited within classical dis-
specialized CNN architectures from computer vision [484, cretization frameworks. Secondly, novel NN architectures
485]. Coarse to fine mappings can be obtained in a super- arise, which are more tailored towards many physical prob-
vised manner using matching coarse and fine simulations as lems in computational mechanics but potentially also find
labeled data, as presented for turbulent flows [462, 486] and their use cases outside of that field.
topology optimization [487–489]. The mapping is typically
performed from coarse to fine solution fields, but mappings 4.1 Finite element method
from a posteriori errors have been proposed as well [490].
Further specialized extensions to the cost function have been One method are finite element NNs [14, 498] (see [499–504]
suggested in the context of de-homogenization [491]. for applications), for which we consider the system of equa-
The methods can analogously be applied to temporal data tions from a finite element discretization with the stiffness
where the solution is refined at each time step,—as, e.g., matrix K i j , degrees of freedom u j , and the body load bi :
presented with RNNs as corrector of reduced order models
[492]. However, coarse discretizations in dynamical models
lead to an error accumulation, that increases with the number
N

of time steps. Thus, a simple coarse-to-fine post-processing K i j u j − bi = 0, i = 1, 2, . . . , N . (64)

at each time step is not sufficient. To this end, [445, 446] j=1

apply a correction at each time step before the coarse solver

predicts the next time step. As the correction is propagated Assuming constant material properties along an element and
through the solver, the sensitivities of the solver must be uniform elements, a pre-integration of the local stiffness
computed to perform the backward propagation. Therefore, matrix kiej = α e wiej can be performed, as, e.g., shown in
a differentiable solver (i.e., differentiable physics) has to be [505]. The goal is to pull out the material coefficients of the
employed. This significantly outperforms the purely super- integration, leading to the following assembly of the global
vised approach, where the entire coarse trajectory is applied stiffness matrix:
without corrections in between. The number of steps per-
formed is a hyperparameter, which increases the accuracy

M
wiej if i, j ∈ e
but comes with a higher computational effort. This concept Ki j = α e
Wiej with Wiej = . (65)
is referred to as solver-in-the-loop. e=1
0 else

123
Computational Mechanics (2024) 74:281–331 301

displacement over the entire domain u is obtained by super-

position of all elemental displacement fields u e , which are
first multiplied by a step function defined as 1 inside the
Fig. 9 Finite element NNs, prediction of forces bi from material coef-
ficients α e via assembly of global stiffness matrix K i j , and evaluations corresponding element domain e and 0 outside.
of equations with the displacements u j [498] A forward problem is solved with a minimization of the
variational loss function, as presented in Sect. 3.2 with the
nodal values u ie as learnable weights. According to [506], this
Inserting the assembly into the system of equations from Eq. (64) is equivalent to iterative solution procedures employed for
yields large systems of equations in finite elements. The additional
M advantage is a seamless integration of r -refinement [513–
N 515] (also referred to as adaptive mesh refinement), i.e., the
α e Wiej u j − bi = 0, i = 1, 2, . . . , N . (66) shift of nodal positions to optimal positions by making the
j=1 e=1
nodal positions xie learnable. Special care has to be taken to
avoid element inversion, which is handled by an additional
The nested summation has a similar structure of a FC-
(l) (l) (l) (l−1) (l) loss term. Inverse problems can similarly be solved by using
NN, ai = σ (z i ) = σ ( Nj=1 a j + bi ), (where
(l)
learnable input parameters, as presented for topology opti-
(l) N (l−1) (l)
zi = j=1 a j + bi ) without activation σ and bias mization [512].
b (see Fig. 9): The method has been combined with reduced order model-
⎛ (1) ⎞ ing techniques [508]. Furthermore, the shape functions have
N (2)
N (2)
N been extended with convolutions [510, 511]. Specifically, a
(2) (1) (1) (1) (0) (0)
ai = Wi j a j = Wi j ⎝ W jk ak ⎠ . (67) weighting field W (x), i.e., kernel (e.g., radial basis functions)
j=1 j=1 k=1 with learnable dilation parameter23 , is introduced to enhance
the finite element space u c (x) through convolutions, thereby
Thus, the stiffness matrix K i j is the hidden layer. In a increasing the space’s expressivity and continuity:
forward problem, Wiej are non-learnable weights, while u j
contains a mixture of learnable weights and non-learnable
u c (x) = u e (x) ∗ W (x). (71)
weights coming from the imposed Dirichlet boundary con-
ditions. A loss can be formulated in terms of body load
mismatch, as 21 i=1 N
(b̂i − bi )2 . In the inverse setting, α e This introduces a smoothing effect over the elements and can
becomes learnable—instead of u j , which is then fixed. For efficiently be implemented using NNs and, thereby, obtain a
partial domain knowledge in the inverse case, u j becomes more favorable data-structure to exploit the full paralleliza-
partially learnable. tion capabilities of GPUs [511]. The enhanced space has
A different approach are the hierarchical deep-learning been incorporated in the HiDeNN framework. While an inde-
NNs (HiDeNNs) [506] with extensions in [507–512]. Here, pendent confirmation is still missing, the authors promise a
shape functions are treated as NNs constructed from basic speedup of several orders of magnitude compared to tradi-
building blocks. Consider, for example, the one-dimensional tional finite element solvers [512]24 .
linear shape functions
23 The dilation parameter depends on x and thus introduces additional
x − x2e degrees of freedom throughout the domain.
N1 (x) = (68) 24 After close examination and exchanges with the authors of [506,
x1e − x2e
508, 510–512], we have concluded that the current speed-up is mainly
x − x1e attributable to the simultaneously employed reduced order models.
N2 (x) = , (69)
x2e − x1e Minor improvements in accuracy are possible through the employed
r -adaptivity and convolutions, however, accompanied by an increase
in computational effort. Furthermore, we have been informed that the
which can be represented as a NN, as shown in Fig. 10, hiDeNN methodology is implemented most efficiently in the standard
where the weights depend on the nodal positions x1e , x2e . The finite element way, i.e., without NNs, unlike the descriptions in [506,
interpolated displacement field u e (x), which is valid in the 508, 510–512]. Thus the shape functions Nie (x) are implemented in a
element domain e , is obtained by multiplication with the straightforward manner within automatic differentiation frameworks
in order to obtain the derivatives of the solution u with respect to
nodal displacements u e1 , u e2 , treated as shared NN weights. x, and the loss function with respect to the nodal positions xie and
degrees of freedom u e . If convolutions are not employed, the sensitivi-
u e (x) = N1e (x)u e1 + N2e (x)u e2 (70) ties can be precomputed analytically, eliminating the need for automatic
differentiation. Hence, the difference to a conventional finite element
implementation is that the finite element discretization is solved with
They are shared, as the nodal displacements u e1 , u e2 are also gradient descent optimization instead of solving the system of equations
used for the neighboring elements u e−1 , u e+1 . Finally the directly. This is more expensive but allows for the flexibility of

123
302 Computational Mechanics (2024) 74:281–331

Fig. 10 HiDeNN with

one-dimensional linear elements
[506]
(a) One-dimensional linear elements with two nodes each

(b) HiDeNN

Lastly, another approach related to finite elements was

presented as FEA-net [516, 517]. Here, the matrix-vector
multiplication of the global stiffness matrix K and solution
vector u including the assembly of the global stiffness matrix Fig. 11 Segment of one-dimensional finite element mesh with degrees
is replaced by a convolution. In other words, the computation of freedom (left). Local element definition with stiffness contributions
of the force vector f is used to compute the residual r. (right)

r = f −K ·u (72) FEA-Net is able to outperform classical finite elements for

non-linear problems on uniform grids.
Assuming a uniform mesh with homogeneous material
properties, the mesh is defined by the segment illustrated 4.2 Finite difference method
in Fig. 11. The degree of freedom u j only interacts with the
stiffness contributions K i1 , K i2 , K i+1
1 , K 2 of its neighbor-
i+1 Similar ideas have been proposed for finite differences [518],
ing elements i and i + 1. Therefore, the force component f j as for example employed in [313], where convolutional ker-
acting on node j can be expressed by a convolution: nels are used as an implementation of stencils exploiting
the efficient NN libraries with GPU capabilities. Here, the
f j = [K i1 , K i2 + K i+1
1
, K i+1
2
] ∗ [U j−1 , U j , U j+1 ] (73) learnable parameters can be the finite difference stencil for
inverse problems or the output for forward problems. This
This can analogously be applied to all degrees of freedoms, has, for example, been presented in the context of full wave-
with the same convolution filter W = [K 1 , K 1 + K 2 , K 2 ], form inversion, which is modeled as a RNN [519, 520]. The
assuming the same stiffness contributions for each element. stencils are written as convolutional filters and repeatedly
applied to the current state and the corresponding inputs.
K ·u= W ∗U (74) These are the wave field, the material distribution, and the
source. The problem can then be regarded as a RNN. How-
The convolution can then be exploited in iterative schemes ever, it is computationally expensive to perform automatic
which minimize the residual r from Eq. 72). This saves the differentiation throughout the time steps for full waveform
effort of constructing and storing the global stiffness matrix. inversion, thereby obtaining the sensitivities with respect to
By constructing the filter W as a function of the material prop- γ —both regarding memory and wall clock computational
erties of the adjacent elements, heterogeneities can be taken time. A remedy is to combine automatic differentiation with
into account [517]. If the same iterative solver is employed, the adjoint state method as in [313, 468, 472] and discussed
in Sect. 3.3.3.
Footnote 24 continued
Taking this idea one step further, the discretized wave
seamlessly introducing r -adaptivity or convolutions on top of the ansatz equation can be regarded as an analog RNN [521] where the
space. weights are the material distribution. Here, a binary material

123
Computational Mechanics (2024) 74:281–331 303

Fig. 12 Analog RNN

Here, f is the evaluation of one recurrent unit in the RNN.

In the limit of the time step size lim t → 0, the dynamics
of the hidden units yt can be parametrized by an ordinary
differential equation

Fig. 13 A single building block of the deep material network [522]

dy(t)
= f (y(t), t; θ ) (76)
dt
is learned in a trainable region between source and probing
location. The input x(t) is encoded as a signal and emitted The input to the network is the initial condition y(0), and
as source, which is measured at the probing locations yi (t) the output is the solution y(T ) at time T . The output of the
as output. By integrating the outputs, a classification of the NN, y(T ), is obtained by solving Eq. 76 with a differential
input can be performed. equation solver. The sensitivity computation for the weight
update is obtained using the adjoint state method [453, 527],
4.3 Material discretizations as backpropagating through each time step of the solver leads
to a high memory cost. This also makes it possible to treat the
Deep material networks [522, 523] construct a NN from a solver as a black box. Similar extensions to PDEs [525] have
material distribution. An output is constructed from basic been proposed by considering recurrent CNNs with residual
building blocks, inspired by analytical homogenization tech- connections, where the CNNs act as spatial gradients.
niques. Given two materials defined in terms of their compli- Similarly, [528] establish a connection between deep
ance tensors c1 , c2 , and volume fractions f 1 , f 2 , an analytical residual RNNs and iterative solvers. Residual connections
effective compliance tensor c̄ is computed. The effective ten- in NNs allow information to bypass NN layers. Consider the
sor is subsequently rotated with a rotation tensor R, defined in estimation of the next state of a PDE with a classical solver
terms of the three rotation angles α, β, γ , yielding a rotated u t+1 = u(t + t) = F[u(t)]. The residual rt+1 = r (t + t)
effective tensor c̄r . Thus, the building block takes as input is determined in terms of the ground truth u Mt+1 :
two compliance tensors c1 , c2 and outputs a rotated effective
compliance tensor c̄r , where f 1 , f 2 , α, β, γ are the learn- rt+1 = u M
t+1 − u t+1 . (77)
able parameters (see Fig. 13). By connecting these building
blocks, a large network can be created. The network is applied
An iterative correction scheme is formulated with a NN. The
to homogenization tasks of RVEs [522, 523], where the mate-
iterations are indicated with the superindex (k).
rial of the phases is varied during evaluation.
(k+1) (k) (k+1)
4.4 Neural differential equations u t+1 = u t+1 + f N N (rt+1 ; θ ) (78)
(k+1) (k)
rt+1 = uM
t+1 − u t+1 (79)
In a more general setting, neural ordinary differential equa-
tions [402] consider the forward Euler discretization of ordi- (k)
nary differential equations. Specifically, RNNs are viewed Note that the residual connection, i.e., u t+1 as directly used
(k+1)
as Euler discretizations of continuous transformations [524– in the prediction of u t+1 , allows information to pass past
526]. Consider the iterative update rule of the hidden states the recurrent unit f N N . A related approach can be found in
yt+1 = y(t + t) of a RNN. [529], where an autoencoder iteratively acts on a solution
until convergence. In the first iteration, a random initial solu-
yt+1 = yt + f (yt ; θ ) (75) tion is used as input.

123
304 Computational Mechanics (2024) 74:281–331

5 Generative approaches is presented in [547] to reconstruct three-dimensional voxel

models of porous media from two-dimensional images.
Generative approaches (see [33] for an in-depth review in The generated data sets can subsequently be leveraged to
the field of design and [530] for a hands-on textbook) aim train surrogate models, as demonstrated in [536, 548–550]
to model the underlying probability distribution of a data set where CNNs were used to verify the physical properties of
to generate new data that resembles the training data. Three designs, and in the study by [551] on the homogenization
main methodologies exist: of microstructures with CNNs. Similarly, [93, 552] generate
realistic material distributions, such as velocity distributions,
• autoencoders, to train an inverse operator for full waveform inversion.
• generative adversarial networks (GANs),
• diffusion models, 5.1.2 Generative design and design optimization

and are described in detail in Appendix B. Currently, there Within generative design, the generator can also be consid-
are two prominent areas of application in computational ered as a reparametrization of the design space that reduces
mechanics. One area of focus is microstructure generation the number of design variables. With autoencoders, the latent
(Sect. 5.1.1), which aims to produce a sufficient quantity vector serves as the design parameter [553, 554], which
of realistic training data for surrogate models, as described is then optimized25 . Similarly, [556] find that point cloud
in Sect. 2.1. The second key application area is generative autoencoders [117, 557, 558] are advantageous as geometric
design (Sect. 5.1.2), which relies on algorithms to efficiently dimensionality reduction tools (potentially combined with
explore the design space within the constraints established performance features) for efficiently exploring the design
by the designer. space. In the context of GANs, the optimization task is
aimed at the random input ξ provided to the generator. This
5.1 Applications approach is demonstrated in various studies, such as ship
hull design parameterized by NURBS surfaces [559], airfoil
5.1.1 Data generation shapes expressed with Bézier curves [560, 561], structural
optimization [562], and full waveform inversion [563]. For
The most straightforward application of variational autoen- optimization, variational autoencoder GANs are particularly
coders and GANs in computational mechanics is the gen- important, as the GAN ensures high quality designs, while
eration of new data, based on existing examples. This has the autoencoder ensures well-behaving gradients. This was
been demonstrated in [531–535] for microstructures in [93] shown for microstructure optimization in [564].
for velocity models used in full waveform inversion, and An important requirement for generative design is design
in [536] for optimized structures using GANs. Variational diversity. Achieving this involves ensuring that the entire
autoencoders have also been used to model the crossover design space is spanned by the generated data. For this, the
operation in evolutionary algorithms to create new designs cost function can be extended, as presented in [565], using
from parent designs [537]. Applications of diffusion models determinantal point processes [566] or in [559] with a space-
for microstructure generation can be found in [538–540]. filling term [567].
Microstructures pose a unique challenge due to their Other strategies are specifically focused on promoting
inherent three-dimensional nature, while often only two- design diversity. This involves identifying novel designs via
dimensional reference images are available. This has led to a novelty score [568]. The novelty within these designs is
the development of specialized architectures that are capable segmented and used to modify the GAN using methods out-
of creating three-dimensional structures from representative lined in [569]. An alternative approach proposed by [570]
two-dimensional slices [541–543]. The approach typically quantifies creativity and maximizes it. This is achieved by
involves treating three-dimensional voxel data as a sequence performing a classification in pre-determined categories by
of two-dimensional slices of pixels. Sequences of images are the discriminator. If the classification is unsuccessful, the
predicted from individual slices, ultimately forming a three- design must lie outside the categories and is therefore deemed
dimensional microstructure. In [544], a RNN is applied to creative. Thus the generator then seeks to minimize the clas-
a two-dimensional reference image, yielding an additional sification accuracy.
dimension, and consequently creating a three-dimensional
structure. The RNN is applied at the latent vector inside
25 It is worth noting, that to ensure designs that are physically mean-
an encoder decoder architecture, such that the inputs and
ingful, a style transfer technique can be implemented [555]. Here, the
outputs of the RNN have a relatively small size. Similarly, training data is perceived as a style, and the Gram matrices’ differ-
[545, 546] apply a transformer [172] to the latent vector. An ence, characterizing the distribution of visual patterns or textures in the
alternative formulation using variational autoencoder GANs generated designs, is minimized.

123
Computational Mechanics (2024) 74:281–331 305

However, some applications necessitate a resemblance velocity distributions when they are transformed from seis-
to prior designs due to factors such as aesthetics [571] or mogram to velocity distribution and back again.
manufacturability [572]. In [571], a pixel-wise L 1 -distance Lastly, coarse-to-fine mappings as previously discussed
to previous designs is included in the loss26 . A complete in Sect. 3.4, can also be learned by GANs. This was, for
workflow with generative design enforcing resemblance of example, demonstrated in topology optimization, where a
previous designs and surrogate model training for the quan- conditional GAN refines coarse designs obtained from clas-
tification of mechanical properties is described in [573]. sical optimizations [579, 586] or CNN predictions [102].
Another option is the use of style transfer techniques [555], For temporal problems, such as fluid flows, the temporal
which in [574] is incorporated into a conventional topology coherence between time steps poses an additional challenge.
optimization scheme [575] as a constraint in the loss. These Temporal coherence can be ensured by a second discrimi-
are tools with the purpose of incorporating vague constraints nator, which receives three consecutive frames of either the
based on previous designs for topology optimization. generator or the real data and decides if they are real or gen-
GANs can also be applied to inverse problems, as pre- erated. The method is referred to as tempoGAN [587].
sented in [576] for full waveform inversion. The generator
predicts the material distribution, which is used in a differen- 5.1.4 Anomaly detection
tiable simulation providing the forward solution in the form
of a seismogram. The discriminator attempts to distinguish Finally, a last application of generative models is anomaly
between the seismogram indirectly coming from the genera- detection, see [588] for a review. This is particularly valu-
tor and the measured seismograms. The underlying material able for non-destructive testing, where flawed specimens can
distribution is determined through gradient descent. be identified in terms of anomalies. The approach relies on
generative models and attempts to reconstruct the geometry.
5.1.3 Conditional generation At first, the generative model is trained on structures with-
out flaws. During evaluation, the structures to be tested are
As stated earlier, GANs can take specific inputs to dictate the then fed through the NN. In case of an autoencoder, as in
output’s nature. The key difference to data-driven surrogate [589], it is fed through the encoder and decoder. For a GAN,
models from Sect. 2.1 is that GANs provide a tool to generate as discussed, e.g., in [590–592], the input of the generator
multiple outputs given the same conditional input. They are is optimized to fit the output as well as possible. The mis-
thus applicable to problems with multiple solutions, such as match in reconstruction then provides a spatially dependent
design optimization or data generation. measure of where an anomaly, i.e., defect is located.
Examples of conditional generation are rendered cars Another approach is to use the discriminator directly, as
from car sketches [577], hierarchical shape generation [578], presented in [593]. If a flawed specimen is given to the dis-
where the child shape considers its parent shape and topol- criminator, it will be categorized as fake, as it was not part
ogy optimization with predictions of optimal structures from of the undamaged structures during training. The discrimi-
initial fields, e.g., strain energy, of the unoptimized structure nator can also be used to check if the domain of application
[579, 580]. Physical properties can also be used as input. of a surrogate model is valid. Trained on the same training
The properties are computed by a differentiable solver after data as the surrogate model, the discriminator estimates the
generation and are incorporated in the loss. This was, e.g., dissimilarity between the data to be tested and the training
presented in [581] for airplane shapes, and in [582] for inverse data. For large discrepancies, the discriminator detects that
homogenization. For full waveform inversion, [583] trains the surrogate model becomes invalid.27
a conditional GAN with seismograms as input to predict
the corresponding velocity distributions. A similar effort is
made by [584] with CycleGANs [585] to circumvent the need 6 Deep reinforcement learning
for paired data. Here, one generator generates a seismogram
ŷ = G y (x) and another a corresponding velocity distribution In reinforcement learning, an agent interacts with an environ-
x̂ = G x (y). The predictions are judged by two separate dis- ment through a sequence of actions at , which is illustrated
criminators. Additionally, a cycle-consistency loss ensures in Fig. 14. Upon executing an action at , the agent receives
that a prediction from a prediction, i.e., G y (x̂) or G x ( ŷ), an updated state st+1 and reward rt+1 from the environment.
matches the initial input x or y. This cycle-consistency loss The agent’s objective is to maximize the cumulative reward
ensures, that the learned transformations preserve the essen- R . The environment can be treated as a black box. This
tial features and structures of the original seismograms or presents an advantage in computational mechanics when dif-

27 Note however, that the discriminator does not guarantee an accurate

26 Similarly, this loss can be used to filter out designs that are too similar. assessment of the validity of the surrogate model.

123
306 Computational Mechanics (2024) 74:281–331

in order to best match simulation and measurement [608],

which yields an interpretable law.
Topology optimization has also been tackled by reinforce-
ment learning. Specifically, the ability to predict only binary
states (material or no material) is desirable—instead of inter-
mediate states, as in solid isotropic material with penalization
Fig. 14 Reinforcement learning in which an agent interacts with an [609, 610]. This has been shown with binary truss structures,
environment with actions at , states st , and rewards rt . Figure adapted
from [45] modeled with graphs in order to minimize the total structural
volume under stress constraints. In [611], an agent removes
trusses from existing structures, and trusses are added in
ferentiable physics are not feasible (as for example in crash [612]. Similarly, [613] removes finite elements in solid struc-
simulations [594]). Reinforcement learning has achieved tures to modify the topology. Instead, [614] pursues design
impressive results such as human-level performance in games diversity. Here a NN surrogate model predicts near optimal
like Atari [20], Go [595], and StarCraft II [596]. Further, structures from reference designs. The agent then learns to
reinforcement learning has successfully been demonstrated generate reference designs as input, such that the correspond-
in robotics [597]. An example hereof is learning complex ing optimal structures are as diverse as possible.
maneuvers for autonomous helicopter flight [598–600]. Also, high-dimensional PDEs have been solved with rein-
A comprehensive review of reinforcement learning exceeds forcement learning [615, 616]. This is achieved by recasting
the scope of this work, since it represents a major branch of the PDE into stochastic control problems, thereby solving
machine learning. An introduction is, e.g., given in [25, 38], these with reinforcement learning.
and an in-depth textbook is [45]. However, at the intersec- Finally, adaptive mesh refinement algorithms have been
tion of these domains lies deep reinforcement learning, which learned by reinforcement learning [617]. An agent decides
employs NNs to model the agent’s actions. In Appendix C, we whether an element is to be refined based on the current
present the main concepts of deep reinforcement learning and state, i.e., the mesh and solution. The reward is subsequently
delve into two prominent methodologies: deep policy net- defined in terms of the error reduction, which is computed
works (Appendix C.1) and deep Q-learning (Appendix C.2) with a ground truth solution. The trained agent can thus be
in view of applications in computational mechanics. applied to adaptive mesh refinement to previously unseen
simulations.

6.1 Applications
6.1.1 Extensions
Deep reinforcement learning is mainly used for inverse prob-
lems (see [25] for a review within fluid mechanics), where Each interaction with the environment requires solving the
the PDE solver is treated as a black box, and assumed to not differential equation, which, due to the many interactions,
be differentiable. makes reinforcement learning expensive. The learning can be
The most prominent application are control problems. One accelerated through some basic modifications. The learning
example is discovering swimming strategies for fish—with can be perfectly parallelized by using multiple environments
the goal of efficiently minimizing the distance to a leader fish simultaneously [618], or by using multiple agents within
[601, 602]. The environment is given by the Navier Stokes the same environment [619]. Another idea is to construct
equation. Another example is balancing rigid bodies with a surrogate model of the environment and thereby exploit
fluid jets while using as little force as possible [603]. Simi- model-based approaches [620–623]. The general procedure
larly, [604] control jets in order to reduce the drag around a consists of three steps:
cylinder. Reducing the drag around a cylinder is also achieved
by controlling small rotating cylinders in the wake of the flow
[605]. A more complex example is controlling unmanned • model learning: learn surrogate of environment,
aerial vehicles [606]. The control schemes are learned by • behavior learning: learn policy or value function,
interacting with simulations and, subsequently, applied in • environment interaction: apply learned policy and collect
experiments. data.
Further applications in connection with inverse problems
are learning filters to perturb flows in order to match target
flows [607]. Also, constitutive laws can be identified. The Most approaches construct the surrogate with data-driven
individual arithmetic manipulations within a constitutive law modeling (Sect. 2.1), but physics-informed approaches have
can be represented as graphs. An agent constructs the graph been proposed as well [620, 622] (Sect. 3.2).

123
Computational Mechanics (2024) 74:281–331 307

7 Conclusion and outlook tutive laws, which are inherently phenomenological and
thereby well-suited to be identified from data using tools
In order to structure the state-of-the-art, an overview of the such as deep learning. In addition, simulation enhance-
most prominent deep learning methods employed in compu- ment, makes it possible to draw on insights gained from
tational mechanics was presented. Five main categories were classical methods developed since the inception of com-
identified: simulation substitution, simulation enhancement, putational mechanics. Furthermore, it is currently more
discretizations as NNs, generative approaches, and deep rein- realistic to learn smaller components of the simulation
forcement learning. chain with NNs rather than the entire model. These com-
Despite the variety and abundance of the literature, few ponents should ideally be expensive and have limited
approaches are competitive in comparison to classical meth- requirements regarding accuracy and reliability. Lastly,
ods. This manifests itself in the lack of comparisons in it is also easier to assess whether a method enhanced by
the literature of NN-based methods to classical methods. deep learning outperforms the classical method, as direct
We have found little evidence that NN-based methods truly and fair comparisons are readily possible.
outperform classical methods in computational mechanics. • An interesting research direction is to employ discretiza-
However, with only few exceptions, current research is still tions as NNs, as this offers the potential to discover NNs
in its early stages, with a focus on showcasing possibili- tailored to computational mechanics tasks, such as CNNs
ties without focusing too much attention on accuracy and for computer vision or RNNs and transformers for natural
efficiency. Future research must, nevertheless, shift its focus language processing. In computational mechanics, their
to incorporate more in-depth investigations into the perfor- main benefit seems to stem from being able to exploit
mance of the developed methods—including thorough and the computational benefits of tools and hardware that
meaningful comparisons to performant classical methods were created for the wider community of deep learning—
dedicated to the task under investigation. This is in agreement such as NN libraries programmed for GPUs which enable
with the recent review article on deep learning in topology an efficient, yet effortless massive parallelization. In our
optimization [22], where critical and fair assessments are assessment, none of the methods encountered in this
requested. This includes the determination of generalization review were shown to be able to consistently outperform
capabilities, greater transparency by including, e.g., worst classical approaches using a comparable amount of com-
case performances to illustrate reliability, and computation putational resources.
times without disregarding the training time. • Generative approaches have been shown to be highly ver-
In line with this, and to the best of our knowledge, we pro- satile in applications of computational mechanics since
vide a final overview outlining the potentials and limitations the accuracy of a specific instance under investigation
of the discussed methods. is less of a concern here. They have been used to gen-
erate statistically equivalent data to train other machine
• Simulation substitution has potential for surrogate mod- learning models, to incorporate vague constraints based
eling of parameterized models that need to be evaluated on data within optimization frameworks, and to detect
many times. However, currently this is only realizable anomalies.
for small parameter spaces, due to the amount of data • Deep reinforcement learning has already shown encour-
required and unlikely to replace established methods, aging results—for example in controlling unmanned
as also stated in [42]. Complex problems can still be vehicles in complex physics environments. It is mainly
tackled by NN surrogates if they are first reduced to a applicable for problems where efficient differentiable
low-dimensional space through model order reduction physics solvers are unavailable, which is why it is popu-
techniques. Physics-informed learning further reduces lar in control problems for turbulence. In the presence of
the amount of required data and improves the general- differentiable solvers, gradient-based methods are, how-
ization capabilities. However, enforcing physics through ever, still the state-of-the-art [443] and, thus, preferred.
penalty terms increases the computational effort, where
the solutions still do not necessarily satisfy the corre-
sponding physical laws. Instead, enforcing physical laws
by construction guarantees that they are obeyed, which
is more favorable to adding constraints through penalty
terms.
• Simulation enhancement is currently one of the most
Acknowledgements The authors gratefully acknowledge the funding
promising areas of investigation. It is in particular bene- through the joint research project Geothermal-Alliance Bavaria (GAB)
ficial for tasks where classical methods show difficulties. by the Bavarian State Ministry of Science and the Arts (StMWK) as well
An excellent example for this is the formulation of consti- as the Georg Nemetschek Institut (GNI) under the project DeepMonitor.

123
308 Computational Mechanics (2024) 74:281–331

Funding Open Access funding enabled and organized by Projekt In a directed graph28 , each edge ei has a sender node vis
DEAL. and a receiver node vir . This enables the formulation of an
algorithm operating first on the edges, and subsequently on
Declarations the nodes, as summarized in Algorithm 1 for a single graph
block. These graph blocks can be stacked similarly to layers
Conflict of interest No potential conflict of interest was reported by the
authors. in other NN architectures.

Open Access This article is licensed under a Creative Commons

Attribution 4.0 International License, which permits use, sharing, adap-
tation, distribution and reproduction in any medium or format, as
Algorithm 1 Graph block in a message passing NN. Nodes
long as you give appropriate credit to the original author(s) and the v, edges e, and global attributes u are updated with conven-
source, provide a link to the Creative Commons licence, and indi- tional FC-NNs f v , f e , f u . To incorporate all the relevant
cate if changes were made. The images or other third party material information with a variable size as input to these FC-NNs,
in this article are included in the article’s Creative Commons licence,
unless indicated otherwise in a credit line to the material. If material
aggregation operations are performed ρ e→v , ρ e→u , ρ e→v .
is not included in the article’s Creative Commons licence and your These can for example be a summation or taking the max-
intended use is not permitted by statutory regulation or exceeds the imum. These are thus related to the pooling operation of
permitted use, you will need to obtain permission directly from the copy- CNNs.
right holder. To view a copy of this licence, visit http://creativecomm Require: Graph consisting of nodes v, edges e and a global attribute
ons.org/licenses/by/4.0/. u.
for all edges ei do
Update edges: ei = f e (ei , vis , vir , u; θ e )
end for
A Advanced neural network architectures for all nodes vi do
Find all edges connecting to node vi : ei
Aggregate adjacent edges: ēi = ρ e→v (ei )
A.1 Convolutional neural networks Update nodes: vi = f v (ēi , vi , u; θ v )
end for
CNNs [55–57] leverage the translation-invariant properties Aggregate all edges e: ē = ρ e→u (e)
of physical objects. Unlike FC-NNs, CNNs process struc- Aggregate all nodes v: v̄ = ρ v→u (v)
Update global attribute: u = f u (ē , v̄ , u; θ u )
tured data arranged in grids, such as two-dimensional images
or three-dimensional voxel data. On this data, specialized
convolutional layers (confer Fig. 15a) are applied. These lay-
ers utilize a set of trainable kernels to extract features such
as edges, textures, and shapes. In addition, pooling layers
(as illustrated in Fig. 15b) are employed for downsampling,
A.3 Recurrent neural networks
thereby reducing spatial dimensions, which not only cap-
tures essential information but also reduces computational
RNNs [58–60] harness sequential dependencies and tempo-
complexity. CNNs excel in learning hierarchical feature rep-
ral information within data. In contrast to FC-NNs, RNNs
resentations, rendering them highly effective in tasks like
are designed to handle sequential or time-series data, where
object classification, and image segmentation [624].
each input xt is not treated in isolation but rather as part of
a sequence (see Fig. 16). RNNs maintain an internal hidden
A.2 Graph neural networks state h t that evolves as new inputs are processed, allowing
them to capture context and relationships across time steps.
GNNs [61–63] are tailored to data structured as graphs, where This makes RNNs particularly well-suited for tasks like natu-
entities and their connections are represented as nodes v and ral language processing [626, 627], speech recognition [628],
edges e, respectively. Unlike other NNs, GNNs operate on and other time-series forecasting tasks. The recurrent nature
non-Euclidean data with irregular structures, making them of RNNs enables them to model dynamic patterns and depen-
suitable for a wide range of applications, including citation dencies, making them a valuable tool in various applications
networks [61], molecule analysis [625], and even mesh-based that involve sequential data analysis. Modern variations and
simulation [114, 115]. GNNs propagate information through alternatives hereof are LSTMs [59], GRUs [180], and trans-
the graph by iteratively aggregating and updating features formers [172].
from neighboring nodes. An example of this is the message
passing NN [63], where the following invariant is exploited:

28 An undirected graph can also be considered by treating it as a bi-

ei
vis −−−−−→ vir (80) directional graph.

123
Computational Mechanics (2024) 74:281–331 309

Fig. 15 The fundamental operations within CNNs: in the top the convolution operation and in the bottom the pooling operation. Adapted from [37]

Fig. 16 An unrolled RNN. The

recurrent unit is iteratively
applied to a varying input xt and
a continuously evolving hidden
state h t , yielding the prediction
ŷt and next hidden state h t+1 .
Adapted from [37, 629]

B Generative approaches B.2 Generative adversarial networks

B.1 Autoencoders GANs [19] emulate data distributions by setting up a two-

player adversarial game between two NNs:
Autoencoders [630–633] facilitate data generation by map-
ping high-dimensional training data {x i }i=1N to a lower-
dimensional latent space {hi }i=1 which can be sampled
N
• the generator G N N ,
efficiently. Specifically, an encoder ĥ = E N N (x; θ e ) trans-
• the discriminator D N N .
forms an input sample x to a reduced latent vector ĥ. A
corresponding decoder x̂ = D N N ( ĥ; θ d ) reconstructs the
original sample x from this latent vector ĥ. As mentioned
in Paragraph 2.1.1.3, the encoder can serve as a tool for The generator creates predictions ŷ = G N N (ξ ; θ G ) from
dimensionality reduction, whereas the decoder, within the random noise ξ , while the discriminator attempts to dis-
scope of generative approaches, operates as a generator. By tinguish between these generated predictions ŷ from real
emulating the probability distribution of the latent space data yM . The discriminator assigns a probability score
{ ĥi }i=1
N , variational autoencoders [17, 18] are able to gen-
p̂ = D N N ( y; θ D ) which evaluates the likelihood of a data
erate new data that resembles the training data. point y being real or generated. The quality of both the gener-
ator and the discriminator can be expressed via the following

123
310 Computational Mechanics (2024) 74:281–331

cost function: latent space, where proper sampling leads to smooth interpo-
lations in the generated space. In other words, small changes
1
ND in the latent space correspond to small changes in the gen-
C= log D N N ( yi ; θ D ) erated space—a characteristic not inherent to GANs. To
ND
i=1
achieve smooth interpolations, autoencoders can be com-
1
NG
bined with GANs [641], where the autoencoder acts as
+ log 1 − D N N G N N (ξ i ; θ G ); θ D . (81) generator in the GAN framework, employing both an autoen-
NG
i=1
coder loss and a GAN loss.
Here, N D real samples and N G generated samples are used
for training. The goal for the generator is to minimize the B.3 Diffusion models
cost function, implying that the discriminator fails to dis-
tinguish between real and generated samples. However, the Diffusion models enhanced by NNs [642–644] convert ran-
discriminator strives to maximize the cost. Therefore, this is dom noise x into a sample resembling the training data
formulated as a minimax optimization problem through a series of transformations. Given a data set { yi0 }i=1
N

that corresponds to the distribution q(x ), a forward noising

min max C. (82) process q(x t |x t−1 ) is introduced. This process adds Gaus-
θG θD sian noise to x t−1 at each time step t − 1. The process is
applied iteratively
Convergence is ideally reached at the Nash equilibrium
[634], where the discriminator always outputs a probabil-
T
ity of 1/2, signifying its inability to distinguish between real q(x 0 , x 1 , . . . , x T ) = q(x 0 ) q(x t |x t−1 ). (83)
and generated samples. However, GANs can be challenging t=1
to train. Problems like mode collapse [635] can arise. Here,
the generator learns only a few modes from the training data. After a sufficient number of iterations T , the resulting distri-
In the extreme case, only a single sample from the train- bution approximates a Gaussian distribution. Consequently,
ing data is learned, yielding a low discriminator score, yet a random sample from a Gaussian distribution x T can be
an undesirable outcome. To combat mode collapse, design denoised with the reverse denoising process q(x t−1 |x t ),
diversity can be either promoted in the learning algorithm resulting in a sample x 0 that matches the original distribution
or the cost [635, 636]. Another challenge lies in balancing q(x 0 ). The reverse denoising process is, however, unknown
the training of the two NNs. If the discriminator learns too and therefore modeled as a Gaussian distribution, where the
quickly and manages to distinguish all generated samples, mean and covariance are learned by a NN. With the learned
the gradient of the cost function (Eq. 81) with respect to the denoising process, data can be generated by denoising sam-
weights becomes zero, halting further progress. A possible ples drawn from a Gaussian distribution. Note the similarity
remedy is to use the Wasserstein distance in the cost function to autoencoders. Instead of learning a mapping to a hidden
[637]. random state hi , the encoding is prescribed as the iterative
Additionally, GANs can be modified to include inputs application of Gaussian noise [530].
that control the generated data. This can be achieved in a A related approach are normalizing flows [645] (see [646]
supervised manner with conditional GANs [638]. The con- for an introduction and extensive review). Here, a basic
ditional GAN does not just receive random noise, but also an probability distribution is transformed through a series of
additional input. This supplementary input is considered by invertible transformations, i.e., flows. The goal is to model
the discriminator, which assesses whether the input-output distributions of interest. The individual transformations can
pairs are real or generated. An unsupervised alternative are be modeled by NNs. A normalization is required, such that
InfoGANs [639], which disentangle the input information, each intermediate probability distribution integrates to one.
i.e., the random input ξ , defining the generated data. This is
achieved by introducing an additional parameter c, a latent
code to the generator G N N (ξ, c; θ G ). To ensure that the C Deep reinforcement learning
parameter is used by the NN, the cost (Eq. 81)) is extended by
a mutual information term [640] I (c, G N N (x, c; θ G )) ensur- In reinforcement learning, the environment is commonly
ing that the generated data varies meaningfully based on the modeled as a Markov Decision Process (MDP). This math-
input latent code c. ematical model is defined by a set of all possible states S,
In comparison to variational autoencoders, GANs typi- actions A, and associated rewards R. Furthermore, the prob-
cally generate higher quality data. However, the advantage of ability of getting to the next state st+1 from the previous
autoencoders lies in their ability to construct a well-structured st with action at is given by P(st+1 |st , at ). Thus, the envi-

123
Computational Mechanics (2024) 74:281–331 311

ronment is not necessarily deterministic. One key aspect of and a critic that judges its quality. Both can be modeled by
a Markov Decision Process is the Markov property, stating NNs.
that future states depend solely on the current state and action,
and not the history of states and actions. C.1 Deep policy networks
The goal of a reinforcement learning algorithm is to deter-
mine a policy π(s, a) which dictates the next action at in In deep policy networks, the policy, i.e., the mapping of states
order to maximize the cumulative reward R . The cumula- to actions, is modeled by a NN â = π(s; θ ). The quality of
tive reward R is discounted by a discount factor γ t in order the NN is assessed by the expected cumulative reward R ,
to give more importance to immediate rewards. formulated in terms of the action-value function Q(s, a).
∞

C = R = E Q(s, a) (89)
R = γ rt
t
(84)
t=0 Its gradient (see [38, 658, 660] for a derivation), given as
The quality of a policy π(s, a) can be assessed by a state-
∇θ R = E Q(s, a)∇θ log π(s, a; θ ) , (90)
value function Vπ (s), defined as the expected future reward
given the current state s and following the policy π . Similarly, can be applied within a gradient ascent scheme to learn the
an action-value function Q π (s) determines the expected optimal policy.
future reward given the current state s and action a, while sub-
sequently following the policy π . The expected value along C.2 Deep Q-learning
a policy π is denoted as Eπ .
Deep Q-learning identifies the optimal action-value func-
Vπ (s) = Eπ R (t)|s (85)
tion Q(s, a) from which the optimal policy is extracted.
Q π (s, a) = Eπ R (t)|s, a (86) Q-Learning relies on the Bellman optimality criterion [666,
667]. By separating the reward r0 at the first step, the recur-
The optimal value and quality function correspondingly fol- sion formula of the optimal state-value function, i.e., the
low the optimal policy: Bellman optimality criterion, can be established:

V (s) = max Vπ (s), (87) ∞

π
V (s) = max Eπ γ t rt |s0 = s (91)
Q(s, a) = max Q π (s). (88) π
π t=0
∞

The approaches can be subdivided into model-based and = max Eπ r0 + γ t rt |s1 = s (92)
π
model-free. Model-based methods incorporate a model of t=1

the environment. In the most general sense, a probabilis- = max Eπ r0 + γ V (s ) . (93)
π
tic environment, this entitles the probability distribution
of the next state P(st+1 |st , at ) and of the next reward Here, s represents the next state after s. This can be done
R(rt+1 |st+1 , st , at ). The model of the environment can be analogously for the action-value function.
cheaply sampled to improve the policy π with model-free

reinforcement learning techniques [647–650] discussed in Q(s, a) = max Eπ r0 + γ Q(s , a ) (94)
π
the sequel (Appendices C.1 and C.2). However, if the model
is differentiable, the gradient of the reward can directly be The recursion enables an update formula, referred to as tem-
used to update the policy [651–656]. This is identical to the poral difference (TD) learning [668, 669]. Specifically, the
optimization through differentiable physics solvers discussed current estimate Q (m) at state st is compared to the more
in Sect. 3.3.3. Model-free reinforcement learning techniques accurate estimate at the next state st+1 using the obtained
can be used to enhance the optimization. reward rt , referred to as the TD target estimate. The differ-
A further distinction is made between policy-based [657– ence is the TD error, which in combination with a learning
661] and value-based [662–664] approaches. Policy-based rate α is used to update the function Q (m) :
methods, such as deep policy networks [38] (Appendix C.1),
directly optimize the policy. By contrast, value-based meth- Q (m+1) (st , at ) = Q (m) (st , at )
ods, such as deep Q-learning [664] (Appendix C.2) learn the TD error

value function from which the optimal policy is selected.
+ α rt + γ max Q(st+1 , a) − Q (m) (st , at ) . (95)
Actor-critic methods, such as proximal policy optimization a
model prediction
[665] combine the ideas with an actor that performs a policy TD target estimate

123
312 Computational Mechanics (2024) 74:281–331

Here, the TD target estimate only looks one step ahead—and 12. Tomonari F, Genki Y (1998) Implicit constitutive modelling for
is therefore referred to as TD(0). The generalization is called viscoplasticity using neural networks. Int J Numer Methods Eng
43(2):195–219
TD(N). In the limit N → ∞, the method is equivalent to 13. Okuda H, Yoshimura S, Yagawa G, Matsuda A (1998) Neural
Monte Carlo learning, where all steps are performed and a network-based parameter estimation for non-linear finite element
true target is obtained. analyses. Eng Comput 15(1):103–138. https://doi.org/10.1108/
Deep Q-learning introduces a NN for the action-value 02644409810200721
14. Jun T, Yukio K (1994) Neural network representation of finite
function Q(s, a; θ ). Its quality is assessed with a loss com- element method. Neural Netw 7(2):389–395. https://doi.org/10.
posed of the mean squared error of the TD error. 1016/0893-6080(94)90031-0
15. Yagawa G, Okuda H (1996) Finite element solutions with feed-
1 2 back network mechanism through direct minimization of energy
C =E rt + γ max Q(st+1 , a; θ ) − Q(st , at ; θ ) (96) functionals. Int J Numer Methods Eng 39(5):867–883
2 a 16. Topping BHV, Khan AI, Bahreininejad A (1997) Parallel train-
ing of neural networks for finite element mesh decomposition.
Comput Struct 63(4):693–707. https://doi.org/10.1016/S0045-
Lastly, the optimal policy π(s) maximizing the action- 7949(96)00082-X
value function Q(s, a; θ ) is extracted: 17. Rezende DJ, Mohamed S, Wierstra D (2014) Stochastic backprop-
agation and approximate inference in deep generative models.
arXiv:1401.4082 [cs, stat]
π(s) = arg max Q(s, a; θ ) (97) 18. Kingma Diederik P, Welling M (2022) Auto-encoding variational
a
bayes. arXiv:1312.6114 [cs, stat]
19. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley
D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial
nets. In: Advances in neural information processing systems, vol
References 27. Curran Associates, Inc. https://papers.nips.cc/paper_files/
paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.
1. Abu-Mostafa YS, Magdon-Ismail M, Lin H-T (2012) Learning html
from data. AML Book 20. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Belle-
2. Adie J, Juntao Y, Zhang X, See S (2018) Deep learning mare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G,
for computational science and engineering. In: GPU technol- Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran
ogy conference. https://on-demand.gputechconf.com/gtc/2018/ D, Wierstra D, Legg S, Hassabis D (2015) Human-level control
presentation/S8242-Yang-Juntao-paper.pdf through deep reinforcement learning. Nature 518(7540):529–533.
3. Yagawa G, Okuda H (1996) Neural networks in computational https://doi.org/10.1038/nature14236
mechanics. Arch Comput Methods Eng 3(4):435–512. https://doi. 21. Zhang D, Maslej N, Brynjolfsson E, Etchemendy J, Lyons T,
org/10.1007/BF02818935 Manyika J, Ngo H, Niebles JC, Sellitto M, Sakhaee E, Shoham
4. Waszczyszyn Z, Ziemiański L (2001) Neural networks in mechan- Y, Clark J, Perrault R (2022) The AI index 2022 annual report.
ics of structures and materials—new results and prospects of arXiv:2205.03468 [cs]
applications. Comput Struct 79(22):2261–2276. https://doi.org/ 22. Woldseth RV, Aage N, Andreas Bærentzen J, Sigmund O (2022)
10.1016/S0045-7949(01)00083-9 On the use of artificial neural networks in topology optimi-
5. Yagawa G, Oishi A (2021) Computational mechanics with neural sation. Struct Multidiscip Optim 65(10):294. https://doi.org/10.
networks. Lecture notes on numerical methods in engineering and 1007/s00158-022-03347-1
sciences. Springer, Cham 23. Seungyeon S, Dongju S, Namwoo K (2023) Topology opti-
6. Song SJ, Schmerr LW (1992) Ultrasonic flaw classification in mization via machine learning and deep learning: a review. J
weldments using probabilistic neural networks. J Nondestr Eval Comput Des Eng 10(4):1736–1766. https://doi.org/10.1093/jcde/
11(2):69–77. https://doi.org/10.1007/BF00568290 qwad072
7. Yagawa G, Yoshimura S, Mochizuki Y, Oishi T (1993) Identifi- 24. Adler A, Araya-Polo M, Poggio T (2021) Deep learning for
cation of crack shape hidden in solid by means of neural network seismic inverse problems: toward the acceleration of geophysi-
and computational mechanics. In: Masataka T, Huy Duong B (eds) cal analysis workflows. IEEE Signal Process Mag 38(2):89–119.
Inverse problems in engineering mechanics, international union of https://doi.org/10.1109/MSP.2020.3037429
theoretical and applied mechanics. Springer, Berlin, pp 213–222. 25. Garnier P, Viquerat J, Rabault J, Larcher A, Kuhnle A, Hachem E
https://doi.org/10.1007/978-3-642-52439-4_21 (2019) A review on deep reinforcement learning for fluid mechan-
8. Psichogios DC, Ungar LH (1992) A hybrid neural network-first ics. arXiv:1908.04127 [physics]
principles approach to process modeling. AIChE J 38(10):1499– 26. Karthik D, Gianluca I, Heng X (2019) Turbulence modeling in
1511. https://doi.org/10.1002/aic.690381003 the age of data. Ann Rev Fluid Mech 51(1):357–377. https://doi.
9. Dissanayake MWMG, Phan-Thien N (1994) Neural-network- org/10.1146/annurev-fluid-010518-040547
based approximations for solving partial differential equations. 27. Brunton S, Noack B, Koumoutsakos P (2020) Machine learning
Commun Numer Methods Eng 10(3):195–201. https://doi.org/ for fluid mechanics. Annu Rev Fluid Mech 52(1):477–508.
10.1002/cnm.1640100303 https://doi.org/10.1146/annurev-fluid-010719-060214. arXiv:
10. Lagaris IE, Likas A, Fotiadis DI (1998) Artificial neural networks 1905.11075
for solving ordinary and partial differential equations. IEEE Trans 28. Cai S, Mao Z, Wang Z, Yin M, Karniadakis GE (2021)
Neural Netw 9(5):987–1000. https://doi.org/10.1109/72.712178 Physics-informed neural networks (PINNs) for fluid mechanics:
11. Theocaris Pericles S, Panagiotopoulos PD (1995) Generalised a review. Acta Mech Sin 37(12):1727–1738. https://doi.org/10.
hardening plasticity approximated via anisotropic elasticity: a 1007/s10409-021-01148-1
neural network approach. Comput Methods Appl Mech Eng 29. Giovanni C, Wei L (2021) Deep learning to replace, improve,
125(1):123–139. https://doi.org/10.1016/0045-7825(94)00769-J or aid CFD analysis in built environment applications: a review.

123
Computational Mechanics (2024) 74:281–331 313

Build Environ 206:108315. https://doi.org/10.1016/j.buildenv. high-performance deep learning library. arXiv:1912.01703 [cs,
2021.108315 stat]
30. Bock FE, Aydin RC, Cyron CJ, Huber N, Kalidindi SR, Kluse- 49. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Cor-
mann B (2019) A review of the application of machine learning rado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I,
and data mining approaches in continuum materials mechanics. Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur
Front Mater 6:110. https://doi.org/10.3389/fmats.2019.00110 M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah
31. Bishara D, Xie Y, Liu WK, Li S (2023) A state-of-the-art review on C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker
machine learning-based multiscale modeling, simulation, homog- P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P,
enization and design of materials. Arch Comput Methods Eng Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow:
30(1):191–222. https://doi.org/10.1007/s11831-022-09795-8 large-scale machine learning on heterogeneous distributed sys-
32. Max R, Kalina Karl A, Jörg B, Markus K (2023) A comparative tems. arXiv:1603.04467 [cs]
study on different neural network architectures to model inelastic- 50. Kurt H, Maxwell S, Halbert W (1989) Multilayer feedforward
ity. Int J Numer Methods Eng. https://doi.org/10.1002/nme.7319 networks are universal approximators. Neural Netw 2(5):359–
33. Lyle R, Heyrani NA, Faez A (2022) Deep generative models in 366. https://doi.org/10.1016/0893-6080(89)90020-8
engineering design: a review. J Mech Des 144(7):071704. https:// 51. Baydin AG, Pearlmutter BA, Radul AA, Siskind JM (2018) Auto-
doi.org/10.1115/1.4053859 matic differentiation in machine learning: a survey, p 43
34. Moosavi SM, Jablonka KM, Smit B (2020) The role of machine 52. Kingma DP, Ba J (2017) Adam: a method for stochastic optimiza-
learning in the understanding and design of materials. J Am Chem tion. arXiv:1412.6980 [cs]
Soc 142(48):20273–20287. https://doi.org/10.1021/jacs.0c09105 53. Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn.
35. Faller William E, Schreck Scott J (1996) Neural networks: appli- Springer series in operations research. Springer, New York
cations and opportunities in aeronautics. Progress Aerosp Sci 54. Rosenblatt F (1958) The perceptron: a probabilistic model for
32(5):433–456. https://doi.org/10.1016/0376-0421(95)00011-9 information storage and organization in the brain. Psychol Rev
36. Thuerey N, Holl P, Mueller M, Schnell P, Trost F, Um K (2022) 65(6):386–408. https://doi.org/10.1037/h0042519
Physics-based deep learning. arXiv:2109.05237 [physics] 55. LeCun Y, Boser B, Denker J, Henderson D, Howard R,
37. Kollmannsberger S, D’Angella D, Jokeit M, Herrmann L (2021) Hubbard W, Jackel L (1989) Handwritten digit recogni-
Deep learning in computational mechanics: an introductory tion with a back-propagation network. In: Advances in
course, vol 977. Studies in computational intelligence. Springer, neural information processing systems, vol 2. Morgan-
Cham Kaufmann. https://proceedings.neurips.cc/paper/1989/hash/
38. Brunton SL, Kutz JN (2022) Data-driven science and engineering: 53c3bce66e43be4f209556518c2fcb54-Abstract.html
machine learning, dynamical systems, and control. Cambridge 56. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hub-
University Press, Cambridge bard W, Jackel LD (1989) Backpropagation applied to handwritten
39. Anuj K, Ramakrishnan K, Vipin K (2022) Knowledge guided zip code recognition. Neural Comput 1(4):541–551. https://doi.
machine learning: accelerating discovery using scientific knowl- org/10.1162/neco.1989.1.4.541
edge and data. Chapman and Hall/CRC, New York. https://doi. 57. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-
org/10.1201/9781003143376 based learning applied to document recognition. Proc IEEE
40. Yagawa G, Oishi A (2023) Computational mechanics with deep 86(11):2278–2324. https://doi.org/10.1109/5.726791
learning: an introduction. Springer, Cham 58. Rumelhart David E, Hinton Geoffrey E, Williams Ronald J (1986)
41. Rabczuk T, Bathe K-J (2023) Machine learning in modeling and Learning representations by back-propagating errors. Nature
simulation: methods and applications. Springer 323(6088):533–536. https://doi.org/10.1038/323533a0
42. Baker N, Alexander F, Bremer T, Hagberg A, Kevrekidis Y, Najm 59. Hochreiter S, Schmidhuber J (1997) Long short-term memory.
H, Parashar M, Patra A, Sethian J, Wild S, Willcox K, Lee S (2019) Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.
Workshop report on basic research needs for scientific machine 1997.9.8.1735
learning: core technologies for artificial intelligence. Technical 60. Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares
Report 1478744. http://www.osti.gov/servlets/purl/1478744/ F, Schwenk H, Bengio Y (2014) Learning phrase representations
43. von Rueden L, Mayer S, Beckh K, Georgiev B, Giesselbach S, using RNN encoder–decoder for statistical machine translation.
Heese R, Kirsch B, Pfrommer J, Pick A, Ramamurthy R, Walczak In: Proceedings of the 2014 conference on empirical methods in
M, Garcke J, Bauckhage C, Schuecker J (2023) Informed machine natural language processing (EMNLP). Association for Computa-
learning—a taxonomy and survey of integrating prior knowledge tional Linguistics, Doha, pp 1724–1734. https://doi.org/10.3115/
into learning systems. IEEE Trans Knowl Data Eng 35(1):614– v1/D14-1179
633. https://doi.org/10.1109/TKDE.2021.3079836 61. Kipf TN, Welling M (2017) Semi-supervised classification with
44. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT graph convolutional networks. arXiv:1609.02907 [cs, stat]
Press 62. Monti F, Shchur O, Bojchevski A, Litany O, Günnemann S,
45. Sutton RS, Barto AG (2018) Reinforcement learning: an introduc- Bronstein MM (2018) Dual-primal graph convolutional networks.
tion, 2nd edn. Adaptive computation and machine learning series. arXiv:1806.00770 [cs, stat]
The MIT Press, Cambridge 63. Battaglia PW, Hamrick JB, Bapst V, Sanchez-Gonzalez A, Zam-
46. Alpaydin E (2020) Introduction to machine learning, 4th edn. baldi V, Malinowski M, Tacchetti A, Raposo D, Santoro A,
Adaptive computation and machine learning series. The MIT Faulkner R, Gulcehre C, Song F, Ballard A, Gilmer J, Dahl G,
Press, Cambridge Vaswani A, Allen K, Nash C, Langston V, Dyer C, Heess N, Wier-
47. Russell SJ, Norvig P (2022) Artificial intelligence: a modern stra D, Kohli P, Botvinick M, Vinyals O, Li Y, Pascanu R (2018)
approach, 4th edn. Pearson series in artificial intelligence. Pear- Relational inductive biases, deep learning, and graph networks.
son, Harlow arXiv:1806.01261 [cs, stat]
48. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, 64. Henkes A, Eshraghian JK, Wessels H (2022) Spiking neural net-
Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, works for nonlinear regression. arXiv:2210.03515 [cs]
Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner 65. Tandale SB, Stoffel M (2023) Spiking recurrent neural networks
B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, for neuromorphic computing in nonlinear structural mechanics.

123
314 Computational Mechanics (2024) 74:281–331

Comput Methods Appl Mech Eng 412:116095. https://doi.org/ 83. Chen LW, Thuerey N (2023) Towards high-accuracy deep learning
10.1016/j.cma.2023.116095 inference of compressible flows over aerofoils. Comput Fluids
66. Gerstner W, Kistler WM (2002) Spiking neuron models: single 250:105707. https://doi.org/10.1016/j.compfluid.2022.105707
neurons, populations, plasticity, 1st edn. Cambridge University 84. Khadilkar A, Wang J, Rai R (2019) Deep learning-based
Press, Cambridge. https://doi.org/10.1017/CBO9780511815706 stress prediction for bottom-up SLA 3D printing process. Int J
67. Hughes Thomas JR, Hulbert GM (1988) Space-time finite element Adv Manuf Technol 102(5):2555–2569. https://doi.org/10.1007/
methods for elastodynamics: formulations and error estimates. s00170-019-03363-4
Comput Methods Appl Mech Eng 66(3):339–363. https://doi.org/ 85. Zhenguo N, Haoliang J, Burak KL (2020) Stress field prediction
10.1016/0045-7825(88)90006-0 in cantilevered structures using convolutional neural networks. J
68. Alsalman M, Colvert B, Kanso E (2018) Training bioinspired sen- Comput Inform Sci Eng 20(1):011002. https://doi.org/10.1115/1.
sors to classify flows. Bioinspir Biomimet 14(1):016009. https:// 4044097
doi.org/10.1088/1748-3190/aaef1d 86. Guo X, Li W, Iorio F (2016) Convolutional neural networks
69. Colvert B, Alsalman M, Kanso E (2018) Classifying vortex wakes for steady flow approximation. In: Proceedings of the 22nd
using neural networks. Bioinspir Biomimet 13(2):025003. https:// ACM SIGKDD international conference on knowledge discov-
doi.org/10.1088/1748-3190/aaa787 ery and data mining. ACM, pp 481–490. https://doi.org/10.1145/
70. Pierret S, Van Den Braembussche RA (1999) Turbomachinery 2939672.2939738
blade design using a Navier–Stokes solver and artificial neural 87. Zhang Z, Jaiswal P, Rai R (2018) FeatureNet: machining feature
network. J Turbomach 121(2):326–332. https://doi.org/10.1115/ recognition based on 3D convolution neural network. Comput
1.2841318 Aided Des 101:12–22. https://doi.org/10.1016/j.cad.2018.03.006
71. Vurtur Badarinath P, Chierichetti M, Davoudi Kakhki F (2021) 88. Williams G, Meisel NA, Simpson TW, McComb C (2019)
A machine learning approach as a surrogate for a finite element Design repository effectiveness for 3d convolutional neural
analysis: status of research and application to one dimensional networks: application to additive manufacturing. J Mech Des
systems. Sensors 21(5):1654. https://doi.org/10.3390/s21051654 141(11):111701. https://doi.org/10.1115/1.4044199
72. Lee C, Kim J, Babcock D, Goodman R (1997) Application of 89. Wu Y, Lin Y, Zhou Z (2018) Inversionet: accurate and efficient
neural networks to turbulence control for drag reduction. Phys seismic-waveform inversion with convolutional neural networks.
Fluids 9(6):1740–1747. https://doi.org/10.1063/1.869290 In: SEG technical program expanded abstracts 2018. Society of
73. Jambunathan K, Hartle SL, Ashforth-Frost S, Fontama VN (1996) Exploration Geophysicists, Anaheim, pp 2096–2100. https://doi.
Evaluating convective heat transfer coefficients using neural net- org/10.1190/segam2018-2998603.1
works. Int J Heat Mass Transfer 39(11):2329–2332. https://doi. 90. Wang W, Yang F, Ma J (2018) Velocity model building with a
org/10.1016/0017-9310(95)00332-0 modified fully convolutional network. In: SEG technical program
74. Tracey BD, Duraisamy K, Alonso JJ (2015) A machine learn- expanded abstracts 2018. Society of Exploration Geophysicists,
ing strategy to assist turbulence model development. In: 53rd Anaheim, pp 2086–2090. https://doi.org/10.1190/segam2018-
AIAA aerospace sciences meeting. American Institute of Aero- 2997566.1
nautics and Astronautics, Kissimmee. https://doi.org/10.2514/6. 91. Yang F, Ma J (2019) Deep-learning inversion: a next-
2015-1287 generation seismic velocity model building method. Geophysics
75. Ramuhalli P, Udpa L, Udpa SS (2002) Electromagnetic NDE sig- 84(4):R583–R599. https://doi.org/10.1190/geo2018-0249.1
nal inversion by function-approximation neural networks. IEEE 92. Zheng Y, Zhang Q, Yusifov A, Shi Y (2019) Applications of super-
Trans Magn 38(6):3633–3642. https://doi.org/10.1109/TMAG. vised deep learning for seismic interpretation and inversion. Lead
2002.804817 Edge 38(7):526–533. https://doi.org/10.1190/tle38070526.1
76. Araya-Polo M, Jennings J, Adler A, Dahlke T (2018) Deep- 93. Araya-Polo M, Farris S, Florez M (2019) Deep learning-driven
learning tomography. Lead Edge 37(1):58–66. https://doi.org/10. velocity model building workflow. Lead Edge 38(11):872–872.
1190/tle37010058.1 https://doi.org/10.1190/tle38110872a1.1
77. Kim Y, Nakata N (2018) Geophysical inversion versus machine 94. Das V, Pollack A, Wollner U, Mukerji T (2019) Convolutional
learning in inverse problems. Lead Edge 37(12):894–901. https:// neural network for seismic impedance inversion. Geophysics
doi.org/10.1190/tle37120894.1 84(6):R869–R880. https://doi.org/10.1190/geo2018-0838.1
78. Hoang V-N, Nguyen N-L, Tran DQ, Vu Q-V, Nguyen-Xuan H 95. Wang W, Ma J (2020) Velocity model building in a cross-
(2022) Data-driven geometry-based topology optimization. Struct well acquisition geometry with image-trained artificial neural
Multidiscip Optim 65(2):69. https://doi.org/10.1007/s00158- networks. Geophysics 85(2):U31–U46. https://doi.org/10.1190/
022-03170-8 geo2018-0591.1
79. Zhang X, Garikipati K (2023) Label-free learning of elliptic 96. Li S, Liu B, Ren Y, Chen Y, Yang S, Wang Y, Jiang P (2020)
partial differential equation solvers with generalizability across Deep-learning inversion of seismic data. IEEE Trans Geosci
boundary value problems. Comput Methods Appl Mech Eng. Remote Sens 58(3):2135–2149. https://doi.org/10.1109/TGRS.
https://doi.org/10.1016/j.cma.2023.116214 2019.2953473
80. Thuerey N, Weißenow K, Prantl L, Xiangyu H (2020) Deep learn- 97. Bangyu W, Meng D, Wang L, Liu N, Wang Y (2020)
ing methods for Reynolds-averaged Navier–Stokes simulations Seismic impedance inversion using fully convolutional resid-
of airfoil flows. AIAA J 58(1):25–36. https://doi.org/10.2514/1. ual network and transfer learning. IEEE Geosci Remote
J058291 Sens Lett 17(12):2140–2144. https://doi.org/10.1109/LGRS.
81. Li-Wei C, Cakal Berkay A, Xiangyu H, Nils T (2021) Numerical 2019.2963106
investigation of minimum drag profiles in laminar flow using deep 98. Park MJ, Sacchi MD (2020) Automatic velocity analysis using
learning surrogates. J Fluid Mech 919:A34. https://doi.org/10. convolutional neural network and transfer learning. Geophysics
1017/jfm.2021.398 85(1):V33–V43. https://doi.org/10.1190/geo2018-0870.1
82. Chen X, Zhao X, Gong Z, Zhang J, Zhou W, Chen X, Yao W (2021) 99. Ye J, Toyama N (2022) Automatic defect detection for ultrasonic
A deep neural network surrogate modeling benchmark for temper- wave propagation imaging method using spatio-temporal convo-
ature field prediction of heat source layout. Sci China Phys Mech lution neural networks. Struct Health Monit 21(6):2750–2767.
Astron 64(11):1. https://doi.org/10.1007/s11433-021-1755-6 https://doi.org/10.1177/14759217211073503

123
Computational Mechanics (2024) 74:281–331 315

100. Jing R, Fangshu Y, Huadong M, Stefan K, Ernst R (2023) 117. Qi CR, Su H, Mo K, Guibas LJ (2017) PointNet: deep
Quantitative reconstruction of defects in multi-layered bonded learning on point sets for 3D classification and segmentation.
composites using fully convolutional network-based ultrasonic arXiv:1612.00593
inversion. J Sound Vib 542:117418. https://doi.org/10.1016/j.jsv. 118. Groueix T, Fisher M, Kim VG, Russell BC, Aubry M (2018) Atlas-
2022.117418 Net: a Papier-Mâché approach to learning 3D surface generation.
101. Qiyin L, Jun H, Zheng L, Baotong L, Jihong W (2018) arXiv:1802.05384 [cs]
Investigation into the topology optimization for conductive 119. Cunningham JD, Simpson TW, Tucker CS (2019) An investiga-
heat transfer based on deep learning approach. Int Com- tion of surrogate models for efficient performance-based decoding
mun Heat Mass Transfer 97:103–109. https://doi.org/10.1016/j. of 3D point clouds. J Mech Des 141(12):121401. https://doi.org/
icheatmasstransfer.2018.07.001 10.1115/1.4044597
102. Yonggyun Yu, Hur T, Jung J, Jang IG (2019) Deep learning for 120. Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer
determining a near-optimal topological design without any itera- series in statistics. Springer, New York
tion. Struct Multidiscip Optim 59(3):787–799. https://doi.org/10. 121. Tobias H, Hans-Peter M (2009) Statistical shape models for
1007/s00158-018-2101-5 3D medical image segmentation: a review. Med Image Anal
103. Abueidda Diab W, Seid K, Sobh Nahil A (2020) Topology opti- 13(4):543–563. https://doi.org/10.1016/j.media.2009.05.004
mization of 2D structures with nonlinearities using deep learning. 122. Bhattacharya K, Hosseini B, Kovachki NB, Stuart AM (2021)
Comput Struct 237:106283. https://doi.org/10.1016/j.compstruc. Model reduction and neural networks for parametric PDEs. SMAI
2020.106283 J Comput Math 7:121–157. https://doi.org/10.5802/smai-jcm.74
104. Nakamura K, Suzuki Y (2020) Deep learning-based topologi- 123. Berkooz G, Holmes P, Lumley JL (1993) The proper orthog-
cal optimization for representing a user-specified design area. onal decomposition in the analysis of turbulent flows. Annu
arXiv:2004.05461 Rev Fluid Mech 25(1):539–575. https://doi.org/10.1146/annurev.
105. Zhang Y, Peng B, Zhou X, Xiang C, Wang D (2020) A deep con- fl.25.010193.002543
volutional neural network for topology optimization with strong 124. Muñoz D, Allix O, Chinesta F, Ródenas JJ, Nadal E (2023) Man-
generalization ability. arXiv:1901.07761 [cs, stat] ifold learning for coherent design interpolation based on geomet-
106. Zheng S, He Z, Liu H (2021) Generating three-dimensional rical and topological descriptors. Comput Methods Appl Mech
structural topologies via a U-Net convolutional neural network. Eng 405:115859. https://doi.org/10.1016/j.cma.2022.115859
Thin-Walled Struct 159:107263. https://doi.org/10.1016/j.tws. 125. Liang L, Liu M, Martin C, Sun W (2018) A deep learning approach
2020.107263 to estimate stress distribution: a fast and accurate surrogate of
107. Shuai Z, Haojie F, Ziyu Z, Zhiqiang T, Kang J (2021) Accu- finite-element analysis. J R Soc Interface 15(138):20170844.
rate and real-time structural topology prediction driven by deep https://doi.org/10.1098/rsif.2017.0844
learning under moving morphable component-based framework. 126. Ali M, Ahmed B, Jiwon K, Yara M, Mofrad Mohammad RK
Appl Math Modell 97:522–535. https://doi.org/10.1016/j.apm. (2019) Bridging finite element and machine learning modeling:
2021.04.009 stress prediction of arterial walls in atherosclerosis. J Biomech
108. Wang D, Xiang C, Pan Y, Chen A, Zhou X, Zhang Y (2022) A deep Eng 141(8):084502. https://doi.org/10.1115/1.4043290
convolutional neural network for topology optimization with per- 127. Muravleva E, Oseledets I, Koroteev D (2018) Application of
ceptible generalization ability. Eng Optim 54(6):973–988. https:// machine learning to viscoplastic flow modeling. Phys Fluids
doi.org/10.1080/0305215X.2021.1902998 30(10):103102. https://doi.org/10.1063/1.5058127
109. Jun Y, Zhang Qi X, Qi FZ, Haijiang L, Wei S, Guangyuan W 128. Liang L, Liu M, Martin C, Sun W (2018) A machine learning
(2022) Deep learning driven real time topology optimisation based approach as a surrogate of finite element analysis-based inverse
on initial stress learning. Adv Eng Inform 51:101472. https://doi. method to estimate the zero-pressure geometry of human thoracic
org/10.1016/j.aei.2021.101472 aorta. Int J Numer Methods Biomed Eng 34(8):e3103. https://doi.
110. Seo J, Kapania RK (2023) Topology optimization with advanced org/10.1002/cnm.3103
CNN using mapped physics-based data. Struct Multidiscip Optim 129. Derouiche K, Garois S, Champaney V, Daoud M, Traidi
66(1):21. https://doi.org/10.1007/s00158-022-03461-0 K, Chinesta F (2021) Data-driven modeling for multiphysics
111. Ivan S, Ivan O (2019) Neural networks for topology optimization. parametrized problems-application to induction hardening pro-
Russian J Numer Anal Mathl Modell 34(4):215–223. https://doi. cess. Metals 11(5):738. https://doi.org/10.3390/met11050738
org/10.1515/rnam-2019-0018 130. Quercus H, Alberto B, Francisco C, Elías C (2023)
112. Joo Y, Yonggyun Yu, Jang IG (2021) Unit module-based conver- Thermodynamics-informed neural networks for physically realis-
gence acceleration for topology optimization using the spatiotem- tic mixed reality. Comput Methods Appl Mech Eng 407:115912.
poral deep neural network. IEEE Access 9:149766–149779. https://doi.org/10.1016/j.cma.2023.115912
https://doi.org/10.1109/ACCESS.2021.3125014 131. Hinton GE, Salakhutdinov RR (2006) Reducing the dimension-
113. Kallioras NA, Kazakis G, Lagaros ND (2020) Accelerated ality of data with neural networks. Science 313(5786):504–507.
topology optimization by means of deep learning. Struct Multi- https://doi.org/10.1126/science.1127647
discip Optim 62(3):1185–1212. https://doi.org/10.1007/s00158- 132. Michele M, Petros K (2002) Neural network modeling for near
020-02545-z wall turbulent flow. J Comput Phys 182(1):1–26. https://doi.org/
114. Sanchez-Gonzalez A, Godwin J, Pfaff T, Ying R, Leskovec J, 10.1006/jcph.2002.7146
Battaglia PW (2020) Learning to simulate complex physics with 133. Siddharth N, Walsh Timothy F, Greg P, Fabio S (2023) GRIDS-
graph networks. arXiv:2002.09405 Net: inverse shape design and identification of scatterers via
115. Pfaff T, Fortunato M, Sanchez-Gonzalez A, Battaglia PW geometric regularization and physics-embedded deep learning.
(2021) Learning mesh-based simulation with graph networks. Comput Methods Appl Mech Eng 414:116167. https://doi.org/
arXiv:2010.03409 10.1016/j.cma.2023.116167
116. Roberto P, Davide G, Vinamra A (2022) Graph neural networks for 134. Ana F-N, Diego Z-S, Omella Ángel J, David P, David G-S, Filipe
simulating crack coalescence and propagation in brittle materials. M (2022) Supervised deep learning with finite element simula-
Comput Methods Appl Mech Eng 395:115021. https://doi.org/10. tions for damage identification in bridges. Eng Struct 257:114016.
1016/j.cma.2022.115021 https://doi.org/10.1016/j.engstruct.2022.114016

123
316 Computational Mechanics (2024) 74:281–331

135. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolu- 150. Clark DLP, Lu L, Charles M, Em KG, Zaki Tamer A (2023) Neu-
tional networks for biomedical image segmentation. In Nassir N, ral operator prediction of linear instability waves in high-speed
Joachim H, Wells WM, Frangi AF (eds) Medical image comput- boundary layers. J Comput Phys 474:111793. https://doi.org/10.
ing and computer-assisted intervention—MICCAI 2015. Lecture 1016/j.jcp.2022.111793
notes in computer science. Springer, Cham, pp 234–241. https:// 151. Seid K, Abueidda Diab W (2023) Data-driven and physics-
doi.org/10.1007/978-3-319-24574-4_28 informed deep learning operators for solution of heat
136. Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2018) UNet++: conduction equation with parametric heat source. Int J
a nested U-Net architecture for medical image segmentation. Heat Mass Transfer 203:123809. https://doi.org/10.1016/j.
arXiv:1807.10165 [cs, eess, stat] ijheatmasstransfer.2022.123809
137. Lu L, Xuhui M, Shengze C, Zhiping M, Somdatta G, Zhongqiang 152. Liu C, He Q, Zhao A, Tao W, Song Z, Liu B, Feng C (2023) Oper-
Z, Em KG (2022) A comprehensive and fair comparison of two ator learning for predicting mechanical response of hierarchical
neural operators (with practical extensions) based on FAIR data. composites with applications of inverse design. Int J Appl Mech
Comput Methods Appl Mech Eng 393:114778. https://doi.org/ 15(04):2350028. https://doi.org/10.1142/S175882512350028X
10.1016/j.cma.2022.114778 153. Ahmed Shady E, Panos S (2023) A multifidelity deep opera-
138. Chen T, Chen H (1995) Universal approximation to nonlinear tor network approach to closure for multiscale systems. Comput
operators by neural networks with arbitrary activation functions Methods Appl Mech Eng 414:116161. https://doi.org/10.1016/j.
and its application to dynamical systems. IEEE Trans Neural Netw cma.2023.116161
6(4):911–917. https://doi.org/10.1109/72.392253 154. Wang S, Wang H, Perdikaris P (2021) Learning the solution
139. Lu L, Pengzhan J, Guofei P, Zhongqiang Z, Em KG (2021) Learn- operator of parametric partial differential equations with physics-
ing nonlinear operators via DeepONet based on the universal informed DeepONets. Sci Adv 7(40):8605. https://doi.org/10.
approximation theorem of operators. Nat Mach Intell 3(3):218– 1126/sciadv.abi8605
229. https://doi.org/10.1038/s42256-021-00302-5 155. Somdatta G, Yin Minglang Yu, Yue KG (2022) A physics-
140. Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K, informed variational DeepONet for predicting crack path in quasi-
Stuart A, Anandkumar A (2021) Fourier neural operator for para- brittle materials. Comput Methods Appl Mech Eng 391:114587.
metric partial differential equations. arXiv:2010.08895 https://doi.org/10.1016/j.cma.2022.114587
141. Chensen L, Martin M, Zhen L, Em KG (2021) A seamless mul- 156. Goswami S, Bora A, Yu Y, Karniadakis GE (2022) Physics-
tiscale operator neural network for inferring bubble dynamics. J informed deep neural operator networks. arXiv:2207.05748 [cs,
Fluid Mech 929:A18. https://doi.org/10.1017/jfm.2021.866 math]
142. Mao Zhiping LL, Olaf M, Zaki Tamer A, Em KG (2021) 157. Kovachki N, Lanthaler S, Mishra S (2021) On universal approx-
DeepM&Mnet for hypersonics: predicting the coupled flow and imation and error bounds for Fourier neural operators. J Mach
finite-rate chemistry behind a normal shock using neural-network Learn Res 22(1):290:13237-290:13312
approximation of operators. J Comput Phys 447:110698. https:// 158. Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K,
doi.org/10.1016/j.jcp.2021.110698 Stuart A, Anandkumar A (2020) Neural operator: graph kernel
143. Clark DLP, Lu L, Meneveau C, Karniadakis G, Zaki TA (2021) network for partial differential equations. arXiv:2003.03485 [cs,
DeepONet prediction of linear instability waves in high-speed math, stat]
boundary layers. arXiv:2105.08697 [physics] 159. Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K,
144. Shengze C, Wang Zhicheng LL, Zaki Tamer A, Em KG Stuart A, Anandkumar A (2020) Multipole graph neural operator
(2021) DeepM&Mnet: inferring the electroconvection multi- for parametric partial differential equations. In: Proceedings of the
physics fields based on operator approximation by neural net- 34th international conference on neural information processing
works. J Comput Phys 436:110296. https://doi.org/10.1016/j.jcp. systems, NIPS’20. Curran Associates Inc., Red Hook, pp 6755–
2021.110296 6766
145. Chensen L, Li Zhen LL, Shengze C, Martin M, Em KG (2021) 160. Cao Q, Goswami S, Karniadakis GE (2023) LNO: laplace neural
Operator learning for predicting multiscale bubble growth dynam- operator for solving differential equations. arXiv:2303.10528 [cs]
ics. J Chem Phys 154(10):104118. https://doi.org/10.1063/5. 161. Zhu C, Ye H, Zhan B (2021) Fast solver of 2D Maxwell’s
0041203 equations based on Fourier neural operator. In: 2021 Photon-
146. Minglang Y, Ehsan B, Rego Bruno V, Enrui Z, Cristina C, ics and electromagnetics research symposium (PIERS). IEEE,
Humphrey Jay D, Em KG (2022) Simulating progressive intra- Hangzhou, pp 1635–1643. https://doi.org/10.1109/PIERS53385.
mural damage leading to aortic dissection using DeepONet: 2021.9695119
an operator-regression neural network. J Roy Soc Interface 162. Chao S, Yanghua W (2022) High-frequency wavefield extrapola-
19(187):20210670. https://doi.org/10.1098/rsif.2021.0670 tion using the Fourier neural operator. J Geophys Eng 19(2):269–
147. Osorio Julian D, Zhicheng W, George K, Shengze C, Chrys 282. https://doi.org/10.1093/jge/gxac016
C, Mayank P, Mayank H (2022) Forecasting solar-thermal sys- 163. Wei W, Li-Yun F (2022) Small-data-driven fast seismic sim-
tems performance under transient operation using a data-driven ulations for complex media using physics-informed Fourier
machine learning approach based on the deep operator network neural operators. Geophysics 87(6):T435–T446. https://doi.org/
architecture. Energy Convers Manag 252:115063. https://doi.org/ 10.1190/geo2021-0573.1
10.1016/j.enconman.2021.115063 164. Mehran RM, Tanu P, Souvik C, Anoop Krishnan NM (2022)
148. Goswami S, Li DS, Rego BV, Latorre M, Humphrey JD, Kar- Learning the stress-strain fields in digital composites using
niadakis GE (2022) Neural operator learning of heterogeneous Fourier neural operator. iScience 25(11):105452. https://doi.org/
mechanobiological insults contributing to aortic aneurysms. J 10.1016/j.isci.2022.105452
R Soc Interface 19(193):20220410. https://doi.org/10.1098/rsif. 165. Kai Z, Yuande Z, Hanjun Z, Ma Xiaopeng G, Jianwei WJ, Yongfei
2022.0410 Y, Chuanjin Y, Jun Y (2022) Fourier neural operator for solving
149. Seid K, Asha V, Abueidda Diab W, Sobh Nahil A, Kamran K subsurface oil/water two-phase flow partial differential equation.
(2023) Deep learning operator network for plastic deformation SPE J 27(03):1815–1830. https://doi.org/10.2118/209223-PA
with variable loads and material properties. Eng Comput. https:// 166. Bicheng Y, Bailian C, Dylan RH, Wei J, Pawar Rajesh J (2022) A
doi.org/10.1007/s00366-023-01822-x robust deep learning workflow to predict multiphase flow behavior
during geological CO2 sequestration injection and Post-Injection

123
Computational Mechanics (2024) 74:281–331 317

periods. J Hydrol 607:127542. https://doi.org/10.1016/j.jhydrol. J Comput Phys 403:109056. https://doi.org/10.1016/j.jcp.2019.

2022.127542 109056
167. Gege W, Zongyi L, Kamyar A, Anima A, Benson Sally M 183. Chang MB, Ullman T, Torralba A, Tenenbaum JB (2017) A com-
(2022) U-FNO—an enhanced Fourier neural operator-based positional object-based approach to learning physical dynamics.
deep-learning model for multiphase flow. Adv Water Resour arXiv:1612.00341
163:104180. https://doi.org/10.1016/j.advwatres.2022.104180 184. Mrowca D, Zhuang C, Wang E, Haber N, Fei-Fei L, Tenenbaum
168. Wenhui P, Zelong Y, Jianchun W (2022) Attention-enhanced JB, Yamins DL (2018) Flexible neural representation for physics
neural network models for turbulence simulation. Phys Fluids prediction. arXiv:1806.08047
34(2):025111. https://doi.org/10.1063/5.0079302 185. Sanchez-Gonzalez A, Heess N, Springenberg JT, Merel J, Ried-
169. You H, Zhang Q, Ross Colton J, Lee CH, Yu Y (2022) Learning miller M, Hadsell R, Battaglia P (2018) Graph networks as learn-
deep implicit Fourier neural operators (IFNOs) with applica- able physics engines for inference and control. arXiv:1806.01242
tions to heterogeneous material modeling. Comput Methods 186. Li Y, Wu J, Zhu JY, Tenenbaum JB, Torralba A, Tedrake R
Appl Mech Eng 398:115296. https://doi.org/10.1016/j.cma.2022. (2019) Propagation networks for model-based control under par-
115296 tial observation. arXiv:1809.11169
170. Tie K, Jianqiao L, Zhilin Y, Hongbin J, Yubo L, Zhengkai L, 187. Lino M, Cantwell C, Bharath AA, Fotiadis S (2021) Simulating
Huanquan P (2023) Fast and robust prediction of multiphase flow continuum mechanics with multi-scale graph neural networks.
in complex fractured reservoir using a Fourier neural operator. arXiv:2106.04900
Energies 16(9):3765. https://doi.org/10.3390/en16093765 188. Alfarraj M, AlRegib G (2018) Petrophysical-property estimation
171. Alexandre CRP, Joseph JS, Victor OS, Aliabadi Amir A, Jesse from seismic data using recurrent neural networks. In: SEG tech-
VGT, Bahram G (2023) Deep neural network modeling for CFD nical program expanded abstracts 2018. Society of Exploration
simulations: benchmarking the Fourier neural operator on the lid- Geophysicists, Anaheim, pp 2141–2146. https://doi.org/10.1190/
driven cavity case. Appl Sci 13(5):3165. https://doi.org/10.3390/ segam2018-2995752.1
app13053165 189. Adler A, Araya-Polo M, Poggio T (2019) Deep recurrent archi-
172. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez tectures for seismic tomography. In: 81st EAGE conference
AN, Kaiser L, Polosukhin I (2017) Attention is all you need. and exhibition 2019, pp 1–5. https://doi.org/10.3997/2214-4609.
In: Advances in neural information processing systems, vol 30. 201901512
Curran Associates, Inc. https://papers.nips.cc/paper_files/paper/ 190. Fabien-Ouellet G, Sarkar R (2020) Seismic velocity estimation: a
2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html deep recurrent neural-network approach. Geophysics 85(1):U21–
173. Cao S (2021) Choose a transformer: Fourier or Galerkin. U29. https://doi.org/10.1190/geo2018-0786.1
arXiv:2105.14995 [cs, math] 191. Vlachas PR, Byeon W, Wan ZY, Sapsis TP, Koumoutsakos P
174. Li Z, Zheng H, Kovachki N, Jin D, Chen H, Liu B, Azizzadenesheli (2018) Data-driven forecasting of high-dimensional chaotic sys-
K, Anandkumar A (2023) Physics-informed neural operator for tems with long short-term memory networks. Proc Roy Soc
learning partial differential equations. arXiv:2111.03794 [cs, A Math Phys Eng Sci 474(2213):20170844. https://doi.org/10.
math] 1098/rspa.2017.0844
175. Marcati C, Opschoor JAA, Petersen PC, Schwab C (2023) Expo- 192. Hou W, Darakananda D, Eldredge J (2019) Machine learning
nential ReLU neural network approximation rates for point and based detection of flow disturbances using surface pressure mea-
edge singularities. Found Comput Math 23(3):1043–1127. https:// surements. In: AIAA Scitech 2019 forum. American Institute
doi.org/10.1007/s10208-022-09565-9 of Aeronautics and Astronautics, San Diego. https://doi.org/10.
176. Lukas G, Christoph S (2023) Deep ReLU neural networks 2514/6.2019-1148
overcome the curse of dimensionality for partial integrodifferen- 193. Heindel L, Hantschke P, Kästner M (2021) A virtual sensing
tial equations. Anal Appl 21(01):1–47. https://doi.org/10.1142/ approach for approximating nonlinear dynamical systems using
S0219530522500129 LSTM networks. PAMM 21(1):e202100119. https://doi.org/10.
177. Marcati C, Schwab C (2023) Exponential convergence of deep 1002/pamm.202100119
operator networks for elliptic partial differential equations. 194. Heindel L, Hantschke P, Kästner M (2022) A data-driven approach
SIAM J Numer Anal 61(3):1513–1545. https://doi.org/10.1137/ for approximating non-linear dynamic systems using LSTM net-
21M1465718 works. Proc Struct Integr 38:159–167. https://doi.org/10.1016/j.
178. Álvarez-Aramberri J, Vicent D, Caro F, Pardo D (2023) Gen- prostr.2022.03.017
eration of massive databases for deep learning inversion: a 195. Freitag S, Cao BT, Ninić J, Meschke G (2018) Recurrent neu-
goal-oriented hp-adaptive strategy. In: International conference ral networks and proper orthogonal decomposition with interval
on adaptive modeling and simulation (ADMOS 2023), applica- data for real-time predictions of mechanised tunnelling pro-
tions of goal-oriented error estimation and adaptivity. https://doi. cesses. Comput Struct 207:258–273. https://doi.org/10.1016/j.
org/10.23967/admos.2023.027 compstruc.2017.03.020
179. Bolager EL, Burak I, Datar I, Sun Q, Dietrich F (2023) Sampling 196. Cao BT, Obel M, Freitag S, Mark P, Meschke G (2020) Arti-
weights of deep neural networks. arXiv:2306.16830 [cs, math] ficial neural network surrogate modelling for real-time pre-
180. Ballakur AA, Arya A (2020) Empirical evaluation of gated recur- dictions and control of building damage during mechanised
rent neural network architectures in aviation delay prediction. tunnelling. Adv Eng Softw 149:102869. https://doi.org/10.1016/
In: 2020 5th International conference on computing, commu- j.advengsoft.2020.102869
nication and security (ICCCS), pp 1–7. https://doi.org/10.1109/ 197. Cao BT, Obel M, Freitag S, Heußner L, Meschke G, Mark P (2022)
ICCCS49678.2020.9276855 Real-time risk assessment of tunneling-induced building dam-
181. Chen Q, Kong L, Dugast F, To A (2023) Using the transformer age considering polymorphic uncertainty. ASCE-ASME J Risk
model for physical simulation: an application on transient thermal Uncertain Eng Syst Part A Civ Eng 8(1):04021069. https://doi.
analysis for 3D printing process simulation. https://openreview. org/10.1061/AJRUA6.0001192
net/forum?id=tuXhnv6pgo 198. Anthony G, Gunzburger Max J, Lili WZ (2022) A comparison of
182. Geneva N, Zabaras N (2020) Modeling the dynamics of PDE neural network architectures for data-driven reduced-order mod-
systems with physics-constrained deep auto-regressive networks. eling. Comput Methods Appl Mech Eng 393:114764. https://doi.
org/10.1016/j.cma.2022.114764

123
318 Computational Mechanics (2024) 74:281–331

199. Gonzalez FJ, Balajewicz M (2018) Deep convolutional recurrent In: Proceedings of the 32nd international conference on neural
autoencoders for learning low-dimensional feature dynamics of information processing systems, NIPS’18. Curran Associates Inc,
fluid systems. arXiv:1808.01346 [physics] Red Hook, pp 9278–9288
200. Holden D, Duong BC, Datta S, Nowrouzezahrai D (2019) Sub- 216. Lusch B, Nathan Kutz J, Brunton SL (2018) Deep learning for
space neural physics: fast data-driven interactive simulation. In: universal linear embeddings of nonlinear dynamics. Nat Commun
Proceedings of the 18th annual ACM SIGGRAPH/Eurographics 9(1):4950. https://doi.org/10.1038/s41467-018-07210-0
symposium on computer animation, SCA ’19. Association for 217. Otto SE, Rowley CW (2019) Linearly recurrent autoencoder net-
Computing Machinery, New York, pp 1–12. https://doi.org/10. works for learning dynamics. SIAM J Appl Dyn Syst 18(1):558–
1145/3309486.3340245 593. https://doi.org/10.1137/18M1177846
201. Stefania F, Andrea M, Luca D, Alfio Q (2020) Deep learning- 218. Cohn D, Ghahramani Z, Jordan M (1994) Active learning with
based reduced order models in cardiac electrophysiology. statistical models. In: Advances in neural information processing
PLoS ONE 15(10):e0239416. https://doi.org/10.1371/journal. systems, vol 7. MIT Press, Cambridge
pone.0239416 219. Liu X, Athanasiou CE, Padture NP, Sheldon BW, Gao H
202. Fresca S, Dede’ L, Manzoni A (2021) A comprehensive deep (2021) Knowledge extraction and transfer in data-driven fracture
learning-based approach to reduced order modeling of nonlin- mechanics. Proc Natl Acad Sci 118(23):e2104765118. https://doi.
ear time-dependent parametrized PDEs. J Sci Comput 87(2):61. org/10.1073/pnas.2104765118
https://doi.org/10.1007/s10915-021-01462-7 220. Haasdonk B, Kleikamp H, Ohlberger M, Schindler F, Wenzel T
203. Stefania F, Andrea M (2022) POD-DL-ROM: enhancing deep (2023) A new certified hierarchical and adaptive RB-ML-ROM
learning-based reduced order models for nonlinear parametrized surrogate model for parametrized PDEs. SIAM J Sci Comput
PDEs by proper orthogonal decomposition. Comput Methods 45(3):A1039–A1065. https://doi.org/10.1137/22M1493318
Appl Mech Eng 388:114181. https://doi.org/10.1016/j.cma.2021. 221. Kalina KA, Linden L, Brummund J, Kästner M (2023) FE
114181 ANN: an efficient data-driven multiscale approach based on
204. Ren P, Chengping R, Yang L, Jian-Xun W, Hao S (2022) Phy- physics-constrained neural networks and automated data mining.
CRNet: physics-informed convolutional-recurrent network for Comput Mech 71(5):827–851. https://doi.org/10.1007/s00466-
solving spatiotemporal PDEs. Comput Methods Appl Mech Eng 022-02260-0
389:114399. https://doi.org/10.1016/j.cma.2021.114399 222. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE
205. Hu C, Martin S, Dingreville R (2022) Accelerating phase-field Trans Knowl Data Eng 22(10):1345–1359. https://doi.org/10.
predictions via recurrent neural networks learning the microstruc- 1109/TKDE.2009.191
ture evolution in latent space. Comput Methods Appl Mech Eng 223. Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transfer-
397:115128. https://doi.org/10.1016/j.cma.2022.115128 able are features in deep neural networks? In: Proceedings of the
206. Kookjin L, Carlberg Kevin T (2020) Model reduction of dynam- 27th international conference on neural information processing
ical systems on nonlinear manifolds using deep convolutional systems, vol 2, NIPS’14. MIT Press, Cambridge, pp 3320–3328
autoencoders. J Comput Phys 404:108973. https://doi.org/10. 224. Kollmannsberger S, Singh D, Herrmann L (2023) Transfer
1016/j.jcp.2019.108973 learning enhanced full waveform inversion. arXiv:2302.11259
207. Shen S, Yin Y, Shao T, Wang H, Jiang C, Lan L, Zhou K (2021) [physics]
High-order differentiable autoencoder for nonlinear model reduc- 225. Liu Z, Chen Y, Du Y, Tegmark M (2021) Physics-augmented
tion. arXiv:2102.11026 [cs] learning: a new paradigm beyond physics-informed learning.
208. Schmid Peter J (2010) Dynamic mode decomposition of numer- arXiv:2109.13901 [physics]
ical and experimental data. J Fluid Mech 656:5–28. https://doi. 226. Zhu Y, Zabaras N, Koutsourelakis PS, Perdikaris P (2019)
org/10.1017/S0022112010001217 Physics-constrained deep learning for high-dimensional surrogate
209. Tu JH, Rowley CW, Luchtenburg DM, Brunton SL, Nathan KJ modeling and uncertainty quantification without labeled data. J
(2013) On dynamic mode decomposition: theory and applications. Comput Phys 394:56–81. https://doi.org/10.1016/j.jcp.2019.05.
arXiv:1312.0041 [physics] 024. arXiv: 1901.06314
210. Koopman BO (1931) Hamiltonian systems and transformation in 227. Eichelsdörfer J, Kaltenbach S, Koutsourelakis PS (2021)
Hilbert space. Proc Natl Acad Sci 17(5):315–318. https://doi.org/ Physics-enhanced neural networks in the small data regime.
10.1073/pnas.17.5.315 arXiv:2111.10329 [physics, stat] version: 1
211. Williams MO, Kevrekidis IG, Rowley CW (2015) A data-driven 228. Raissi M (2018) Deep hidden physics models: deep learning
approximation of the Koopman operator: extending dynamic of nonlinear partial differential equations. arXiv:1801.06637 [cs,
mode decomposition. J Nonlinear Sci 25(6):1307–1346. https:// math, stat]
doi.org/10.1007/s00332-015-9258-5 229. Em KG, Kevrekidis Ioannis G, Lu L, Paris P, Sifan W, Liu Y (2021)
212. Li Q, Dietrich F, Bollt EM, Kevrekidis IG (2017) Extended Physics-informed machine learning. Nat Rev Phys 3(6):422–440.
dynamic mode decomposition with dictionary learning: a data- https://doi.org/10.1038/s42254-021-00314-5
driven adaptive spectral decomposition of the Koopman operator. 230. Cuomo S, Cola VSD, Giampaolo F, Rozza G, Raissi M, Piccialli
Chaos Interdiscip J Nonlinear Sci 27(10):103111. https://doi.org/ F (2022) Scientific machine learning through physics-informed
10.1063/1.4993854 neural networks: where we are and what’s next. J Sci Comput
213. Yeung E, Kundu S, Hodas N (2019) Learning deep neural network 92(3):88. https://doi.org/10.1007/s10915-022-01939-z
representations for koopman operators of nonlinear dynamical 231. Hao Z, Liu S, Zhang Y, Ying C, Feng Y, Su H, Zhu J (2022)
systems. In: 2019 American Control Conference (ACC), pp 4832– Physics-informed machine learning: a survey on problems, meth-
4839. https://doi.org/10.23919/ACC.2019.8815339 ods and applications. arXiv:2211.08064
214. Takeishi N, Kawahara Y, Yairi T (2017) Learning Koopman invari- 232. Ehsan H, Ruben J (2021) SciANN: a Keras/tensorflow wrapper
ant subspaces for dynamic mode decomposition. In: Proceedings for scientific computations and physics-informed deep learning
of the 31st international conference on neural information pro- using artificial neural networks. Comput Methods Appl Mech Eng
cessing systems, NIPS’17. Curran Associates Inc, Red Hook, pp 373:113552. https://doi.org/10.1016/j.cma.2020.113552
1130–1140 233. Hennigh O, Narasimhan S, Nabian MA, Subramaniam A, Tangsali
215. Morton J, Witherden FD, Jameson A, Kochenderfer MJ (2018) K, Fang Z, Rietmann M, Byeon W, Choudhry S (2021) NVIDIA
Deep dynamical modeling and control of unsteady fluid flows. SimNet: an AI-accelerated multi-physics simulation framework.

123
Computational Mechanics (2024) 74:281–331 319

In: Paszynski M, Kranzlmüller D, Krzhizhanovskaya VV, Don- 251. Ferrari S, Jensenius M (2008) A constrained optimization
garra JJ, Sloot PMA (eds) Computational science—ICCS 2021. approach to preserving prior knowledge during incremental train-
Lecture notes in computer science. Springer, Cham, pp 447–461. ing. IEEE Trans Neural Netw 19(6):996–1009. https://doi.org/10.
https://doi.org/10.1007/978-3-030-77977-1_36 1109/TNN.2007.915108
234. Lu L, Meng X, Mao Z, Karniadakis GE (2021) DeepXDE: a 252. Rudd K, Di Muro G, Ferrari S (2014) A constrained backprop-
deep learning library for solving differential equations. SIAM Rev agation approach for the adaptive solution of partial differential
63(1):208–228. https://doi.org/10.1137/19M1274067 equations. IEEE Trans Neural Netw Learn Syst 25(3):571–584.
235. Zhiqiang C, Jingshuang C, Min L, Xinyu L (2020) Deep https://doi.org/10.1109/TNNLS.2013.2277601
least-squares methods: an unsupervised learning-based numeri- 253. Keith R, Silvia F (2015) A constrained integration (CINT)
cal method for solving elliptic PDEs. J Comput Phys 420:109707. approach to solving partial differential equations using artificial
https://doi.org/10.1016/j.jcp.2020.109707 neural networks. Neurocomputing 155:277–285. https://doi.org/
236. Justin S, Konstantinos S (2018) DGM: a deep learning algo- 10.1016/j.neucom.2014.11.058
rithm for solving partial differential equations. J Comput Phys 254. Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang
375:1339–1364. https://doi.org/10.1016/j.jcp.2018.08.029 X (2017) Residual attention network for image classification. In:
237. Kharazmi E, Zhang Z, Karniadakis GE (2019) Variational 2017 IEEE conference on computer vision and pattern recog-
physics-informed neural networks for solving partial differential nition (CVPR), pp 6450–6458. https://doi.org/10.1109/CVPR.
equations. arXiv:1912.00873 [physics, stat] 2017.683
238. Ehsan K, Zhongqiang Z, Karniadakis George EM (2021) hp- 255. Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection
VPINNs: variational physics-informed neural networks with through guided attention in CNNs. In: 2018 IEEE/CVF confer-
domain decomposition. Comput Methods Appl Mech Eng ence on computer vision and pattern recognition, pp 6995–7003.
374:113547. https://doi.org/10.1016/j.cma.2020.113547 https://doi.org/10.1109/CVPR.2018.00731
239. Morokoff William J, Caflisch Russel E (1995) Quasi-Monte Carlo 256. Jim M, Deep R, Hesthaven Jan S, Christian R (2020) Constraint-
integration. J Comput Phys 122(2):218–230. https://doi.org/10. aware neural networks for Riemann problems. J Comput Phys
1006/jcph.1995.1209 409:109345. https://doi.org/10.1016/j.jcp.2020.109345
240. 14-Monte Carlo integration I: basic concepts (2004). In: Pharr M, 257. Nandwani Y, Pathak AM, Singla P (2019) A primal dual for-
Humphreys G (eds) Physically based rendering. Morgan Kauf- mulation for deep learning with constraints. In: Advances in
mann, Burlington, pp 631–660. https://doi.org/10.1016/B978- neural information processing systems, vol 32. Curran Asso-
012553180-1/50016-8 ciates, Inc. https://papers.nips.cc/paper_files/paper/2019/hash/
241. Novak E, Ritter K (1996) High dimensional integration of smooth cf708fc1decf0337aded484f8f4519ae-Abstract.html
functions over cubes. Numer Math 75(1):79–97. https://doi.org/ 258. McClenny L, Braga-Neto U (2022) Self-adaptive physics-
10.1007/s002110050231 informed neural networks using a soft attention mechanism.
242. Rivera Jon A, Taylor Jamie M, Omella Angel J, David P (2022) arXiv:2009.04544 [cs, stat]
On quadrature rules for solving partial differential equations using 259. Lu L, Pestourie R, Yao W, Wang Z, Verdugo F, Johnson SG
neural networks. Comput Methods Appl Mech Eng 393:114710. (2021) Physics-informed neural networks with hard constraints
https://doi.org/10.1016/j.cma.2022.114710 for inverse design. SIAM J Sci Comput 43(6):B1105–B1132.
243. Yaohua Z, Gang B, Xiaojing Y, Haomin Z (2020) Weak adversar- https://doi.org/10.1137/21M1397908
ial networks for high-dimensional partial differential equations. 260. Zeng Q, Kothari Y, Bryngelson SH, Schäfer F (2022) Competitive
J Comput Phys 411:109409. https://doi.org/10.1016/j.jcp.2020. physics informed networks. arXiv:2204.11144 [cs, math]
109409 261. Philipp M, Wolfgang F, Stefan T, Isabell G, Michael G (2023)
244. Minh N-TV, Xiaoying Z, Timon R (2019) A deep energy method Modeling of 3D blood flows with physics-informed neural net-
for finite deformation hyperelasticity. Eur J Mech A Solids. works: comparison of network architectures. Fluids 8(2):46.
https://doi.org/10.1016/j.euromechsol.2019.103874 https://doi.org/10.3390/fluids8020046
245. Weinan E, Bing Yu (2018) The Deep Ritz method: a deep learning- 262. Han J, Tao J, Wang C (2020) FlowNet: a deep learning framework
based numerical algorithm for solving variational problems. for clustering and selection of streamlines and stream surfaces.
Commun Math Stat 6(1):1–12. https://doi.org/10.1007/s40304- IEEE Trans Visual Comput Graph 26(4):1732–1744. https://doi.
018-0127-z org/10.1109/TVCG.2018.2880207
246. Grossmann TG, Komorowska UJ, Latz J, Schönlieb CB (2023) 263. Bhatnagar S, Afshar Y, Pan S, Duraisamy K, Kaushik S (2019)
Can physics-informed neural networks beat the finite element Prediction of aerodynamic flow fields using convolutional neural
method? arXiv:2302.04107 networks. Comput Mech 64(2):525–545. https://doi.org/10.1007/
247. Ali K, Tapan M (2022) Physics-informed PointNet: a deep learn- s00466-019-01740-0
ing solver for steady-state incompressible flows and thermal 264. Han G, Luning S, Jian-Xun W (2021) PhyGeoNet: physics-
fields on multiple sets of irregular geometries. J Comput Phys informed geometry-adaptive convolutional neural networks for
468:111510. https://doi.org/10.1016/j.jcp.2022.111510 solving parameterized steady-state PDEs on irregular domain.
248. Jens B, Kaj N (2018) A unified deep artificial neural network J Comput Phys 428:110079. https://doi.org/10.1016/j.jcp.2020.
approach to partial differential equations in complex geometries. 110079
Neurocomputing 317:28–41. https://doi.org/10.1016/j.neucom. 265. Wandel N, Weinmann M, Neidlin M, Klein R (2022) Spline-
2018.06.056 PINN: approaching PDEs without data using fast, physics-
249. Alexander H, Henning W, Rolf M (2022) Physics informed neu- informed hermite-spline CNNs. arXiv:2109.07143 [physics]
ral networks for continuum micromechanics. Comput Methods 266. Han G, Zahr Matthew J, Jian-Xun W (2022) Physics-informed
Appl Mech Eng 393:114790. https://doi.org/10.1016/j.cma.2022. graph neural Galerkin networks: a unified framework for solving
114790 PDE-governed forward and inverse problems. Comput Methods
250. Lagaris IE, Likas AC, Papageorgiou DG (2000) Neural-network Appl Mech Eng 390:114502. https://doi.org/10.1016/j.cma.2021.
methods for boundary value problems with irregular boundaries. 114502
IEEE Trans Neural Netw 11(5):1041–1049. https://doi.org/10. 267. Möller M, Toshniwal D, Van Ruiten F (2021) Physics-
1109/72.870037 informed machine learning embedded into isogeometric analysis.
Mathematics: key enabling technology for scientific machine

123
320 Computational Mechanics (2024) 74:281–331

learning. https://platformwiskunde.nl/wp-content/uploads/2021/ Springer, Cham, pp 55–84. https://doi.org/10.1007/978-3-030-

11/Math_KET_SciML.pdf 76587-3_5
268. Hughes TJR, Cottrell JA, Bazilevs Y (2005) Isogeometric anal- 285. Anton D, Wessels H (2021) Identification of material parameters
ysis: CAD, finite elements, NURBS, exact geometry and mesh from full-field displacement data using physics-informed neural
refinement. Comput Methods Appl Mech Eng 194(39):4135– networks. https://doi.org/10.13140/RG.2.2.24558.89924/1
4195. https://doi.org/10.1016/j.cma.2004.10.008 286. Yifei Z, QiZhi H, Tartakovsky Alexandre M (2023) Improved
269. Meethal RE, Obst B, Khalil M, Ghantasala A, Kodakkal A, training of physics-informed neural networks for parabolic differ-
Bletzinger KU, Wüchner R (2022) Finite element method- ential equations with sharply perturbed initial conditions. Comput
enhanced neural network for forward and inverse problems. Methods Appl Mech Eng 414:116125. https://doi.org/10.1016/j.
arXiv:2205.08321 [cs, math] cma.2023.116125
270. Hughes TJR (2000) The finite element method: linear static and 287. Yu Jeremy LL, Xuhui M, Em KG (2022) Gradient-enhanced
dynamic finite element analysis. Dover Publications, Mineola physics-informed neural networks for forward and inverse PDE
271. Bathe K-J (ed) (2014) Finite element procedures, 2nd edn. K.J. problems. Comput Methods Appl Mech Eng 393:114823. https://
Bathe, Watertown doi.org/10.1016/j.cma.2022.114823
272. Berrone S, Canuto C, Pintore M (2022) Variational physics 288. Taylor Jamie M, David P, Ignacio M (2023) A deep Fourier resid-
informed neural networks: the role of quadratures and test func- ual method for solving PDEs using neural networks. Comput
tions. J Sci Comput 92(3):100. https://doi.org/10.1007/s10915- Methods Appl Mech Eng 405:115850. https://doi.org/10.1016/
022-01950-4 j.cma.2022.115850
273. Badia S, Li W, Martín AF (2023) Finite element interpolated 289. Pao-Hsiung C, Cheng WJ, Chinchun O, Ha DM, Yew-Soon O
neural networks for solving forward and inverse problems. (2022) CAN-PINN: a fast physics-informed neural network based
arXiv:2306.06304 [cs, math] on coupled-automatic-numerical differentiation method. Comput
274. Alireza Yazdani LL, Raissi M, Karniadakis GE (2020) Systems Methods Appl Mech Eng 395:114909. https://doi.org/10.1016/j.
biology informed deep learning for inferring parameters and hid- cma.2022.114909
den dynamics. PLoS Comput Biol 16(11):e1007575. https://doi. 290. Jagtap Ameya D, Kenji K, Em KG (2020) Adaptive activation
org/10.1371/journal.pcbi.1007575 functions accelerate convergence in deep and physics-informed
275. Carlos U, David P, Javier OA (2022) A finite element based deep neural networks. J Comput Phys 404:109136. https://doi.org/10.
learning solver for parametric PDEs. Comput Methods Appl Mech 1016/j.jcp.2019.109136
Eng 391:114562. https://doi.org/10.1016/j.cma.2021.114562 291. Guang-Bin H, Qin-Yu Z, Chee-Kheong S (2006) Extreme learning
276. Jagtap Ameya D, Ehsan K, Em KG (2020) Conservative physics- machine: theory and applications. Neurocomputing 70(1):489–
informed neural networks on discrete domains for conservation 501. https://doi.org/10.1016/j.neucom.2005.12.126
laws: applications to forward and inverse problems. Comput 292. Huang G-B, Wang DH, Lan Y (2011) Extreme learning machines:
Methods Appl Mech Eng 365:113028. https://doi.org/10.1016/ a survey. Int J Mach Learn Cybern 2(2):107–122. https://doi.org/
j.cma.2020.113028 10.1007/s13042-011-0019-y
277. Khemraj S, Jagtap Ameya D, Em KG (2021) Parallel physics- 293. Suchuan D, Zongwei L (2021) Local extreme learning machines
informed neural networks via domain decomposition. J Comput and domain decomposition for solving linear and nonlinear par-
Phys 447:110683. https://doi.org/10.1016/j.jcp.2021.110683 tial differential equations. Comput Methods Appl Mech Eng
278. Jagtap Ameya D, Em KG (2020) Extended physics-informed 387:114129. https://doi.org/10.1016/j.cma.2021.114129
neural networks (XPINNs): a generalized space-time domain 294. Suchuan D, Jielin Y (2022) Numerical approximation of par-
decomposition based deep learning framework for nonlinear tial differential equations by a variable projection method with
partial differential equations. Commun Comput Phys 28(5):2002– artificial neural networks. Comput Methods Appl Mech Eng
2041. https://doi.org/10.4208/cicp.OA-2020-0164 398:115284. https://doi.org/10.1016/j.cma.2022.115284
279. Chen X, Gong C, Wan Q, Deng L, Wan Y, Liu Y, Chen B, Liu 295. Ehsan H, Maziar R, Adrian M, Hector G, Ruben J (2021) A
J (2021) Transfer learning for deep neural network-based partial physics-informed deep learning framework for inversion and sur-
differential equations solving. Adv Aerodyn 3(1):36. https://doi. rogate modeling in solid mechanics. Comput Methods Appl Mech
org/10.1186/s42774-021-00094-7 Eng 379:113741. https://doi.org/10.1016/j.cma.2021.113741
280. Goswami S, Anitescu C, Chakraborty S, Rabczuk T (2019) 296. Jinshuai B, Hyogu J, Batuwatta-Gamage CP, Shusheng X,
Transfer learning enhanced physics informed neural network for Qingxia W, Rathnayaka CM, Laith A, Liu Gui-Rong G, Yuan-
phase-field modeling of fracture. arXiv:1907.02531 [cs, stat] tong S (2023) An introduction to programming physics-informed
281. He J, Chadha C, Kushwaha S, Koric S, Abueidda D, Jasiuk neural network-based computational solid mechanics. Int J Com-
I (2023) Deep energy method in topology optimization appli- put Methods. https://doi.org/10.1142/S0219876223500135
cations. Acta Mech 234(4):1365–1379. https://doi.org/10.1007/ 297. Georgios K, Yibo Y, Eileen H, Witschey Walter R, Detre John
s00707-022-03449-3 A, Paris P (2020) Machine learning in cardiovascular flows mod-
282. Nabian MA, Gladstone RJ, Meidani H (2021) Efficient training eling: predicting arterial blood pressure from non-invasive 4D
of physics-informed neural networks via importance sampling. flow MRI data using physics-informed neural networks. Comput
Comput Aided Civ Infrastruct Eng 36(8):962–977. https://doi. Methods Appl Mech Eng 358:112623. https://doi.org/10.1016/j.
org/10.1111/mice.12685 cma.2019.112623
283. Hanna John M, Aguado José V, Sebastien C-C, Ramzi A, 298. Raissi M, Yazdani A, Karniadakis GE (2020) Hidden fluid
Domenico B (2022) Residual-based adaptivity for two-phase flow mechanics: learning velocity and pressure fields from flow visual-
simulation in porous media using physics-informed Neural Net- izations. Science 367(6481):1026–1030. https://doi.org/10.1126/
works. Comput Methods Appl Mech Eng 396:115100. https://doi. science.aaw4741
org/10.1016/j.cma.2022.115100 299. Luning S, Han G, Shaowu P, Jian-Xun W (2020) Surrogate
284. Kollmannsberger S, D’Angella D, Jokeit M, Herrmann L (2021) modeling for fluid flows based on physics-constrained deep learn-
Physics-informed neural networks. In: Stefan K, Davide D, Moritz ing without simulation data. Comput Methods Appl Mech Eng
J, Leon H (eds) Deep learning in computational mechanics: 361:112732. https://doi.org/10.1016/j.cma.2019.112732
an introductory course, studies in computational intelligence. 300. Xiaowei J, Shengze C, Hui L, Em KG (2021) NSFnets
(Navier–Stokes flow nets): physics-informed neural networks

123
Computational Mechanics (2024) 74:281–331 321

for the incompressible Navier–Stokes equations. J Comput Phys lems in unsaturated groundwater flow. Georisk Assess Manag
426:109951. https://doi.org/10.1016/j.jcp.2020.109951 Risk Eng Syst Geohazards 16(1):21–36. https://doi.org/10.1080/
301. Shengze C, Zhicheng W, Frederik F, Jin JY, Callum G, Em KG 17499518.2021.1971251
(2021) Flow over an espresso cup: inferring 3-D velocity and 318. Chen X, Trung CB, Yong Y, Günther M (2023) Transfer learning
pressure fields from tomographic background oriented Schlieren based physics-informed neural networks for solving inverse prob-
via physics-informed neural networks. J Fluid Mech 915:A102. lems in engineering structures under different loading scenarios.
https://doi.org/10.1017/jfm.2021.135 Comput Methods Appl Mech Eng 405:115852. https://doi.org/
302. Fraces Cedric G, Hamdi T (2021) Physics informed deep learning 10.1016/j.cma.2022.115852
for flow and transport in porous media. OnePetro. https://doi.org/ 319. Yubiao S, Ushnish S, Matthew J (2023) Physics-informed
10.2118/203934-MS deep learning for simultaneous surrogate modeling and PDE-
303. Wenbo Z, Li David S, Tan B-T, Sacks Michael S (2022) Simulation constrained optimization of an airfoil geometry. Comput Methods
of the 3D hyperelastic behavior of ventricular myocardium using a Appl Mech Eng 411:116042. https://doi.org/10.1016/j.cma.2023.
finite-element based neural-network approach. Comput Methods 116042
Appl Mech Eng 394:114871. https://doi.org/10.1016/j.cma.2022. 320. Rasht-Behesht M, Huber C, Shukla K, Karniadakis GE (2022)
114871 Physics-informed neural networks (PINNs) for wave propagation
304. Wang Jeremy CH, Jean-Pierre H (2023) FluxNet: a physics- and full waveform inversions. J Geophys Res Solid Earth. https://
informed learning-based Riemann solver for transcritical flows doi.org/10.1029/2021JB023120
with non-ideal thermodynamics. Comput Methods Appl Mech 321. Zehnder J, Li Y, Coros S, Thomaszewski B (2021) NTopo: mesh-
Eng 411:116070. https://doi.org/10.1016/j.cma.2023.116070 free topology optimization using implicit neural representations.
305. Sina AN, Ehsan H, Trevor C, Anoush P, Reza V (2021) Physics- arXiv:2102.10782
informed neural network for modelling the thermochemical 322. Di Lorenzo D, Champaney V, Marzin JY, Farhat C, Chinesta
curing process of composite-tool systems during manufacture. F (2023) Physics informed and data-based augmented learning
Comput Methods Appl Mech Eng 384:113959. https://doi.org/ in structural health diagnosis. Comput Methods Appl Mech Eng
10.1016/j.cma.2021.113959 414:116186. https://doi.org/10.1016/j.cma.2023.116186
306. Zhu Q, Liu Z, Yan J (2021) Machine learning for metal addi- 323. Jens B, Kaj N (2019) Data-driven discovery of PDEs in complex
tive manufacturing: predicting temperature and melt pool fluid datasets. J Comput Phys 384:239–252. https://doi.org/10.1016/j.
dynamics using physics-informed neural networks. Comput Mech jcp.2019.01.036
67(2):619–635. https://doi.org/10.1007/s00466-020-01952-9 324. Udrescu S-M, Tegmark M (2020) AI Feynman: a physics-inspired
307. Markidis S (2021) The old and the new: can physics-informed method for symbolic regression. Sci Adv 6(16):2631. https://doi.
deep-learning replace traditional linear solvers? Front Big Data. org/10.1126/sciadv.aay2631
https://doi.org/10.3389/fdata.2021.669097 325. Feynman Richard P, Leighton Robert B, Sands Matthew L (2011)
308. Liangliang L, Li Yunzhu D, Qiuwan LT, Yonghui X (2022) ReF- The Feynman lectures on physics. Basic Books, New York
nets: physics-informed neural network for Reynolds equation 326. Xuhui M, Zhen L, Dongkun Z, Em KG (2020) PPINN: parareal
of gas bearing. Comput Methods Appl Mech Eng 391:114524. physics-informed neural network for time-dependent PDEs. Com-
https://doi.org/10.1016/j.cma.2021.114524 put Methods Appl Mech Eng 370:113250. https://doi.org/10.
309. Chen Yuyao LL, Em KG, Dal NL (2020) Physics-informed neural 1016/j.cma.2020.113250
networks for inverse problems in nano-optics and metamaterials. 327. Revanth M, Susanta G (2022) A novel sequential method to
Optics Express 28(8):11618. https://doi.org/10.1364/OE.384875 train physics informed neural networks for Allen Cahn and Cahn
310. Ruiyang Z, Yang L, Hao S (2020) Physics-informed multi-LSTM Hilliard equations. Comput Methods Appl Mech Eng 390:114474.
networks for metamodeling of nonlinear structures. Comput https://doi.org/10.1016/j.cma.2021.114474
Methods Appl Mech Eng 369:113226. https://doi.org/10.1016/ 328. Iserles A (2008) A first course in the numerical analysis of differ-
j.cma.2020.113226 ential equations. Cambridge University Press
311. Shukla K, Di Leoni PC, Blackshire J, Sparkman D, Karniadakis 329. Henning W, Christian W, Peter W (2020) The neural parti-
GE (2020) Physics-informed neural network for ultrasound non- cle method—an updated Lagrangian physics informed neural
destructive quantification of surface breaking cracks. J Nondestr network for computational fluid dynamics. Comput Methods
Eval 39(3):61. https://doi.org/10.1007/s10921-020-00705-1 Appl Mech Eng 368:113127. https://doi.org/10.1016/j.cma.2020.
312. Anton D, Wessels H (2022) Physics-informed neural networks 113127
for material model calibration from full-field displacement data. 330. Jinshuai B, Ying Z, Yuwei M, Hyogu J, Haifei Z, Charith R, Sauret
arXiv:2212.07723 Emilie G (2022) A general neural particle method for hydrody-
313. Herrmann L, Bürchner T, Dietrich F, Kollmannsberger S (2023) namics modeling. Comput Methods Appl Mech Eng 393:114740.
On the use of neural networks for full waveform inversion. Com- https://doi.org/10.1016/j.cma.2022.114740
put Methods Appl Mech Eng 415:116278. https://doi.org/10. 331. González-García R, Rico-Martínez R, Kevrekidis IG (1998) Iden-
1016/j.cma.2023.116278 tification of distributed parameter systems: a neural net based
314. Rojas Carlos JG, Bitterncourt ML, Boldrini JL (2021) Parameter approach. Comput Chem Eng 22:S965–S968. https://doi.org/10.
identification for a damage model using a physics informed neural 1016/S0098-1354(98)00191-4
network. arXiv:2107.08781 332. Long Z, Lu Y, Ma X, Dong B (2018) PDE-Net: learning PDEs
315. Li W, Lee K-M (2021) Physics informed neural network for from data. In: Proceedings of the 35th international conference
parameter identification and boundary force estimation of compli- on machine learning. PMLR, pp 3208–3216. https://proceedings.
ant and biomechanical systems. Int J Intell Robot Appl 5(3):313– mlr.press/v80/long18a.html
325. https://doi.org/10.1007/s41315-021-00196-x 333. Long Zichao L, Yiping DB (2019) PDE-Net 2.0: learning PDEs
316. Zhang E, Dao M, Karniadakis GE, Suresh S (2022) Analyses from data with a numeric-symbolic hybrid deep network. J Com-
of internal structures and defects in materials using physics- put Phys 399:108925. https://doi.org/10.1016/j.jcp.2019.108925
informed neural networks. Sci Adv 8(7):0644. https://doi.org/10. 334. Hua BS, Tran MK, Yeung SK (2018) Pointwise convolutional
1126/sciadv.abk0644 neural networks. arXiv:1712.05245 [cs]
317. Depina I, Jain S, Mar Valsson S, Gotovac H (2022) Appli- 335. Brunton SL, Proctor JL, Nathan Kutz J (2016) Discovering gov-
cation of physics-informed neural networks to inverse prob- erning equations from data by sparse identification of nonlinear

123
322 Computational Mechanics (2024) 74:281–331

dynamical systems. Proc Natl Acad Sci 113(15):3932–3937. uation of steel plates by neural networks. IEEE Trans Appl
https://doi.org/10.1073/pnas.1517384113 Supercond 9(2):3475–3478. https://doi.org/10.1109/77.783778
336. Rudy SH, Brunton SL, Proctor JL, Nathan Kutz J (2017) 354. Ovcharenko O, Kazei V, Kalita M, Peter D, Alkhalifah T (2019)
Data-driven discovery of partial differential equations. Sci Adv Deep learning for low-frequency extrapolation from multioffset
3(4):e1602614. https://doi.org/10.1126/sciadv.1602614 seismic data. Geophysics 84(6):R989–R1001. https://doi.org/10.
337. Schaeffer H (2017) Learning partial differential equations via data 1190/geo2018-0884.1
discovery and sparse optimization. Proc Roy Soc A Math Phys 355. Sun H, Demanet L (2020) Extrapolated full waveform inversion
Eng Sci 473(2197):20160446. https://doi.org/10.1098/rspa.2016. with deep learning. Geophysics, 85(3):R275–R288. https://doi.
0446 org/10.1190/geo2019-0195.1. arXiv:1909.11536
338. Champion K, Lusch B, Nathan Kutz J, Brunton SL (2019) Data- 356. Sun H, Demanet L (2022) Deep learning for low-frequency
driven discovery of coordinates and governing equations. Proc extrapolation of multicomponent data in elastic FWI. IEEE Trans
Natl Acad Sci 116(45):22445–22451. https://doi.org/10.1073/ Geosci Remote Sens 60:1–11. https://doi.org/10.1109/TGRS.
pnas.1906995116 2021.3135790
339. Paolo C, Giorgio G, Stefania F, Andrea M, Attilio F (2023) 357. Lewis W, Vigh W (2017) Deep learning prior models from
Reduced order modeling of parametrized systems through autoen- seismic images for full-waveform inversion. In: SEG techni-
coders and SINDy approach: continuation of periodic solutions. cal program expanded abstracts 2017. Society of Exploration
Comput Methods Appl Mech Eng 411:116072. https://doi.org/ Geophysicists, Houston, pp 1512–1517. https://doi.org/10.1190/
10.1016/j.cma.2023.116072 segam2017-17627643.1
340. Raissi M, Perdikaris P, Karniadakis GE (2018) Multistep neural 358. Dyck DN, Lowther DA, McFee S (1992) Determining an approxi-
networks for data-driven discovery of nonlinear dynamical sys- mate finite element mesh density using neural network techniques.
tems. arXiv:1801.01236 [nlin, physics:physics, stat] IEEE Trans Magn 28(2):1767–1770. https://doi.org/10.1109/20.
341. Kim B, Azevedo VC, Thuerey N, Kim T, Gross M, Solenthaler B 124047
(2019) Deep fluids: a generative network for parameterized fluid 359. Chedid R, Najjar N (1996) Automatic finite-element mesh gen-
simulations. Comput Graph Forum 38(2):59–70. https://doi.org/ eration using artificial neural networks-part I: prediction of mesh
10.1111/cgf.13619 density. IEEE Trans Magn 32(5):5173–5178. https://doi.org/10.
342. Julia L, Reese J, Jeremy T (2016) Machine learning strategies for 1109/20.538619
systems with invariance properties. J Comput Phys 318:22–35. 360. Triantafyllidis DG, Labridis DP (2000) An automatic mesh
https://doi.org/10.1016/j.jcp.2016.05.003 generator for handling small features in open boundary power
343. Julia L, Andrew K, Jeremy T (2016) Reynolds averaged tur- transmission line problems using artificial neural networks. Com-
bulence modelling using deep neural networks with embedded mun Numer Methods Eng 16(3):177–190
invariance. J Fluid Mech 807:155–166. https://doi.org/10.1017/ 361. Zhang Z, Wang Y, Jimack PK, Wang H (2020) MeshingNet:
jfm.2016.615 a new mesh generation method based on deep learning. In:
344. Smith GF (1965) On isotropic integrity bases. Arch Ration Mech Krzhizhanovskaya VV, Závodszky G, Lees MH, Dongarra JJ,
Anal 18(4):282–292. https://doi.org/10.1007/BF00251667 Sloot PMA, Brissos S, Teixeira J (eds) Computational science—
345. Lutter M, Listmann K, Peters J (2019) Deep Lagrangian networks ICCS 2020, vol 12139. Lecture notes in computer science.
for end-to-end learning of energy-based control for under-actuated Springer, Cham, pp 186–198. https://doi.org/10.1007/978-3-030-
systems. In: 2019 IEEE/RSJ international conference on intelli- 50420-5_14
gent robots and systems (IROS), pp 7718–7725. https://doi.org/ 362. Lock C, Hassan O, Sevilla R, Jones J (2023) Meshing using neural
10.1109/IROS40897.2019.8968268 networks for improving the efficiency of computer modelling. Eng
346. Lutter M, Ritter C, Peters J (2019) Deep Lagrangian networks: Comput. https://doi.org/10.1007/s00366-023-01812-z
using physics as model prior for deep learning. arXiv:1907.04490 363. Bernd F (1994) Growing cell structures—a self-organizing net-
[cs, eess, stat] work for unsupervised and supervised learning. Neural Netw
347. Cranmer M, Greydanus S, Hoyer S, Battaglia P, Spergel D, Ho S 7(9):1441–1460. https://doi.org/10.1016/0893-6080(94)90091-
(2020) Lagrangian neural networks. arXiv:2003.04630 [physics, 4
stat] 364. Alfonzetti S, Coco S, Cavalieri S, Malgeri M (1996) Automatic
348. Greydanus S, Dzamba M, Yosinski J (2019) Hamiltonian neural mesh generation by the let-it-grow neural network. IEEE Trans
networks. arXiv:1906.01563 [cs] Magn 32(3):1349–1352. https://doi.org/10.1109/20.497496
349. Zhang L, Yang F, Daniel Zhang Y, Zhu YJ (2016) Road crack 365. Triantafyllidis DG, Labridis DP (2002) A finite-element mesh
detection using deep convolutional neural network. In: 2016 IEEE generator based on growing neural networks. IEEE Trans Neu-
international conference on image processing (ICIP), pp 3708– ral Netw 13(6):1482–1496. https://doi.org/10.1109/TNN.2002.
3712. https://doi.org/10.1109/ICIP.2016.7533052 804223
350. Chen F-C, Jahanshahi MR (2018) NB-CNN: deep learning-based 366. Lefik M, Schrefler BA (2003) Artificial neural network as an incre-
crack detection using convolutional neural network and Naïve mental non-linear constitutive model for a finite element code.
Bayes data fusion. IEEE Trans Ind Electron 65(5):4392–4400. Comput Methods Appl Mech Eng 192(28):3265–3283. https://
https://doi.org/10.1109/TIE.2017.2764844 doi.org/10.1016/S0045-7825(03)00350-5
351. Jaeger BE, Schmid S, Grosse CU, Gögelein A, Elischberger F 367. Phill JD, Piemaan F, Whan YJ (2021) Machine learning-based
(2022) Infrared thermal imaging-based turbine blade crack clas- constitutive model for J2- plasticity. Int J Plast 138:102919.
sification using deep learning. J Nondestr Eval 41(4):74. https:// https://doi.org/10.1016/j.ijplas.2020.102919
doi.org/10.1007/s10921-022-00907-9 368. Lin YC, Jun Z, Jue Z (2008) Application of neural networks to
352. Korshunova N, Jomo J, Lékó G, Reznik D, Balázs P, Kollmanns- predict the elevated temperature flow behavior of a low alloy
berger S (2020) Image-based material characterization of complex steel. Comput Mater Sci 43(4):752–758. https://doi.org/10.1016/
microarchitectured additively manufactured structures. Comput j.commatsci.2008.01.039
Math Appl 80(11):2462–2480. https://doi.org/10.1016/j.camwa. 369. Li Hong-Ying H, Ji-Dong WD-D, Xiao-Feng W, Yang-Hua L
2020.07.018 (2012) Artificial neural network and constitutive equations to
353. Hall Barbosa C, Bruno AC, Vellasco M, Pacheco M, Wikswo predict the hot deformation behavior of modified 2.25Cr-1Mo
JP, Ewing AP (1999) Automation of SQUlD nondestructive eval-

123
Computational Mechanics (2024) 74:281–331 323

steel. Mater Des 42:192–197. https://doi.org/10.1016/j.matdes. modeling by deep learning. J Comput Phys 429:110010. https://
2012.05.056 doi.org/10.1016/j.jcp.2020.110010
370. Daoping L, Hang Y, Elkhodary KI, Shan T, Kam LW, Guo 387. Mozaffar M, Bostanabad R, Chen W, Ehmann K, Cao J, Bessa
X (2022) Mechanistically informed data-driven modeling of MA (2019) Deep learning predicts path-dependent plasticity. Proc
cyclic plasticity via artificial neural networks. Comput Methods Natl Acad Sci 116(52):26414–26420. https://doi.org/10.1073/
Appl Mech Eng 393:114766. https://doi.org/10.1016/j.cma.2022. pnas.1911815116
114766 388. Ling W, Ludovic N (2022) Recurrent neural networks (RNNs)
371. Unger Jörg F, Carsten K (2009) Neural networks as material mod- with dimensionality reduction and break down in computational
els within a multiscale approach. Comput Struct 87(19):1177– mechanics; application to multi-scale localization step. Comput
1186. https://doi.org/10.1016/j.compstruc.2008.12.003 Methods Appl Mech Eng 390:114476. https://doi.org/10.1016/j.
372. Gabriel H, Luiz SA (2015) Contact stiffness estimation in ANSYS cma.2021.114476
using simplified models and artificial neural networks. Finite Elem 389. Abueidda Diab W, Seid K, Sobh Nahil A, Huseyin S (2021)
Anal Des 97:43–53. https://doi.org/10.1016/j.finel.2015.01.003 Deep learning for plasticity and thermo-viscoplasticity. Int J Plast
373. Atsuya O, Shinobu Y (1970) A new local contact search method 136:102852. https://doi.org/10.1016/j.ijplas.2020.102852
using a multi-layer neural network. Comput Model Eng Sci 390. Hsu Yu-Chuan Yu, Chi-Hua BM (2020) Using deep learning to
21(2):93–104. https://doi.org/10.3970/cmes.2007.021.093 predict fracture patterns in crystalline solids. Matter 3(1):197–
374. Oishi A, Yagawa G (2020) A surface-to-surface contact search 211. https://doi.org/10.1016/j.matt.2020.04.019
method enhanced by deep learning. Comput Mech 65(4):1125– 391. Lew AJ, Yu CH, Hsu YC, Buehler MJ (2021) Deep learning model
1147. https://doi.org/10.1007/s00466-019-01811-2 to predict fracture mechanisms of graphene. Npj 2D Mater Appl
375. Singh AP, Medida S, Duraisamy K (2017) Machine-learning- 5(1):1–8. https://doi.org/10.1038/s41699-021-00228-x
augmented predictive modeling of turbulent separated flows 392. Minliang L, Liang L, Wei S (2020) A generic physics-informed
over airfoils. AIAA J 55(7):2215–2227. https://doi.org/10.2514/ neural network-based constitutive model for soft biological tis-
1.J055595 sues. Comput Methods Appl Mech Eng 372:113402. https://doi.
376. Maulik R, San O, Rasheed A, Vedula P (2019) Subgrid modelling org/10.1016/j.cma.2020.113402
for two-dimensional turbulence using neural networks. J Fluid 393. Weber P, Geiger J, Wagner W (2021) Constrained neural net-
Mech 858:122–144. https://doi.org/10.1017/jfm.2018.770 work training and its application to hyperelastic material mod-
377. Arnau F, Joan B, Ramon C (2022) Finite element approximation eling. Comput Mech 68(5):1179–1204. https://doi.org/10.1007/
of wave problems with correcting terms based on training artificial s00466-021-02064-8
neural networks with fine solutions. Comput Methods Appl Mech 394. Leng Y, Tac V, Calve S, Tepole AB (2021) Predicting the
Eng 399:115280. https://doi.org/10.1016/j.cma.2022.115280 mechanical properties of biopolymer gels using neural net-
378. Le BA, Yvonnet J, He Q-C (2015) Computational homogeniza- works trained on discrete fiber network data. Comput Methods
tion of nonlinear elastic materials using neural networks. Int J Appl Mech Eng 387:114160. https://doi.org/10.1016/j.cma.2021.
Numer Method Eng 104(12):1061–1084. https://doi.org/10.1002/ 114160. arXiv:2101.11712 [cs, q-bio]
nme.4953 395. Vahidullah T, Francisco SC, Tepole Adrian B (2022) Data-driven
379. Xiaoxin L, Giovanis DG, Yvonnet J, Papadopoulos V, Detrez tissue mechanics with polyconvex neural ordinary differential
F, Bai J (2019) A data-driven computational homogenization equations. Comput Methods Appl Mech Eng 398:115248. https://
method based on neural networks for the nonlinear anisotropic doi.org/10.1016/j.cma.2022.115248
electrical response of graphene/polymer nanocomposites. Com- 396. Linden L, Klein DK, Kalina KA, Brummund J, Weeger O, Käst-
put Mech 64(2):307–321. https://doi.org/10.1007/s00466-018- ner M (2023) Neural networks meet hyperelasticity: a guide to
1643-0 enforcing physics. arXiv:2302.02403 [cs]
380. Huang Daniel Z, Kailai X, Charbel F, Eric D (2020) Learning 397. Klein Dominik K, Rogelio O, Jesús M-F, Oliver W (2022) Finite
constitutive relations from indirect observations using deep neural electro-elasticity with physics-augmented neural networks. Com-
networks. J Comput Phys 416:109491. https://doi.org/10.1016/j. put Methods Appl Mech Eng 400:115501. https://doi.org/10.
jcp.2020.109491 1016/j.cma.2022.115501
381. Kun W, WaiChing S (2018) A multiscale multi-permeability poro- 398. Klein Dominik K, Mauricio F, Martin Robert J, Patrizio N, Oliver
plasticity model linked by recursive homogenizations and deep W (2022) Polyconvex anisotropic hyperelasticity with neural net-
learning. Comput Methods Appl Mech Eng 334:337–380. https:// works. J Mech Phys Solids 159:104703. https://doi.org/10.1016/
doi.org/10.1016/j.cma.2018.01.036 j.jmps.2021.104703
382. Li B, Zhuang X (2020) Multiscale computation on feedforward 399. As’ad F, Farhat C (2023) A mechanics-informed neural network
neural network and recurrent neural network. Front Struct Civ Eng framework for data-driven nonlinear viscoelasticity. In: AIAA
14(6):1285–1298. https://doi.org/10.1007/s11709-020-0691-7 SCITECH 2023 forum. American Institute of Aeronautics and
383. Vlassis Nikolaos N, Ran M, WaiChing S (2020) Geometric deep Astronautics, National Harbor. https://doi.org/10.2514/6.2023-
learning for computational mechanics part I: anisotropic hypere- 0949
lasticity. Comput Methods Appl Mech Eng 371:113299. https:// 400. Vahidullah T, Rausch Manuel K, Francisco SC, Buganza TA
doi.org/10.1016/j.cma.2020.113299 (2023) Data-driven anisotropic finite viscoelasticity using neu-
384. Frankenreiter I, Rosato D, Miehe C (2011) Hybrid micro- ral ordinary differential equations. Comput Methods Appl Mech
macro-modeling of evolving anisotropies and length scales in Eng 411:116046. https://doi.org/10.1016/j.cma.2023.116046
finite plasticity of polycrystals: hybrid micro-macro-modeling of 401. Amos B, Xu L, Zico KJ (2017) Input convex neural networks.
evolving anisotropies and length scales in finite plasticity of poly- In: Proceedings of the 34th international conference on machine
crystals. PAMM 11(1):515–518. https://doi.org/10.1002/pamm. learning. PMLR, pp 146–155. https://proceedings.mlr.press/v70/
201110249 amos17b.html
385. Fish J (2013) Practical multiscaling. Wiley, Chichester 402. Chen Ricky TQ, Rubanova Y, Bettencourt J, Duvenaud D (2019)
386. Kevin L, Markus H, Abdolazizi Kian P, Aydin Roland C, Mikhail Neural ordinary differential equations. arXiv:1806.07366
I, Cyron Christian J (2021) Constitutive artificial neural networks: 403. Peiyi C, Johann G (2022) Polyconvex neural networks for hyper-
a fast and general approach to predictive data-driven constitutive elastic constitutive models: a rectification approach. Mech Res

123
324 Computational Mechanics (2024) 74:281–331

Commun 125:103993. https://doi.org/10.1016/j.mechrescom. deep learning from instrumented indentation. Proc Natl Acad Sci
2022.103993 117(13):7052–7062. https://doi.org/10.1073/pnas.1922210117
404. Filippo M, Ioannis S, Paolo V, Victor M-B (2021) 418. Xuhui M, Em KG (2020) A composite neural network that learns
Thermodynamics-based artificial neural networks for con- from multi-fidelity data: application to function approximation
stitutive modeling. J Mech Phys Solids 147:104277. https://doi. and inverse PDE problems. J Comput Phys 401:109020. https://
org/10.1016/j.jmps.2020.104277 doi.org/10.1016/j.jcp.2019.109020
405. Masi F, Stefanou I, Vannucci P, Maffi-Berthier V (2021) Material 419. Xing L, Athanasiou Christos E, Padture Nitin P, Sheldon Brian
modeling via thermodynamics-based artificial neural networks. W, Huajian G (2020) A machine learning approach to fracture
In: Barbaresco F, Nielsen F (eds) Geometric structures of sta- mechanics problems. Acta Mater 190:105–112. https://doi.org/
tistical physics, information geometry, and learning. Springer 10.1016/j.actamat.2020.03.016
proceedings in mathematics and statistics. Springer, Cham, pp 420. Hambli R, Katerchi H, Benhamou C-L (2011) Multiscale
308–329. https://doi.org/10.1007/978-3-030-77957-3_16 methodology for bone remodelling simulation using coupled
406. Filippo M, Ioannis S (2022) Multiscale modeling of inelastic finite element and neural network computation. Biomech Model
materials with thermodynamics-based artificial neural networks Mechanobiol 10(1):133–145. https://doi.org/10.1007/s10237-
(TANN). Comput Methods Appl Mech Eng 398:115190. https:// 010-0222-x
doi.org/10.1016/j.cma.2022.115190 421. Moritz F, Siddhant K, Laura DL (2021) Unsupervised discovery
407. Ladeveze P, Nedjar D, Reynier M (1994) Updating of finite ele- of interpretable hyperelastic constitutive laws. Comput Methods
ment models using vibration tests. AIAA J 32(7):1485–1491. Appl Mech Eng 381:113852. https://doi.org/10.1016/j.cma.2021.
https://doi.org/10.2514/3.12219 113852
408. Basile M, Ludovic C, Christian R (2019) Parameter identifica- 422. Robert T (1996) Regression shrinkage and selection via the lasso.
tion and model updating in the context of nonlinear mechanical J Roy Stat Soc Ser B Methodol 58(1):267–288
behaviors using a unified formulation of the modified constitu- 423. Flaschel M, Kumar S, De Lorenzis L (2022) Discovering plasticity
tive relation error concept. Comput Methods Appl Mech Eng models without stress data. npj Comput Mater, 8(1):91. https://
345:1094–1113. https://doi.org/10.1016/j.cma.2018.09.008 doi.org/10.1038/s41524-022-00752-4. arXiv:2202.04916 [cs]
409. Nam NH, Ludovic C, Cuong HM (2022) mCRE-based parameter 424. Enzo M, Moritz F, Siddhant K, Laura DL (2023) Auto-
identification from full-field measurements: consistent frame- mated identification of linear viscoelastic constitutive laws with
work, integrated version, and extension to nonlinear material EUCLID. Mech Mater 181:104643. https://doi.org/10.1016/j.
behaviors. Comput Methods Appl Mech Eng 400:115461. https:// mechmat.2023.104643
doi.org/10.1016/j.cma.2022.115461 425. Moritz F, Siddhant K, Laura DL (2023) Automated discovery
410. Benady A, Baranger E, Chamoin L (2023) NN-mCRE: a modified of generalized standard material models with EUCLID. Comput
constitutive relation error framework for unsupervised learning Methods Appl Mech Eng 405:115867. https://doi.org/10.1016/j.
of nonlinear state laws with physics-augmented neural networks. cma.2022.115867
https://doi.org/10.13140/RG.2.2.32171.00804 426. Akshay J, Prakash T, Yiwen Z, Maxime E, Moritz F, Laura DL,
411. Benady AB, Chamoin LC, Baranger EB (2023) A modi- Siddhant K (2022) Bayesian-EUCLID: discovering hyperelastic
fied constitutive relation error (mCRE) framework to learn material laws with uncertainties. Comput Methods Appl Mech
nonlinear constitutive models from strain measurements with Eng 398:115225. https://doi.org/10.1016/j.cma.2022.115225
thermodynamics-consistent neural networks. In: International 427. Kevin L, Sarah P, Kuhl E (2023) Automated model discovery for
conference on adaptive modeling and simulation (ADMOS 2023), human brain using constitutive artificial neural networks. Acta
advanced techniques for data assimilation, inverse analysis, and Biomater 160:134–151. https://doi.org/10.1016/j.actbio.2023.01.
data-based enrichment of simulation models. https://doi.org/10. 055
23967/admos.2023.020 428. Kevin L, Ellen K (2023) A new family of constitutive artificial
412. Xueyang L, Roth Christian C, Dirk M (2019) Machine-learning neural networks towards automated model discovery. Comput
based temperature- and rate-dependent plasticity model: applica- Methods Appl Mech Eng 403:115731. https://doi.org/10.1016/
tion to analysis of fracture experiments on DP steel. Int J Plast j.cma.2022.115731
118:320–344. https://doi.org/10.1016/j.ijplas.2019.02.012 429. Atsuya O, Genki Y (2017) Computational mechanics enhanced
413. Prakash T, Akshay J, Yiwen Z, Yiwen F, Laura DL, Siddhant K by deep learning. Comput Methods Appl Mech Eng 327:327–351.
(2022) NN-EUCLID: deep-learning hyperelasticity without stress https://doi.org/10.1016/j.cma.2017.08.040
data. J Mech Phys Solids 169:105076. https://doi.org/10.1016/j. 430. Jaeho J, Kyungho Y, Phill-Seung L (2020) Deep learned finite
jmps.2022.105076 elements. Comput Methods Appl Mech Eng 372:113401. https://
414. Xiang L, Zhanli L, Shaoqing C, Chengcheng L, Chenfeng L, Zhuo doi.org/10.1016/j.cma.2020.113401
Z (2019) Predicting the effective mechanical property of hetero- 431. Bar-Sinai Y, Hoyer S, Hickey J, Brenner MP (2019) Learning data-
geneous materials by image based modeling and deep learning. driven discretizations for partial differential equations. Proc Natl
Comput Methods Appl Mech Eng 347:735–753. https://doi.org/ Acad Sci USA 116(31):15344–15349. https://doi.org/10.1073/
10.1016/j.cma.2019.01.005 pnas.1814058116
415. Henkes A, Caylak I, Mahnken R (2021) A deep learning 432. Panos P, Mobasher Mostafa E (2023) Integrated finite ele-
driven pseudospectral PCE based FFT homogenization algo- ment neural network (I-FENN) for non-local continuum dam-
rithm for complex microstructures. Comput Methods Appl Mech age mechanics. Comput Methods Appl Mech Eng 404:115766.
Eng 385:114070. https://doi.org/10.1016/j.cma.2021.114070. https://doi.org/10.1016/j.cma.2022.115766
arXiv:2110.13440 433. Arcones DA, Meethal RE, Obst B, Wüchner R (2022) Neural
416. Minliang L, Liang L, Wei S (2019) Estimation of in vivo network-based surrogate models applied to fluid–structure inter-
constitutive parameters of the aortic wall using a machine learn- action problems. In: WCCM-APCOM 2022, 1700 data science,
ing approach. Comput Methods Appl Mech Eng 347:201–217. machine learning and artificial intelligence. https://doi.org/10.
https://doi.org/10.1016/j.cma.2018.12.030 23967/wccm-apcom.2022.080
417. Lu L, Dao M, Kumar P, Ramamurty U, Karniadakis GE, Suresh S 434. Changnian H, Peng Z, Danny B, Guojing C, Yuefan D (2021)
(2020) Extraction of mechanical properties of materials through Artificial intelligence for accelerating time integrations in multi-

123
Computational Mechanics (2024) 74:281–331 325

scale modeling. J Comput Phys 427:110053. https://doi.org/10. Appl Mech Eng 412:115991. https://doi.org/10.1016/j.cma.2023.
1016/j.jcp.2020.110053 115991
435. Tomasz S, Mateusz D, Anna P, Ignacio M, Marcin Ł, Maciej 452. Plessix R-E (2006) A review of the adjoint-state method for com-
P (2023) Automatic stabilization of finite-element simulations puting the gradient of a functional with geophysical applications.
using neural networks and hierarchical matrices. Comput Meth- Geophys J Int 167(2):495–503. https://doi.org/10.1111/j.1365-
ods Appl Mech Eng 411:116073. https://doi.org/10.1016/j.cma. 246X.2006.02978.x
2023.116073 453. Dan G (2021) A tutorial on the adjoint method for inverse prob-
436. Mariusz B, Salman YM, Nathan Z, Duane D, Stefan M, Satchit R, lems. Comput Methods Appl Mech Eng 380:113810. https://doi.
Thiago R, Fabian D (2023) Learning hyperparameter predictors org/10.1016/j.cma.2021.113810
for similarity-based multidisciplinary topology optimization. Sci 454. Keshavarzzadeh V, Kirby RM, Narayan A (2021) Robust topology
Rep 13(1):14856. https://doi.org/10.1038/s41598-023-42009-0 optimization with low rank approximation using artificial neu-
437. Casadei F, Rimoli JJ, Ruzzene M (2013) A geometric multiscale ral networks. Comput Mech 68(6):1297–1323. https://doi.org/10.
finite element method for the dynamic analysis of heterogeneous 1007/s00466-021-02069-3
solids. Comput Methods Appl Mech Eng 263:56–70. https://doi. 455. Qian C, Ye W (2021) Accelerating gradient-based topology
org/10.1016/j.cma.2013.05.009 optimization design with dual-model artificial neural networks.
438. Oztoprak O, Paolini A, D’Acunto P, Rank E, Kollmannsberger S Struct Multidiscip Optim 63(4):1687–1707. https://doi.org/10.
(2023) Two-scale analysis of spaceframes with complex additive 1007/s00158-020-02770-6
manufactured nodes. Eng Struct 289:116283. https://doi.org/10. 456. Heng C, Yuyu Z, Elaine TTL, Lucia M, Livio D, Le S, Paulino
1016/j.engstruct.2023.116283 Glaucio H (2021) Universal machine learning for topology opti-
439. Arnd K, Franz B, Bernd M (2020) An intelligent nonlinear meta mization. Comput Methods Appl Mech Eng 375:112739. https://
element for elastoplastic continua: deep learning using a new doi.org/10.1016/j.cma.2019.112739
time-distributed residual U-Net architecture. Comput Methods 457. Aulig N, Olhofer M (2013) Evolutionary generation of neural
Appl Mech Eng 366:113088. https://doi.org/10.1016/j.cma.2020. network update signals for the topology optimization of struc-
113088 tures. In: Proceedings of the 15th annual conference companion on
440. German C, Rimoli Julian J (2019) Smart finite elements: a novel genetic and evolutionary computation, GECCO ’13 Companion.
machine learning application. Comput Methods Appl Mech Eng Association for Computing Machinery, New York, pp 213–214.
345:363–381. https://doi.org/10.1016/j.cma.2018.10.046 https://doi.org/10.1145/2464576.2464685
441. Taichi Y, Hiroshi O (2021) Zooming method for FEA using a 458. Aulig N, Olhofer M (2014) Topology optimization by predicting
neural network. Comput Struct 247:106480. https://doi.org/10. sensitivities based on local state features. https://congress.cimne.
1016/j.compstruc.2021.106480 com/iacm-eccomas2014/admin/files/filePaper/p437.pdf
442. Minglang Y, Zhang Enrui Yu, Yue KG (2022) Interfacing finite 459. Aulig N, Olhofer M (2015) Neuro-evolutionary topology opti-
elements with deep neural operators for fast multiscale model- mization with adaptive improvement threshold. In: Mora AM,
ing of mechanics problems. Comput Methods Appl Mech Eng Squillero G (eds) Applications of evolutionary computation. Lec-
402:115027. https://doi.org/10.1016/j.cma.2022.115027 ture notes in computer science. Springer, Cham, pp 655–666.
443. Sigmund O (2011) On the usefulness of non-gradient approaches https://doi.org/10.1007/978-3-319-16549-3_53
in topology optimization. Struct Multidiscip Optim 43(5):589– 460. Zhang Y, Chi H, Chen B, Tang TLE, Mirabella L, Song L,
596. https://doi.org/10.1007/s00158-011-0638-7 Paulino GH (2021) Speeding up computational morphogenesis
444. Holl P, Koltun V, Thuerey N (2020) Learning to control PDEs with online neural synthetic gradients. arXiv:2104.12282
with differentiable physics. arXiv:2001.07457 [physics, stat] 461. Hunter TH, Hulsoff SH, Sitaram A (2023) SuperAdjoint: super-
445. Um K, Brand R, Fei Y, Holl P, Thuerey N (2020) Solver-in-the- resolution neural networks in adjoint-based output error esti-
loop: learning from differentiable physics to interact with iterative mation. In: International conference on adaptive modeling and
PDE-solvers. In: Proceedings of the 34th international conference simulation (ADMOS 2023), recent developments in methods
on neural information processing systems, NIPS’20. Curran Asso- and applications for mesh adaptation. https://doi.org/10.23967/
ciates Inc, Red Hook, pp 6111–6122 admos.2023.058
446. Um K, Brand R, Yun F, Holl P, Thuerey P (2021) Solver-in-the- 462. Kai F, Koji F, Kunihiko T (2021) Machine-learning-based spatio-
loop: learning from differentiable physics to interact with iterative temporal super resolution reconstruction of turbulent flows. J
PDE-solvers. arXiv:2007.00016 [physics] Fluid Mech 909:A9. https://doi.org/10.1017/jfm.2020.948
447. Jensen CA, Reed RD, Marks RJ, El-Sharkawi MA, Jung J-B, 463. Senhora Fernando V, Heng C, Yuyu Z, Lucia M, Elaine TTL,
Miyamoto RT, Anderson GM, Eggen CJ (1999) Inversion of feed- Paulino Glaucio H (2022) Machine learning for topology opti-
forward neural networks: algorithms and applications. Proc IEEE mization: physics-based learning through an independent training
87(9):1536–1549. https://doi.org/10.1109/5.784232 strategy. Comput Methods Appl Mech Eng 398:115116. https://
448. Chi-Hua Yu, Qin Z, Buehler MJ (2019) Artificial intelligence doi.org/10.1016/j.cma.2022.115116
design algorithm for nanocomposites optimized for shear crack 464. Hsieh JT, Zhao S, Eismann S, Mirabella L, Ermon S (2019)
resistance. Nano Futures 3(3):035001. https://doi.org/10.1088/ Learning neural PDE solvers with convergence guarantees.
2399-1984/ab36f0 arXiv:1906.01200 [cs, stat]
449. Chen C-T, Grace XG (2020) Generative deep neural networks 465. Hong-Ling Y, Ji-Cheng L, Bo-Shuai Y, Nan W, Yun-Kang
for inverse materials design using backpropagation and active S (2021) Acceleration design for continuum topology opti-
learning. Adv Sci 7(5):1902607. https://doi.org/10.1002/advs. mization by using Pix2pix neural network. Int J Appl Mech
201902607 13(04):2150042. https://doi.org/10.1142/S1758825121500423
450. Tanyu DN, Ning J, Freudenberg T, Heilenkötter N, Rademacher 466. Hoyer S, Sohl-Dickstein J, Greydanus S (2019) Neural reparam-
A, Iben U, Maass Pr (2022) Deep learning methods for partial dif- eterization improves structural optimization. arXiv:1909.04240
ferential equations and related parameter identification problems. 467. Xu K, Darve E (2019) The neural network approach to inverse
arXiv:2212.03130 problems in differential equations. arXiv:1901.07758
451. Zohdi TI (2023) A machine-learning digital-twin for rapid large- 468. Jens B, Kaj N (2021) Neural networks as smooth priors for inverse
scale solar-thermal energy system design. Comput Methods problems for PDEs. J Comput Math Data Sci 1:100008. https://
doi.org/10.1016/j.jcmds.2021.100008

123
326 Computational Mechanics (2024) 74:281–331

469. Chen L, Shen MHH (2021) A new topology optimization 486. Kai F, Koji F, Kunihiko T (2019) Super-resolution reconstruction
approach by physics-informed deep learning process. Adv Sci of turbulent flows with machine learning. J Fluid Mech 870:106–
Technol Eng Syst J 6(4):233–240. https://doi.org/10.25046/ 120. https://doi.org/10.1017/jfm.2019.238
aj060427 487. Nicholas N, Sai-Aksharah S, Tran Huy T, James Kai A (2020) An
470. Alex H, Flavio CL, Alexander H (2021) An artificial intelligence- artificial neural network approach for generating high-resolution
assisted design method for topology optimization without pre- designs from low-resolution input in topology optimization. J
optimized training data. Appl Sci 11(19):9041. https://doi.org/ Mech Des 142(1):011402. https://doi.org/10.1115/1.4044332
10.3390/app11199041 488. Wang C, Yao S, Wang Z, Jie H (2021) Deep super-
471. Deng H, Albert CT (2020) Topology optimization based on resolution neural network for structural topology optimiza-
deep representation learning (DRL) for compliance and stress- tion. Eng Optim 53(12):2108–2121. https://doi.org/10.1080/
constrained design. Comput Mech 66(2):449–469. https://doi. 0305215X.2020.1846031
org/10.1007/s00466-020-01859-5 489. Xue L, Liu J, Wen G, Wang H (2021) Efficient, high-resolution
472. Chandrasekhar A, Suresh K (2021) TOuNN: topology optimiza- topology optimization method based on convolutional neural net-
tion using neural networks. Struct Multidiscip Optim 63(3):1135– works. Front Mech Eng 16(1):80–96. https://doi.org/10.1007/
1149. https://doi.org/10.1007/s00158-020-02748-4 s11465-020-0614-2
473. Chandrasekhar A, Suresh K (2021) Length scale control in 490. Oishi A, Yagawa G (2021) Finite elements using neural networks
topology optimization using fourier enhanced neural networks. and a posteriori error. Arch Comput Methods Eng 28(5):3433–
arXiv:2109.01861 3456. https://doi.org/10.1007/s11831-020-09507-0
474. Aaditya C, Krishnan S (2021) Multi-material topology optimiza- 491. Ohrt EM, Niels A, Andreas BJ, Ole S (2022) De-homogenization
tion using neural networks. Comput Aided Des 136:103017. using convolutional neural networks. Comput Methods Appl
https://doi.org/10.1016/j.cad.2021.103017 Mech Eng 388:114197. https://doi.org/10.1016/j.cma.2021.
475. Park JJ, Florence P, Straub J, Newcombe R, Lovegrove S (2019) 114197
DeepSDF: learning continuous signed distance functions for 492. Wan ZY, Vlachas P, Koumoutsakos P, Sapsis T (2018) Data-
shape representation. In: 2019 IEEE/CVF conference on com- assisted reduced-order modeling of extreme events in complex
puter vision and pattern recognition (CVPR), pp 165–174. https:// dynamical systems. PLoS ONE 13(5):e0197704. https://doi.org/
doi.org/10.1109/CVPR.2019.00025 10.1371/journal.pone.0197704
476. Michalkiewicz M, Pontes JK, Jack D, Baktashmotlagh M, Eriks- 493. Sato S, Dobashi Y, Kim T, Nishita T (2018) Example-based tur-
son A (2019) Implicit surface representations as layers in neural bulence style transfer. ACM Trans Graph 37(4):84:1-84:9. https://
networks. In: 2019 IEEE/CVF international conference on com- doi.org/10.1145/3197517.3201398
puter vision (ICCV), pp 4742–4751. https://doi.org/10.1109/ 494. Chu M, Thuerey N (2017) Data-driven synthesis of smoke
ICCV.2019.00484 flows with CNN-based feature descriptors. ACM Trans Graph
477. Gropp A, Yariv L, Haim N, Atzmon M, Lipman Y (2020) Implicit 36(4):69:1-69:14. https://doi.org/10.1145/3072959.3073643
geometric regularization for learning shapes. In: Proceedings of 495. Yildiz AR, Öztürk N, Kaya N, Öztürk F (2003) Integrated
the 37th international conference on machine learning, vol 119 of optimal topology design and shape optimization using neural net-
ICML’20, pp 3789–3799. JMLR.org works. Struct Multidiscip Optim 25(4):251–260. https://doi.org/
478. Sitzmann V, Martel Julien NP, Bergman AW, Lindell DB, Wet- 10.1007/s00158-003-0300-0
zstein G (2020) Implicit neural representations with periodic 496. Chyi-Yeu L, Shin-Hong L (2005) Artificial neural network
activation functions. arXiv:2006.09661 [cs, eess] based hole image interpretation techniques for integrated topol-
479. Huang Z, Bai S, Zico KJ (2021) (Implicit)2 : implicit ogy and shape optimization. Comput Methods Appl Mech Eng
layers for implicit representations. In: Advances in neu- 194(36):3817–3837. https://doi.org/10.1016/j.cma.2004.09.005
ral information processing systems, vol 34. Curran Asso- 497. Chen G, Fidkowski K (2020) Output-based error estimation and
ciates, Inc., pp 9639–9650. https://papers.nips.cc/paper/2021/ mesh adaptation using convolutional neural networks: applica-
hash/4ffbd5c8221d7c147f8363ccdc9a2a37-Abstract.html tion to a scalar advection-diffusion problem. In: AIAA Scitech
480. Deng H, To AC (2021) A parametric level set method for 2020 forum. American Institute of Aeronautics and Astronautics,
topology optimization based on deep neural network (DNN). Orlando. https://doi.org/10.2514/6.2020-1143
arXiv:2101.03286 498. Ramuhalli P, Udpa L, Udpa SS (2005) Finite-element neural net-
481. Zeyu Z, Li Yu, Weien Z, Xiaoqian C, Wen Y, Yong Z (2021) works for solving differential equations. IEEE Trans Neural Netw
TONR: an exploration for a novel way combining neural network 16(6):1381–1392. https://doi.org/10.1109/TNN.2005.857945
with topology optimization. Comput Methods Appl Mech Eng 499. Sikora R, Sikora J, Cardelli E, Chady T (1999) Artificial neural
386:114083. https://doi.org/10.1016/j.cma.2021.114083 network application for material evaluation by electromagnetic
482. Biswas R, Sen MK, Das V, Mukerji T (2019) Prestack and methods. In: International joint conference on neural networks.
poststack inversion using a physics-guided convolutional neu- Proceedings (Cat. No.99CH36339), IJCNN’99, vol 6, pp 4027–
ral network. Interpretation 7(3):161–174. https://doi.org/10.1190/ 4032. https://doi.org/10.1109/IJCNN.1999.830804
INT-2018-0236.1 500. Xu G, Littlefair G, Penson R, Callan R (1999) Application of
483. Alfarraj M, AlRegib G (2019) Semi-supervised learning for FE-based neural networks to dynamic problems. In: ICONIP’99.
acoustic impedance inversion. In: SEG technical program ANZIIS’99 & ANNES’99 & ACNN’99. 6th International con-
expanded abstracts 2019. Society of Exploration Geophysicists, ference on neural information processing. Proceedings (Cat.
San Antonio, pp 2298–2302. https://doi.org/10.1190/segam2019- No.99EX378), vol 3, pp 1039–1044. https://doi.org/10.1109/
3215902.1 ICONIP.1999.844679
484. Dong C, Loy CC, He K, Tang X (2014) Learning a deep convo- 501. Guo F, Zhang P, Wang F, Ma X, Qiu G (1999) Finite element anal-
lutional network for image super-resolution. In: Fleet D, Pajdla ysis based Hopfield neural network model for solving nonlinear
T, Schiele B, Tuytelaars T (eds) Computer vision—ECCV 2014. electromagnetic field problems. In: International joint confer-
Lecture notes in computer science. Springer, Cham, pp 184–199. ence on neural networks. Proceedings (Cat. No.99CH36339),
https://doi.org/10.1007/978-3-319-10593-2_13 IJCNN’99, vol 6, pp 4399–4403. https://doi.org/10.1109/IJCNN.
485. Dong C, Loy CC, He K, Tang X (2015) Image super-resolution 1999.830877
using deep convolutional networks. arXiv:1501.00092 [cs]

123
Computational Mechanics (2024) 74:281–331 327

502. Hyuk L, Seok KI (1990) Neural algorithm for solving differen- Comput Methods Appl Mech Eng 363:112892. https://doi.org/
tial equations. J Comput Phys 91(1):110–131. https://doi.org/10. 10.1016/j.cma.2020.112892
1016/0021-9991(90)90007-N 518. Mishra RK, Hall PS (2005) NFDTD concept. IEEE Trans Neural
503. Kalkkuhl J, Hunt KJ, Fritz H (1999) FEM-based neural-network Netw 16(2):484–490. https://doi.org/10.1109/TNN.2004.841799
approach to nonlinear modeling with application to longitudinal 519. Richardson A (2018) Seismic full-waveform inversion using deep
vehicle dynamics control. IEEE Trans Neural Netw 10(4):885– learning tools and techniques. arXiv:1801.07232
897. https://doi.org/10.1109/72.774241 520. Sun J, Niu Z, Innanen KA, Li J, Trad DO (2020) A theory-guided
504. Chao X, Wang C, Ji F, Yuan X (2012) Finite-element neural deep-learning formulation and optimization of seismic waveform
network-based solving 3-D differential equations in MFL. IEEE inversion. Geophysics 85(2):R87–R99. https://doi.org/10.1190/
Trans Magn 48(12):4747–4756. https://doi.org/10.1109/TMAG. geo2019-0138.1
2012.2207732 521. Hughes TW, Williamson IAD, Minkov M, Fan S (2019)
505. Yang Z, Ruess M, Kollmannsberger S, Düster A, Rank E (2012) Wave physics as an analog recurrent neural network. Sci Adv
An efficient integration technique for the voxel-based finite cell 5(12):6946. https://doi.org/10.1126/sciadv.aay6946
method: efficient integration technique for finite cells. Int J Numer 522. Liu Zeliang WCT, Koishi M (2019) A deep material network for
Methods Eng 91(5):457–471. https://doi.org/10.1002/nme.4269 multiscale topology learning and accelerated nonlinear modeling
506. Zhang L, Cheng L, Li H, Gao J, Cheng Yu, Domel R, Yang Y, Tang of heterogeneous materials. Comput Methods Appl Mech Eng
S, Liu WK (2021) Hierarchical deep-learning neural networks: 345:1138–1168. https://doi.org/10.1016/j.cma.2018.09.020
finite elements and beyond. Comput Mech 67(1):207–230. https:// 523. Liu Zeliang WCT (2019) Exploring the 3D architectures of deep
doi.org/10.1007/s00466-020-01928-9 material network in data-driven multiscale mechanics. J Mech
507. Sourav S, Zhengtao G, Lin C, Jiaying G, Kafka Orion L, Xiaoyu Phys Solids 127:20–46. https://doi.org/10.1016/j.jmps.2019.03.
X, Hengyang L, Mahsa T, Alicia Kim H, Kam LW (2021) Hier- 004
archical deep learning neural network (HiDeNN): an artificial 524. Haber E, Ruthotto L (2018) Stable architectures for deep neu-
intelligence (AI) framework for computational science and engi- ral networks. Inverse Problems, 34(1):014004. https://doi.org/10.
neering. Comput Methods Appl Mech Eng 373:113452. https:// 1088/1361-6420/aa9a90. arXiv:1705.03341 [cs, math]
doi.org/10.1016/j.cma.2020.113452 525. Ruthotto L, Haber E (2018) Deep neural networks motivated by
508. Zhang Lei L, Ye TS, Kam LW (2022) HiDeNN-TD: reduced-order partial differential equations. arXiv:1804.04272 [cs, math, stat]
hierarchical deep learning neural networks. Comput Methods 526. Lu Y, Zhong A, Li Q, Dong B (2020) Beyond finite layer neural
Appl Mech Eng 389:114414. https://doi.org/10.1016/j.cma.2021. networks: bridging deep architectures and numerical differential
114414 equations. arXiv:1710.10121 [cs, stat]
509. Liu Y, Park C, Ye L, Mojumder S, Liu WK, Qian D (2023) 527. Pontriagin LS, Neustadt LW, Pontriagin LS (1986) The math-
HiDeNN-FEM: a seamless machine learning approach to nonlin- ematical theory of optimal processes. In: Classics of Soviet
ear finite element analysis. Comput Mech 72(1):173–194. https:// mathematics. Gordon and Breach Science Publishers, New York
doi.org/10.1007/s00466-023-02293-z 528. Yu Y, Yao H, Liu Y (2018) Physics-based learning for aircraft
510. Ye L, Li H, Zhang L, Park C, Mojumder S, Knapik S, Sang Z, Tang dynamics simulation. In: Annual conference of the PHM society.
S, Apley DW, Wagner GJ, Liu WK (2023) Convolution hierarchi- https://doi.org/10.36001/phmconf.2018.v10i1.513
cal deep-learning neural networks (C-HiDeNN): finite elements, 529. Rishikesh R, Chris H, Jay P (2021) DiscretizationNet: a
isogeometric analysis, tensor decomposition, and beyond. Com- machine-learning based solver for Navier–Stokes equations using
put Mech 72(2):333–362. https://doi.org/10.1007/s00466-023- finite volume discretization. Comput Methods Appl Mech Eng
02336-5 378:113722. https://doi.org/10.1016/j.cma.2021.113722
511. Park C, Ye L, Saha S, Xue T, Guo J, Mojumder S, Apley DW, Wag- 530. Foster D (2023) Generative deep learning: teaching machines to
ner GJ, Liu WK (2023) Convolution hierarchical deep-learning paint, write, compose, and play, 2nd edn. O’Reilly Media Incor-
neural network (C-HiDeNN) with graphics processing unit (GPU) porated, Sebastopol
acceleration. Comput Mech 72(2):383–409. https://doi.org/10. 531. Mosser L, Dubrule O, Blunt MJ (2017) Reconstruction of three-
1007/s00466-023-02329-4 dimensional porous media using generative adversarial neural
512. Li H, Knapik S, Li Y, Park C, Guo J, Mojumder S, Ye L, networks. Phys Rev E 96(4):043309. https://doi.org/10.1103/
Chen W, Apley DW, Liu WK (2023) Convolution hierarchical PhysRevE.96.043309
deep-learning neural network tensor decomposition (C-HiDeNN- 532. Feng J, He X, Teng Q, Ren C, Chen H, Li Y (2019) Recon-
TD) for high-resolution topology optimization. Comput Mech struction of porous media from extremely limited information
72(2):363–382. https://doi.org/10.1007/s00466-023-02333-8 using conditional generative adversarial networks. Phys Rev E
513. Grosse IR, Katragadda P, Benoit J (1992) An adaptive accuracy- 100(3):033308. https://doi.org/10.1103/PhysRevE.100.033308
based a posteriori error estimator. Finite Elem Anal Des 12(1):75– 533. Reza S, Mohsen M, Bozorgmehry BR, Blunt Martin J (2020)
90. https://doi.org/10.1016/0168-874X(92)90008-Z Coupled generative adversarial and auto-encoder neural networks
514. Zhu JZ, Zienkiewicz OC (1997) A posteriori error estima- to reconstruct three-dimensional multi-scale porous media. J
tion and three-dimensional automatic mesh generation. Finite Petrol Sci Eng 186:106794. https://doi.org/10.1016/j.petrol.2019.
Elem Anal Des 25(1):167–184. https://doi.org/10.1016/S0168- 106794
874X(96)00037-6 534. Xia P, Bai H, Zhang T (2022) Multi-scale reconstruction of porous
515. Möller M, Kuzmin D (2006) Adaptive mesh refinement for high- media based on progressively growing generative adversarial net-
resolution finite element schemes. Int J Numer Meth Fluids works. Stoch Env Res Risk Assess 36(11):3685–3705. https://doi.
52(5):545–569. https://doi.org/10.1002/fld.1183 org/10.1007/s00477-022-02216-z
516. Yao H, Ren Y, Liu Y (2019) FEA-Net: a deep convolutional neural 535. Alexander H, Henning W (2022) Three-dimensional microstruc-
network with physicsprior for efficient data driven PDE learning. ture generation using generative adversarial neural networks in
In: AIAA Scitech 2019 forum. American Institute of Aeronau- the context of continuum micromechanics. Comput Methods
tics and Astronautics, San Diego. https://doi.org/10.2514/6.2019- Appl Mech Eng 400:115497. https://doi.org/10.1016/j.cma.2022.
0680 115497
517. Houpu Y, Yi G, Yongming L (2020) FEA-Net: a physics-guided
data-driven model for efficient mechanical response prediction.

123
328 Computational Mechanics (2024) 74:281–331

536. Rawat S, Herman Shen MH (2019) A novel topology design 553. Tinghao G, Lohan Danny J, Ruijin C, Yi RM, Allison James
approach using an integrated deep learning network architecture. T (2018) An indirect design representation for topology opti-
arXiv:1808.02334 mization using variational autoencoder and style transfer. In:
537. Kentaro Y, Shintaro Y, Kikuo F (2022) Data-driven multifi- AIAA/ASCE/AHS/ASC structures, structural dynamics, and
delity topology design using a deep generative model: application materials conference. AIAA SciTech Forum American Institute
to forced convection heat transfer problems. Comput Methods of Aeronautics and Astronautics. https://doi.org/10.2514/6.2018-
Appl Mech Eng 388:114284. https://doi.org/10.1016/j.cma.2021. 0804
114284 554. Vulimiri Praveen S, Hao D, Florian D, Xiaoli Z, To Albert C
538. Lee KH, Yun GJ (2023) Microstructure reconstruction using (2021) Integrating geometric data into topology optimization via
diffusion-based generative models. arXiv:2211.10949 [cond-mat, neural style transfer. Materials 14(16):4551. https://doi.org/10.
physics:physics] 3390/ma14164551
539. Christian D, Paul S, Dennis R, Stephanie H, Markus K, Maik 555. Gatys L, Ecker A, Bethge M (2016) A neural algorithm of artistic
G (2023) Conditional diffusion-based microstructure reconstruc- style. J Vis 16(12):326. https://doi.org/10.1167/16.12.326
tion. Mater Today Commun 35:105608. https://doi.org/10.1016/ 556. Dommaraju N, Bujny M, Menzel S, Olhofer M, Duddeck F (2023)
j.mtcomm.2023.105608 Evaluation of geometric similarity metrics for structural clusters
540. Vlassis Nikolaos N, WaiChing S (2023) Denoising diffusion generated using topology optimization. Appl Intell 53(1):904–
algorithm for inverse design of microstructures with fine-tuned 929. https://doi.org/10.1007/s10489-022-03301-0
nonlinear material properties. Comput Methods Appl Mech Eng 557. Achlioptas P, Diamanti O, Mitliagkas I, Guibas L (2018) Learn-
413:116126. https://doi.org/10.1016/j.cma.2023.116126 ing representations and generative models for 3D point clouds.
541. Junxi F, Qizhi T, Bing L, Xiaohai H, Honggang C, Yang L (2020) In: Proceedings of the 35th international conference on machine
An end-to-end three-dimensional reconstruction framework of learning. PMLR, pp 40–49. https://proceedings.mlr.press/v80/
porous media from a single two-dimensional image based on deep achlioptas18a.html
learning. Comput Methods Appl Mech Eng 368:113043. https:// 558. Yang Y, Feng C, Shen Y, Tian D (2018) FoldingNet: point
doi.org/10.1016/j.cma.2020.113043 cloud auto-encoder via deep grid deformation, pp 206–215.
542. Steve K, Cooper Samuel J (2021) Generating three-dimensional https://openaccess.thecvf.com/content_cvpr_2018/html/Yang_
structures from a two-dimensional slice with generative adver- FoldingNet_Point_Cloud_CVPR_2018_paper.html
sarial network-based dimensionality expansion. Nat Mach Intell 559. Shahroz K, Kosa G-L, Konstantinos K, Panagiotis K (2023) Shi-
3(4):299–305. https://doi.org/10.1038/s42256-021-00322-1 pHullGAN: a generic parametric modeller for ship hull design
543. Li Y, Jian P, Han G (2022) Cascaded progressive generative adver- using deep convolutional generative model. Comput Methods
sarial networks for reconstructing three-dimensional grayscale Appl Mech Eng 411:116051. https://doi.org/10.1016/j.cma.2023.
core images from a single two-dimensional image. Front Phys. 116051
https://doi.org/10.3389/fphy.2022.716708 560. Qiuyi C, Jun W, Phillip P, Chen W, Fuge M (2022) Inverse design
544. Fan Z, Xiaohai H, Teng Qizhi W, Xiaohong DX (2022) 3D- of two-dimensional airfoils using conditional generative models
PMRNN: Reconstructing three-dimensional porous media from and surrogate log-likelihoods. J Mech Des 144(2):021712. https://
the two-dimensional image with recurrent neural network. J doi.org/10.1115/1.4052846
Petrol Sci Eng 208:109652. https://doi.org/10.1016/j.petrol.2021. 561. Chen W, Fuge M (2021) BézierGAN: automatic generation of
109652 smooth curves from interpretable low-dimensional parameters.
545. Zheng Q, Zhang D (2022) RockGPT: reconstructing three- arXiv:1808.08871 [cs, stat]
dimensional digital rocks from single two-dimensional slice with 562. Wei C, Faez A (2021) MO-PaDGAN: reparameterizing engineer-
deep learning. Comput Geosci 26(3):677–696. https://doi.org/10. ing designs for augmented multi-objective optimization. Appl
1007/s10596-022-10144-8 Soft Comput 113:107909. https://doi.org/10.1016/j.asoc.2021.
546. Johan P, Leonardo R, Gabriel K, Frank L (2022) Size-invariant 107909
3D generation from a single 2D rock image. J Petrol Sci Eng 563. Richardson A (2018) Generative adversarial networks for
215:110648. https://doi.org/10.1016/j.petrol.2022.110648 model order reduction in seismic full-waveform inversion.
547. Fan Z, Qizhi T, Honggang C, Xiaohai H, Xiucheng D (2021) Slice- arXiv:1806.00828 [physics]
to-voxel stochastic reconstructions on porous media with hybrid 564. Zhang Y, Seibert P, Otto A, Raßloff A, Ambati M, Kästner M
deep generative model. Comput Mater Sci 186:110018. https:// (2023) DA-VEGAN: differentiably augmenting VAE-GAN for
doi.org/10.1016/j.commatsci.2020.110018 microstructure reconstruction from extremely small data sets.
548. Rawat S, Shen MHH (2019) Application of adversarial networks arXiv:2303.03403 [cs]
for 3D structural topology optimization, pp 2019-01-0829. https:// 565. Wei C, Faez A (2021) PaDGAN: learning to generate high-quality
doi.org/10.4271/2019-01-0829 novel designs. J Mech Des 143(3):031703. https://doi.org/10.
549. Rawat S, Herman SMH (2019) A novel topology optimization 1115/1.4048626
approach using conditional deep learning. arXiv:1901.04859 566. Kulesza A, Taskar B (2012) Determinantal point processes for
550. Herman Shen MH, Chen L (2019) A new CGAN technique for machine learning. Found Trends Mach Learn 5(2–3):123–286.
constrained topology design optimization. arXiv:1901.07675 https://doi.org/10.1561/2200000044. arXiv:1207.6083 [cs, stat]
551. Henning W, Christoph B, Fadi A, Markus H, Michael H, Ludger 567. Bates SJ, Sienz J, Langley DS (2003) Formulation of the
L, Peter W (2022) Computational homogenization using convo- Audze–Eglais uniform latin hypercube design of experiments.
lutional neural networks. In: Fadi A, Blaž H, Meisam S, Henning Adv Eng Softw 34(8):493–506. https://doi.org/10.1016/S0965-
W, Christian W, Michele M (eds) Current trends and open prob- 9978(03)00042-5
lems in computational mechanics. Springer, Cham, pp 569–579. 568. Heyrani Nobari A, Rashad MF, Ahmed F (2021) CreativeGAN:
https://doi.org/10.1007/978-3-030-87312-7_55 editing generative adversarial networks for creative design syn-
552. Mosser L, Dubrule O, Blunt MJ (2020) Stochastic seismic thesis. In: 47th Design automation conference (DAC), page
waveform inversion using generative adversarial networks as a V03AT03A002, virtual, vol 3. American Society of Mechanical
geological prior. Math Geosci 52(1):53–79. https://doi.org/10. Engineers. https://doi.org/10.1115/DETC2021-68103
1007/s11004-019-09832-6 569. Bau D, Liu S, Wang T, Zhu JY, Torralba A (2020) Rewriting a
deep generative model. arXiv:2007.15646 [cs]

123
Computational Mechanics (2024) 74:281–331 329

570. Elgammal A, Liu B, Elhoseiny M, Mazzone M (2017) CAN: 587. Xie Y, Franz E, Chu M, Thuerey N (2018) tempoGAN: a tempo-
creative adversarial networks, generating “art” by learning about rally coherent, volumetric GAN for super-resolution fluid flow.
styles and deviating from style norms. arXiv:1706.07068 [cs] ACM Trans Graph 37(4):95:1-95:15. https://doi.org/10.1145/
571. Oh S, Jung Y, Kim S, Lee I, Kang N (2019) Deep genera- 3197517.3201304
tive design: integration of topology optimization and generative 588. Pang G, Shen C, Cao L, Van Den Hengel A (2022) Deep learning
models. J Mech Des 141(11):111405. https://doi.org/10.1115/1. for anomaly detection: a review. ACM Comput Surv 54(2):1–38.
4044229. arXiv:1903.01548 https://doi.org/10.1145/3439950
572. Greminger M (2020) Generative adversarial networks with syn- 589. Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection
thetic training data for enforcing manufacturing constraints on using replicator neural networks. In: Kambayashi Y, Winiwarter
topology optimization. In: 46th Design automation conference W, Arikawa M (eds) Data warehousing and knowledge discovery.
(DAC), vol 11A, p V11AT11A005. American Society of Mechan- Lecture notes in computer science. Springer, Berlin, pp 170–180.
ical Engineers. https://doi.org/10.1115/DETC2020-22399 https://doi.org/10.1007/3-540-46145-0_17
573. Yoo S, Lee S, Kim S, Hwang KH, Park JH, Kang N (2021) Inte- 590. Thomas S, Philipp S, Waldstein Sebastian M, Ursula S-E, Georg L
grating deep learning into CAD/CAE system: generative design (2017) Unsupervised anomaly detection with generative adversar-
and evaluation of 3D conceptual wheel. Struct Multidiscip Optim ial networks to guide marker discovery. In: Niethammer M, Styner
64(4):2725–2747. https://doi.org/10.1007/s00158-021-02953-9 M, Aylward S, Zhu H, Oguz I, Yap P-T, Shen D (eds) Information
574. Weisheng Z, Wang Yue D, Zongliang LC, Sung-Kie Y, Guo processing in medical imaging. Lecture notes in computer sci-
X (2023) Machine-learning assisted topology optimization ence. Springer, Cham, pp 146–157. https://doi.org/10.1007/978-
for architectural design with artistic flavor. Comput Methods 3-319-59050-9_12
Appl Mech Eng 413:116041. https://doi.org/10.1016/j.cma.2023. 591. Zenati H, Foo CS, Lecouat B, Manek G, Chandrasekhar
116041 VR (2019) Efficient GAN-based anomaly detection.
575. Bendsøe MP, Sigmund O (2003) Topology optimization: theory, arXiv:1802.06222 [cs, stat]
methods, and applications. Springer, New York 592. Thomas S, Philipp S, Waldstein Sebastian M, Georg L, Ursula
576. Yang F, Ma J (2023) FWIGAN: full-waveform inversion via S-E (2019) f-AnoGAN: fast unsupervised anomaly detection
a physics-informed generative adversarial network. J Geophys with generative adversarial networks. Med Image Anal 54:30–
Res Solid Earth 128(4):e2022JB025493. https://doi.org/10.1029/ 44. https://doi.org/10.1016/j.media.2019.01.010
2022JB025493 593. Henkes A, Herrmann L, Wessels H, Kollmannsberger S (2023)
577. Radhakrishnan S, Bharadwaj V, Manjunath V, Srinath R (2018) Gan enables outlier detection and property monitoring for additive
Creative intelligence—automating car design studio with gener- manufacturing of complex structures. Preprint https://www.ssrn.
ative adversarial networks (GAN). In: Holzinger A, Kieseberg com/abstract=4627723
P, Tjoa AM, Weippl E (eds) Machine learning and knowledge 594. Duddeck F (2008) Multidisciplinary optimization of car bod-
extraction. Lecture notes in computer science. Springer, Cham, ies. Struct Multidiscip Optim 35(4):375–389. https://doi.org/10.
pp 160–175. https://doi.org/10.1007/978-3-319-99740-7_11 1007/s00158-007-0130-6
578. Wei C, Mark F (2019) Synthesizing designs with interpart depen- 595. David S, Julian S, Karen S, Ioannis A, Aja H, Arthur G, Thomas
dencies using hierarchical generative adversarial networks. J H, Lucas B, Matthew L, Adrian B, Yutian C, Timothy L, Fan H,
Mech Des 141(11):111403. https://doi.org/10.1115/1.4044076 Laurent S, van den Driessche G, Graepel T, Hassabis D (2017)
579. Nie Z, Lin T, Jiang H, Kara LB (2020) TopologyGAN: topol- Mastering the game of Go without human knowledge. Nature
ogy optimization using generative adversarial networks based on 550(7676):354–359. https://doi.org/10.1038/nature24270
physical fields over the initial domain. https://doi.org/10.48550/ 596. Oriol V, Igor B, Czarnecki Wojciech M, Michaël M, Andrew D,
arXiv.2003.04685. arXiv:2003.04685v2 Junyoung C, Choi David H, Richard P, Timo E, Petko G, Junhyuk
580. Nathan H, Buskohl Philip R, Andrew G, Kumar V, Sam A (2021) O, Dan H, Manuel K, Ivo D, Aja H, Laurent S, Trevor C, Agapiou
Generative adversarial network for early-stage design flexibility John P, Max J, Vezhnevets Alexander S, Rémi L, Tobias P, Valentin
in topology optimization for additive manufacturing. J Manuf Syst D, David B, Yury S, James M, Paine Tom L, Caglar G, Ziyu
59:675–685. https://doi.org/10.1016/j.jmsy.2021.04.007 W, Tobias P, Yuhuai W, Roman R, Dani Y, Dario W, Katrina
581. Heyrani Nobari A, Chen W, Ahmed F (2021) RANGE-GAN: MK, Oliver S, Tom S, Timothy L, Koray K, Demis H, Chris A,
design synthesis under constraints using conditional generative David S (2019) Grandmaster level in StarCraft II using multi-
adversarial networks. J Mech Des 10(1115/1):4052442 agent reinforcement learning. Nature 575(7782):350–354. https://
582. Jun W, Wei C, Da D, Fuge M, Rai R (2022) IH-GAN: a conditional doi.org/10.1038/s41586-019-1724-z
generative model for implicit surface-based inverse design of cel- 597. Kober J, Andrew Bagnell J, Peters J (2013) Reinforcement learn-
lular structures. Comput Methods Appl Mech Eng 396:115060. ing in robotics: a survey. Int J Robot Res 32(11):1238–1274.
https://doi.org/10.1016/j.cma.2022.115060 https://doi.org/10.1177/0278364913495721
583. Duque L, Gutiérrez G, Arias C, Rüger A, Jaramillo H (2019) 598. Kim H, Jordan M, Sastry S, Ng A (2003) Autonomous
Automated velocity estimation by deep learning based seismic- helicopter flight via reinforcement learning. In: Advances
to-velocity mapping. Eur Assoc Geosci Eng. https://doi.org/10. in neural information processing systems, vol 16. MIT
3997/2214-4609.201901523 Press. https://papers.nips.cc/paper_files/paper/2003/hash/
584. Yu-Qing W, Wang Qi L, Wen-Kai GQ, Xin-Fei Y (2022) Seismic b427426b8acd2c2e53827970f2c2f526-Abstract.html
impedance inversion based on cycle-consistent generative adver- 599. Abbeel P, Coates A, Quigley M, Ng A (2006). An applica-
sarial network. Pet Sci 19(1):147–161. https://doi.org/10.1016/j. tion of reinforcement learning to aerobatic helicopter flight.
petsci.2021.09.038 In: Advances in neural information processing systems, vol
585. Zhu JY, Park T, Isola P, Efros AA (2020) Unpaired image-to- 19. MIT Press. https://proceedings.neurips.cc/paper/2006/hash/
image translation using cycle-consistent adversarial networks. 98c39996bf1543e974747a2549b3107c-Abstract.html
arXiv:1703.10593 [cs] 600. Abbeel P, Coates A, Andrew YN (2010) Autonomous heli-
586. Baotong L, Congjia H, Xin L, Shuai Z, Jun H (2019) Non-iterative copter aerobatics through apprenticeship learning. Int J Robot Res
structural topology optimization using deep learning. Comput 29(13):1608–1639. https://doi.org/10.1177/0278364910371999
Aided Des 115:172–180. https://doi.org/10.1016/j.cad.2019.05. 601. Novati G, Verma S, Alexeev D, Rossinelli D, van Rees WM,
038 Koumoutsakos P (2017) Synchronised swimming of two fish.

123
330 Computational Mechanics (2024) 74:281–331

Bioinspir Biomimet 12(3):036001. https://doi.org/10.1088/1748- 618. Rabault J, Kuhnle A (2019) Accelerating deep reinforcement
3190/aa6311. arXiv:1610.04248 [physics] learning strategies of flow control through a multi-environment
602. Verma S, Novati G, Koumoutsakos P (2018) Efficient collective approach. Phys Fluids 31(9):094105. https://doi.org/10.1063/1.
swimming by harnessing vortices through deep reinforcement 5116415. arXiv:1906.10382 [physics]
learning. Proc Natl Acad Sci 115(23):5849–5854. https://doi.org/ 619. Novati G, de Laroussilhe HL, Koumoutsakos P (2020) Automat-
10.1073/pnas.1800923115 ing turbulence modeling by multi-agent reinforcement learning.
603. Ma P, Tian Y, Pan Z, Ren B, Manocha D (2018) Fluid directed arXiv:2005.09023 [physics]
rigid body control using deep reinforcement learning. ACM 620. Liu X-Y, Wang J-X (2021) Physics-informed Dyna-style model-
Trans Graph 37(4):96:1-96:11. https://doi.org/10.1145/3197517. based deep reinforcement learning for dynamic control. Proc Roy
3201334 Soc A Math Phys Eng Sci 477(2255):20210618. https://doi.org/
604. Jean R, Miroslav K, Atle J, Ulysse R, Nicolas C (2019) Artifi- 10.1098/rspa.2021.0618
cial neural networks trained through deep reinforcement learning 621. Haotian S, Zhou Yang W, Keshu CS, Bin R, Qinghui N (2023)
discover control strategies for active flow control. J Fluid Mech Physics-informed deep reinforcement learning-based integrated
865:281–302. https://doi.org/10.1017/jfm.2019.62 two-dimensional car-following control strategy for connected
605. Fan D, Yang L, Wang Z, Triantafyllou MS, Karniadakis GE (2020) automated vehicles. Knowl-Based Syst 269:110485. https://doi.
Reinforcement learning for bluff body active flow control in org/10.1016/j.knosys.2023.110485
experiments and simulations. Proc Natl Acad Sci 117(42):26091– 622. Ramesh A, Ravindran B (2023) Physics-informed model-based
26098. https://doi.org/10.1073/pnas.2004939117 reinforcement learning. arXiv:2212.02179 [cs]
606. Jie X, Tao D, Foshey M, Li B, Zhu B, Schulz A, Matusik W (2019) 623. Colin R, Phanindra T (2023) Physics-informed reinforcement
Learning to fly: computational controller design for hybrid UAVs learning for motion control of a fish-like swimming robot. Sci
with reinforcement learning. ACM Trans Graph 38(4):42:1-42:12. Rep 13(1):10754. https://doi.org/10.1038/s41598-023-36399-4
https://doi.org/10.1145/3306346.3322940 624. Nielsen MA (2015) Neural networks and deep learning. Determi-
607. Lee XY, Balu A, Stoecklein D, Ganapathysubramanian B, Sarkar nation Press. http://neuralnetworksanddeeplearning.com
S (2018) Flow shape design for microfluidic devices using deep 625. Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel
reinforcement learning. arXiv:1811.12444 [cs, stat] T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks
608. Kun W, WaiChing S (2019) Meta-modeling game for deriving on graphs for learning molecular fingerprints. In: Advances in
theory-consistent, microstructure-based traction-separation laws neural information processing systems, vol 28. Curran Asso-
via deep reinforcement learning. Comput Methods Appl Mech ciates, Inc. https://papers.nips.cc/paper_files/paper/2015/hash/
Eng 346:216–241. https://doi.org/10.1016/j.cma.2018.11.026 f9be311e65d81a9ad8150a60844bb94c-Abstract.html
609. Bendsøe MP (1989) Optimal shape design as a material distri- 626. Bird S, Klein E, Loper E (2009) Natural language processing with
bution problem. Struct Optim 1(4):193–202. https://doi.org/10. Python, 1st edn. Beijing, Cambridge
1007/BF01650949 627. Hobson L, Cole H, Max HH (2019) Natural language process-
610. Martin P (2004) Bendsøe and ole sigmund, topology optimization. ing in action: understanding, analyzing, and generating text with
Springer, Berlin. https://doi.org/10.1007/978-3-662-05086-6 Python. Manning Publications Co, Shelter Island
611. Hayashi K, Ohsaki M (2020) Reinforcement learning and graph 628. Jurafsky D, Martin JH, Norvig P, Russell SJ (2009) Speech and
embedding for binary truss topology optimization under stress and language processing: an introduction to natural language process-
displacement constraints. Front Built Environ. https://doi.org/10. ing, computational linguistics, and speech recognition. Prentice
3389/fbuil.2020.00059 Hall series in artificial intelligence, 2nd edn. Prentice Hall, Pear-
612. Shaojun Z, Makoto O, Kazuki H, Xiaonong G (2021) Machine- son Education International, Upper Saddle River
specified ground structures for topology optimization of binary 629. Olah C (2015) Understanding LSTM networks. http://colah.
trusses using graph embedding policy network. Adv Eng Softw github.io/posts/2015-08-Understanding-LSTMs/
159:103032. https://doi.org/10.1016/j.advengsoft.2021.103032 630. Le Cun Y, Françoise F-S (1987) Modèles connexionnistes
613. Hongbo S, Ling M (2020) Generative design by using exploration de l’apprentissage. Intellectica 2(1):114–143. https://doi.org/10.
approaches of reinforcement learning in density-based structural 3406/intel.1987.1804
topology optimization. Designs 4(2):10. https://doi.org/10.3390/ 631. Bourlard H, Kamp Y (1988) Auto-association by multilayer
designs4020010 perceptrons and singular value decomposition. Biol Cybern
614. Seowoo J, Soyoung Y, Namwoo K (2022) Generative design by 59(4):291–294. https://doi.org/10.1007/BF00332918
reinforcement learning: enhancing the diversity of topology opti- 632. Hinton GE, Zemel R (1993) Autoencoders, minimum descrip-
mization designs. Comput Aided Des 146:103225. https://doi.org/ tion length and helmholtz free energy. In: Advances in
10.1016/j.cad.2022.103225 neural information processing systems, vol 6. Morgan-
615. Jiequn H, Arnulf J, Weinan E (2018) Solving high-dimensional Kaufmann. https://proceedings.neurips.cc/paper/1993/hash/
partial differential equations using deep learning. Proc Natl 9e3cfc48eccf81a0d57663e129aef3cb-Abstract.html
Acad Sci 115(34):8505–8510. https://doi.org/10.1073/pnas. 633. Shuangshuang C, Wei G (2023) Auto-encoders in deep learning–
1718942115 a review with new perspectives. Mathematics 11(8):1777. https://
616. Weinan E, Han J, Jentzen A (2017) Deep learning-based numer- doi.org/10.3390/math11081777
ical methods for high-dimensional parabolic partial differential 634. Nash JF (1950) Equilibrium points in n-person games. Proc Natl
equations and backward stochastic differential equations. Com- Acad Sci 36(1):48–49. https://doi.org/10.1073/pnas.36.1.48
mun Math Stat 5(4):349–380. https://doi.org/10.1007/s40304- 635. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A,
017-0117-6. arXiv:1706.04702 Chen X, Chen X (2016) Improved techniques for training GANs.
617. Yang J, Dzanic T, Petersen B, Kudo J, Mittal K, Tomov Vl Camier In: Advances in neural information processing systems, vol 29.
JS, Zhao T, Zha H, Kolev T, Anderson R, Faissol D (2023) Rein- Curran Associates, Inc. https://papers.nips.cc/paper_files/paper/
forcement learning for adaptive mesh refinement. In: Proceedings 2016/hash/8a3363abe792db2d8761d6403605aeb7-Abstract.
of The 26th international conference on artificial intelligence and html
statistics PMLR, pp 5997–6014. https://proceedings.mlr.press/ 636. Srivastava A, Valkov L, Russell C, Gutmann MU, Sutton C
v206/yang23e.html (2017) VEEGAN: reducing mode collapse in GANs using implicit
variational learning. In: Proceedings of the 31st international

123
Computational Mechanics (2024) 74:281–331 331

conference on neural information processing systems, NIPS’17. 653. Heess N, Wayne G, Silver D, Lillicrap T, Erez T, Tassa Y (2015)
Curran Associates Inc, Red Hook, pp 3310–3320 Learning continuous control policies by stochastic value gra-
637. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. dients. In: Advances in neural information processing systems,
arXiv:1701.07875 [cs, stat] vol 28. Curran Associates, Inc., https://papers.nips.cc/paper_
638. Mirza M, Osindero S (2014) Conditional generative adversarial files/paper/2015/hash/148510031349642de5ca0c544f31b2ef-
nets. arXiv:1411.1784 [cs, stat] Abstract.html
639. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P 654. Clavera I, Fu V, Abbeel P (2020) Model-augmented actor-critic:
(2016) InfoGAN: interpretable representation learning by infor- backpropagating through paths. arXiv:2005.08068 [cs, stat]
mation maximizing generative adversarial nets. In: Proceedings 655. Hafner D, Lillicrap T, Ba J, Norouzi M (2020) Dream to control:
of the 30th international conference on neural information pro- learning behaviors by latent imagination. arXiv:1912.01603 [cs]
cessing systems, NIPS’16. Curran Associates Inc, Red Hook, pp 656. Hafner D, Lillicrap T, Norouzi M, Ba J (2022) Mastering atari
2180–2188 with discrete world models. arXiv:2010.02193 [cs, stat]
640. Bridle JS, Heading Anthony JR, MacKay David JC (1991) Unsu- 657. Williams RJ (1992) Simple statistical gradient-following algo-
pervised classifiers, mutual information and ’phantom targets. In: rithms for connectionist reinforcement learning. Mach Learn
Proceedings of the 4th international conference on neural informa- 8(3):229–256. https://doi.org/10.1007/BF00992696
tion processing systems, NIPS’91. Morgan Kaufmann Publishers 658. Sutton RS, McAllester D, Singh S, Mansour Y (1999) Policy gra-
Inc., San Francisco, pp 1096–1101 dient methods for reinforcement learning with function approxi-
641. Larsen ABL, Sønderby SK, Larochelle H, Winther O (2016) mation. In: Advances in neural information processing systems,
Autoencoding beyond pixels using a learned similarity metric. In: vol 12. MIT Press. https://papers.nips.cc/paper_files/paper/1999/
Proceedings of the 33rd international conference on international hash/464d828b85b0bed98e80ade0a5c43b0f-Abstract.html
conference on machine learning, vol 48, ICML’16, pp 1558–1566 659. Kakade S (2001) A natural policy gradient. In: Advances
642. Sohl-Dickstein J, Weiss E, Maheswaranathan N, Ganguli S (2015) in neural information processing systems, vol 14. MIT
Deep unsupervised learning using nonequilibrium thermodynam- Press. https://papers.nips.cc/paper_files/paper/2001/hash/
ics. In: Proceedings of the 32nd international conference on 4b86abe48d358ecf194c56c69108433e-Abstract.html
machine learning. PMLR, pp 2256–2265. https://proceedings.mlr. 660. Silver D, Lever G, Heess N, Degris T, Wierstra T, Riedmiller M
press/v37/sohl-dickstein15.html (2014) Deterministic policy gradient algorithms. In: Proceedings
643. Ho J, Jain A, Abbeel P (2020) Denoising diffusion prob- of the 31st international conference on machine learning. PMLR,
abilistic models. In: Advances in neural information pp 387–395. https://proceedings.mlr.press/v32/silver14.html
processing systems, vol 33. Curran Associates, Inc., pp 661. Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust
6840–6851. https://proceedings.neurips.cc/paper/2020/hash/ region policy optimization. In: Proceedings of the 32nd interna-
4c5bcfec8584af0d967f1ab10179ca4b-Abstract.html tional conference on machine learning. PMLR, pp 1889–1897.
644. Nichol A, Dhariwal P (2021) Improved denoising diffusion prob- https://proceedings.mlr.press/v37/schulman15.html
abilistic models. arXiv:2102.09672 [cs, stat] 662. Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8:279–292.
645. Rezende D, Mohamed S (2015) Variational inference with nor- https://doi.org/10.1007/BF00992698
malizing flows. In: Proceedings of the 32nd international con- 663. van Hasselt H, Guez A, Silver D (February) Deep reinforcement
ference on machine learning. PMLR, pp 1530–1538. https:// learning with double Q-learning. In: Proceedings of the thirti-
proceedings.mlr.press/v37/rezende15.html eth AAAI conference on artificial intelligence, AAAI’16. AAAI
646. Ivan K, Prince Simon JD, Brubaker Marcus A (2021) Normalizing Press, Phoenix, pp 2094–2100
flows: an introduction and review of current methods. IEEE Trans 664. Wang Z, Schaul T, Hessel M, Van Hasselt H, Lanctot M, De Freitas
Pattern Anal Mach Intell 43(11):3964–3979. https://doi.org/10. N (2016) Dueling network architectures for deep reinforcement
1109/TPAMI.2020.2992934 learning. In: Proceedings of the 33rd international conference on
647. Sutton RS (1991) Dyna, an integrated architecture for learning, international conference on machine learning, vol 48, ICML’16,
planning, and reacting. ACM SIGART Bull 2(4):160–163. https:// New York, pp 1995–2003
doi.org/10.1145/122344.122377 665. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017)
648. Janner M, Fu J, Zhang M, Levine S (2019) When to trust your Proximal policy optimization algorithms. arXiv:1707.06347 [cs]
model: model-based policy optimization. In: Proceedings of the 666. Richard B (1957) A Markovian decision process. J Math Mech
33rd international conference on neural information processing 6(5):679–684
systems, vol 1122. Curran Associates Inc., Red Hook, pp 12519– 667. Capuzzo Dolcetta I, Ishii H (1984) Approximate solutions of
12530 the bellman equation of deterministic control theory. Appl Math
649. Lukasz K, Mohammad B, Piotr M, Blazej O, Campbell RH, Kon- Optim 11(1):161–181. https://doi.org/10.1007/BF01442176
rad C, Dumitru E, Chelsea F, Piotr K, Sergey L, Afroz M, Ryan S, 668. Sutton RS (1988) Learning to predict by the methods of tempo-
George T, Henryk M (2020) Model-based reinforcement learning ral differences. Mach Learn 3(1):9–44. https://doi.org/10.1007/
for Atari. arXiv:1903.00374 [cs, stat] BF00115009
650. Luo Y, Xu H, Li Y, Tian Y, Darrell T, Ma T (2021) Algorithmic 669. Bradtke SJ, Barto AG (1996) Linear least-squares algorithms for
framework for model-based deep reinforcement learning with the- temporal difference learning. Mach Learn 22(1):33–57. https://
oretical guarantees. arXiv:1807.03858 [cs, stat] doi.org/10.1007/BF00114723
651. Deisenroth MP, Rasmussen CE (2011) PILCO: a model-based
and data-efficient approach to policy search. In: Proceedings of
the 28th international conference on international conference on
Publisher’s Note Springer Nature remains neutral with regard to juris-
machine learning, ICML’11. Omnipress, Madison, pp 465–472
dictional claims in published maps and institutional affiliations.
652. Levine S, Abbeel P (2014) Learning neural network poli-
cies with guided policy search under unknown dynamics. In:
Advances in neural information processing systems, vol 27.
Curran Associates, Inc. https://papers.nips.cc/paper_files/paper/
2014/hash/6766aa2750c19aad2fa1b32f36ed4aee-Abstract.html

123

Boundary Representation Modelling Techniques - Ian Stroud PDF
No ratings yet
Boundary Representation Modelling Techniques - Ian Stroud PDF
789 pages
3D FEA Simulations in Machining
No ratings yet
3D FEA Simulations in Machining
95 pages
Dragon Bundle: It's Time To & Make Projects That Matter
No ratings yet
Dragon Bundle: It's Time To & Make Projects That Matter
18 pages
Bott Curt Noce 18
No ratings yet
Bott Curt Noce 18
89 pages
2501.05165v1
No ratings yet
2501.05165v1
325 pages
Master Thesis
No ratings yet
Master Thesis
151 pages
Master Thesis Lipovsky
No ratings yet
Master Thesis Lipovsky
76 pages
Dimensioning, Cell Site Planning PDF
No ratings yet
Dimensioning, Cell Site Planning PDF
145 pages
On The Sizing of Analog Integrated Circuits Towards Lifetime Robustness
No ratings yet
On The Sizing of Analog Integrated Circuits Towards Lifetime Robustness
124 pages
Yinpeng Wang, Qiang Ren - Deep Learning-Based Forward Modeling and Inversion Techniques For Computational Physics Problems (2024, CRC Press) - Libgen - Li
No ratings yet
Yinpeng Wang, Qiang Ren - Deep Learning-Based Forward Modeling and Inversion Techniques For Computational Physics Problems (2024, CRC Press) - Libgen - Li
199 pages
Metalearning - A Tutorial: Christophe Giraud-Carrier December 2008
No ratings yet
Metalearning - A Tutorial: Christophe Giraud-Carrier December 2008
45 pages
RMT ML Book-1
No ratings yet
RMT ML Book-1
446 pages
MLbook Extract
No ratings yet
MLbook Extract
14 pages
Lu Yi
No ratings yet
Lu Yi
147 pages
09-03-31 Go Kawakita EESB Thesis 2008
No ratings yet
09-03-31 Go Kawakita EESB Thesis 2008
85 pages
Extreme Events - Dynamics, Statistics and Prediction
No ratings yet
Extreme Events - Dynamics, Statistics and Prediction
56 pages
A Study of Selected Security Issues
No ratings yet
A Study of Selected Security Issues
173 pages
Whalen Ewhalen SM Ccse 2021 Thesis
No ratings yet
Whalen Ewhalen SM Ccse 2021 Thesis
81 pages
Krynauw DD
No ratings yet
Krynauw DD
176 pages
How the MITRE ATT&CK Framework
No ratings yet
How the MITRE ATT&CK Framework
92 pages
2014 03 PHD Pini PDF
No ratings yet
2014 03 PHD Pini PDF
164 pages
Lecture Notes- Machine Learning for the Sciences
No ratings yet
Lecture Notes- Machine Learning for the Sciences
84 pages
Where can buy (Ebook) Data-Driven Engineering Design 2022nd Edition by Liu, Ang, Wang, Yuchen, Wang, Xingzhi ISBN 9783030881801, 3030881806 ebook with cheap price
100% (5)
Where can buy (Ebook) Data-Driven Engineering Design 2022nd Edition by Liu, Ang, Wang, Yuchen, Wang, Xingzhi ISBN 9783030881801, 3030881806 ebook with cheap price
81 pages
Thesis Final
0% (1)
Thesis Final
186 pages
b401 PDF
No ratings yet
b401 PDF
89 pages
2411.01843v1
No ratings yet
2411.01843v1
180 pages
Combining Machine Learning and Computational Chemistry For Predictive Insights Into Chemical Systems
No ratings yet
Combining Machine Learning and Computational Chemistry For Predictive Insights Into Chemical Systems
150 pages
MPLS Thesis
No ratings yet
MPLS Thesis
119 pages
Structure and Interpretation of Computer Programs Second Edition Hal Abelson And Gerald Jay Sussman - The full ebook set is available with all chapters for download
100% (1)
Structure and Interpretation of Computer Programs Second Edition Hal Abelson And Gerald Jay Sussman - The full ebook set is available with all chapters for download
46 pages
PHD Thesis 2004 - Efficient, Transparent, and Comprehensive Runtime Code Manipulation PHD
No ratings yet
PHD Thesis 2004 - Efficient, Transparent, and Comprehensive Runtime Code Manipulation PHD
306 pages
Optimization Methods For Large-Scale Machine Learnig2
No ratings yet
Optimization Methods For Large-Scale Machine Learnig2
95 pages
Mathematical applications in continuum and structural mechanics
No ratings yet
Mathematical applications in continuum and structural mechanics
275 pages
Numerical Optimal Control: July 2011
No ratings yet
Numerical Optimal Control: July 2011
123 pages
(Ebook) Test Signal Generation for Service Diagnosis Based on Local Structure Graphs by Michael Ungermann ISBN 9783832595982, 3832595988 instant download
100% (2)
(Ebook) Test Signal Generation for Service Diagnosis Based on Local Structure Graphs by Michael Ungermann ISBN 9783832595982, 3832595988 instant download
60 pages
Marinelli Full
No ratings yet
Marinelli Full
257 pages
StackGAN and AttnGAN
No ratings yet
StackGAN and AttnGAN
78 pages
Anomaly Detection in Big Data
No ratings yet
Anomaly Detection in Big Data
148 pages
Relaxed Abduction Robust Information Interpretation for Industrial Applications 1st Edition Thomas Hubauer (Auth.) instant download
100% (3)
Relaxed Abduction Robust Information Interpretation for Industrial Applications 1st Edition Thomas Hubauer (Auth.) instant download
68 pages
Masuda Jmasuda Meng Eecs 2024 Thesis
No ratings yet
Masuda Jmasuda Meng Eecs 2024 Thesis
76 pages
High-Performance and Low-Power Clock Network Synthesis in The Presence of Variation
No ratings yet
High-Performance and Low-Power Clock Network Synthesis in The Presence of Variation
174 pages
Coisas Prati Cas J Fuzzy Logic
No ratings yet
Coisas Prati Cas J Fuzzy Logic
94 pages
Dynamic Systems and Causal Structures in Psychology, Connecting Data and Theory 2020
No ratings yet
Dynamic Systems and Causal Structures in Psychology, Connecting Data and Theory 2020
252 pages
Integrated Color Sensor in Standard CMOS
No ratings yet
Integrated Color Sensor in Standard CMOS
97 pages
Buy ebook Data-Driven Engineering Design 2022nd Edition Liu cheap price
100% (4)
Buy ebook Data-Driven Engineering Design 2022nd Edition Liu cheap price
50 pages
MRLucas Thesis
No ratings yet
MRLucas Thesis
124 pages
Model Predictive Control in Thermal Management of Multiprocessor Systems-On-chip
No ratings yet
Model Predictive Control in Thermal Management of Multiprocessor Systems-On-chip
335 pages
Structure and Interpretation of Computer Programs Second Edition Hal Abelson And Gerald Jay Sussman All Chapters Instant Download
100% (1)
Structure and Interpretation of Computer Programs Second Edition Hal Abelson And Gerald Jay Sussman All Chapters Instant Download
51 pages
Immediate download Structure and Interpretation of Computer Programs Second Edition Hal Abelson And Gerald Jay Sussman ebooks 2024
100% (4)
Immediate download Structure and Interpretation of Computer Programs Second Edition Hal Abelson And Gerald Jay Sussman ebooks 2024
61 pages
GPU Methodologies For Numerical Partial Differential Equations Andrew Gloster
No ratings yet
GPU Methodologies For Numerical Partial Differential Equations Andrew Gloster
158 pages
(Ebook) Test Signal Generation for Service Diagnosis Based on Local Structure Graphs by Michael Ungermann ISBN 9783832595982, 3832595988 - Read the ebook online or download it for the best experience
100% (1)
(Ebook) Test Signal Generation for Service Diagnosis Based on Local Structure Graphs by Michael Ungermann ISBN 9783832595982, 3832595988 - Read the ebook online or download it for the best experience
51 pages
Deep Learning
No ratings yet
Deep Learning
91 pages
MasterarbeitAmelieNuesse_final(1)
No ratings yet
MasterarbeitAmelieNuesse_final(1)
100 pages
DylanEveringham MScThesis
No ratings yet
DylanEveringham MScThesis
64 pages
Statistical Machine Learning For Information Retrieval - Adam Berger PDF
No ratings yet
Statistical Machine Learning For Information Retrieval - Adam Berger PDF
147 pages
AI and Machine Learning Report Sample 3
No ratings yet
AI and Machine Learning Report Sample 3
59 pages
Systems_Engineering_and_Its_Application
No ratings yet
Systems_Engineering_and_Its_Application
7 pages
A Comparison of Deep Learning Methods For Time Series Forecasting With Limited Data
No ratings yet
A Comparison of Deep Learning Methods For Time Series Forecasting With Limited Data
55 pages
Evaluation of Text Transformers for Classifying Sentiment of Revi
No ratings yet
Evaluation of Text Transformers for Classifying Sentiment of Revi
104 pages
(Production and Logistics) Dr. Dirk C. Mattfeld (auth.) - Evolutionary Search and the Job Shop_ Investigations on Genetic Algorithms for Production Scheduling-Physica-Verlag Heidelberg (1996)
No ratings yet
(Production and Logistics) Dr. Dirk C. Mattfeld (auth.) - Evolutionary Search and the Job Shop_ Investigations on Genetic Algorithms for Production Scheduling-Physica-Verlag Heidelberg (1996)
162 pages
FULLTEXT01
No ratings yet
FULLTEXT01
68 pages
Continuum Scale Simulation of Engineering Materials: Fundamentals - Microstructures - Process Applications
From Everand
Continuum Scale Simulation of Engineering Materials: Fundamentals - Microstructures - Process Applications
Dierk Raabe
No ratings yet
New_fracture mechanics problems
No ratings yet
New_fracture mechanics problems
30 pages
4. Haghigat_2021_DeepLearning_SurrogateModel (1)
No ratings yet
4. Haghigat_2021_DeepLearning_SurrogateModel (1)
22 pages
3. cmame_paper
No ratings yet
3. cmame_paper
28 pages
6. PINNs_mDEM
No ratings yet
6. PINNs_mDEM
17 pages
PINNS_GADEM
No ratings yet
PINNS_GADEM
15 pages
New_cylinderr
No ratings yet
New_cylinderr
11 pages
Recent Advances in Quantum Computing for Drug Discovery and Development
No ratings yet
Recent Advances in Quantum Computing for Drug Discovery and Development
5 pages
VR Facial Animation Via Multiview Image Translation PDF
No ratings yet
VR Facial Animation Via Multiview Image Translation PDF
16 pages
Prompts Pour Chat GPT & Midjourney
No ratings yet
Prompts Pour Chat GPT & Midjourney
9 pages
2024 - Image and Video Tokenization With Binary Spherical Quantization - Zhao Et Al
No ratings yet
2024 - Image and Video Tokenization With Binary Spherical Quantization - Zhao Et Al
23 pages
Unit 2 v1.
No ratings yet
Unit 2 v1.
41 pages
DL Notes
No ratings yet
DL Notes
35 pages
LAKe-Net Topology-Aware Point Cloud Completion by Localizing Aligned Keypoints
No ratings yet
LAKe-Net Topology-Aware Point Cloud Completion by Localizing Aligned Keypoints
10 pages
Automatic Detection of Welding Defects Using Deep PDF
No ratings yet
Automatic Detection of Welding Defects Using Deep PDF
11 pages
Complete Download Deep Learning Applications, Volume 2 M. Arif Wani PDF All Chapters
100% (4)
Complete Download Deep Learning Applications, Volume 2 M. Arif Wani PDF All Chapters
52 pages
Seasonal Crops Disease Prediction and Classification Using Deep Convolutional Encoder Network
No ratings yet
Seasonal Crops Disease Prediction and Classification Using Deep Convolutional Encoder Network
19 pages
Algorithem Cheat Sheet
No ratings yet
Algorithem Cheat Sheet
25 pages
Deep Learning-Based MIMO Communications
No ratings yet
Deep Learning-Based MIMO Communications
9 pages
Malware Classification Using Deep Learning: Mohd Shahril
No ratings yet
Malware Classification Using Deep Learning: Mohd Shahril
48 pages
A Deep Community Detection Approach in Real Time Networks
No ratings yet
A Deep Community Detection Approach in Real Time Networks
14 pages
Chaos, Solitons and Fractals: Suat Toraman, Talha Burak Alakus, Ibrahim Turkoglu
No ratings yet
Chaos, Solitons and Fractals: Suat Toraman, Talha Burak Alakus, Ibrahim Turkoglu
11 pages
Neural Contextual Anomaly Detection For Time Series
No ratings yet
Neural Contextual Anomaly Detection For Time Series
22 pages
19. 100 Days of Machine Learning
No ratings yet
19. 100 Days of Machine Learning
45 pages
IJIRSET Paper Sample
No ratings yet
IJIRSET Paper Sample
4 pages
m13. Feuerriegel Et Al. (2024)
No ratings yet
m13. Feuerriegel Et Al. (2024)
28 pages
Hair Segmentation and Removal in Dermoscopic Images Using Deep Learning
No ratings yet
Hair Segmentation and Removal in Dermoscopic Images Using Deep Learning
11 pages
Ai ML
No ratings yet
Ai ML
1 page
Deep Learning with Python 2nd Edition François Chollet - The complete ebook version is now available for download
No ratings yet
Deep Learning with Python 2nd Edition François Chollet - The complete ebook version is now available for download
28 pages
Deep Learning Glossary
No ratings yet
Deep Learning Glossary
30 pages
2023 - An Advanced Deep Neural Network for Fundus Image Analysis and Enhancing Diabetic Retinopathy Detection
No ratings yet
2023 - An Advanced Deep Neural Network for Fundus Image Analysis and Enhancing Diabetic Retinopathy Detection
19 pages
Eric Jang - Tutorial - Categorical Variational Autoencoders Using Gumbel-Softmax
No ratings yet
Eric Jang - Tutorial - Categorical Variational Autoencoders Using Gumbel-Softmax
8 pages
Rahil Merged
No ratings yet
Rahil Merged
27 pages
Data Science Course Syllabus 01
100% (1)
Data Science Course Syllabus 01
20 pages
Inferring The Time-Varying Coupling of Dynamical Systems With Temporal
No ratings yet
Inferring The Time-Varying Coupling of Dynamical Systems With Temporal
17 pages
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages

Deep learning in computational mechanics a review

Uploaded by

Deep learning in computational mechanics a review

Uploaded by

Computational Mechanics (2024) 74:281–331

Deep learning in computational mechanics: a review

Contents 3.3.1 Algorithm enhancement . . . . . . . . . . . . . . 298

1 Introduction for material mechanics, [32] for constitutive modeling, [33]

(a) Publications in all fields (b) Publications within computational mechanics

N [u; λ] = 0, on × T , (2) 2.1.1 Space-time approaches

Fig. 2 Conceptual illustration

2.1.1.1. Fully connected neural networks

û(x) = FF N N (λ; x; θ ) (7)

of the data, which for GNNs are edges connecting vertices

• training data: generalization error,

A lack of sufficient training data leads to poor generaliza-

B N min max C, (36)

Fig. 6 Refinement expressed

with a NN. Sparsity is ensured with a L 1 -regularization. u n+ci

to identify fundamental laws in the data. If unsuccessful, a u n+1 = u n + t b j N T [u n+c j ], (44)

estimation of effective bone properties from the boundary

The coefficients αi are predicted by NNs from the current

Multiscale methods have been proposed to efficiently inte-

[457, 459]. In [457, 459], an evolutionary algorithm was

of time steps. Thus, a simple coarse-to-fine post-processing K i j u j − bi = 0, i = 1, 2, . . . , N . (64)

apply a correction at each time step before the coarse solver

displacement over the entire domain u is obtained by super-

Fig. 10 HiDeNN with

Lastly, another approach related to finite elements was

r = f −K ·u (72) FEA-Net is able to outperform classical finite elements for

Fig. 12 Analog RNN

Here, f is the evaluation of one recurrent unit in the RNN.

Fig. 13 A single building block of the deep material network [522]

5 Generative approaches is presented in [547] to reconstruct three-dimensional voxel

27 Note however, that the discriminator does not guarantee an accurate

in order to best match simulation and measurement [608],

Open Access This article is licensed under a Creative Commons

28 An undirected graph can also be considered by treating it as a bi-

Fig. 16 An unrolled RNN. The

B Generative approaches B.2 Generative adversarial networks

B.1 Autoencoders GANs [19] emulate data distributions by setting up a two-

that corresponds to the distribution q(x ), a forward noising

V (s) = max Vπ (s), (87) ∞

periods. J Hydrol 607:127542. https://doi.org/10.1016/j.jhydrol. J Comput Phys 403:109056. https://doi.org/10.1016/j.jcp.2019.

learning. https://platformwiskunde.nl/wp-content/uploads/2021/ Springer, Cham, pp 55–84. https://doi.org/10.1007/978-3-030-

You might also like