0% found this document useful (0 votes)
42 views

Roy-NN-2023

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Roy-NN-2023

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Neural Networks 162 (2023) 472–489

Contents lists available at ScienceDirect

Neural Networks
journal homepage: www.elsevier.com/locate/neunet

Deep learning-accelerated computational framework based on Physics


Informed Neural Network for the solution of linear elasticity

Arunabha M. Roy a , , Rikhi Bose b , Veera Sundararaghavan c , Raymundo Arróyave a,d
a
Department of Materials Science and Engineering, Texas A&M University, 3003 TAMU, College Station, TX 77843, USA
b
Mechanical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
c
Department of Aerospace Engineering, University of Michigan, Ann Arbor, MI 48109, USA
d
Department of Mechanical Engineering, Texas A&M University, 3003 TAMU, College Station, TX 77843, USA

article info a b s t r a c t

Article history: The paper presents an efficient and robust data-driven deep learning (DL) computational framework
Received 5 July 2022 developed for linear continuum elasticity problems. The methodology is based on the fundamentals of
Received in revised form 7 February 2023 the Physics Informed Neural Networks (PINNs). For an accurate representation of the field variables,
Accepted 8 March 2023
a multi-objective loss function is proposed. It consists of terms corresponding to the residual of
Available online 13 March 2023
the governing partial differential equations (PDE), constitutive relations derived from the governing
Keywords: physics, various boundary conditions, and data-driven physical knowledge fitting terms across ran-
Physics Informed Neural Networks (PINNs) domly selected collocation points in the problem domain. To this end, multiple densely connected
Artificial neural networks (ANNs) independent artificial neural networks (ANNs), each approximating a field variable, are trained to
Linear elasticity obtain accurate solutions. Several benchmark problems including the Airy solution to elasticity and the
Bi-harmonic equations
Kirchhoff–Love plate problem are solved. Performance in terms of accuracy and robustness illustrates
Deep learning
the superiority of the current framework showing excellent agreement with analytical solutions. The
present work combines the benefits of the classical methods depending on the physical information
available in analytical relations with the superior capabilities of the DL techniques in the data-driven
construction of lightweight, yet accurate and robust neural networks. The models developed herein
can significantly boost computational speed using minimal network parameters with easy adaptability
in different computational platforms.
© 2023 Elsevier Ltd. All rights reserved.

1. Introduction is a major bottleneck for analyzing various complex physical


systems (Butler, Davies, Cartwright, Isayev, & Walsh, 2018; Ching
In recent years, driven by the advancement of big data-based et al., 2018). Consequently, the majority of state-of-the-art ma-
architectures (Khan et al., 2022), deep learning (DL) techniques chine learning algorithms lack robustness in predicting these sys-
(LeCun, Bengio, & Hinton, 2015) have shown great promises in tems. Upon availability of sufficient data, these have also garnered
computer vision (Roy & Bhaduri, 2021, 2022; Roy, Bose & Bhaduri, considerable success in problems governed by physics, such as
2022c), object detection (Chandio et al., 2022; Roy, Bhaduri, Ku- dynamical systems (Dana & Wheeler, 2020), geosciences (Jahan-
mar, & Raj, 2022a, 2022b), image/signal classification (Irfan et al., bakht, Xiang, & Azghadi, 2022; Racca & Magri, 2021; Saha, Dash,
& Mukhopadhyay, 2021), material science and informatics (Batra,
2021; Jamil, Abbas, & Roy, 2022; Jamil & Roy, 2023), damage
Song, & Ramprasad, 2021; Butler et al., 2018; Määttä et al.,
detection (Glowacz, 2021, 2022), brain–computer interfaces (Roy,
2021), fluid mechanics (Brunton, Noack, & Koumoutsakos, 2020;
2022a, 2022b, 2022c), and across various scientific applications
Kutz, 2017), various constitutive modeling (Tartakovsky, Marrero,
(Bose & Roy, 2022; Ching et al., 2018).
Perdikaris, Tartakovsky, & Barajas-Solano, 2018; Xu, Huang, &
The success of these methods, such as various classes of Neural Darve, 2021), etc. Their applicability however may be further
Networks (NNs), can be largely attributed to their capacity in enhanced by utilizing physical information available by mathe-
excavating large volumes of data in establishing complex high- matical/ analytical means. The recent endeavor of scientific and
dimensional non-linear relations between input features and engineering community has been in attempting to incorporate
output (Kutz, 2017). However, the availability of sufficient data such physical information within their predictive scheme in small
data regimes.
∗ Corresponding author. The incorporation of physical information into the DL frame-
E-mail addresses: [email protected], [email protected] work may have several advantages. First, as previously men-
(A.M. Roy). tioned, in absence of sufficient data, it may be possible to solely

https://doi.org/10.1016/j.neunet.2023.03.014
0893-6080/© 2023 Elsevier Ltd. All rights reserved.
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

utilize physical knowledge for solving such problems (Raissi, Meng, & Karniadakis, 2021), heat transfer (Cai, Wang, Wang,
Perdikaris, & Karniadakis, 2019), or to the least, enhance solutions Perdikaris, & Karniadakis, 2021; Zhu, Liu, & Yan, 2021), poroe-
in a data-driven predictive scheme (Karniadakis et al., 2021; lasticity (Haghighat, Amini, & Juanes, 2022), material identi-
Raissi, Yazdani, & Karniadakis, 2020). For example, in Sirignano fication (Shukla, Jagtap, Blackshire, Sparkman, & Karniadakis,
and Spiliopoulos (2018), a high-dimensional Hamilton–Jacobi– 2021), geophysics (bin Waheed, Alkhalifah, Haghighat, & Song,
Bellman PDE has been solved by approximating the solution with 2022; bin Waheed, Haghighat, Alkhalifah, Song, & Hao, 2021),
a DNN trained to satisfy the differential operator, initial condition, supersonic flows (Jagtap, Mao, Adams, & Karniadakis, 2022), and
and boundary conditions. In incompressible fluid mechanics, the various other applications (Bekar, Madenci, Haghighat, Waheed,
use of the solenoidality condition of the velocity fields restricts & Alkhalifah, 2022; Waheed, Haghighat, Alkhalifah, Song, & Hao,
the solution space of the momentum equations. Therefore, this 2020). Contrary to traditional DL approaches, PINNs force the
condition may be used as a constraint for solving the governing underlying PDEs and the boundary conditions in the solution
equations (conventional solvers are generally developed in a domain ensuring the correct representation of governing physics
way to satisfy this constraint through the Poisson equation for of the problem. Learning of the governing physics is ensured by
pressure), or at least improve the predictions in a data-driven the formulation of the loss function that includes the underlying
approach. Second, physical systems are often governed by laws PDEs; therefore labeled data to learn the mapping between inputs
that must satisfy certain properties, such as invariance under and outputs is no more necessary. Such architectural construc-
translation, rotation, reflection, etc. In a purely data-driven ap- tion can be utilized for complex forward and inverse (finding
proach, it is almost impossible for a DL algorithm to inherit those parameters) solutions for various systems of ODEs and PDEs
properties entirely from data without explicit external forcing. (Karniadakis et al., 2021). Additionally, the feed-forward neu-
Embedding such properties in the DL algorithm might automati- ral networks utilize graph-based automated differentiation (AD)
cally improve the accuracy of the predictions. For example, Ling, (Baydin, Pearlmutter, Radul, & Siskind, 2018) to approximate
Kurzawski, and Templeton (2016) used a Tensor-based Neural the derivative terms in the PDEs. Various PINNs architectures
Network (TBNN) to embed Galilean invariance that improved NN notably self-adaptive PINNs (McClenny & Braga-Neto, 2020), ex-
models for Reynolds-averaged Navier Stokes (RANS) simulations tended PINNs (XPINN) (De Ryck, Jagtap, & Mishra, 2022; Hu,
for the prediction of turbulent flows. And lastly, any scientific Jagtap, Karniadakis, & Kawaguchi, 2021) have been proposed that
problem is governed by some underlying mechanism dictated by demonstrated superior performance. Moreover, multiple DNN-
physical laws. Neglect of such physical information in a purely based solvers such as cPINN (Jagtap, Kharazmi, & Karniadakis,
2020), XPINNs (Jagtap & Karniadakis, 2021), and PINNs frame-
data-driven framework in the current state of affairs is, therefore,
work for solid mechanics (Haghighat, Raissi, Moure, Gomez &
an unsophisticated approach, if not an ignorant one.
Juanes, 2021) have been developed that provide important ad-
Partial differential equations (PDEs) represent underlying
vancement in terms of both robustness and faster computa-
physical processes governed by first principles such as conserva-
tion. In this regard, (Haghighat, Raissi, Moure, Gomez, & Juanes,
tion of mass, momentum, and energy. In most cases, analytical
2020; Haghighat, Raissi et al., 2021) have been the breakthrough
solutions to these PDEs are not obtainable. Various numerical
works geared towards developing a DL-based solver for inver-
methods, such as finite-difference (Sengupta, 2013), finite ele-
sion and surrogate modeling in solid mechanics for the first
ment (FE) (Zienkiewicz & Taylor, 2005), Chebyshev and Fourier
time utilizing PINNs theory. Additionally, PINNs have been suc-
spectral methods (Boyd, 2001), etc are used to obtain approxi-
cessfully applied to the solution and discovery in linear elas-
mate solutions. However, such techniques are often computation-
tic solid mechanics (Guo & Haghighat, 2020; Haghighat, Bekar,
ally expensive and suffer from various sources of errors due to
Madenci & Juanes, 2021; Rezaei, Harandi, Moeineddin, Xu, &
the complex nature of the underlying PDEs, numerical discretiza-
Reese, 2022; Roy & Bose, 2023; Samaniego et al., 2020; Va-
tion and integration schemes, iterative convergence techniques, hab, Haghighat, Khaleghi, & Khalili, 2021; Zhang, Dao, Karni-
etc. Moreover, the solution of inverse problems is the current adakis, & Suresh, 2022; Zhang, Yin, & Karniadakis, 2020), elastic-
endeavor of the engineering community which requires complex viscoplastic solids (Arora, Kakkar, Dey, & Chakraborty, 2022;
formulations and is often prohibitively expensive computation- Frankel, Tachida, & Jones, 2020; Goswami, Yin, Yu, & Karniadakis,
ally. The use of the NNs in solving/modeling the PDEs governing 2022), elastoplastic material (Roy & Guha, 2022, 2023), brittle
physical processes in a forward/ inverse problem is an important fracture (Goswami, Anitescu, Chakraborty, & Rabczuk, 2020) and
challenge worth pursuing, as these methods have the capacity to computational elastodynamics (Rao, Sun, & Liu, 2021) etc. The
provide accurate solutions using limited computational resources solution of PDEs corresponding to elasticity problems can be ob-
in a significantly robust framework relative to the conventional tained by minimizing the network’s loss function that comprises
methods. In this paper, we explore the possibility of using NN the residual error of governing PDEs and the initial/boundary
to obtain solutions to such PDEs governing linear continuum conditions. In this regard, PINNs can be utilized as a computa-
elasticity problems applicable in solid mechanics. tional framework for the data-driven solution of PDE-based linear
There has been a recent thrust in developing machine learning elasticity problems that can significantly boost computational
(ML) approaches to obtain the solution of governing PDEs (Kar- speed with limited network parameters. The potential of the
niadakis et al., 2021; von Rueden et al., 2019). The idea is to PINNs framework in achieving computational efficiency beyond
combine traditional scientific computational modeling with a the capacity of the conventional computational methods for solv-
data-driven ML framework to embed scientific knowledge into ing complex problems in linear continuum elasticity is the main
neural networks (NNs) to improve the performance of learning motivation behind the present work.
algorithms (Karniadakis et al., 2021; Lagaris, Likas, & Fotiadis, In the present work, an efficient data-driven deep learning
1998; Raissi & Karniadakis, 2018). The Physics Informed Neu- computational framework has been presented based on the fun-
ral Networks (PINNs) (Lagaris et al., 1998; Raissi et al., 2019, damentals of PINNs for the solution of the linear elasticity prob-
2020) were developed for the solution and discovery of non- lem in continuum solid mechanics. In order to efficiently incorpo-
linear PDEs leveraging the capabilities of deep neural networks rate physical information for the elasticity problem, an improved
(DNNs) as universal function approximators achieving consider- multi-objective loss considering additional physics-constrained
able success in solving forward and inverse problems in different terms has been carefully formulated that consists of the resid-
physical problems such as fluid flows (Jin, Cai, Li, & Karniadakis, ual of governing PDE, various boundary conditions, and data-
2021; Sun, Gao, Pan, & Wang, 2020), multi-scale flows (Lou, driven physical knowledge fitting terms that demonstrate the
473
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

efficacy of the model by accurately capturing the elasticity so- aforementioned boundary condition may be approximated for
lution. Several benchmark problems including the Airy solution an input x = ⃗x by constructing a feed-forward NN expressed
to an elastic plane-stress problem for an end-loaded cantilever mathematically as
beam and simply supported rectangular Kirchhoff–Love thin plate
under transverse sinusoidal loading conditions have been solved φ̂ = N N
⊚ N N −1 ⊚ · · · ⊚ N 0 (x) (3)
which illustrates the superiority of the proposed model in terms where φ̂ is the approximate solution to Eq. (2); ⊚ denotes the
of accuracy and robustness by revealing excellent agreement with general compositional construction of the NN; the input to the
analytical solutions. The employed models consist of independent NN N 0 := x0 = ⃗ x = (x1 , x2 , . . . , xndim ) is the spatial coordinate
multi-layer ANNs that are separately trained on minimizing the i
prescribed loss function specific to the problem under consider-
at which the solution is sought. ⋃NFollowing Eqs. (1) and (3), if W
and bi are all collected in θ = i=0 (W i , bi ), the output layer N N
ation. The performance of PINNs has been evaluated for different
contains the approximate solution φ̂ (⃗ x) to the PDE such that
activation functions and network architectures. Furthermore, we
have illustrated the applicability of data-driven enhancement N k+1 = φ̂ [x, θ] = [φ̂1 , φ̂2 , . . . , φ̂m ] (4)
using the smart initialization of a data-driven learning-based ap-
proach in reducing training time, while simultaneously improving The spatial dependence of φ̂ is implicitly contained in the NN
the accuracy of the model which is not possible in conventional parameter θ . In the internal/ hidden layers of NN, several vari-
numerical algorithms. Such an approach would be important ations of nonlinear transformer or the activation function ~ may
in achieving computational efficiency beyond the capacity of be used, such as, the hyperbolic-tangent function tanh(ξ ), the sig-
conventional computational methods for solving complex linear moid function ~ (ξ ) = 1/(1 + e−ξ ), the rectified linear unit (ReLU)
elasticity problems. The present study also demonstrates the con- ~ (ξ ) = max(0, ξ ), etc. The activation in the final layer is generally
tribution of analytical solutions for the data-driven construction taken to be linear for regression-type problems considered here.
of an accurate and robust PINNs framework that can significantly
boost computational speed utilizing minimal trainable network 2.1. Embedding constraints in NN
parameters.
The paper is organized as follows: Section 2 introduces the This section briefly describes the general idea of embedding
background of PINNs theory and the generalized idea of imple- linear constraints into NN (Du & Zaki, 2021; Lagaris et al., 1998).
menting multi-objective loss function into the PINNs framework; Let us consider U and A, two complete normed vector spaces,
In Section 3, a brief overview of the theory of linear elasticity where NN function class M ⊂ U need to be constrained. A linear
has been presented; Section 4 introduces the extension of the constraint on φ ∈ M can be expressed as:
proposed PINNs framework for the Airy solution to an elastic P φ (x) = 0, φ∈M (5)
plane-stress problem for an end-loaded cantilever beam; in Sec-
tion 5, the proposed PINNs framework has been extended to the where, P : U → A expresses a linear operator on U. Generally,
solution of Kirchhoff–Love thin plate governed by Biharmonic a such constraint can be realized for solving PDEs in most of the
PDE; Section 7 deals with the relevant finding and prospects of DL framework by minimizing the following functional
the current work. Finally, the conclusions have been discussed in
JA = ∥P φ∥A , φ∈M (6)
Section 7 .
where ∥  ∥A denotes the norm corresponding to space A. It
2. Physics-informed neural networks is noteworthy to mention that the aforementioned procedure
approximately enforces linear constraint in Eq. (5). However,
The concept of training a NN in the PINNs framework is the the accuracy of the imposed constraint relies on the relative
construction of the loss function. The loss function is intended to weighting between the constraint and other objectives involved
embed the underlying physics which is represented in mathemat- in the training include the satisfaction of the governing PDEs or
ical terms by the PDEs and the associated boundary conditions. In the integration of data-driven schemes.
this section, we discuss the construction of the proposed multi-
object loss functions for embedding a data-driven physical model 2.2. Multiple objective loss functions
that has been associated with the PINNs framework.
Let us consider a fully connected NN defined by In order to incorporate physical information of the problem,
N k+1 (N k ) = ~ k (W k · N k
+ bk ) (1) one of the possibilities is to impose Eq. (2) as a hard constraint
in x ∈ Ω while training the NN on the physical data. Mathemat-
where k ∈ {0, 1, . . . , N } represents the layer number of NN. N is ically, such a condition is imposed by formulating a constrained
m
a nonlinear map defined by N m (x̂ ) := ~ m (W m · xm + bm ) for mth- optimization problem which can be expressed as (Krishnapriyan,
m m
layer where W and b represents the weights and biases of this Gholami, Zhe, Kirby, & Mahoney, 2021),
transformation, respectively; ~ (·) is the non-linear transformer or
min ∆L (x, θ ) s.t. G φ (⃗
x) = 0. (7)
activation function acting on a vector element-wise. Therefore, θ
k = 0 represents the input layer of the NN taking in the input x0 .
where ∆L represents data-driven physical knowledge fitting term
Also consider a steady state general nonlinear partial differ-
which includes the imposed initial and boundary conditions.
ential operator G operated on a scalar solution variable φ (⃗ x) such
G φ (⃗
x) denotes the constraint corresponding to the residual PDE
that,
imposing the governing PDE itself. Thus, it is important to care-
G φ (⃗
x) = 0 ⃗x ∈ Rndim (2) fully impose appropriate constraints for the NN to realize the
underlying physics of the problem.
Since G is a differential operator, in general, Eq. (2) is accompa- In the present work, we propose a multi-objective loss func-
nied by appropriate boundary conditions to ensure the existence tion that consists of residuals of governing PDEs, various bound-
and uniqueness of a solution. Let us assume, it is subjected to ary conditions, and data-driven physical knowledge fitting terms
the boundary condition B φ (∂ Γ⃗ ) = τ (∂⃗ x) on the boundary Γ⃗ that can be expressed in the following general form:
in domain Ω ∈ Rndim , ndim being the spatial dimension. In a
PINNs framework, the solution to Eq. (2), φ (x), subjected to the ∆L (x, θ ) = ϕ∥G φ (x) − 0̂∥Ω + βu ∥B Γu φ − g Γu ∥Γu
474
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

+ βt ∥B Γt φ − g Γt ∥Γt + α∥φ − φ̂∥Ω + · · · (8) 3.1. Compatibility condition


where, ∆L (x, θ ) is the total loss function; the ⨀symbol ∥ ⊚⨀ ∥ rep-
resents the mean squared error norm, i.e., ∥ ∥ = MSE( ) for In the context of infinitesimal strain theory, we seek to find
regression type problem; ∥G φ (x)−0̂∥Ω denotes the residual of the u : Ω → Rndim and corresponding ε : Ω → Rndim ×ndim , and
governing differential relation in Eq. (2) for x ∈ Ω ; Γu and Γt are σ : Ω → Rndim ×ndim for a given infinite elastic solid satisfying the
the Dirichlet and Neumann boundaries subjected to conditions following compatibility conditions (Marsden & Hughes, 1994):
B Γu φ = g Γu and B Γt φ = g Γt , respectively. The values of g Γu
R : = ∇ × (∇ × ε)T = 0; (10)
and g Γt are specific to the problem under consideration, and
therefore, pre-specified as inputs to the problem/ loss function. where, R is Saint-Venant compatibility tensor. Alternatively, the
Note ϕ , βu , and, βt , are weights associated with each loss term elastic solid should satisfy the Navier–Cauchy equations which
regularizing the emphasis on each term (the higher the relative
can be expressed as (Lurie, 2010):
value, the more emphasis on satisfying the relation). The remain-
ing task is to utilize standard optimization techniques to tune the (λ + µ)∇ (∇ · u) + µ∆u + B = 0, in Ω
parameters of the NN minimizing the proposed objective/ loss (11)
u |ΓD = ū;
function ∆L (x, θ ) in Eq. (8).
However, even with a large volume of training data, such where u = (u1 , u2 , . . . , undim ) is the unknown displacement field;
an approach may not guarantee that the NN strictly obeys the µ > 0 and λ > −µ are Lame constants; ∇ , ∆, and ∇ represent
conservation/governing equations in Eq. (2). Thus, additional loss the gradient, the Laplacian, and the divergence operators, respec-
terms to fit the observation data can be introduced. Hence, in tively. Eq. (11) satisfies the continuity of the displacement field u
the proposed objective loss function, additional loss terms such and Dirichlet boundary condition.
as ∥φ − φ̄∥Ω have been included that represent the data-driven
physical knowledge fitting term for the state variable φ (⃗ x). Here,
φ̄ is the true (target) value of φ provided from either the analyt- 3.2. Equilibrium condition
ical solution (if available), numerical simulation, or experimental
observations. α is the weight associated with the data-driven In addition, the equilibrium condition and the Neumann
physical knowledge fitting term for φ (⃗ x). In the NN approxima- boundary condition should be satisfied which can be expressed
tion, various degrees of differentials of the state variable φ (x) as (Marsden & Hughes, 1994):
(i.e., φ ′ (x), φ ′′ (x), . . .) can also be included (if known) for stronger
coupling in the data-driven approach. The partial differentials ∇ · σ + B = 0, in Ω
(12)
of φ (x) may be evaluated utilizing the graph-based automatic t := Tu = t̄ , on Γt σ |Γt n̂ = t̄
differentiation (Baydin et al., 2018) with multiple hidden layers
representing the nonlinear response in PINNs. Following the same where, t̄ is a prescribed function on Γt ; n̂ is the field normal to
steps, the initial conditions can also be incorporated in Eq. (8). The Γt . Eq. (12) satisfies the momentum equation and the Neumann
loss from the initial conditions is not included herein due to the boundary condition where T follows the conformal derivative
quasi-static nature of the elasticity problem. In a more general operator such that (Atkin & Fox, 2005)
t =t
case, the additional loss term ∥φ0 − φˆ0 ∥Ω 0 should be added for
∂u
the loss contribution from the initial condition. Tu = λ(∆ u) · n̂ + 2µ + µ n̂ × (∇ × u) (13)
Finally, the optimal network parameters of NN θ̃ can be ob- ∂ n̂
tained by optimizing the loss function in Eq. (8) as
3.3. Constitutive relation
θ̃ = arg min ∆L (X̄ , θ ). (9)
θ ⊂R Nt

⋃N i i
Subsequently, the elastic constitutive relation can be
where, θ̃ := i=0 (W̃ , b̃ ) is the set of optimized network param- expressed from generalized Hooke’s law (Timoshenko, 1970) as:
eters; N t is the total number of trainable parameters; and X̄ ∈
t
RNc ×N is the set of Nc collocation points used for optimization. σ=C :ε (14)

where, the fourth-order stiffness tensor C = Cijkl ei ⊗ ej ⊗ ek ⊗ el


3. Theory of linear elastic solid
denotes the constitutive relation that maps the displacement gra-
Consider an undeformed configuration B of an elastic body dient ∇ u to the Cauchy stress tensor σ . For an isotropic linearly
bounded in the domain Ω ⊂ Rndim (1 ≤ ndim ≤ 3) with elastic material, Cijkl = λδij δkl + µ(δik δjl + δil δjk ) where δij is the
boundary Γ = Γu ∪ Γt where Γu ̸ = ∅ is the Dirichlet boundary, Kronecker delta. The components of the stress tensor σ , and the
Γt is the Neumann boundary, and Γu ∩ Γt = ∅. With respect strain tensor ε, are expressed as :
to the undeformed surface, the elastic body can be subjected ndim
∂ ui ∂ uj
( )
to a prescribed displacement ū on ΓD , and a prescribed surface
∑ 1
σij (u) = λδij εkk (u) + 2µεij (u), εij (u) = + ,
traction t̄ ∈ [L2 (Γt )]ndim . Additionally, a body force of density 2 ∂ xj ∂ xi
k=1
B ∈ [L2 (Ω )]ndim in Ω can be prescribed with respect to the
undeformed volume. Using a standard basis {ei } in Rndim , we can i, j = 1, 2, . . . , ndim . (15)
express the) displacement, u = ui ei , and its gradient, ∇ u =
1
( Note that σ is the Cauchy stress tensor in linear elasticity appli-
2
u i , j + u j,i ei ⊗ ej ; where, ⊗ denotes the tensor products.
Second order symmetric tensors are linear }transformations in S, cable under small deformation. The constitutive relation in terms
defined as S := ξ : Rndim → Rndim | ξ = ξ T with inner product of strain can be
{

ξ : ξ = tr ξξ ≡ ξij ξij . Therefore, the stress tensor can be


[ T]
εij,kl + εkl,ij − εik,jl − εjl,ik = 0 i, j, k, l ∈ 1, 2, . . . , ndim . (16)
expressed as σ := σij ei ⊗ ej . For infinitesimal strain, displacement
gradient[ tensor ∇ u ]can be expressed as: ∇ u = ε + ω where Equations governing a linear elastic boundary value problem
ε := 21 ∇ u + ∇ (u)T is the infinitesimal strain
] tensor with ∇ × (BVP) are defined by Eqs. (11)–(16) where the field variables
ε = eijk εrj,i ek ⊗ er , and ω := 21 ∇ u − ∇ (u)T is the infinitesimal u, σ, ε can be obtained for given material constants (Atkin & Fox,
[
rotation tensor. 2005; Lurie, 2010).
475
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 1. PINNs network architecture for solving linear elasticity problem consisting of multi-ANN (NNi ∀ i = 1, k) for each output variables ũNN
x (x), ũy (x), σ̃xx (x),
NN NN

σ̃yy
NN
(x), σ̃xy
NN
(x), ε̃xx
NN
(x), ε̃yy
NN
(x), and ε̃xy
NN
(x), with independent variable x = (x, y) as input features.

4. PINNs formulation for continuum linear elasticity represent the loss components computed at the Dirichlet bound-
ary Γu , and the Neumann boundary Γt (Eq. (11)), respectively;
The proposed PINNs framework is applied to linearly elastic ∆uL , ∆σL , and ∆εL are the loss components for the fields u(x), σ (x),
solids. A two-dimensional (ndim = 2) problem is considered. and ε(x), respectively, when a data driven approach is pursued.
The input features (variables) to the models are the spatial co- The coefficients ϕ, ϕe , ϕc , βu , βt , αu , ασ , and αε are the weights
ordinates x = (x, y). A separate NN is used to approximate associated with each loss term that dictates the emphasis on
each output field variable. As shown in Fig. 1, displacement u(x), each penalty term. Evidently, the terms in the cost function are
stress σ (x), and strain ε(x) fields are obtained by densely con- the measures of the errors in the displacement and stress fields,
nected independent ANNs. For ndim = 2, considering symmetry the momentum balance, and the constitutive law. The explicit
of the stress and strain tensors, u(x), σ (x), and ε(x) fields can be expressions for each term in ∆L (x, θ ) are,
approximated as: Ω
Nc
[
ũNN
] Ω 1 ∑
x (x) ∆L = ∥∇ · Ξ NN
σ (xl|Ω ) + B(xl|Ω )∥ (20)
u(x) ≃ ΞuNN (x) = (17) NcΩ
ũNN
y (x) l=1
Nc Ω
1 ∑
σ̃ NN
xx (x) σ̃ NN
xy (x) ∆cL σ (xl|Ω ) − C ∇ · Ξu (xl|Ω ) ∥
[ ] [ ]
= ∥Ξ NN NN
(21)
σ (x) ≃ Ξσ (x) =
NN
; NΩ
σ̃ NN
yx (x) σ̃ NN
xy (x)
c l=1
Γu
ε̃ NN
xx (x) ε̃NN
xy (x)
Nc
[ ]
1 ∑
ε(x) ≃ Ξε (x) =
NN
(18) Γu
∆L = ∥Ξ NN
u (xk|Γu ) − ū(xk|Γu )|∥ (22)
ε̃NN
yx (x) ε̃NN
xy (x) Nc u
Γ
k=1
Here ΞuNN (x), ΞσNN (x), and ΞεNN (x) denote the NN approximations Nc
Γt

for u(x), σ (x), and ε(x), respectively. Γt 1 ∑


∆L = Γ
∥Ξ NN
σ (xj|Γt )n̂ − t̄(xj|Γt )∥ (23)
Nc t j=1
4.1. Loss function
Nc Ω
1 ∑
To define the loss function for the linear elasticity problem, ∆uL = ∥Ξ NN
u (xl|Ω ) − û(xl|Ω )∥ (24)
governing equations including compatibility conditions, equilib- NcΩ
l=1
rium conditions, constitutive relations, and boundary conditions Nc Ω
that fully describe the problem have been considered. Addition- 1 ∑
∆σL = σ (xl|Ω ) − σ̂ (xl|Ω )∥
∥Ξ NN (25)
ally, as in a data-driven approach, the field variables in Eq. (8) NcΩ
l=1
have been included. The generalized multi-objective loss func-

tional ∆L can be expressed as: Nc
1 ∑
ε
Γu Γt ∆L = ε (xl|Ω ) − ε̂(xl|Ω )∥
∥Ξ NN (26)
∆L (x, θ ) = ϕ ∆Ω
L + ϕe ∆L + ϕc ∆L + βu ∆L + βt ∆L
e c
NcΩ
l=1
+ αu ∆uL + ασ ∆σL + αε ∆εL (19) { }
where, x1|Ω , . . . , xN Ω |Ω are randomly chosen collocation points
where, ∆eL , ∆cL , and ∆Ω
c
L are the loss components from the equi-
{ } { }
librium condition (Eq. (12)), constitutive relation (Eq. (14)), and over the domain Ω ; x1|Γu , . . . , xN Γu |Γ and x1|Γt , . . . , xN Γt |Γ
c u c t
Γ Γ
the compatibility condition (Eq. (15)), respectively; ∆Lu and ∆Lt are those chosen randomly along the boundaries Γu and Γt ,
476
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 2. (a) Elastic plane-stress problem for an end-loaded cantilever beam of length L, height 2a and out-of-plane thickness b which has been clamped at x = L; (b)
distributions of total collocations points Nc = 5000 on the problem domain and various boundaries during PINNs training.

respectively. The terms û(xl|Ω ), σ̂ (xl|Ω ), and ε̂(xl|Ω ) represent the satisfy the biharmonic equation for any Amn . Additionally, φ must
true (target) value obtained by means of analytical solution or satisfy the following traction boundary conditions on Ω .
high-fidelity simulation. The weights ϕ, ϕe , ϕc ∈ R+ are the ∂ 2φ ∂ 2φ ∂ 2φ ∂ 2φ
weights corresponding to the compatibility, equilibrium, and con- n x − n y = tx ; n y − ny = ty (29)
stitutive relations, respectively. In general, these coefficients can
∂ y2 ∂ x∂ y ∂ x2 ∂ x∂ y
be prescribed as 1 for solving a relatively less complex problem, Here, (nx , ny ) are the components of a unit vector normal to the
whereas, βu and βt are the binary (i.e., either 0 or 1) integers. boundary. For the end-loaded cantilever beam, the Airy function
The weights αi = 1; ∀ i = u, σ, ε for a complete data driven can be formulated as,
approach for u(x), σ (x), and ε(x), respectively at the collocation 3P P
points NcΩ . However, we prescribe αi = 0 ∀ (i = u, σ, ε) as φ=− xy + xy3 (30)
4ab 4a3 b
labeled training data is unavailable, which may not guarantee the
∂2φ ∂2φ ∂2φ
accuracy of PINNs solutions. where, σxx = ∂ y2 − Ω ; σyy = ∂ x2 − Ω ; σxy = σyx = − ∂ x∂ y
The forward problem is studied herein, where the displace- with Ω = 0. At the clamped end, x1 = L, displacement boundary
ment, stress, and strain fields are obtained as the PINNs solu- conditions are ux = uy = ∂ uy /∂ x = 0. The top and bottom
tions assuming material properties λ and µ remain constant. surfaces of the beam (i.e., y = ±a) are traction free, σij ni = 0,
However, the loss functional in Eq. (19) can also be utilized in that requires σyy = σxy = 0. Whereas, the resultant of the
an inverse problem for parameter identification, where λ and traction acting on the surface at x = 0 is −Pey with traction
y2
µ can be treated as network outputs which may vary during vector ti = σij nj = −σxy δiy = − 4ab
3P
(1 − a2
δ
) iy . The resultant
training (Fig. 1). For the network construction in the PINNs frame- y2
∫a
force can be obtained as : Fi = b −a − 4ab 3P
(1 − δ = −P δiy .
a2
) iy dx2
work, SciANN (Haghighat & Juanes, 2021), a convenient high-level On satisfaction of the aforementioned conditions, approximate
Keras (Chollet et al., 2015) wrapper for PINNs is used. analytical solutions for the displacements ux , uy , the strain fields
εxx , εyy , εxy and the stress fields σxx , σyy , σxy can be expressed as:
4.2. Solution for linear elasticity problem
3P P Pa2 3PL2
ux = x2 y − (2 + µ) y3 + 3(1 + µ) y− y (31)
4Ea3 b 4Ea3 b 2Ea3 b 4Ea3 b
For this study, an end-loaded isotropic linearly elastic can- 3µP P 3PL2 PL3
tilever beam of height 2a, length L, thickness b (assuming b ≪ a) uy = − xy2 − x3 + x− (32)
4Ea3 b 4Ea3 b 4Ea3 b 2Ea3 b
has been considered to ensure a state of plane-stress condition
3P µ 3P(1 + µ) 2
( )
3P y
as shown in Fig. 2. The left edge of the beam is subjected to a εxx = xy; εyy = − xy; εxy = 1 − 2 (33)
2Ea3 b 2Ea3 b 4Eab a
resultant force P. Whereas, the right-hand end is clamped. The
y2
( )
top and bottom surfaces of the beam, y = ±a are traction free. 3P 3P
σxx = xy; σyy = 0; σxy = 1− (34)
An approximate solution to the problem can be obtained from the 2a3 b 4ab a2
Airy function discussed next. These analytical solutions for u(x), σ (x), and ε(x) have been used
as û(xl|Ω ), σ̂ (xl|Ω ), and ε̂(xl|Ω ) at the collocation points for data-
4.2.1. The Airy solution to the end-loaded cantilever beam driven enhancement in Eqs. (24)–(26), respectively, for solving
The Airy solution in Cartesian coordinates Ω ⊂ R2 can be the field variables in the proposed PINNs framework.
found from the Airy potential φ (x, y) that satisfies (Bower, 2009),
4.2.2. PINNs solutions for linear elasticity problem
For the benchmark, end-loaded cantilever beam problem, L =
∂ φ
4
∂ φ 4
∂ φ 4
∂ bx ∂ by 3 m, a = 0.5 m, and b = 0.001 m have been considered. The ma-
∇φ = + 2 2 2 + 4 = C(ν )( + ) (27)
∂ x4 ∂x ∂y ∂y ∂x ∂y terial properties are, Young’s modulus E = 1 GPa, and the Poisson
where, ratio ν = 0.25 as shown in Fig. 2(a). Unless otherwise stated, a
total of Nc = 5000 randomly distributed collocation points over
1−ν
{
1−2ν
(plane strain) the domain and boundaries have been used for training the PINNs
C(ν ) = 1 (28)
1−ν
(plane stress) model as shown in Fig. 2(a). During training, the optimization
loop was run for 500 epochs using the Adam optimization scheme
Here, the body forces bx , by have the form ρ0 bx = ∂∂Ωx , ρ0 by = with a learning rate of 0.001, and a batch size of 32 for optimal
∂Ω
∂y
; Ω (x, y) is the positional scalar function. The solution of the accuracy and faster convergence.
Airy ∑∞ can mbenexpressed in the polynomial form φ (x, y) =
∑∞ function The Airy solutions for various fields including displacements
m=0 n=0 Amn x y . For m + n ≤ 3, the terms automatically ux , uy , stresses σxx , σyy , σxy , and strains εxx , εyy , εxy as in Eqs.
477
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 3. (a) The Airy solutions for displacements ux , uy , stresses σxx , σyy , σxy , strains εxx , εyy , εxy ; (b) corresponding PINNs solutions for ũNN
x , ũy , σ̃xx , σ̃yy , σ̃xy , ε̃xx ,
NN NN NN NN NN

ε̃yy
NN
, and ε̃xy
NN
; (c) absolute error between the Airy solutions and PINNs predictions associated with each field variables for an end-loaded cantilever beam.

(31)–(34) are shown in Fig. 3(a). The corresponding PINNs ap- shown in Fig. 3(c). This is due to the approximate nature of
proximations using the tanh activation function are shown in the Airy solutions at clamped end x1 = L for the displacement
Fig. 3(b). Additionally, in Fig. 3(c), the absolute error between boundary conditions ux = uy = ∂ uy /∂ x = 0. Such differences
the Airy solutions and PINNs predictions for each field variable also propagate through the solutions of stress and strain fields,
is shown. The overall results from PINNs are in excellent agree- where PINNs predictions slightly deviate from the Airy solutions,
ment with the Airy solutions. The PINNs approximations attained in particular, near the free vertical and horizontal edges as shown
satisfactory accuracy with low absolute errors for all field vari- in Fig. 3(c). However, according to Saint-Venant’s principle, these
ables. For the displacement fields, the absolute error is relatively deviations do not sufficiently influence the solution far from the
high near to clamped edge for ux . For uy , the absolute error is end, which is reflected in the result. Overall, the proposed PINNs
maximum at the midsection and near the horizontal edges as

478
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 4. Comparison of (a) total loss ∆Ω Ω


L ; (b) constitutive loss ∆L for tanh, sigmoid and ReLU activation functions for network parameters N = 20, Ln = 5.

model can capture the distributions of various fields accurately 4.2.4. Influence of network complexity
from the solution of the Airy stress function. It is worth mentioning that the PINNs approximations are sen-
sitive to network architecture including the depth of the hidden
layer and the number of network parameters. In this section, the
influence of network architecture parameters, i.e., the number of
neurons in each hidden layer N , and the number of hidden layers
4.2.3. Suitable activation function Ln on the accuracy and the efficiency of the PINNs solution are
The impact of the use of various activation functions on train- explored. Since the tanh activation performs the best in terms of
ing the PINNs models in predicting field variables and the epoch accuracy (see previous section), it is chosen as the activation for
evolution of various components of the loss function is explored. different networks used in the following experiments.
The ReLU, sigmoid, and tanh activation functions are compared; In the current study, four different networks considering the
the network architecture remains the same: the number of neu- combinations N = 20, 40, and Ln = 5, 10 are tested, and
rons in each layer N = 20 with the total number of hidden layers values of different loss components at the end of the training,
Ln = 5 in the PINNs model. The evolution of the total loss ∆L , training duration (ttr ), along with model complexities in terms of
and the constitutive loss ∆Ω L are depicted in Fig. 4. Additionally, network parameters (np ) for these architectures are presented in
values of the various loss components and training times ttr at Table 2. For fair comparison, Nc = 5000 for all experiments. The
the end of training are compared in Table 1. Evidently, the tanh evolution of the total loss ∆L and the constitutive loss ∆Ω L for
activation provides the best performance in terms of the value these networks are shown in Fig. 5. From the comparisons, for the
of the total loss at the end of training. The final constitutive loss chosen number of collocation points relatively shallow network
with tanh activation is significantly lower compared to the other N = 20, Ln = 5 provides the best performance in terms of ∆L
two activations illustrating the suitability of the use of the tanh and ∆Ω L at the end of training. Additionally, the time required for
activation for the PINNs model for solving the elasticity problem training is faster due to a significantly lower number of network
herein. In addition, all other loss components obtained are lowest parameters. However, for a relatively deeper network, N = 20,
upon using the tanh activation as shown in Table 1. Ln = 10 with increased network complexity, the performance
Comparing the evolution of ∆L , the convergence characteris- of the model degrades with respect to loss values as shown in
tics for the ReLU activation are better compared to the tanh with Table 2 possibly due to an increase in variability and reduction in
fewer fluctuations and rapid decrease in loss values as shown in bias. Interestingly, an increase in the number of neurons N = 40
Fig. 4(a). However, the tanh illustrates better adaptability in the while maintaining the depth of the network (Ln = 5) leads to the
constitutive loss with an excellent convergence rate in Fig. 4(b). worst performance which can be attributed to over-fitting (Bilbao
Out of the three activations, ReLU performs the worst possibly & Bilbao, 2017; Jabbar & Khan, 2015). The epoch evolution of the
due to its derivative being discontinuous. However, the total loss loss for various network architectures demonstrates the efficacy
for all three activations is negligible (loss value in the range below of a relatively shallow network with significantly faster training
10−4 to 10−5 ) within 200 epochs indicating the adaptability of the for solving elasticity problems in the proposed PINNs framework.
proposed PINNs framework to any of these activations provided
the models are trained sufficiently long. In comparing the train- 5. PINNs formulation for linear elastic plate theory
ing time, the tanh activation takes longer for the same number
of epochs compared to the other two. This coincides with the In this section, the PINNs framework is expanded for the
fact that the evolution of the total loss has a higher degree of solution of the classical Kirchhoff–Love thin plate (Timoshenko
discontinuity. However, the model with the ReLU activation trains & Woinowsky-Krieger, 1959) subjected to a transverse loading
the fastest possibly due to its linear nature. From the comparison, in linearly elastic plate theory. In the subsequent section, the
it can be concluded that although tanh is the best in terms of Kirchhoff–Love theory has been briefly described; PINNs formu-
accuracy, however, ReLU can be an optimal choice of activation lation for solving the governing fourth-order biharmonic partial
considering both accuracy and training time for solving elasticity differential equation (PDE) for the solution of the thin plate
equation in the proposed PINNs framework. is elaborated. For a benchmark problem, the proposed PINNs
479
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Table 1
Influence of different activation functions on the final values of various loss components (in 10−09 ) and training times ttr in the proposed PINNs
model for solving linear elastic beam problem.
Activation function ∆Ω
L ∆cL ∆ΓLu ∆ΓLt ∆uL ∆σL ∆εL ∆L ttr
(min)
ReLU 107.16 43.43 14.51 36.75 24.97 1.07 5.48 233.37 9.4
Sigmoid 30.96 54.33 517.38 126.14 37.85 124.51 592.82 1483.99 13.8
tanh 4.56 0.73 31.47 25.64 3.11 9.60 10.45 85.56 15.7

Table 2
Influence of network parameters N and Ln on training times ttr and final values various loss components (in 10−09 ) for tanh activation.
Network identifier np ttr ∆Ω
L ∆cL ∆ΓLu ∆ΓLt ∆uL ∆σL ∆εL ∆L
(min)
N-1 (N = 20, Ln = 5) 22,706 15.7 4.56 0.73 31.47 25.64 3.11 9.60 10.45 85.56
N-2 (N = 40, Ln = 5) 113,530 23.8 2.21 90.39 77.73 59.58 4.29 24.16 78.39 336.75
N-3 (N = 20, Ln = 10) 54,494 18.3 6.89 0.89 12.73 65.42 13.01 17.19 4.67 120.8
N-4 (N = 40, Ln = 10) 272,472 32.3 2.78 3.67 18.78 12.63 24.19 43.10 2.49 107.64

Fig. 5. Comparison of (a) total loss ∆Ω Ω


L ; (b) constitutive loss ∆L for various combinations of network parameters N and Ln considering tanh activation function.

approach is applied for the solution of a simply supported rect- variables includes the primitive variable deflection w , and the
angular plate under a transverse sinusoidal loading condition. derived quantities, moments Mxx , Myy , Mxy = −Myx , and shearing
forces Qxx , Qyy . The expressions for the derived fields are,
5.1. Kirchhoff–Love thin plate theory
∂ 2w ∂ 2w ∂ 2w ∂ 2w
( ) ( )
Mxx = −D +ν 2 ; Myy = −D + ν ;
Thin plates are structurally planar elements that have small
∂ x2 ∂y ∂ y2 ∂ x2
∂ w
( 2 )
thickness relative to their in-plane dimensions which can be
Mxy = −D (1 − ν ) (37)
simplified as a two-dimensional plate problem. According to the ∂ x∂ y
Kirchhoff–Love theory, the kinetics of a thin plate under the effect
of a distributed transverse loading q = q(x, y) can be described by ∂ Myx ∂ Mxx ∂ ∂ 2w ∂ 2w
( )
a fourth-order differential equation (Reddy, 2006; Timoshenko & Qxx = + = −D + ;
∂y ∂x ∂ x ∂ x2 ∂ y2
Woinowsky-Krieger, 1959).
∂ Myy ∂ Mxy ∂ ∂ w ∂ 2w
( 2 )
∆(D∆w ) = q (35) Qyy = − = −D + (38)
∂y ∂x ∂ y ∂ x2 ∂ y2
When the elastic plate is bounded in the domain Ω ⊂ R2 ,
Eq. (35) is known as the Kirchhoff–Love equation. In Cartesian 5.2. PINNs formulation for the biharmonic equation
coordinates, w = w (x, y) represents the transverse displacement
field, D = D (x, y) is the bending stiffness of the plate, and ∆ = For solving the Biharmonic equation using the PINNs frame-
∂ 2 /∂ x2 + ∂ 2 /∂ y2 is the Laplace operator. Considering a homoge- work, the input features are the spatial coordinates x := (x, y);
neous and isotropic plate (i.e., D ≡ constant), Eq. (35) becomes the field variables, w (x), M (x), and Q (x) are obtained using mul-
the biharmonic equation (Szilard & Nash, 1974; Timoshenko & tiple densely connected independent ANNs, with each network
Woinowsky-Krieger, 1959) approximating one of the outputs as shown in Fig. 6. Different
field variables approximated by the NNs are as follows:
∂ 4w ∂ 4w ∂ 4w
( )
D∆2 w = D +2 2 2 + =q (36)
∂x 4 ∂x ∂y ∂ y4 w(x) ≃ ΞwNN = w̃NN (x) (39)
Under appropriate boundary conditions, and with D (x, y) > 0 [ ]
M̃NN
xx (x) M̃NN
xy (x)
and q(x, y) ≥ 0, both being known, the problem possesses a M (x) ≃ ΞM
NN
= ;
unique solution for the displacement w (x, y). The set of solution M̃NN
yx (x) M̃NN
yy (x)

480
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 6. PINNs network architecture for solving Kirchhoff–Love thin plate problem governed by biharmonic equation consisting of multi-ANN (NNi ∀ i = 1, k) for each
xx (x), M̃xy (x), M̃yy (x), Q̃xx (x), and Q̃yy (x) with independent variable x = (x, y) as input features.
field variables w̃NN (x), M̃NN NN NN NN NN

Fig. 7. Benchmark problem setup for Kirchhoff–Love plate: (a, b) simply supported rectangular plate of a = 200 cm and b = 300 cm with thickness t = 1 cm
subjected to transverse sinusoidal loading of intensity q0 = 9.806 × 10−4 MPa; (b) distributions of total collocations points Nc = 10,000 on the problem domain and
various boundaries during PINNs training.

Γt
[ ]
Q̃NN
xx (x)
Nc
Q (x) ≃ Ξ NN
= (40) Γt 1 ∑
Q
Q̃NN ∆L = ∥Ξ NN M (xj|Γt ) − M̄ (xj|Γt )|∥ (44)
yx (x) N{ tΓ
c j=1 } { }
where, ΞwNN , ΞM NN
, and ΞQNN are the neural network approxi- x1|Ω , . . . , xN Ω |Ω , x1|Γu , . . . , xN Γu |Γ , x1|Γt , . . . ,
{
where,
c c u
mations. From the NN approximations of the fields, the multi- }
objective loss function ∆L (x, θ ) can be defined as: xN Γt |Γ are the collocation points over the domain Ω , and along
c t
Γu Γt w the boundaries Γu and Γt , respectively; ϕ ∈ R+ is the penalty
∆L (x, θ ) = ϕ ∆Ω Q
L +βu ∆L +βt ∆L +αw ∆L +αM ∆L +αQ ∆L (41)
M
coefficient for imposing the biharmonic relation in Eq. (36). Ad-
Γ Γ
where, ∆Ω
L , ∆L , ∆L are the losses in the domain Ω , and along
u t
ditionally, data driven estimates of w (x), M (x), and Q (x) at the
the boundaries Γu and Γt , respectively. Their expressions are, collocation points across Ω are used to define ∆L (x, θ ).
Nc Ω Ω
Nc
Ω 1 ∑ q̂ w 1 ∑
∆L = ∥∇ ∇ w −
2 2
∥ (42) ∆L = w (xl|Ω ) − ŵ (xl|Ω )∥
∥Ξ NN (45)
NcΩ D NcΩ
l=1 l=1
Γu Ω
Nc Nc
Γu 1 ∑ 1 ∑
∆L = Γ
∥Ξ NN
w (xk|Γu ) − w̄(xk|Γu )|∥ (43) ∆M = ∥Ξ NN
M (xl|Ω ) − M̂ (xl|Ω )∥ (46)
Nc u
L
NcΩ
k=1 l=1

481
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Nc Ω
1 ∑ also solved in the recent work (Vahab et al., 2021). Unless other-
∆QL = ∥Ξ NN
Q (xl|Ω ) − Q̂ (xl|Ω )∥ (47) wise stated, the total number of randomly distributed collocation
NΩc l=1 points, Nc = 10,000 is used during the training of the PINNs
Here, ŵ (xl|Ω ), M̂ (xl|Ω ), and Q̂ (xl|Ω ) are obtained by means of an- model. Additionally, a learning rate of 0.001, and a batch size of
alytical or high-fidelity numerical solutions. Note, αi = 1; ∀ i = 50 were prescribed for optimal accuracy and faster convergence
w, M , Q for data-driven enhancement coupled with of the optimization scheme. For better accuracy during training,
physics-informed regression by forcing the PDE constraints in the Adam optimization scheme is employed with 1000 epochs. In
Eqs. (36)–(38). Whereas, αi = 0 switches off the data-driven the present study, three different activation functions were tested
enhancement of accuracy of the NN approximations. The loss (see Section 5.4.1).
function in Eq. (41) can either be used for obtaining PINNs In Fig. 8(a–f), the analytical solution for various fields including
approximations of w (x), M (x), and Q (x) (i.e., forward problem), or plate deflection w , moments Mxx , Myy , Mxy , and shearing forces
identification of model parameters λ and µ (i.e., inverse problem). Qxx , and Qyy in Eqs. (50)–(55) are shown. Corresponding approx-
imations from PINNs for various activation functions are shown
5.3. Simply supported Kirchhoff–Love plate in Fig. 8(a–f) which illustrate the efficacy of the proposed model
in terms of accuracy and robustness as excellent agreement with
A simply supported rectangular plate of size (a × b) under a the analytical solutions is evident.
πy
sinusoidal load q(x, y) = q0 sin πax sin b is considered in Cartesian
coordinates as shown in Fig. 7. Here, q0 is the intensity of the load 5.4.1. Influence of the activation function
at the center of the plate. The accuracy of the field variables and epoch evolution of
The following boundary conditions are applied at the simply the loss functions are explored for various activation functions
supported (SS) edges: for solving the fourth-order biharmonic PDE. To this end, three
different activations, i.e., ReLU, sigmoid, and tanh are selected;
∂ 2w
w = 0; =0 for x = 0 and x = a (48) the network used is defined by N = 20, Ln = 5. The corre-
∂ x2 sponding results are depicted in Fig. 8(g–l). Based on the results,
∂ w
2
all the activations perform well as the NN approximations are
w = 0; =0 for y = 0 and y = b (49)
∂ y2 in good agreement with the analytical solutions both qualita-
tively and quantitatively. For further insight into the influence
5.3.1. Analytical solution of an activation function on the accuracy of the solutions, the
Along with the governing equation in Eq. (36) and the bound- absolute error between the analytical solutions and the PINNs
ary conditions in Eqs. (48)–(49), the analytical solutions of w are approximations for each field variable is compared for the solu-
obtained as: tions obtained with different activations in Fig. 9(a–f). From the
q0 πx πy comparison, ReLU provides the least absolute error distributions
w= sin sin (50) in solving the Biharmonic equation for the simply supported
π ( 2 + 2)
4 1
a
1 2
b
a b
plate. Although, the sigmoid activation provides the best result
Utilizing Eqs. (37)–(38), analytical solutions for the moments Mxx , for |Mxy − M̃NN
xy |, the absolute error for the rest of the fields is
Myy , Mxy and the shearing forces, Qxx , Qyy are obtained as: higher compared to the solutions obtained with ReLU. Because
of the sinusoidal nature of the solution, it was expected that
ν πx πy
( )
q0 1
Mxx = )2 + sin sin (51) tanh activation might be specifically suitable for this problem.
a2 b2 a b
(
π2 1
+ 1 Surprisingly, tanh provides worse results compared to ReLU and
a2 b2
sigmoid activations. This can be due to the complex nature of the
ν πx πy
( )
q0 1 solution space, where ReLU can provide better adaptability during
Myy = + sin sin (52)
( )2
a2 b2 a b training. Furthermore, in Fig. 10, the epoch evolution of the
π2 1
+ 1
total loss ∆Ω Ω
a2 b2 L , and constitutive loss ∆L is compared for different
q0 (1 − ν )
(
ν 1
)
πx πy activation functions. For a particular epoch, ReLU performs better
Mxy = )2 + cos cos (53) than the other two activations for ∆L . For ∆Ω L , tanh activation
a2 b2 a b
(
π2 1
a2
+ 1
b2
ab shows better convergence and the lowest loss value at the end
q0 πx πy of training due to the sinusoidal nature of the solution of the
Qxx = ( ) cos sin (54) Biharmonic PDE. However, the fluctuations in the loss curve for
πa 1
+ 1 a b tanh have a relatively higher variance compared to ReLU and
a2 b2
q0 πx πy sigmoid. As reported in Table 3, overall, performance in terms
Qyy = ( ) sin sin (55) of various loss components at the end of training is superior
πa 1
+ 1 a b
a2 b2 for the ReLU activation for solving the Biharmonic PDE using
the proposed PINNs framework. Additionally, the model with the
These analytical solutions, w (x), M (x), and Q (x) have been uti- ReLU activation requires the least training time ttr , indicating
lized as ŵ (xl|Ω ), M̂ (xl|Ω ), and Q̂ (xl|Ω ) for data driven enhancement better convergence and faster computation of the forward and
in Eqs. (45)–(47), respectively for the PINNs approximations of backpropagation steps.
the field variables.
5.4.2. Influence of network parameters
5.4. PINNs solutions for the biharmonic equation As was found for the linear elasticity problem, PINNs solu-
tions are sensitive to the NN architecture. Various parameters
For the benchmark problem, a rectangular plate (a = 200 cm, that influence the NN architectures, the number of neurons in
b = 300 cm) with thickness t = 1 cm is considered with the each hidden layer N , and the total number of hidden layers
following material properties: Young’s modulus of elasticity E= Ln , on the accuracy of the model and the efficiency of training
202017.03 MPa, Poisson’s ratio ν = 0.25, and flexural rigidity D = the model have been explored herein. Because of its superior
17,957 N m. The sinusoidal load intensity q0 = 9.806 × 10−4 MPa performance for the problem, ReLU is chosen as the activation
is prescribed as shown in Fig. 7. A similar problem has been function. Four different networks with combinations N = 20, 40,
482
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 8. Solution of field variables obtained from (a–f) analytical solutions (left to right): w , Mxx , Myy , Mxy , Qxx , and Qyy ; (g–l) proposed PINNs results (left to right):
w̃NN , M̃NN NN NN NN NN
xx , M̃xy , M̃yy , Q̃xx , and Q̃yy for activation functions (i) ReLU, (ii) sigmoid, and (iii) tanh.

Table 3
Influence of different activation functions on the final values of various loss components (in 10−05 ) and training times ttr in the proposed PINNs
model for solving biharmonic PDE.
Activation function ∆Ω
L ∆ΓLt ∆ΓLu ∆w
L ∆M
L ∆QL ∆L ttr
(min)
ReLU 5.34 132.31 1672.91 278.43 498.76 101.36 2689.11 23.1
Sigmoid 63.07 980.67 4601.60 1707.50 987.89 117.56 8458.29 25.8
tanh 0.12 7138.43 9807.31 6809.34 397.89 500.37 24653.46 34.6

and Ln = 5, 10 were trained. Corresponding network parameters N = 20, 40 and Ln = 10. Similar conclusions may be drawn
(np ), model training time (ttr ), and values of different loss com- based on Fig. 12 and Table 4. The total and constitutive losses
ponents at the end of training have been presented in Table 4. are minimum for N = 40 and Ln = 10 at the end of training.
The comparisons of the absolute error between the analytical However, the approximations by this model have higher variance.
solutions and the PINNs approximations for each field are shown Expectedly, more complex models (higher Ln ), or with larger
in Fig. 11. Comparisons of the total loss ∆L , the constitutive loss np , require longer training time ttr . For the chosen number of
∆ΩL for various combinations of network parameters, N and Ln collocation points, Ln = 10 is optimal.
are shown in Fig. 12.
Based on the comparisons shown in Fig. 11, increased network
5.4.3. Smart initialization of data-driven enhancement
depth improves the accuracy of the PINNs approximations for all
In this section, we explore the applicability of data-driven
variables. Predictions by both networks with Ln = 10 are superior
enhancement in the proposed PINNs framework to improve the
compared to the analytical solutions for the chosen number of
accuracy of the solution. Initially, the network is trained with
collocation points. On the other hand, an increase in the number
relatively low Nc = 10,000. The pre-trained model is then trained
of neurons in each layer increases model prediction variance
for the higher number of collocation datasets Nc = 15,000
which is reflected in the higher absolute error comparisons for
and Nc = 20,000 to further improve the model accuracy. The
483
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 9. Absolute error of field variables between analytical solution and PINNs results (a) |w − w̃NN |; (b) |Mxx − M̃NN NN NN NN
xx |; (c) |Myy − M̃yy |; (d) |Mxy − M̃xy |; (e) |Qxx − Q̃xx |;
and (f) |Qyy − Q̃NN
yy | for activation functions (i) ReLU, (ii) sigmoid, and (iii) tanh.

Fig. 10. Comparison of (a) total loss ∆L ; (b) constitutive loss ∆Ω


L during training for tanh, sigmoid and ReLU activation functions for network parameters
N = 20, Ln = 5.

Table 4
Influence of network parameters N and Ln on training times ttr and final values of various loss components (in 10−05 ) for tanh activation.
Network identifier np ttr ∆Ω
L ∆ΓLu ∆ΓLt ∆w
L ∆M
L ∆QL ∆L
(min)
N-1 (N = 20, Ln = 5) 12,940 23.1 5.34 132.31 1672.91 278.43 498.76 101.36 2689.11
N-2 (N = 40, Ln = 5) 52,760 29.8 0.47 35.13 467.34 128.38 198.11 40.29 869.72
N-3 (N = 20, Ln = 10) 32,056 31.7 0.07 82.15 86.84 77.82 298.01 10.17 555.06
N-4 (N = 40, Ln = 10) 126,224 42.8 0.009 0.67 5.12 4.21 0.53 0.17 10.709

idea is to speed up the training by utilizing pre-trained weights; shown which demonstrate significant improvement of the PINNs
the initial states of the PINNs models in the later phases of approximations with the increase in Nc . Additionally, parameters
training are not random anymore. The speed-up is reflected in related to the efficiency of the network training processes with
Figs. 13(a, b) when the convergence of the loss curves (∆L and initialization of data-driven enhancement are reported in Table 5.
∆ΩL ) for the pre-trained models corresponding to Nc = 15,000 The loss terms quickly reduce by orders of magnitude in the
and Nc = 20,000 are much improved compared to the first second training phase which indicates that for the considered
training phase with Nc = 10,000. In Fig. 13(c), the absolute network architecture, Nc = 15 000 is possibly optimal.
errors between the approximations and analytical solutions are

484
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 11. Absolute error of field variables between analytical solution and PINNs results (a) |w − w̃NN |; (b) |Mxx − M̃NN NN NN
xx |; (c) |Myy − M̃yy |; (d) |Mxy − M̃xy |; (e)
|Qxx − Q̃NN NN
xx |; and (f) |Qyy − Q̃yy | for various network parameters (i) N = 20, Ln = 5, (ii) N = 40, Ln = 5, (iii) N = 20, Ln = 10, and (iv) N = 40, Ln = 10.

Fig. 12. Comparison of (a) total loss ∆Ω Ω


L ; (b) constitutive loss ∆L for various combinations of network parameters N and Ln considering ReLU activation.

6. Discussions multi-objective loss function consists of the residual of the gov-


erning PDE, various boundary conditions, and data-driven physi-
cal knowledge fitting terms. Additionally, weights corresponding
to the terms in the loss function dictate the emphasis on satis-
In the current study, a generalized PINNs framework for solv- fying the specific loss terms. To demonstrate the efficacy of the
ing problems in linear continuum elasticity in the field of solid framework, the Airy solution to an end-loaded cantilever beam
mechanics is presented. The fundamentals of the PINNs frame- and the Kirchhoff–Love plate theory governed by fourth-order
work involve a construction of the loss function for physics- Biharmonic PDE has been solved. The proposed PINNs framework
informed learning of the NNs through the embedding of the linear is shown to accurately solve different fields in both problems.
constraint during training. Following the PINNs philosophy to Parametric investigations on activation functions and network
solve the linear elastic problem accurately, a multi-objective loss architectures highlight the scope of improvement in terms of
function has been formulated and implemented. The proposed solution accuracy and performance. Data-driven enhancement

485
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Fig. 13. Influence of smart initialization of data-driven enhancement on (a) total loss ∆Ω Ω
L ; (b) constitutive loss ∆L for increasing Nc considering ReLU activation;
(c) Absolute error of field variables between analytical solution and PINNs results for (i) Nc = 10,000, (ii) Nc = 15,000 TL, and Nc = 20,000 TL.

Table 5
Network parameters, training time, and the component of loss for different smart initialization of data-driven enhancement models.
Network identifier Nc Epochs ∆Ω
L ∆ΓLt ∆ΓLu ∆w
L ∆M
L ∆QL ∆L ttr
(min)
N-1 10000 1000 5.34 132.31 1672.91 278.43 498.76 101.36 2689.11 23.1
N-TL1 15000 250 0.025 1.31 17.34 1.43 13.11 9.89 43.11 5.1
N-TL2 20000 250 0.005 0.71 2.96 2.01 2.56 0.87 9.11 7.2

of the PINNs approximations using analytical solutions signifi- with singularities the performance of PINNs may vary drastically
cantly boosts accuracy and speed only using minimal network with various sampling procedures (Daw, Bu, Wang, Perdikaris,
parameters. Therefore, such an approach can be employed to & Karpatne, 2022; Leiteritz & Pflüger, 2021). To overcome such
enhance solution accuracy for complex PDEs. Additionally, the an issue, a failure-informed adaptive enrichment strategy such as
applicability of a smart initialization of data-driven enhance- failure-informed PINNs (FI-PINNs) can be employed that adopts
ment learning-based approach quickening the training process the failure probability as the posterior error indicator to generate
and also improving model accuracy have been illustrated. Such new training points in the failure region (Gao, Yan, & Zhou,
an approach would be key in achieving computational efficiency 2022). Furthermore, the basic resampling scheme can be further
beyond conventional computational methods for solving linear improved with a gradient-based adaptive scheme to relocate
continuum elasticity. The proposed PINNs elasticity solvers uti- the collocation points through a cosine-annealing to areas with
lize Tensorflow as the backend which can be easily deployed in higher loss gradient, without increasing the total number of
CPU/ GPU clusters, whereas, conventional algorithms lack such points that demonstrated significant improvement under rela-
adaptability. Thus, it opens new possibilities for solving complex tively fewer number of collocation points and sharper forcing
elasticity problems that have remained unsolved by conventional function (Subramanian, Kirby, Mahoney, & Gholami, 2022). In
numerical algorithms in the regime of continuum mechanics. It is addition, the evolutionary sampling (Evo) method (Daw et al.,
however worth noting that exploitation of the computational ad- 2022) that can incrementally accumulate collocation points in
vantages of the PINNs framework depends on various factors in- regions of high PDE residuals can be an efficient choice for solving
cluding the choice of the network architectures, hyperparameter various time-dependent PDEs with little to no computational
tuning, sampling techniques (distribution) of collocation points, overhead. Instead of using a random approach such as Latin
etc. It has been shown that appropriate combinations of such Hypercube sampling, in the future, different deterministic and
factors significantly improve the training process and the trained pseudo-random sampling strategies such as Sparse Grid sampling
models. or Sobol Sequences can be employed to further improve the
In the present study, random sampling of the collocation performance of the model.
points has been considered which is simple, yet powerful, that Furthermore, it is critical to obtain the statics of saturation
can lead to a significantly better reconstruction of the elastic along different parts of the solution domain during the training
fields. Importantly, this approach does not increase computa- of DNNs (Glorot & Bengio, 2010; Rakitianskaia & Engelbrecht,
tional complexity, and it is easy to implement. However, in 2015b). The saturation occurs when the hidden units of a DNN
elastic/elastoplastic PDE problem which exhibits local behavior predominantly output values close to the asymptotic ends of
(e.g., in presence of sharp, or very localized, features) or problems the activation function range which reduces the particular PINNs

486
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

model to a binary state, thus limiting the overall information Acknowledgment


capacity of the NN (Bai, Zhou, Li, & Li, 2019; Rakitianskaia &
Engelbrecht, 2015a). The saturated units can make gradient de- The support of the Aeronautical Research and Development
scent learning slow and inefficient due to small derivative values Board (Grant No. DARO/08/1051450/M/I) is gratefully acknowl-
near the asymptotes which can hinder the training PINNs effi- edged. AMR and RA also would like to acknowledge the support
ciently (Bai et al., 2019). Thus, in the future, NN saturation can be of National Science Foundation (NSF) through Grant No. 2001333
studied quantitatively in relation to the ability of NNs to learn, and Grant No. 2119103.
generalize, and the degree of regression accuracy. In addition,
various weighting coefficients of the loss terms in Eq. (8) and References
implementation of second-order optimization techniques (Tan
& Lim, 2019) can accelerate the training significantly. Based on Arora, R., Kakkar, P., Dey, B., & Chakraborty, A. (2022). Physics-informed neural
the performance of the PINNs framework herein, further studies networks for modeling rate-and temperature-dependent plasticity. arXiv
quantifying the computational gains of the PINNs approach com- preprint arXiv:2201.08363.
pared to conventional numerical methods are in order. The pro- Atkin, R. J., & Fox, N. (2005). An introduction to the theory of elasticity. Courier
Corporation.
posed approach can be extended to the solution in various com-
Bai, W., Zhou, Q., Li, T., & Li, H. (2019). Adaptive reinforcement learning neural
putational mechanics problems such as soil plasticity (Bousshine, network control for uncertain nonlinear system with input saturation. IEEE
Chaaba, & De Saxce, 2001; Chen & Baladi, 1985), strain-gradient Transactions on Cybernetics, 50(8), 3433–3443.
plasticity (Guha, Sangal, & Basu, 2013, 2014), composite mod- Batra, R., Song, L., & Ramprasad, R. (2021). Emerging materials intelligence
eling (Roy, 2021c) etc. Furthermore, the present model can be ecosystems propelled by machine learning. Nature Reviews Materials, 6(8),
655–678.
employed to predict microstructure evolution in Phase-field (PF)
Baydin, A. G., Pearlmutter, B. A., Radul, A. A., & Siskind, J. M. (2018). Automatic
approach including various solid–solid phase transitions (PTs) differentiation in machine learning: a survey. Journal of Machine Learning
(Levitas & Roy, 2015; Levitas, Roy, & Preston, 2013; Roy, 2020a, Research, 18.
2020b, 2020c), solid–solid PT via intermediate melting (Levitas & Bekar, A. C., Madenci, E., Haghighat, E., Waheed, U. b., & Alkhalifah, T.
Roy, 2016; Roy, 2021a, 2021b, 2021d, 2021e, 2021f, 2022d), and (2022). Solving the eikonal equation for compressional and shear waves
in anisotropic media using peridynamic differential operator. Geophysical
various other applications (Jamil & Roy, 2022; Khan, Raj, Kumar,
Journal International, 229(3), 1942–1963.
Roy & Luo, 2022; Roy & Bhaduri, 2023; Singh, Raj, Kumar, Verma Bilbao, I., & Bilbao, J. (2017). Overfitting problem and the over-training in the era
& Roy, 2023; Singh, Ranjbarzadeh, Raj, Kumar & Roy, 2023). of data: Particularly for artificial neural networks. In 2017 eighth international
conference on intelligent computing and information systems (pp. 173–177).
IEEE.
7. Conclusions bin Waheed, U., Alkhalifah, T., Haghighat, E., & Song, C. (2022). A holistic
approach to computing first-arrival traveltimes using neural networks. In
Advances in subsurface data analytics (pp. 251–278). Elsevier.
Summarizing, the current work presents a deep learning
bin Waheed, U., Haghighat, E., Alkhalifah, T., Song, C., & Hao, Q. (2021).
framework based on the fundamentals of PINNs theory for the Pinneik: Eikonal solution using physics-informed neural networks. Computers
solution of linear elasticity problems in continuum mechanics. & Geosciences, 155, Article 104833.
A multi-objective loss function is proposed for the linear elastic Bose, R., & Roy, A. (2022). Accurate deep learning sub-grid scale models for large
solid problems that include governing PDE, Dirichlet, and Neu- eddy simulations. Bulletin of the American Physical Society.
mann boundary conditions across randomly chosen collocation Bousshine, L., Chaaba, A., & De Saxce, G. (2001). Softening in stress–strain
curve for Drucker–Prager non-associated plasticity. International Journal of
points in the problem domain. Multiple deep network mod- Plasticity, 17(1), 21–46.
els trained to predict different fields result in a more accurate Bower, A. F. (2009). Applied mechanics of solids. CRC Press.
representation. Traditional ML/ DL approaches that only rely Boyd, J. P. (2001). Chebyshev and Fourier spectral methods. Courier Corporation.
on fitting a model that establishes complex, high-dimensional, Brunton, S. L., Noack, B. R., & Koumoutsakos, P. (2020). Machine learning for
non-linear relationships between the input features and outputs, fluid mechanics. Annual Review of Fluid Mechanics, 52, 477–508.
are unable to incorporate rich information available through Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O., & Walsh, A. (2018). Machine
learning for molecular and materials science. Nature, 559(7715), 547–555.
governing equations/ physics-based mathematical modeling of
Cai, S., Wang, Z., Wang, S., Perdikaris, P., & Karniadakis, G. E. (2021). Physics-
physical phenomena. Conventional computational techniques on informed neural networks for heat transfer problems. ournal of Heat Transfer,
the other hand rely completely on such physical information 143(6).
for prediction. The PINNs approach combines the benefits of the Chandio, A., Gui, G., Kumar, T., Ullah, I., Ranjbarzadeh, R., Roy, A. M., et al. (2022).
DL techniques in the extraction of complex relations from data Precise single-stage detector. arXiv preprint arXiv:2210.04252.
Chen, W.-F., & Baladi, G. Y. (1985). Soil plasticity: theory and implementation.
with the advantages of the conventional numerical techniques
Elsevier.
for physical modeling. The proposed method may be extended to Ching, T., Himmelstein, D. S., Beaulieu-Jones, B. K., Kalinin, A. A., Do, B. T., Way, G.
nonlinear elasticity, viscoplasticity, elastoplasticity, and various P., et al. (2018). Opportunities and obstacles for deep learning in biology and
other mechanics and material science problems. The present medicine. Journal of the Royal Society Interface, 15(141), Article 20170387.
work builds a solid foundation for new promising avenues for Chollet, F., et al. (2015). Keras.
future work in machine learning applications in solid mechanics. Dana, S., & Wheeler, M. F. (2020). A machine learning accelerated FE
homogenization algorithm for elastic solids. arXiv preprint arXiv:2003.11372.
Daw, A., Bu, J., Wang, S., Perdikaris, P., & Karpatne, A. (2022). Rethinking the
Declaration of competing interest importance of sampling in physics-informed neural networks. arXiv preprint
arXiv:2207.02338.
De Ryck, T., Jagtap, A. D., & Mishra, S. (2022). Error estimates for physics
The authors declare that they have no known competing finan- informed neural networks approximating the Navier-Stokes equations. arXiv
cial interests or personal relationships that could have appeared preprint arXiv:2203.09346.
to influence the work reported in this paper. Du, Y., & Zaki, T. A. (2021). Evolutional deep neural network. Physical Review E,
104, Article 045303.
Frankel, A., Tachida, K., & Jones, R. (2020). Prediction of the evolution of the
Data availability stress field of polycrystals undergoing elastic-plastic deformation with a
hybrid neural network model. Machine Learning: Science and Technology, 1(3),
Article 035005.
The data that support the findings of this study are available Gao, Z., Yan, L., & Zhou, T. (2022). Failure-informed adaptive sampling for PINNs.
from the corresponding author upon reasonable request. arXiv preprint arXiv:2210.00279.

487
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep Khan, W., Kumar, T., Cheng, Z., Raj, K., Roy, A. M., & Luo, B. (2022). SQL
feedforward neural networks. In Proceedings of the thirteenth interna- and NoSQL databases software architectures performance analysis and
tional conference on artificial intelligence and statistics (pp. 249–256). JMLR assessments–a systematic literature review. arXiv preprint arXiv:2209.06977.
Workshop and Conference Proceedings. Khan, W., Raj, K., Kumar, T., Roy, A. M., & Luo, B. (2022). Introducing urdu digits
Glowacz, A. (2021). Fault diagnosis of electric impact drills using thermal dataset with demonstration of an efficient and robust noisy decoder-based
imaging. Measurement, 171, Article 108815. pseudo example generator. Symmetry, 14(10), 1976.
Glowacz, A. (2022). Thermographic fault diagnosis of shaft of BLDC motor. Krishnapriyan, A., Gholami, A., Zhe, S., Kirby, R., & Mahoney, M. W. (2021).
Sensors, 22(21), 8537. Characterizing possible failure modes in physics-informed neural networks.
Goswami, S., Anitescu, C., Chakraborty, S., & Rabczuk, T. (2020). Transfer learning Advances in Neural Information Processing Systems, 34.
enhanced physics informed neural network for phase-field modeling of Kutz, J. N. (2017). Deep learning in fluid dynamics. Journal of Fluid Mechanics,
fracture. Theoretical and Applied Fracture Mechanics, 106, Article 102447. 814, 1–4.
Goswami, S., Yin, M., Yu, Y., & Karniadakis, G. E. (2022). A physics-informed Lagaris, I. E., Likas, A., & Fotiadis, D. I. (1998). Artificial neural networks for
variational DeepONet for predicting crack path in quasi-brittle materials. solving ordinary and partial differential equations. IEEE Transactions on
Computer Methods in Applied Mechanics and Engineering, 391, Article 114587. Neural Networks, 9(5), 987–1000.
Guha, S., Sangal, S., & Basu, S. (2013). Finite element studies on indentation size LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553),
effect using a higher order strain gradient theory. International Journal of 436–444.
Solids and Structures, 50(6), 863–875. Leiteritz, R., & Pflüger, D. (2021). How to avoid trivial solutions in
Guha, S., Sangal, S., & Basu, S. (2014). On the fracture of small samples under physics-informed neural networks. arXiv preprint arXiv:2112.05620.
higher order strain gradient plasticity. International Journal of Fracture, 187(2), Levitas, V. I., & Roy, A. M. (2015). Multiphase phase field theory for temperature-
213–226. and stress-induced phase transformations. Physical Review B, 91(17), Article
Guo, M., & Haghighat, E. (2020). An energy-based error bound of physics- 174109.
informed neural network solutions in elasticity. arXiv preprint arXiv:2010. Levitas, V. I., & Roy, A. M. (2016). Multiphase phase field theory for temperature-
09088. induced phase transformations: Formulation and application to interfacial
Haghighat, E., Amini, D., & Juanes, R. (2022). Physics-informed neural network phases. Acta Materialia, 105, 244–257.
simulation of multiphase poroelasticity using stress-split sequential training. Levitas, V. I., Roy, A. M., & Preston, D. L. (2013). Multiple twinning and variant-
Computer Methods in Applied Mechanics and Engineering, 397, Article 115141. variant transformations in martensite: phase-field approach. Physical Review
Haghighat, E., Bekar, A. C., Madenci, E., & Juanes, R. (2021). A nonlocal B, 88(5), Article 054113.
physics-informed deep learning framework using the peridynamic differen- Ling, J., Kurzawski, A., & Templeton, J. (2016). Reynolds averaged turbulence
tial operator. Computer Methods in Applied Mechanics and Engineering, 385, modelling using deep neural networks with embedded invariance. Journal of
Article 114012. Fluid Mechanics, 807, 155–166.
Haghighat, E., & Juanes, R. (2021). Sciann: A keras/tensorflow wrapper for Lou, Q., Meng, X., & Karniadakis, G. E. (2021). Physics-informed neural networks
scientific computations and physics-informed deep learning using artificial for solving forward and inverse flow problems via the Boltzmann-BGK
neural networks. Computer Methods in Applied Mechanics and Engineering, formulation. Journal of Computational Physics, 447, Article 110676.
373, Article 113552.
Lurie, A. I. (2010). Theory of elasticity. Springer Science & Business Media.
Haghighat, E., Raissi, M., Moure, A., Gomez, H., & Juanes, R. (2020). A deep
Määttä, J., Bazaliy, V., Kimari, J., Djurabekova, F., Nordlund, K., & Roos, T. (2021).
learning framework for solution and discovery in solid mechanics. arXiv
Gradient-based training and pruning of radial basis function networks with
preprint arXiv:2003.02751.
an application in materials physics. Neural Networks, 133, 123–131.
Haghighat, E., Raissi, M., Moure, A., Gomez, H., & Juanes, R. (2021). A physics-
Marsden, J. E., & Hughes, T. J. (1994). Mathematical foundations of elasticity.
informed deep learning framework for inversion and surrogate modeling in
Courier Corporation.
solid mechanics. Computer Methods in Applied Mechanics and Engineering, 379,
McClenny, L., & Braga-Neto, U. (2020). Self-adaptive physics-informed neural
Article 113741.
networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544.
Hu, Z., Jagtap, A. D., Karniadakis, G. E., & Kawaguchi, K. (2021). When do
Racca, A., & Magri, L. (2021). Robust optimization and validation of echo state
extended physics-informed neural networks (XPINNs) improve generaliza-
networks for learning chaotic dynamics. Neural Networks, 142, 252–268.
tion? arXiv preprint arXiv:2109.09444.
Irfan, M., Iftikhar, M. A., Yasin, S., Draz, U., Ali, T., Hussain, S., et al. (2021). Role Raissi, M., & Karniadakis, G. E. (2018). Hidden physics models: Machine learning
of hybrid deep neural networks (HDNNs), computed tomography, and chest of nonlinear partial differential equations. Journal of Computational Physics,
X-rays for the detection of COVID-19. International Journal of Environmental 357, 125–141.
Research and Public Health, 18(6), 3056. Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural net-
Jabbar, H., & Khan, R. Z. (2015). Methods to avoid over-fitting and under- works: A deep learning framework for solving forward and inverse problems
fitting in supervised machine learning (comparative study). Computer Science, involving nonlinear partial differential equations. Journal of Computational
Communication and Instrumentation Devices, 70. Physics, 378, 686–707.
Jagtap, A. D., & Karniadakis, G. E. (2021). Extended physics-informed neural Raissi, M., Yazdani, A., & Karniadakis, G. E. (2020). Hidden fluid mechanics:
networks (XPINNs): A generalized space-time domain decomposition based Learning velocity and pressure fields from flow visualizations. Science,
deep learning framework for nonlinear partial differential equations. In AAAI 367(6481), 1026–1030.
spring symposium: MLPS. Rakitianskaia, A., & Engelbrecht, A. (2015a). Measuring saturation in neural
Jagtap, A. D., Kharazmi, E., & Karniadakis, G. E. (2020). Conservative physics- networks. In 2015 IEEE symposium series on computational intelligence (pp.
informed neural networks on discrete domains for conservation laws: 1423–1430). IEEE.
Applications to forward and inverse problems. Computer Methods in Applied Rakitianskaia, A., & Engelbrecht, A. (2015b). Saturation in PSO neural network
Mechanics and Engineering, 365, Article 113028. training: Good or evil? In 2015 IEEE congress on evolutionary computation
Jagtap, A. D., Mao, Z., Adams, N., & Karniadakis, G. E. (2022). Physics-informed (CEC) (pp. 125–132). IEEE.
neural networks for inverse problems in supersonic flows. arXiv preprint Rao, C., Sun, H., & Liu, Y. (2021). Physics-informed deep learning for computa-
arXiv:2202.11821. tional elastodynamics without labeled data. Journal of Engineering Mechanics,
Jahanbakht, M., Xiang, W., & Azghadi, M. R. (2022). Sediment prediction in 147(8), Article 04021043.
the great barrier reef using vision transformer with finite element analysis. Reddy, J. N. (2006). Theory and analysis of elastic plates and shells. CRC Press.
Neural Networks, 152, 311–321. Rezaei, S., Harandi, A., Moeineddin, A., Xu, B.-X., & Reese, S. (2022). A mixed
Jamil, S., Abbas, M. S., & Roy, A. M. (2022). Distinguishing malicious drones using formulation for physics-informed neural networks as a potential solver for
vision transformer. AI, 3(2), 260–273. engineering problems in heterogeneous domains: comparison with finite
Jamil, S., & Roy, A. M. (2022). Robust pcg-based vhd detection model using d- element method. Computer Methods in Applied Mechanics and Engineering,
cnns, nature-inspired algorithms, and vision transformer. Available at SSRN: 401, Article 115616.
http://dx.doi.org/10.2139/ssrn.4316752, 41. Roy, A. M. (2020a). Effects of interfacial stress in phase field approach for
Jamil, S., & Roy, A. M. (2023). An efficient and robust phonocardiography martensitic phase transformation in NiAl shape memory alloys. Applied
(pcg)-based valvular heart diseases (vhd) detection framework using vision Physics A, 126(7), 1–12.
transformer (vit). Computers in Biology and Medicine, 106734. Roy, A. M. (2020b). Evolution of martensitic nanostructure in NiAl alloys: tip
Jin, X., Cai, S., Li, H., & Karniadakis, G. E. (2021). Nsfnets (Navier-Stokes splitting and bending. Material Science Research India (Online), 17(special 1),
flow nets): Physics-informed neural networks for the incompressible 03–06.
Navier-Stokes equations. Journal of Computational Physics, 426, Article Roy, A. M. (2020c). Influence of interfacial stress on microstructural evolution
109951. in NiAl alloys. JETP Letters, 112(3), 173–179.
Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., & Yang, L. Roy, A. M. (2021a). Barrierless melt nucleation at solid-solid interface in
(2021). Physics-informed machine learning. Nature Reviews Physics, 3(6), energetic nitramine octahydro-1, 3, 5, 7-tetranitro-1, 3, 5, 7-tetrazocine.
422–440. Materialia, 15, Article 101000.

488
A.M. Roy, R. Bose, V. Sundararaghavan et al. Neural Networks 162 (2023) 472–489

Roy, A. M. (2021b). Energetics and kinematics of undercooled nonequilibrium in- Sengupta, T. (2013). High accuracy computing methods: fluid flows and wave
terfacial molten layer in cyclotetramethylene-tetranitramine crystal. Physica phenomena. Cambridge University Press.
B: Condensed Matter, 615, Article 412986. Shukla, K., Jagtap, A. D., Blackshire, J. L., Sparkman, D., & Karniadakis, G. E. (2021).
Roy, A. M. (2021c). Finite element framework for efficient design of three A physics-informed neural network for quantifying the microstructural prop-
dimensional multicomponent composite helicopter rotor blade system. Eng, erties of polycrystalline nickel using ultrasound data: A promising approach
2(1), 69–79.
for solving inverse problems. IEEE Signal Processing Magazine, 39(1), 68–77.
Roy, A. M. (2021d). Formation and stability of nanosized, undercooled prop-
Singh, A., Raj, K., Kumar, T., Verma, S., & Roy, A. M. (2023). Deep learning-based
agating intermediate melt during β → δ phase transformation in HMX
nanocrystal. Europhysics Letters, 133(5), 56001. cost-effective and responsive robot for autism treatment. Drones, 7(2), 81.
Roy, A. M. (2021e). Influence of nanoscale parameters on solid–solid phase Singh, A., Ranjbarzadeh, R., Raj, K., Kumar, T., & Roy, A. M. (2023). Understanding
transformation in octogen crystal: Multiple solution and temperature effect. EEG signals for subject-wise definition of armoni activities. arXiv preprint
JETP Letters, 113(4), 265–272. arXiv:2301.00948.
Roy, A. M. (2021f). Multiphase phase-field approach for solid–solid phase Sirignano, J., & Spiliopoulos, K. (2018). DGM: A deep learning algorithm for
transformations via propagating interfacial phase in HMX. Journal of Applied solving partial differential equations. Journal of Computational Physics, 375,
Physics, 129(2), Article 025103. 1339–1364.
Roy, A. M. (2022a). Adaptive transfer learning-based multiscale feature fused Subramanian, S., Kirby, R. M., Mahoney, M. W., & Gholami, A. (2022). Adaptive
deep convolutional neural network for EEG MI multiclassification in brain–
self-supervision algorithms for physics-informed neural networks. arXiv
computer interface. Engineering Applications of Artificial Intelligence, 116,
preprint arXiv:2207.04084.
Article 105347.
Sun, L., Gao, H., Pan, S., & Wang, J.-X. (2020). Surrogate modeling for fluid
Roy, A. M. (2022b). An efficient multi-scale CNN model with intrinsic feature
integration for motor imagery EEG subject classification in brain-machine flows based on physics-constrained deep learning without simulation data.
interfaces. Biomedical Signal Processing and Control, 74, Article 103496. Computer Methods in Applied Mechanics and Engineering, 361, Article 112732.
Roy, A. M. (2022c). A multi-scale fusion CNN model based on adaptive transfer Szilard, R., & Nash, W. (1974). Theory and analysis of plates, classical and
learning for multi-class MI-classification in BCI system. BioRxiv. numberical methods.
Roy, A. M. (2022d). Multiphase phase-field approach for virtual melting: a brief Tan, H. H., & Lim, K. H. (2019). Review of second-order optimization techniques
review. Roy AM Multiphase Phase-Field Approach for Virtual Melting: A Brief in artificial neural networks backpropagation. Vol. 495, In IOP conference
Review. Mat. Sci. Res. India, 18(2). series: materials science and engineering. (1), IOP Publishing, Article 012003.
Roy, A. M., & Bhaduri, J. (2021). A deep learning enabled multi-class plant disease Tartakovsky, A. M., Marrero, C. O., Perdikaris, P., Tartakovsky, G. D., & Barajas-
detection model based on computer vision. AI, 2(3), 413–428.
Solano, D. (2018). Learning parameters and constitutive relationships with
Roy, A. M., & Bhaduri, J. (2022). Real-time growth stage detection model for
physics informed deep neural networks. arXiv preprint arXiv:1808.03398.
high degree of occultation using DenseNet-fused YOLOv4. Computers and
Timoshenko, S. (1970). Theory of elastic stability 2e. Tata McGraw-Hill Education.
Electronics in Agriculture, 193, Article 106694.
Roy, A. M., & Bhaduri, J. (2023). A computer vision enabled damage detection Timoshenko, S., & Woinowsky-Krieger, S. (1959). Theory of plates and shells.
model with improved yolov5 based on transformer prediction head. arXiv Vahab, M., Haghighat, E., Khaleghi, M., & Khalili, N. (2021). A physics-informed
preprint arXiv:2303.04275. neural network approach to solution and identification of biharmonic
Roy, A. M., Bhaduri, J., Kumar, T., & Raj, K. (2022a). A computer vision- equations of elasticity. Journal of Engineering Mechanics, 148(2), Article
based object localization model for endangered wildlife detection. Ecological 04021154.
Economics, Forthcoming. von Rueden, L., Mayer, S., Beckh, K., Georgiev, B., Giesselbach, S., Heese, R., et
Roy, A. M., Bhaduri, J., Kumar, T., & Raj, K. (2022b). WilDect-YOLO: An effi- al. (2019). Informed machine learning–a taxonomy and survey of integrating
cient and robust computer vision-based accurate object localization model knowledge into learning systems. arXiv preprint arXiv:1903.12394.
for automated endangered wildlife detection. Ecological Informatics, Article
Waheed, U., Haghighat, E., Alkhalifah, T., Song, C., & Hao, Q. (2020). Eikonal
101919.
solution using physics-informed neural networks. 2020, In EAGE 2020 an-
Roy, A. M., & Bose, R. (2023). Physics-aware deep learning framework for linear
elasticity. arXiv preprint arXiv:2302.09668. nual conference & exhibition online (1), (pp. 1–5). European Association of
Roy, A. M., Bose, R., & Bhaduri, J. (2022c). A fast accurate fine-grain object Geoscientists & Engineers.
detection model based on YOLOv4 deep neural network. Neural Computing Xu, K., Huang, D. Z., & Darve, E. (2021). Learning constitutive relations using
and Applications, 1–27. symmetric positive definite neural networks. Journal of Computational Physics,
Roy, A. M., & Guha, S. (2022). Elastoplastic physics-informed deep learning ap- 428, Article 110072.
proach for j2 plasticity. Available at SSRN: https://ssrn.com/abstract=4332254, Zhang, E., Dao, M., Karniadakis, G. E., & Suresh, S. (2022). Analyses of internal
48. structures and defects in materials using physics-informed neural networks.
Roy, A. M., & Guha, S. (2023). A data-driven physics-constrained deep learn- Science Advances, 8(7), eabk0644.
ing computational framework for solving von mises plasticity. Engineering Zhang, E., Yin, M., & Karniadakis, G. E. (2020). Physics-informed neural networks
Applications of Artificial Intelligence, 122, 106049.
for nonhomogeneous material identification in elasticity imaging. arXiv
Saha, P., Dash, S., & Mukhopadhyay, S. (2021). Physics-incorporated convolu-
preprint arXiv:2009.04525.
tional recurrent neural networks for source identification and forecasting of
dynamical systems. Neural Networks, 144, 359–371. Zhu, Q., Liu, Z., & Yan, J. (2021). Machine learning for metal additive man-
Samaniego, E., Anitescu, C., Goswami, S., Nguyen-Thanh, V. M., Guo, H., Ham- ufacturing: predicting temperature and melt pool fluid dynamics using
dia, K., et al. (2020). An energy approach to the solution of partial differential physics-informed neural networks. Computational Mechanics, 67(2), 619–635.
equations in computational mechanics via machine learning: Concepts, Zienkiewicz, O. C., & Taylor, R. L. (2005). The finite element method for solid and
implementation and applications. Computer Methods in Applied Mechanics and structural mechanics. Elsevier.
Engineering, 362, Article 112790.

489

You might also like