0% found this document useful (0 votes)
0 views

GNN-Computer aided Civil Eng - 2022 - Song - Elastic structural analysis based on graph neural network without labeled data

Uploaded by

yang zhang
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

GNN-Computer aided Civil Eng - 2022 - Song - Elastic structural analysis based on graph neural network without labeled data

Uploaded by

yang zhang
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

DOI: 10.1111/mice.

12944

RESEARCH ARTICLE

Elastic structural analysis based on graph neural network


without labeled data

Ling-Han Song1 Chen Wang1 Jian-Sheng Fan1,2 Hong-Ming Lu1

1 Departmentof Civil Engineering,


Tsinghua University, Beijing, China Abstract
2 Key Laboratory of Civil Engineering Artificial intelligence is gaining increasing popularity in structural analysis.
Safety and Durability of the Ministry of However, at the structural system level, the appropriateness of data represen-
Education, Tsinghua University, Beijing,
China
tation, the paucity of data, and the physical interpretability of results are rarely
studied and remain profound challenges. To fill such gaps, a physics-informed
Correspondence model named StructGNN-E (i.e., structural analysis based on graph neural
Chen Wang, Department of Civil
Engineering, Tsinghua University, 100084 network [GNN]–elastic) based on the GNN architecture, which is capable of
Beijing, China. implementing the elastic analysis of structural systems without labeled data,
Email: [email protected]
is proposed in this study. The systems with structural topologies and member
Funding information configurations are organized as graph data and later processed by a modified
The National Natural Science Foundation graph isomorphism network. Moreover, to avoid dependence on big data, a
of China, Grant/Award Number:
novel physics-informed paradigm is proposed to incorporate mechanics into
52121005; China National Postdoctoral
Program for Innovative Talents, deep learning (DL), ensuring the theoretical correctness of the results. Numer-
Grant/Award Number: BX20220177; ical experiments and ablation studies demonstrate the unique effectiveness of
China Postdoctoral Science Foundation,
Grant/Award Number: 2022M711864
StructGNN-E against common DL models, with an average accuracy of 99% and
excellent computational efficiency. Due to its differentiability, StructGNN-E is
promising for bidirectionally linking structural parameters and analysis results,
paving the way for a new end-to-end structural optimization method in the
future.

1 INTRODUCTION in the digital world (Lee et al., 2018; Salehi & Burgueño,
2018).
The emergence of new-generation artificial intelligence Currently, research on ML/DL-based computation in
technologies represented by machine learning (ML) and civil engineering has addressed all levels of analysis scenar-
deep learning (DL; Dong et al., 2021; Goodfellow et al., ios. At the construction material level, Rafiei et al. (2017)
2016; LeCun et al., 2015) is attracting an increasing number presented a deep restricted Boltzmann machine to esti-
of scholars. These technologies have been applied in civil mate concrete properties based on mixture proportions.
engineering, particularly in the fields of computational Nguyen et al. (2018) adopted artificial neural networks
structural analysis and computational material analysis (ANNs) for predicting the strength of foamed concrete.
(Amezquita-Sancheza et al., 2020; Sun et al., 2021). It is C. Wang et al. (2022) first introduced the Seq2Seq frame-
anticipated that the performance of these technologies will work and then the attention mechanism (Wang et al.,
surpass the performance of traditional numerical methods 2022) to simulate the cyclic behavior of low-yield-point
to realize the efficient simulation of engineering structures steel. Kang et al. (2021) predicted the strength of steel

© 2022 Computer-Aided Civil and Infrastructure Engineering.

Comput Aided Civ Inf. 2023;38:1307–1323. wileyonlinelibrary.com/journal/mice 1307


14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1308 SONG et al.

fiber-reinforced concrete using 11 ML algorithms and ana- cannot interpret the analysis results (Naser, 2021) and eval-
lyzed the influential factors. At the structural member uate the correctness of the models based on the test set; this
level, Olalusi and Awoyera (2021) focused on the shear is unacceptable for engineering applications where safety
capacity of slender RC structures with steel fibers and is the primary goal.
used Gaussian process regression for prediction. Wakjira To address these limitations, an innovative physics-
et al. (2022) employed ensemble ML to estimate the shear informed DL model named StructGNN-E (i.e., structural
capacity of fiber-reinforced-polymer (FRP) reinforced con- analysis based on graph neural network [GNN]–elastic),
crete. Naderpour et al. (2021) utilized a decision tree which is able to implement elastic analysis of structural
and ANNs to determine the failure mode of RC columns systems without labeled data, is proposed in this study. The
and achieved exceptional performance. At the structural remainder of this paper is organized as follows. In Sec-
system level, Rafiei and Adeli (2017) combined ML clas- tion 2, we analyze the data representations of structural
sification algorithms with an optimization process for systems and elaborate on how to utilize non-Euclidean
earthquake prediction. Zhu et al. (2021) attempted to use data—the graph—to organize their feature information.
ANNs and support vector regression to calculate the buck- On this basis, we develop a GNN architecture to handle
ling load of imperfect reticulated shells, avoiding costly the structural analysis of elastic structural systems in Sec-
computation in nonlinear analyses. Recently, the deep long tion 3, wherein the basic knowledge of GNNs is introduced
short-term memory (LSTM) proposed by Hochreiter and and a variant of the graph isomorphism network (GIN)
Schmidhuber (1997) model shows great potential in seis- adapting to the structural analysis scenario is illustrated. In
mic analysis. Zhang et al. (2019) developed an LSTM model addition, we propose a physics-informed paradigm, which
to compute the seismic responses of a specific building, integrates fundamental mechanical formulations into DL,
and the results agreed with the reference data well. Torky to resolve the paucity of data at the structural level. In Sec-
and Ohno (2021) combined a convolutional neural net- tion 4, we validate the StructGNN-E model with randomly
work (CNN) with LSTM to predict the seismic response of generated frame structures of different scales. In Section 5,
an industrial-level building. we carry out ablation studies, which further demonstrate
Valuable results have been acquired through relevant the unique effectiveness of the StructGNN-E model, and
studies and have helped engineers efficiently build compli- conceptually discuss the potential application in the field
cated quantitative relationships in structural computation of structural optimization. Finally, we conclude the cur-
(Feng et al., 2020; Guan et al., 2021; Graf et al., 2012; Kabir rent work and discuss the outlook of our future work in
et al., 2021; Oh et al., 2020) and optimization (Hung & Jan, Section 6.
2002; Messner et al., 1994; Parvin & Serpen, 1999; Yin &
Zhu, 2019). However, at the structural system level, exist-
ing studies have the following limitations. (1) In the current 2 DATA REPRESENTATION USING
literature, the data representation of structural systems has GRAPH STRUCTURES
rarely been a point of focus, which can possess various
topologies due to different structural member configu- The mechanism of structural analysis can be abstracted
rations and their connectivity, resulting in exponentially as predicting structural responses  = (𝐘𝑡0 , … , 𝐘𝑡𝑛 ) that
increasing complexity, compared to that of materials and conform to physical laws given structural properties  =
structural members. The linear data structure used in cur- (𝐂1 , 𝐂2 , …) and external stimuli  = (𝐗𝑡0 , … , 𝐗𝑡𝑛 ) (C. Wang
rent studies cannot comprehensively describe the feature et al., 2020). At the structural system level, such as a
information of structural systems. (2) Applications at this steel frame structure, the structural properties Ci corre-
level face severe data scarcity. Most ML/DL models adopt spond to the overall spatial topologies, section geometries
a data-driven paradigm, which relies heavily on big data of underlying steel members, material properties of struc-
for learning underlying patterns. Nevertheless, there is lit- tural steel, and so forth. The stimulus-response pair < Xtj ,
tle experimental data concerning structural systems, and Ytj > represents structural analysis results such as the
data generation using classic finite element (FE) analysis is load‒displacement relationship. The subscript tj indicates
inordinately time-consuming and cannot cover the entire the stimulus/response at location j at time t.
parameter space. It is well-known that common data- To address the problem, traditional methods, including
driven models suffer from data sparsity (Martins et al., numerical approaches operating at the level of structural
2020) and are not directly applicable at this level. (3) The matrices and mechanical approaches operating at the
theoretical correctness of the analysis results is difficult level of nodes or FE elements, follow the idea of solving
to guarantee. The inference process of data-driven mod- a large problem by decomposing it into a number of
els often neglects distinctive mechanical backgrounds of small sub-problems and have been proven useful in civil
structural analysis. Accordingly, researchers and engineers engineering. Methods belonging to the former category
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SONG et al. 1309

overall mechanical responses of the structural system. For


example, larger member sections lead to higher structural
stiffness. Structural members also serve as the physical
carriers of external stimuli .
Accordingly, compared to Euclidean data structures
(such as vectors, sequences, and grids), the feature infor-
F I G U R E 1 Two examples of frame structures. (a) Frame mation of structural systems exhibits two properties. (1)
structure No. 1 and (b) Frame structure No. 2 Nonsequential: We cannot define a proper sequential rela-
tionship among the structural members, and a physical
start or end does not exist. Therefore, DL algorithms that
include singular value decomposition (Pellegrino, 1993), address sequences such as the recurrent neural network
the subspace method (Pellegrino & Calladine, 1986), and (RNN) family are not applicable at the structural sys-
group representation theory for symmetric structures tem level. (2) Translation variant: Due to architectural
(Kangwai & Guest, 2000), which were comprehensively functionalities and diverse member placements, structural
studied before the widespread application of finite ele- systems usually have topologically different structures at
ment method (FEM) and still play important roles in different locations; thus, organizing the feature informa-
many algorithms of FE platforms (Lu, 2009). The latter tion into grids or vectors resembling images or text is not
approaches are now the most common approaches for feasible. Moreover, the scale of structural systems varies
mechanical analysis, ranging from simulations of material greatly, and it is difficult to define a metric of Euclidean
behaviors to computations of large-scale structure systems distance between two samples. In summary, common data
(Reddy, 2019; Zienkiewicz & Robert, 2005). structures used in existing studies are incapable of com-
However, high requirements on mathematical and prehensively describing structural systems; thus, a more
mechanical knowledge of matrix manipulation, time con- advanced data representation method is necessary.
sumption in building a reasonable FE model, and barri- Here, we introduce the graph structure, a type of
ers between different computation platforms make these non-Euclidean data (Zafeiriou et al., 2022), to represent
approaches sometimes unsatisfactory for novel applica- structural systems and capture their distinctive proper-
tions. From another perspective, our goal of structural ties. Graphs are a kind of data structure consisting of
analysis is to build a mapping from the input condi- nodes and edges that characterize objects and their rela-
tions (, ) to the output responses  based on artificial tionships, respectively. Recently, graph structures have
intelligence techniques. received increasing attention in various fields, such as
Very few intrinsic structural properties have been social networks in recommender systems (Ying et al.,
described in existing studies (Esteghamati & Flint, 2021; 2018), molecular modeling in chemistry (Wang et al.,
Morfidis & Kostinakis, 2017; Hwang et al., 2021), and in 2022), and protein interactions in biology (Tsubaki et al.,
some studies, only different loading protocols were con- 2019), due to their extraordinary expressive power. These
sidered (Huang & Chen, 2021; Kim et al., 2019; Lagaros & interdisciplinary explorations have inspired us to pro-
Papadrakakis, 2012), resulting in a great loss of topolog- pose an innovative data paradigm for structural systems.
ical information and thus poor generalization capability In general, graph structures are classified into directed
across different structures. Reviewing classical numeri- graphs and undirected graphs in terms of edge orienta-
cal studies and engineering design experience, the feature tions. Considering that it is difficult to predetermine the
information of structural systems involves two aspects. force transfer paths inside a structural system, we adopt
(1) Structural member connectivity: A structural system undirected graphs:
comprises numerous structural members, whose location
information and connectivity constitute the most intu- 𝐺 = (𝑉, 𝐸) (1)
itive characteristics of the system. Figure 1 shows two
frame structures with the same geometric outlines and where G represents a structural system sample, V are nodes
structural members. However, due to different topological (vertices) representing structural joints, and E are edges
connectivity in the first two spans, these two frame struc- that correspond to structural members connecting to the
tures exhibit disparate internal force distributions under joints. Figure 2 shows the graph counterpart of a planar
the same loading conditions, demonstrating the impor- frame structure with braces (the numbers marked in the
tance of connectivity in structural analysis. (2) Structural figure indicate node identity (i.e., node ID) rather than the
member configurations: The configurations of underly- sequential relationship of nodes), wherein the nodes are
ing structural members, including what member types to beam–column joints, and the edges are beams, columns,
use and their intrinsic properties, significantly affect the and braces. Let 𝑣𝑖 ∈ 𝑉 denote a node and 𝑒𝑖𝑗 =(𝑣𝑖 , 𝑣𝑗 ) ∈ 𝐸
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1310 SONG et al.

FIGURE 2 Graph representation of a planar frame structure. (a) Schematic illustration of a planar frame structure and (b) graph
representation

denote an edge connecting vi and vj . The neighborhood of physical positions of each structural member and their spa-
a node v is defined as tial information, such as their lengths. As the subsequent
sections will show, node features of coordinates play a
𝑁 (𝑣) = {𝑢 ∈ 𝑉| (𝑣, 𝑢) ∈ 𝐸} (2) critical role in structural analysis.
Based on the concepts and notations of graph data, we
One of the most commonly used graph representations can properly digitalize arbitrary structural systems and
is an adjacency matrix A with Aij = 1 if 𝑒𝑖𝑗 ∈ 𝐸 and Aij = 0 if preserve their feature information with high fidelity, facil-
𝑒𝑖𝑗 ∉ 𝐸. To save storage space, we can extend the adjacency itating intelligent structural analysis by downstream DL
matrix A to record structural properties such as section models.
geometries and stiffness by utilizing the symmetry of undi-
rected graphs. For example, the extended adjacency matrix
of the planar structure in Figure 2 can be written as 3 STRUCTGNN-E MODEL
⎡ −1 𝐸𝐼0,1 0 0 0 0 0 0 0 0 0 ⎤ In practice, most engineering structures adopt elastic anal-
⎢ ⎥
⎢𝐸𝐴0,1 0 𝐸𝐼1,2 0 𝐸𝐼1,4 0 0 0 0 0 0 ⎥ ysis solutions under static loads during the design stage.
⎢ ⎥
⎢ 0 𝐸𝐴1,2 0 0 0 𝐸𝐼2,5 0 0 0 0 0 ⎥ Furthermore, elastic cases also serve as the foundations
⎢ ⎥
⎢ 0 0 0 −1 𝐸𝐼3,4 0 0 0 𝐸𝐼3,8 0 0 ⎥ of elasto-plastic cases, which provide key techniques and
⎢ ⎥
⎢ 0 𝐸𝐴 0 𝐸𝐴3,4 0 𝐸𝐼4,5 0 0 𝐸𝐼4,8 𝐸𝐼4,9 0 ⎥ help develop the general framework. Therefore, this study
1,4
⎢ ⎥
𝐴=⎢ 0 0 𝐸𝐴2,5 0 𝐸𝐴4,5 0 𝐸𝐼5,6 0 0 𝐸𝐼5,9 𝐸𝐼5,10 ⎥ takes elastic structural systems under static loads as the
⎢ ⎥
⎢ 0 0 0 0 0 𝐸𝐴5,6 0 0 0 0 𝐸𝐼6,10 ⎥⎥ research object.

⎢ 0 0 0 0 0 0 0 −1 𝐸𝐼7,8 0 0 ⎥⎥
⎢ Adapting to the graph data representation, we propose
⎢ 0 0 0 𝐸𝐴3,8 𝐸𝐴4,8 0 0 𝐸𝐴7,8 0 𝐸𝐼8,9 0 ⎥⎥
⎢ a DL structural analysis model named StructGNN-E (i.e.,
⎢ ⎥
⎢ 0 0 0 0 𝐸𝐴4,9 𝐸𝐴5,9 0 0 𝐸𝐴8,9 0 𝐸𝐼9,10 ⎥ structural analysis based on GNN–elastic) within the GNN
⎢ ⎥
⎣ 0 0 0 0 0 𝐸𝐴5,10 𝐸𝐴6,10 0 0 𝐸𝐴9,10 0 ⎦ framework. The model architecture is shown in Figure 3.
(3) The feature information of structural systems is trans-
where each diagonal element Aii indicates the boundary formed into graph data and sent to a variant of GIN,
condition of node vi (−1 for a confined node); each element which computes the internal force distribution using a
in the upper triangle Aij,i<j represents the bending rigid- message-passing mechanism. The whole model is driven
ity of the structural member between node vi and node vj ; by structural mechanics without labeled data, overcoming
and each element in the lower triangle Aij,i>j represents the data scarcity problem at the structural system level.
the axial rigidity.
Apart from the fundamental representation above, both
the nodes and edges can have attributes 𝐗𝑣 ∈ ℝ𝑛×𝑑 (n is 3.1 Basics of GNNs
the number of nodes and d is the dimension of node fea-
tures) and 𝐗𝑒 ∈ ℝ𝑚×𝑐 (m is the number of edges and c is GNNs are a kind of emerging DL technique that extends
the dimension of edge features) to enhance the expressive- classical neural networks such as CNNs and RNNs to pro-
ness of the graph. For structural systems, the coordinates cess non-Euclidean graph data. A fundamental assump-
of joints are important node features that determine the tion underlying GNNs is that the targets for prediction
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SONG et al. 1311

F I G U R E 3 Architecture of the StructGNN-E (structural analysis based on graph neural network [GNN]–elastic) model. GIN, graph
isomorphism network

should be invariant to the order of graph nodes. Therefore,


GNN parameters are independent of the node ordering and
are shared across the entire graph. In general, GNNs use
the graph structure along with node and edge features for
representation learning of nodes or the entire graph. The
framework of most modern GNNs can be formulated using
a message-passing mechanism (Zhou et al., 2020), wherein
an L-layered GNN is expressed as:

(0)
𝒉𝑣 = 𝒙𝑣 , ∀𝑣 ∈ 𝑉 (4)

( )
(𝑙) (𝑙) (𝑙−1) (𝑙−1)
𝒎𝑢𝑣 = 𝑀𝑆𝐺 𝒉𝑣 , 𝒉𝑢 , ∀ (𝑢, 𝑣) ∈ 𝐸 (5)

({ })
(𝑙) (𝑙) (𝑙)
𝒂𝑣 = AGG 𝒎uv |𝑢 ∈ 𝑁 (𝑣) , ∀𝑣 ∈ 𝑉 (6)

( )
(𝑙) (𝑙) (𝑙−1) (𝑙)
𝒉𝑣 = UPT 𝒉𝑣 , 𝒂𝑣 , ∀𝑣 ∈ 𝑉 (7)

FIGURE 4 Message passing mechanism of GNNs


where the operations 𝑀𝑆𝐺 , 𝐴𝐺𝐺 , and 𝑈𝑃𝑇 are param-
eterized functions with subscripts denoting “message,”
(𝑙) “messages” from their neighborhood nodes to update the
“aggregate,” and “update,” respectively, and 𝒉𝑣 is the
node representation (or embedding vector) at the lth layer. next message or embedding. At the output head, GNNs
Concretely, as shown in Figure 4, all the nodes in the graph implement the specified task, such as node classifica-
carry a “message” (initially, the message is the original tion and link prediction, based on the final representation
(𝑙)
node feature), and at every layer, each node aggregates 𝒉𝑣 (Wu et al., 2020).
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1312 SONG et al.

(𝑙) In this manner, nodes that have the same local structure,
Typically, 𝑀𝑆𝐺 is embedded into the aggregation func-
(𝑙) such as node i and node j in Figure 5, can be distinguished
tion 𝐴𝐺𝐺 or simply preserves the representation from
(𝑙) (𝑙) easily because the messages (i.e., the coordinates) passed
the previous layer. The choices of 𝐴𝐺𝐺 and 𝑈𝑃𝑇 are from their neighborhood nodes differ and thus can be
crucial and derive different GNN models. For instance, projected to discriminative representations through sim-
the graph convolutional network (Kipf & Welling, 2017) ple mathematical operations. This method corresponds to
selects mean pooling as the aggregation function and a a feature augmentation strategy in which node IDs are
single-layer neural network as the updater: used to enhance the expressiveness of graph data, which
( { }) is popular in recommender systems (J. Wang et al., 2018).
(𝑙) (𝑙−1)
𝒉𝑣 = ReLU 𝑾 (𝑙) ⋅ MEAN 𝒉𝑢 |∀𝑢 ∈ 𝑁 (𝑣) ∪ {𝑣} For nodes with different degrees, such as nodes s and
(8) j, we resort to carefully designing the message aggrega-
Alternatively, in GraphSAGE (Hamilton et al., 2017), a tor. We revisit classical aggregators—sum pooling, average
max pooling aggregator is used, and the updater adds a pooling, and max pooling—to analyze their effectiveness.
concatenation operation before the linear mapping: On the right side of Figure 5, we illustrate the “messages”
from the neighborhood of nodes s and j. We calculate the
( [ ]) aggregation results using classical aggregators as shown in
(𝑙) (𝑙) (𝑙−1) (𝑙)
𝒉𝑣 = ReLU 𝑾𝑈𝑃𝑇 ⋅ 𝒉𝑣 , 𝒂𝑣 (9)
Table 1. By comparing the results at two nodes, we can
determine the cases in which sum pooling and mean pool-
({ ( ) }) ing fail to distinguish the local structures. For example,
(𝑙) (𝑙) (𝑙−1)
𝒂𝑣 = MAX ReLU 𝑾AGG ⋅ 𝒉𝑢 , ∀𝑢 ∈ 𝑁 (𝑣) when a = 7d, b = 7 h, x = 9d, and y = 9 h, the sum pooling
(10) aggregator yields the same message information of (36d,
36 h). Max pooling seems the most powerful in this situ-
The lReLU operation in Equations (8) to (10) is the
ation. However, in Figure 5, nodes s and r, which share
rectified linear unit:
the same maximal neighborhood coordinate (a + d, b +
h), will be confused. In summary, classical aggregators are
ReLU (𝑥) = 𝑥 ⊙ 1𝑥≥0 (11)
incapable of adapting to all cases in structural systems.
Distinguishing different nodes with similar local struc-
where 𝟏𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 is the indicator function and ⊙ denotes the
tures corresponds to the well-known graph isomorphism
Hadamard product.
problem (Weisfeiler & Leman, 1968). An eligible GNN is
expected to map different multisets of node features to dif-
ferent representations, requiring the aggregation function
3.2 Modification of the GIN to be injective. Considering that the neighborhood nodes
in structural systems are countable and finite, we adopt the
In the structural analysis scenario, an effective GNN philosophy of GIN (Xu et al., 2019), which states that for a
should be able to distinguish different nodes by mapping countable multiset, there exists a function  ∶  → ℝ𝑛 so
them to different representations in the embedding space. that
Figure 5 displays an asymmetric frame structure, wherein

nodes i and j have the same degree (i.e., the number of  (𝑣, 𝑋) = (1 + 𝜖) ⋅  (𝑣) +  (𝑥) (12)
edges connecting to the node). However, despite shar- 𝑥∈𝑋
ing the same local structure, the internal forces at these
two nodes react differently in most load cases. Moreover, is unique, where 𝑣 ∈  and 𝑋 ⊂  is a multiset of bounded
since these two nodes are located in the middle of the size. In addition, any function  that acts on such pairs can
entire structure, very deep layers are required to reach the be decomposed as follows:
boundary so that the model can discriminate the subtrees [ ]
rooted in them, which is infeasible in practice. There- ∑
 (𝑣, 𝑋) = 𝜙 (1 + 𝜖) ⋅  (𝑣) +  (𝑥) (13)
fore, how GNNs can identify different nodes with similar 𝑥∈𝑋
local structures is a challenging problem in structural
analysis and imposes high demands on selecting an appro- Based on this statement, GIN uses multilayer percep-
priate aggregation function within the message-passing trons (MLPs) to model the composite function (𝑙) ◦𝜙(𝑙−1)
framework. by leveraging the universal approximation theorem
To address this problem, we first augment the graph data (Hornik et al., 1989, 1990). The original GIN thus updates
by introducing physical coordinates into the node features. node representations as follows:
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SONG et al. 1313

FIGURE 5 Illustration of message passing in a frame structure

TA B L E 1 Failure cases of classical aggregators


Sum pooling Mean pooling Max pooling
Node s (5a + d, 5b + h) (a + d/5, b + h/5) (a + d, b + h)
Node j (4x, 4y) (x, y) (x + d, y + h)
{ {
5𝑎 + 𝑑 = 4𝑥 5𝑎 + 𝑑 = 5𝑥
Failure case –
5𝑏 + ℎ = 4𝑦 5𝑏 + ℎ = 5𝑦

( )
(𝑙) ( (𝑙)) (𝑙−1) ∑ (𝑙−1) existing ML/DL studies on structural analysis rely heavily
𝒉𝑣 = MLP 1+𝜖 (𝑙) ⋅ 𝒉𝑣 + 𝒉𝑢 (14) on data from experiments or generated using FE analy-
𝑢∈𝑁(𝑣) sis to train the model parameters, which corresponds to
a supervised learning scheme. However, in contrast to
where 𝜖(𝑙) can be either learnable or fixed. Equation (12) construction materials and structural members such as
is applicable when node features are organized as one- beams and columns, experimental data are very scarce
hot encodings in the first iteration, which makes their at the structural system level. Meanwhile, data genera-
summation injective. To accommodate structural system tion through FE analysis will consume a large amount
scenarios, we modify the first layer to: of time and, more importantly, cannot cover the entire
( ) parameter space due to the high complexity of feature
(1) (1) ( ) (0)
∑ (0) information of structural systems. In addition to the data
𝒉𝑣 = MLP 1 + 𝜖(1) ⋅ MLP 𝒙𝑣 + MLP 𝒙𝑢
𝑢∈𝑁(𝑣) paucity problem, data-driven models are criticized for their
(15) poor interpretability and neglect the physical background
Then, the GIN model can preserve injectiveness while of structural engineering. As a result, researchers and engi-
recursively updating each node’s feature embedding to neers cannot evaluate the correctness of the prediction
capture the graph structure. results, which impedes the promotion of DL methods in
engineering projects.
3.3 Physics-informed modeling Structural analysis has a solid theory in mechanics.
Specifically, elastic cases are governed by three mechan-
The mainstream DL models are driven by big data, min- ical equations: the equilibrium equation, the deforma-
ing the latent patterns underlying the dataset. Accordingly, tion compatibility equation, and the elastic constitutive
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1314 SONG et al.

equation. According to structural mechanics, these three


equations are theoretically complete, implying that if they
are satisfied everywhere in an elastic structural system,
the solutions of internal forces and deformation are cor-
rect and unique. Based on this concept, we innovatively
employ a self-supervised learning scheme to free the model
from the demand in data and propose a new training mode
driven by structural mechanics.
Analogous to classical solutions in structural mechan-
ics, we focus on the deformation and internal forces at
member nodes. By virtue of graph representation, we
can maintain node degrees of freedom (DOFs) as extra
node features and easily realize the deformation compat-
ibility equation because it is convenient to condense the
DOFs according to rigid or hinge connections. The elastic
constitutive relationship is stored in the extended adja-
cency matrix or edge features. Therefore, the DL model
only needs to address equilibrium equations, from which
we can construct a mechanistic-based loss function for
optimization:

 = ‖𝑭in + 𝑭ex ‖ , (16)


F I G U R E 6 StructGNN-E model algorithm in force-driven
cases. GIN, graph isomorphism network
where Fin denotes the internal force vectors computed by
the model, Fex denotes the external input load vectors, and
‖ ⋅ ‖ is a norm metric, such as the mean squared error The input Iv includes node coordinates (xv , yv ) and nodal
(MSE) loss with the L2-norm. Accordingly, solving equilib- load Fex :
rium equations is equivalent to minimizing Equation (16).
In contrast to the “train-test-generalize” paradigm of data- 𝐼𝑣 = (𝑥𝑣 , 𝑦𝑣 , 𝐹𝑣𝑥 , 𝐹𝑣𝑦 , 𝑀𝑣𝑦 ) (17)
driven models, the convergence of the training process
of our physics-informed model corresponds to the com- The output Ov is set as the global nodal displacement:
pletion of the solution, whose correctness is guaranteed
by the theoretical completeness of the three mechanical 𝑂𝑣 = (𝑑𝑥𝑣 , 𝑑𝑦𝑣 , 𝑑𝜃𝑣 ) (18)
equations.
Figure 6 displays the algorithm of the StructGNN-E With a 2-hop GIN, the input is processed as follows:
model in force-driven cases, where the external loads
(0)
are the inputs, and the model attempts to predict the ℎ𝑣 = 𝐿𝑖𝑛𝑒𝑎𝑟1 (𝐼𝑣 ) (19)
node deformation according to the structural topologi-
cal information and node (as well as edge) features. The
entire procedure of the physics-informed model no longer (1) (0)
∑ (0)
requires labeled data (processed data labeled with pre- ℎ𝑣 = ReLU[𝑀𝐿𝑃(1) (ℎ𝑣 + 𝐴𝑣,𝑢 ℎ𝑢 )] (20)
𝑛
diction targets by humans, often used for training and
validation of data-driven models), which fundamentally
(2) (1)
∑ (1)
alleviates the problem of the severe data paucity of struc- ℎ𝑣 = 𝑀𝐿𝑃(2) (ℎ𝑣 + 𝐴𝑣,𝑢 ℎ𝑢 ) (21)
tural systems and poses no restrictions on the structures 𝑛
to be simulated, leading to great generalization. Moreover,
the resolution mainly relies on the topological messages
(2)
from the underlying representation, giving full play to the 𝑂𝑣 = 𝐿𝑖𝑛𝑒𝑎𝑟2 (ℎ𝑣 ) = (𝑑𝑥𝑣 , 𝑑𝑦𝑣 , 𝑑𝜃𝑣 ) (22)
advantages of the graph data structure, which can be inter-
preted as automatic identification of the force transmission Fin = (Fx , Fy , M) is then computed by the output with
path inside the structural system through DL. constitutive relationships of structure members.
To further illustrate the process, we demonstrate the Thus far, we have elaborated on the mechanism of
setup of an example of StructGNN-E with a 2-hop GIN: the StructGNN-E model, which is capable of recognizing
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SONG et al. 1315

T A B L E 2 Number of parameters for different problem scales


(number of nodes N)
#Trainable
Problem First layer Second layer parameters
scale (N) dimension dimension (P)
10 8 32 3411
100 16 64 12,867
1000 32 128 49,472

8 m. For simplicity, all columns have an I-section of 500


× 180 × 14 × 12 (hw × bf × tf × tw mm), all beams have an
I-section of 400 × 150 × 14 × 12 (mm), and all braces have an
I-section of 300 × 150 × 14 × 12 (mm). All structural mem-
bers adopt Q345B structural steel with an elastic modulus
of 200 GPa. External loads are also generated randomly,
including node forces and moments. Intuitively, the num-
ber of nodes can reflect the scale of a structural system,
which also indicates the size of the GNN model. We vali-
date our proposed model on different-scale problems with
10, 100, and 1000 nodes.

4.2 Training configurations

The StructGNN-E model can convert the generated struc-


FIGURE 7 A frame structure generated for the numerical tures into graph data (including an extended adjacency
experiment matrix and node and edge features) and build a GIN
different node representations with similar local topolo- model with a proper dimension. We adopt a two-layer GIN
gies and performing data-free structural analysis driven by model that covers two hops with two-layer MLPs. The
structural mechanics. dimensions of the hidden states and the total number of
trainable parameters for different problem scales are listed
in Table 2. Although the learnable parameters increase as
4 VALIDATION OF THE the problem scales up, the size of our model is still much
STRUCTGNN-E MODEL smaller than that of common DL models.
The model is trained using the Adam algorithm
To validate the effectiveness and efficiency of the (Kingma & Ba, 2014) with default settings of 𝛽1 = 0.9, 𝛽2 =
StructGNN-E model for elastic structural analysis quickly 0.999, and 𝜖 = 10−8 and employs a learning rate schedule
and informatively, we carry out a numerical experiment that decays the initial learning rate (lr0 = 0.01) with epochs.
using multifloor frame structures with diverse configu- The MSE loss is adopted to measure the residual forces in
rations. We compare the results with numerical results Equation (16). For each problem scale (number of nodes
produced by the FE method to show its correctness. N), 100 cases of different loading conditions, including
the lateral load and vertical weight (randomly generated),
are calculated, each of which stops when the number of
4.1 Data preparation
epochs reaches 10,000 or the relative residual forces are
less than 1%. The StructGNN-E model is developed based
To prepare the data for the numerical experiment, we
on the PyTorch platform and is run on an i7-8700 CPU and
develop a subroutine that can randomly generate steel
an RTX Titan GPU (graphic processing unit) to test the
frame structures with braces conforming to the common
efficiency.
engineering design. The number of floors and spans and
the member sections can be manipulated in the subrou-
tine to vary the configurations of the generated structures. 4.3 Model validation
For example, Figure 7 shows a 14-floor five-span steel
frame structure with braces generated by the subroutine, As shown in Figure 8a, the StructGNN-E model converges
wherein the floor height is 3.6 m and the span length is in all cases and presents reasonable results. The accuracy
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1316 SONG et al.

TA B L E 3 Comparison of the computational efficiency


Problem StructGNN-E StructGNN-E
scale (N) SAP2000 (CPU) (GPU)
10 0.15 s 0.61 s 0.42 s
100 3.20 s 3.36 s 1.94 s
1000 68.27 s 71.94 s 45.69 s
10,000 2185.20 s 1390.63 s 848.54 s

is defined according to Equation (16) as

𝐴𝑐𝑐𝑢 = 1 − ‖𝑭in + 𝑭ex ‖ ∕ ‖𝑭ex ‖ (23)

The StructGNN-E model achieves an average accuracy


of 99% on different problem scales with random load-
ing conditions, indicating that our proposed model and
theory-driven paradigm are exceedingly effective and are
capable of accurately implementing elastic structural anal-
ysis. Figure 8b–d shows the influences of the number of
learnable parameters P on the performance, showing that
with the increase in problem scale, the model also needs to
expand its size to adapt to higher complexity. Although a
too small model size P may limit the convergence, quanti-
tative analysis using logarithmic transformation indicates
that the suitable model size ensuring convergence is
approximately C⋅N0.58 in terms of the problem scale N
(number of nodes), where C is a constant. For the same
problem scale, larger models yield higher accuracy and
slightly faster convergence.
To show the performance more clearly, we compare the
StructGNN-E model with the commonly used FE package
SAP2000. The analysis results are displayed in Figures 9
and 10, where both the bending moment diagram and the
virtual work diagram computed by the model agree very
well with the FE model, verifying the high accuracy of the
StructGNN-E model. Table 3 shows a comparison of the
computational efficiency of the different models. Gener-
ally, the models are run on GPUs to utilize their optimized
computing power. However, to ensure comparability with
the FE software, we add a column to show the compu-
tational efficiency of the StructGNN-E model run on one
CPU. Table 3 indicates that our proposed model exhibits
competitive efficiency on the CPU, compared with the FE
software and even surpasses the latter when N = 10,000,
saving up to 36% of the time cost. The model run on the
GPU is much faster than expected, which verifies that the
F I G U R E 8 Average convergence curves of different model
StructGNN-E model is efficient for implementing struc-
sizes on different problem scales. (a) Average convergence curves
tural analysis, especially when the problem scale is large.
for different problem scales, (b) comparison of convergence curves
of different model sizes on problem N = 10, (c) comparison of
In summary, the numerical experiment verifies the
convergence curves of different model sizes on problem N = 100, effectiveness of the StructGNN-E model, which can solve
and (d) comparison of convergence curves of different model sizes the elastic analysis problem of structural systems of any
on problem N = 1000 scale with high accuracy and excellent computational
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SONG et al. 1317

FIGURE 9 Comparison of bending moment diagrams. (a) Results using the StructGNN-E model and (b) results using SAP2000

FIGURE 10 Comparison of virtual work diagrams. (a) Results using the StructGNN-E model and (b) Results using SAP2000

efficiency, making our proposed model applicable to engi- cuss its potential application in the field of structural
neering practice. optimization.

5 DISCUSSION 5.1 Analysis between GNN and classical


DL models
This section presents ablation studies to further demon-
strate the unique effectiveness of the StructGNN-E model The GNN architecture developed on the graph data
at the structural system level and conceptually dis- structure is the skeleton of the StructGNN-E model.
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1318 SONG et al.

N nodes . .
.
. . .
.
. . .
.
.

Input Linear 1 Linear 1 Output

(a) DNN model

FIGURE 11 Two similar nodes without external loads in a


frame structure

Accordingly, we first compare the GNN architecture with


N nodes . . . . . .
classical DL models, including a feedforward deep neural . . .
. . .
network (DNN), CNN, and RNN. . . .
. . .
To accommodate classical DL models, the data inputs
need to be modified. Intuitively, the structural analysis
Pooling & Linear
Input Conv layer 1 Conv layer 2 Output
seeks to build a mapping relation between two mechanical Densing Layer
variables (C. Wang et al., 2020), such as the load and dis-
placement. However, as shown in Figure 11, many nodes (b) CNN model
have the same local topologies and the same input loads Output 1 Output 2 Output N
(or no load) in a structure, resulting in the same inputs
into deep learning models such as a DNN or CNN. Since Linear Linear ... Linear
the classical DL models are deterministic, they can only h12 h22 hN-12
yield the same results on these nodes, which obviously fails
RNN RNN ... RNN

to predict correct displacements. An RNN is an exception h10 h20 hN-10


RNN RNN ... RNN
because it can complement inputs by receiving historical
(𝑙) Input N
states 𝒉𝑖 from the previous unit (shown in Figure 12), Input 1 Input 2

helping to distinguish the nodes to some extent. There-


fore, we use node coordinates as the inputs to the DNN N nodes
and CNN, whereas we use external loads as the inputs
to the RNN. All the outputs are nodal displacements in (c) RNN model
Equation (18). The process remains the same as shown in
F I G U R E 1 2 DNN/convolutional neural network (CNN)/RNN
Figure 6 except for the GIN part.
model architectures. (a) DNN model. (b) CNN model. (c) RNN
We experiment on two structures with N = 100 and
model
1000. In each case, the learnable parameters in each model
are kept similar, as listed in Table 4. The models all
adopt a physics-informed mode. Other experimental con- organization. For example, Nodes 2 and 3 are adjacent in
figurations remain the same, including the optimization the input data, but they actually have no direct mechanical
algorithm and the learning rate scheme. interaction in the structural system. Consequently, the
The comparison results are presented in Figure 13 and local information extracted by the CNN is useless for the
Table 4, wherein only the StructGNN-E model manages to final solution. In contrast to the CNN, the DNN intends
converge to acceptable accuracy. For the CNN, as seen in to capture all information by full neural connections (as
Figure 14, the convolutional kernel maps the coordinates shown in Figure 14), thus achieving better performance.
of “neighborhood nodes” to new representations. Such However, full connections also introduce many irrelevant
a “neighborhood” relationship is determined by data signals that disturb the model as it attempts to find the
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SONG et al. 1319

optimal solution, resulting in model divergence. Aug-


mented by the historical states, the RNN relies on the
information transmitted from the previous unit with an
underlying assumption that the nodes in the structure
have order. As a result, if we permute the node IDs,
the RNN will predict different deformations, which is
inconsistent with the actual situation.
In summary, this comparison experiment again reveals
the nonsequential and translation-variant properties of
structural systems and demonstrates the infeasibility of
classical DL models in handling structural systems. On the
other hand, the graph data structure and the GNN archi-
tecture developed on it succeed in capturing the distinctive
characteristics of structural systems and exhibit unique
validity.

5.2 Analysis between data-driven


modeling and physics-informed modeling

Another focal point in the StructGNN-E model is the


physics-informed paradigm, which resolves the data
scarcity problem and ensures the theoretical correctness
of the computational results. To discuss the superiority of
physics-informed modeling over data-driven modeling at
the structural system level, we use SAP2000 to generate
the load-displacement data of two structures with N = 100
and 1000 for training the supervised learning model. We
prepare a dataset with a total size of 1000 (1000 differ-
F I G U R E 1 3 Average convergence curves of different models
ent loading conditions randomly generated) for each case
on different problem scales. (a) Comparison of convergence curves
and divide 800 pieces of data into the training dataset and
of different models on problem N = 100 and (b) comparison of
convergence curves of different models on problem N = 1000 the rest into the test dataset. In each case, the graph rep-
resentation and the GIN model of the same dimension
are adopted. For the physics-informed paradigm, the con-
vergence of the training process of our physics-informed
model corresponds to the completion of the solution,
which means that no labeled data or division of train-
ing/test is necessary. Other experimental configurations
also remain the same for comparability.
The results are presented in Figure 15 and Table 5.
Figure 15 indicates that the data-driven model converges
FIGURE 14 Illustration of the DNN and CNN operating on a
on the training dataset, realizing accuracies of 99.2% and
frame structure
99.1% on the two structures. However, on the test dataset,

TA B L E 4 Comparison of the StructGNN-E model and classical deep learning models


Convolutional
Scale neural
(N) DNN network RNN GNN
100 #Parameters 4739 3171 3699 3411
Accuracy 83.1% 89.9% 75.8% 99.2%
1000 #Parameters 17,667 12,099 15,603 12,867
Accuracy 84.2% 87.6% 74.3% 99.1%
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1320 SONG et al.

TA B L E 5 Comparison of data-driven modeling and physics-informed modeling


Scale Data-driven Physics-informed
(N) modeling modeling
100 Training accuracy 99.2% –
Test accuracy 74.6% 99.2%
1000 Training accuracy 99.0% –
Test accuracy 73.2% 99.1%

5.3 Potential application in end-to-end


structural optimization

We have detailed the advantages of StructGNN-E in the


above analysis. In addition to its unique effectiveness
and excellent computational accuracy and efficiency, it is
worth noting that the StructGNN-E model is capable of
propagating differentiability due to the backpropagation
algorithm in DL, which has great potential to reform the
structural optimization procedures.
Structural optimization is a popular problem in the engi-
neering design and construction stages, wherein engineers
FIGURE 15 Accuracy of the data-driven model on the attempt to adjust the structural configurations according
training dataset to engineering indices, such as a balanced combination
of safety and economy. Traditionally, structural optimiza-
tion was implemented manually by iteratively varying the
the data-driven model exhibits very poor generalization input parameters (such as structural sections) and check-
and only achieves accuracies of 74.6% and 73.2%, far lower ing whether the results were acceptable. Consequently,
than those on the training dataset. In contrast, the physics- it was very difficult to acquire a real optimal solution
informed model achieves superb performance on the test with a compromise in engineering applications. In recent
dataset, demonstrating its effectiveness compared with the decades, researchers have been dedicated to investigating
data-driven counterpart. We speculate that the failure of automatic computer algorithms as a substitute for manual
the data-driven model is ascribed to the large hypothe- labor. However, obstructed by conventional computational
sis space at the structural system level. For a small-scale methods represented by FE analysis (shown in the upper
planar frame structure with 100 nodes, even considering part of Figure 16), the influences of the input parame-
only three types of node loads (i.e., x-direction force Fx , ters on the target indices cannot be traced back, especially
y-direction force Fy , and moment M), there are 23 = 8 pos- when the coupling effect exists in the input space. Due to
sibilities of load conditions at each node, resulting in 8100 this restriction, heuristic algorithms such as the genetic
load conditions for the entire structure. The displacement algorithm (Gholizadeh et al., 2008; Jenkins, 1991; Liu
space also has a similar scale of possibilities. Therefore, et al., 2008) were widely adopted by researchers because
the hypothesis space of mapping from the load space to they were model-agnostic and automatically mutated the
the displacement space (or vice versa) can be very large encodings of the input parameters to minimize the engi-
and too difficult for the data-driven method to learn a neering targets, wherein only the forward direction was
well-generalized model with limited data. The physics- needed so that the entire procedure conformed to the
informed model overcomes this difficulty by imposing unidirectional paradigm of traditional methods. These
strict physical constraints derived from structural mechan- classical algorithms are inefficient because the results pro-
ics on the hypothesis space, wherein the equilibrium vide a bare minimum of guidance for searching optimal
equation is deemed a strong regularization that prunes parameters, and the variation of parameters is more or
the searching processes to a large extent. Accordingly, less random. In addition, the parameterization of tradi-
the physics-informed method is able to reach the optimal tional structural analysis models is poor. Changes in input
solution more quickly and easily. parameters usually mean building a new model from
In summary, the data-driven paradigm is inapplicable scratch, which hinders the realization of automatic opti-
to structural systems, whereas the physics-informed model mization. Therefore, classical algorithms have not been
exhibits unique effectiveness. promoted for use in structural optimization problems.
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SONG et al. 1321

FIGURE 16 Structural optimization procedures with traditional methods and deep learning methods. FE, finite element

In contrast, the StructGNN-E model can modify the tra- superb performance in both computational accuracy and
ditional unidirectional paradigm to a new bidirectional efficiency.
paradigm. As shown in Figure 16, not only can the input The main conclusions are summarized as follows:
parameters be transformed into the final results, but the
gradients of the results can also be propagated back to 1. We innovatively utilize the graph data structure to orga-
update the input parameters. The backward flow is imple- nize the feature information with nonsequential and
mented by assimilating the input parameters as a part of translation-variant properties, which realizes the data
learnable parameters in DL, which can be co-optimized representation of structural systems with high fidelity.
by the gradient descent method. Due to the excellent 2. We propose the StructGNN-E model within the GNN
parameterization of DL, there is no need to rebuild the architecture to adapt to the graph representation,
model when the input parameters are changed. This new wherein we modify GIN to distinguish the nodes with
paradigm is much more efficient because, on the one hand, similar local topologies.
the information from the results is fully utilized to guide 3. We propose a novel physics-informed paradigm to
the optimization direction of the input parameters, and resolve the data scarcity problem at the structural
on the other hand, the optimizer integrated into the DL system level. The paradigm incorporates structural
framework, such as the Adam algorithm, is more advanced mechanics into DL and converts the training process
than traditional algorithms. Moreover, since the optimiza- to solving the physical equations, implementing the
tion and the structural analysis share the same training structural analysis without labeled data and ensuring
process, we can accomplish them simultaneously and not the theoretical correctness.
avoid implementing two separate algorithms. Therefore, 4. The numerical experiment verifies that StructGNN-E
the StructGNN-E model possesses great potential to inno- converges to accurate results of structures with different
vate the field of structural optimization, deriving a new scales and exhibits excellent computational efficiency.
end-to-end optimization paradigm with high efficiency 5. Ablation studies demonstrate the unique effectiveness
and automaticity. of StructGNN-E at the structural system level compared
with classical DL models and the data-driven paradigm.
6 CONCLUSION 6. The StructGNN-E model can derive a new end-to-end
structural optimization method with high efficiency
To fill the research gap of ML/DL-based analysis at the and automaticity, which is anticipated to have great
structural system level, we propose a physics-informed potential to reform the field of structural optimization.
DL model named StructGNN-E based on the graph data
structure. A numerical experiment and ablation studies Nevertheless, the current framework is applicable for
demonstrate that our proposed model is uniquely effec- elastic analysis and frame-like structures. The implemen-
tive, compared with classical DL models and exhibits tation of the physics-informed paradigm can be complex
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1322 SONG et al.

for elasto-plastic scenarios when the memory effect is Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feed-
taken into account. Structures with high-dimensional forward networks are universal approximators. Neural Networks,
members (such as shear walls) or nonlinear deformation 2(5), 359–366.
Hornik, K., Stinchcombe, M., & White, H. (1990). Universal approx-
are difficult to fit into the current framework. Despite these
imation of an unknown mapping and its derivatives using
limitations, StructGNN-E provides a valuable technical
multilayer feedforward networks. Neural Networks, 3(5), 551–560.
framework for intelligent computation at the structural Huang, P., & Chen, Z. (2021). Deep learning for nonlinear seismic
system level, including a graph data representation to responses prediction of subway station. Engineering Structures,
describe the structural topologies, a GNN architecture to 244, 112735.
gather the information from neighborhood nodes, and a Hung, S., & Jan, J. (2002). Machine learning in engineering analysis
physics-informed paradigm to alleviate the dependence on and design: An integrated fuzzy neural network learning mode.
big data, which paves an inspiring avenue for our future Computer-Aided Civil and Infrastructure Engineering, 14(3), 207–
219.
work on extending it to elasto-plastic structural analysis.
Hwang, S. H., Mangalathu, S., Shin, J., & Jeon, J. S. (2021). Machine
learning-based approaches for seismic demand and collapse of
AC K N OW L E D G M E N T S ductile reinforced concrete building frames. Journal of Building
We gratefully acknowledge the financial support provided Engineering, 34, 101905.
by the National Natural Science Foundation of China Jenkins, W. M. (1991). Towards structural optimization via the genetic
(Grant No. 52121005), China National Postdoctoral Pro- algorithm. Computers & Structures, 40(5), 1321–1327.
gram for Innovative Talents (Award No. BX20220177), Kabir, M. A. B., Hasan, A. S., & Billah, A. H M. (2021). Failure mode
identification of column base plate connection using data-driven
and China Postdoctoral Science Foundation (Grant No.
machine learning techniques. Engineering Structures, 240, 112389.
2022M711864).
Kang, M. C., Yoo, D. Y., & Gupta, R. (2021). Machine learning-based
prediction for compressive and flexural strengths of steel fiber-
REFERENCES reinforced concrete. Construction and Building Materials, 266,
Amezquita-Sancheza, J. P., Valtierra-Rodriguez, M., & Adeli, H. 121117.
(2020). Machine learning in structural engineering. Scientia Iran- Kangwai, R. D., & Guest, S. D. (2000). Symmetry-adapted equilibrium
ica, 27(6), 2645–2656. matrices. International Journal of Solids and Structures, 37(11),
Dong, S., Wang, P., & Abbas, K. (2021). A survey on deep learning and 1525–1548.
its applications. Computer Science Review, 240, 100379. Kim, T., Kwon, O. S., & Song, J. (2019). Response prediction of nonlin-
Esteghamati, M. Z., & Flint, M. M. (2021). Developing data-driven ear hysteretic systems by deep neural networks. Neural Networks,
surrogate models for holistic performance-based assessment of 111, 1–10.
mid-rise RC frame buildings at early design. Engineering Struc- Kingma, D. P., & Ba, J. (2014). Adam: A method for stochas-
tures, 245, 112971. tic optimization. The 3rd International Conference on Learning
Feng, D. C., Liu, Z. T., Wang, X. D., Chen, Y., Chang, J. Q., Wei, D. Representations, San Diego, CA.
F., & Jiang, Z. M. (2020). Machine learning-based compressive Kipf, T. N., & Welling, M. (2017). Semi-supervised classification
strength prediction for concrete: An adaptive boosting approach. with graph convolutional networks. International Conference on
Construction and Building Materials, 230, 117000. Learning Representations, Toulon, France.
Gholizadeh, S., Salajegheh, E., & Torkzadeh, P. (2008). Structural Lagaros, N. D., & Papadrakakis, M. (2012). Neural network based
optimization with frequency constraints by genetic algorithm prediction schemes of the non-linear seismic response of 3D
using wavelet radial basis function neural network. Journal of buildings. Advances in Engineering Software, 44(1), 92–115.
Sound and Vibration, 312(1-2), 316–331. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature,
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. The 521(7553), 436.
MIT Press. Lee, S., Ha, J., Zokhirova, M., Moon, H., & Lee, J. (2018). Background
Graf, W., Freitag, S., Sickert, J. U., & Kaliske, M. (2012). Struc- information of deep learning for structural engineering. Archives
tural analysis with fuzzy data and neural network based material of Computational Methods in Engineering, 25(1), 121–129.
description. Computer-Aided Civil and Infrastructure Engineering, Liu, X., Yi, W. J., Li, Q. S., & Shen, P. S. (2008). Genetic evolutionary
27(9), 640–654. structural optimization. Journal of Constructional Steel Research,
Guan, X., Burton, H., Shokrabadi, M., & Yi, Z. (2021). Seismic 64(3), 305–311.
drift demand estimation for steel moment frame buildings: From Lu, J., Luo, Y., & Li, N. (2009). An incremental algorithm to trace
mechanics-based to data-driven models. journal of structural the non-linear equilibrium paths of pin-jointed structures using
engineering, 147(6), 04021058. the singular value decomposition of the equilibrium matrix.
Hamilton, W., Ying, Z., & Leskovec, J. (2017). Inductive represen- Proceedings of the Institution of Mechanical Engineers, 223(7),
tation learning on large graphs. Advances in Neural Information 881–890.
Processing Systems, 30, 1025. Martins, G. B., Papa, J. P., & Adeli, H. (2020). Deep learning tech-
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. niques for recommender systems based on collaborative filtering.
Neural Computation, 9(8), 1735–1780. Expert Systems, 37(6), e12647.
14678667, 2023, 10, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mice.12944 by University Of Connecticut, Wiley Online Library on [07/06/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SONG et al. 1323

Messner, J. I., Sanvido, V. E., & Kumara, S. R. (1994). StructNet: a neu- Wang, C., Song, L., & Fan, J. (2022). End-to-end structural analy-
ral network for structural system selection. Computer-Aided Civil sis in civil engineering based on deep learning. Automation in
and Infrastructure Engineering, 9(2), 109–118. Construction, 138, 104255.
Morfidis, K., & Kostinakis, K. (2017). Seismic parameters’ com- Wang, C., Xu, L., & Fan, J. (2020). A general deep learning framework
binations for the optimum prediction of the damage state of for history-dependent response prediction based on UA-Seq2Seq
RC buildings using neural networks. Advances in Engineering model. Computer Methods in Applied Mechanics and Engineering,
Software, 106, 1–16. 372, 113357.
Naderpour, H., Mirrashid, M., & Parsa, P. (2021). Failure mode pre- Wang, J., Huang, P., Zhao, H., Zhang, Z., Zhao, B., & Lee, D. L.
diction of reinforced concrete columns using machine learning (2018). Billion-scale commodity embedding for e-commerce rec-
methods. Engineering Structures, 248, 113263. ommendation in Alibaba. Proceedings of the 24th ACM SIGKDD
Naser, M. Z. (2021). An engineer’s guide to eXplainable Artifi- International Conference on Knowledge Discovery & Data Mining,
cial Intelligence and Interpretable Machine Learning: Navigating London, UK (pp. 839–848).
causality, forced goodness, and the false perception of inference. Wang, Y., Wang, J., Cao, Z., & Farimani, A. B. (2022). Molecular
Automation in Construction, 129, 103821. contrastive learning of representations via graph neural networks.
Nguyen, T., Kashani, A., Ngo, T., & Bordas, S. (2018). Deep neu- Nature Machine Intelligence, 4, 279–287.
ral network with high-order neuron for the prediction of foamed Weisfeiler, B., & Leman, A. (1968). A reduction of a graph to a
concrete strength. Computer-Aided Civil and Infrastructure Engi- canonical form and an algebra arising during this reduction.
neering, 34(4), 316–332. Nauchno-Technicheskaya Informatsiya, 2(9), 12–16.
Oh, B. K., Park, Y., & Park, H. S. (2020). Seismic response predic- Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., & Yu, P. S. (2020). A
tion method for building structures using convolutional neural comprehensive survey on graph neural networks. IEEE Transac-
network. Structural Control and Health Monitoring, 27(5), e2519. tions on Neural Networks and Learning Systems, 32(1), 4–24.
Olalusi, O. B., & Awoyera, P. O. (2021). Shear capacity prediction Xu, K., Hu, W., Leskovec, J., & Jegelka, S. (2019). How powerful
of slender reinforced concrete structures with steel fibers using are graph neural networks? International Conference on Learning
machine learning. Engineering Structures, 227, 111470. Representations, New Orleans, LA.
Parvin, A., & Serpen, G. (1999). Recurrent neural networks for Yin, T., & Zhu, H. P. (2019). An efficient algorithm for architecture
structural optimization. Computer-Aided Civil and Infrastructure design of Bayesian neural network in structural model updating.
Engineering, 14(6), 445–451. Computer-Aided Civil and Infrastructure Engineering, 35(4), 354–
Pellegrino, S. (1993). Structural computations with the singular value 372.
decomposition of the equilibrium matrix. International Journal of Ying, R., He, R., Chen, K., Eksombatchai, P., Hamilton, W. L., &
Solids and Structures, 30(21), 3025–3035. Leskovec, J. (2018). Graph convolutional neural networks for web-
Pellegrino, S., & Calladine, C. R. (1986). Matrix analysis of stati- scale recommender systems. Proceedings of the 24th ACM SIGKDD
cally and kinematically indeterminate frameworks. International International Conference on Knowledge Discovery & Data Mining,
Journal of Solids and Structures, 22(4), 409–428. London, UK (pp. 974–983).
Rafiei, M. H., & Adeli, H. (2017). NEEWS: A novel earthquake early Zafeiriou, S., Bronstein, M., Cohen, T., Vinyals, O., Leskovec, J., Liò,
warning system using neural dynamic classification and neu- P., Bruna, J., & Gori, M. (2022). Guest editorial: Non-euclidean
ral dynamic optimization model. Soil Dynamics and Earthquake machine learning. IEEE Transactions on Pattern Analysis &
Engineering, 100, 417–427. Machine Intelligence, 44(2), 723–726.
Rafiei, M. H., Khushefati, W. H., Demirboga, R., & Adeli, H. (2017). Zhang, R., Chen, Z., Chen, S., Zheng, J., Büyüköztürk, O., & Sun,
Supervised deep restricted Boltzmann machine for estimation of H. (2019). Deep long short-term memory networks for nonlinear
concrete compressive strength. ACI Materials Journal, 114(2), 237– structural seismic response prediction. Computers & Structures,
244. 220, 55–68.
Reddy, J. N. (2019). Introduction to the finite element method (4th ed.). Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., Wang, L., Li, C.,
McGraw-Hill Education. & Sun, M. (2020). Graph neural networks: A review of methods
Salehi, H., & Burgueño, R. (2018). Emerging artificial intelligence and applications. AI Open, 1, 57–81.
methods in structural engineering. Engineering Structures, 171, Zhu, S., Ohsaki, M., & Guo, X. (2021). Prediction of non-linear buck-
170–189. ling load of imperfect reticulated shell using modified consistent
Sun, H., Burton, H. V., & Huang, H. (2021). Machine learning applica- imperfection and machine learning. Engineering Structures, 226,
tions for building structural design and performance assessment, 111374.
state-of-the-art review. Journal of Building Engineering, 33, 101816. Zienkiewicz, O. C., & Robert, L. T. (2005). The finite element method
Torky, A. A., & Ohno, S. (2021). Deep learning techniques for predict- for solid and structural mechanics. Elsevier.
ing nonlinear multi-component seismic responses of structural
buildings. Computers & Structures, 252, 106570.
Tsubaki, M., Tomii, K., & Sese, J. (2019). Compound–protein inter- How to cite this article: Song, L.-H., Wang, C.,
action prediction with end-to-end learning of neural networks for Fan, J.-S., & Lu, H.-M. (2023). Elastic structural
graphs and sequences. Bioinformatics, 35(2), 309–318. analysis based on graph neural network without
Wakjira, T. G., Al-Hamrani, A., Ebead, U., & Alnahhal, W. (2022). labeled data. Computer-Aided Civil and
Shear capacity prediction of FRP-RC beams using single and Infrastructure Engineering, 38, 1307–1323.
ensenble ExPlainable machine learning models. Composite Struc- https://doi.org/10.1111/mice.12944
tures, 287, 115381.

You might also like