Sample
Sample
https://doi.org/10.1007/s11831-024-10199-z
REVIEW
Abstract
Artificial intelligence (AI) has become a buzzy word since Google’s AlphaGo beat a world champion in 2017. In the past
five years, machine learning as a subset of the broader category of AI has obtained considerable attention in the research
community of granular materials. This work offers a detailed review of the recent advances in machine learning-aided stud-
ies of granular materials from the particle-particle interaction at the grain level to the macroscopic simulations of granular
flow. This work will start with the application of machine learning in the microscopic particle-particle interaction and asso-
ciated contact models. Then, different neural networks for learning the constitutive behaviour of granular materials will be
reviewed and compared. Finally, the macroscopic simulations of practical engineering or boundary value problems based
on the combination of neural networks and numerical methods are discussed. We hope readers will have a clear idea of the
development of machine learning-aided modelling of granular materials via this comprehensive review work.
1 Introduction of granular media. Wherein the FEM, SPH, and MPM solve
the mechanical responses of granular materials from the
The granular material, as a macroscopic continuum, show- macroscopic scale, while the DEM focuses on the mechani-
cases complicated features, involving anisotropy [125, 155], cal behaviour of granular media at the microscale.
strain localization [155, 158], non-coaxiality [154], and In the mesh-based numerical method, such as FEM [196],
path-and-states dependence [2, 41, 79], under external load- the research domain is discretized into finite element meshes
ing due to their micro-discrete nature. Traditional numerical that incorporate Gaussian points or grids, and the local
techniques, such as the finite element method (FEM), finite mechanical response of the material is characterized by con-
differential method (FDM), material point method (MPM), tinuum-theory-based phenomenological models embedded
discrete element method (DEM), and smoothed particle in each Gaussian point. The classic constitutive models used
hydrodynamics (SPH), have been widely employed to inves- in these methods include linear elasticity (e.g. Hooke’s law),
tigate the micro/discrete and macro/continuous duality [173] nonlinear elasticity (e.g. Duncan-Chang [43]), the elastic-
perfectly-plastic model (e.g. Mohr-Coulomb model [153],
Drucker-Prager model, and hardening soil (HS) model [23]),
* Y. T. Feng
[email protected] and the critical-state-based model (e.g. modified cam-clay
MCC [137] UH [179, 180], and hypoplastic model [118,
* Min Wang
[email protected] 177]), etc.
Different from the mesh-based method, the MPM, a
1
Zienkiewicz Centre for Computational Engineering, Faculty hybrid Eulerian-Lagrangian meshless approach [71, 186],
of Science and Engineering, Swansea University, Swansea, governs the macroscopic deformation of the target body via
Wales SA1 8EP, UK
a set of discretized material points which carry local physi-
2
Department of Civil, Architecture and Environmental cal features (e.g. mass, density, and velocity) of the material.
Engineering, University of Texas at Austin, Austin,
Texas 78701, USA
In MPM, the information stored on each material point is
3
first projected to the node of the Eulerian mesh on which the
Department of Civil and Environmental Engineering, Hong
Kong University of Science and Technology, Clearwater Bay,
motion equation is solved. Then the updated nodal kinematic
Kowloon, Hong Kong SAR, China features are interpolated back to the corresponding material
4
Fluid Dynamics and Solid Mechanics Group, Theoretical
points to update the deformation of the material, making the
Division, Los Alamos National Laboratory, Los Alamos,
New Mexico 87545, USA
Vol.:(0123456789)
M. Wang et al.
MPM especially superior in solving the large deformation of amounts of grain-scale computation has also fostered
and flow problems of granular materials. the emergence of multiscale approaches, such as the FEM-
Unlike the continuum-based methods, the DEM, pro- DEM [4, 5, 68, 69, 121] and MPM-DEM [104] approaches,
posed in Cundall and Strack’s works [37–39, 147] in the which bridges the micro-features to the computation of
1970 s, represents the overall behaviour of particle systems macroscopic response of grain materials. Although these
by explicitly modelling interactions between particles, where methods can compensate for defects of continuum methods
adhesive/frictional contact plays a crucial role. In the early to some extent, they are also subjected to the problem of
framework of the DEM, the discrete particle is normally prohibitive computing costs in large-scale simulations.
simplified as discs for two-dimensional (2D) problems. With More recently, the machine learning (ML) method
the advancement of non-spherical models [22, 54, 94, 165, resurged in various fields [3, 98, 99, 166] seems a promis-
172] and corresponding contact theories [47, 50, 194] over ing scheme to circumvent the aforementioned shortcomings
the last four decades, the DEM has been firmly established as of traditional numerical methods in simulating the behav-
one powerful numerical tool to model the deformation [116, iour of granular materials because of its following advan-
151] or flow [42, 185] of the collection of grains from the tages: 1) the ML or neural network-based surrogate model
microscopic scale in engineering and scientific problems. can directly extract mechanical features of granular mate-
However, these methods have their intrinsic deficiencies. rials from raw data without any assumption; 2) due to its
For mesh-based methods, most constitutive models they remarkable high-dimension mapping capability [40, 78], ML
leveraged tend to homogenize or smear out the underlying models can approximate desired constitutive laws at high
discrete features of granular assemblies [68], which limits precision by sufficient training data rather than sophisticated
their ability to simulate catastrophic instabilities [180] (e.g. formula and well-calibrated physically meaningless param-
landslides and liquefaction) caused by collective particle eters; 3) ML models are highly computationally efficient and
motion. Furthermore, to perfectly fit one certain experi- can instantaneously update the materials’ micro or macro-
mental phenomenon, constitutive models are increasingly states of materials according to the received deformation
sophisticated and inevitably introduce assumptions [180, information once their training parameters are determined.
183] or arguments without physical meanings but need much Resorting to these advantages of the ML method, it is pos-
effort to calibrate [129, 182, 191], which limits their further sible to develop the ML-based stress–strain model to replace
applications in other tests. It is also worth mentioning that the traditional constitutive model used in continuum-level
the constitutive model developed for granular flow is more computational approaches. Meanwhile, employing ML
complicated than the above-mentioned ones. Typical work algorithms to establish surrogate contact/interaction models
can be found in [187], where the effect of particle rotation at for both micro-grains or macroscopic material points also
the grain level results in the Jaumann derivative in the evolu- shows promise in alleviating the computational burden for
tion equation of the macroscopic contact stress. Addition- mesh-based and meshless numerical methods (e.g. MPM
ally, when addressing large deformation problems, the local and DEM).
mesh may suffer from severe distortion and thus deteriorate This paper primarily aims to provide one comprehensive
modelling results. While the mesh distortion problem can be review of the latest advances in ML-assisted granular mate-
avoided in MPM, it requires the information exchange pro- rial modelling at both microscopic and macroscopic levels
cess between the background (Eulerian) grid and material from the following aspects: 1) the application of ML models
points, resulting in data breaches and significantly increasing in microscopic grain scale computation; 2) the development
computational complexity. of ML-based constitutive models of granular materials; 3)
For DEM, although the discrete nature of granular par- the development of ML-aided macroscopic simulation of
ticles can be taken into account, it still confronts some granular materials, including both the macroscopic kin-
challenges. One is the prohibitive computational expense ematic features-based ML model and the application of
primarily attributed to the contact detection procedure in the ML method in mesh-based numerical methods (e.g. the
DEM simulations of particle collections [171], and this issue FEM-ML framework).
is further exacerbated in nonspherical particle assemblies The remaining paper is structured as follows: Sect. 2
where contact detection involves an iteration-based optimi- compares and summarises the architecture and features of
zation method [94, 105]. Related to this, to minimize the some typically used neural networks in ML-aided modelling
computational cost, irregular shape particles are typically of granular materials. A brief review of grain information-
simplified as spheres (3D) or circles (2D) [9, 93] but may based ML simulations, including the ML-based inter-particle
lead to the spuriousness of the bulk mechanical properties contact feature and kinematic feature modelling of grain sys-
in particle assemblies. tems, is provided in Sect. 3 from both their advantages and
In addition to the aforementioned numerical methods deficiencies of each aspect. Section 4 showcases the latest
used for the modelling of granular media, the requirement development of ML-based constitutive studies of granular
Machine Learning Aided Modeling of Granular Materials: A Review
As shown in Fig. 2, the input vector x(x(0 ) , … , x(t) ) is fed However, the basic RNN architectures normally suffer
into each RNN neuron featured with the recurrent connec- from gradient vanishing or exploding problems along the
tion (the orange dash line), which can be envisioned as a timeline, resulting in their weak capability to capture long-
succession of unrolled cells across time steps, and output term dependencies. Therefore, more elaborate RNN archi-
the hidden state H(t) of the current time step. In each cell, tectures, such as the long short-term memory (LSTM) [77]
the hidden state of the last time step and the input of the and the gated recurrent (GRU) [31], are innovated and
current time step are considered together to generate the more commonly used.
hidden state of the next time step until the hidden state of As illustrated in Fig. 3a, which shows the memory cell of
time step t is obtained, enabling the history information LSTM, the key distinction between the standard RNN and
to be integrated into the output layer to predict current LSTM is the introduction of internal state Ct and the novel
time step. inclusion of multiplicative gates i.e. the input gate It , input
The feedforward process in one RNN neuron can be node C̃ t , forget gate Ft , and output gate Ot , in each RNN neu-
expressed as: ron, which can be calculated as:
( ) ( )
H(t) = tanh W(t) x(t) + U(t) H(t−1) + b (3) Ft = 𝜎 WF x(t) + UF H(t−1) + bF (5)
( ) ( )
O(t) =g W2 H(t) + b2 (4) It = 𝜎 WI x(t) + UI H(t−1) + bI (6)
where It and C ̃ t determine how much new information is rapid growth of training parameters and excessive memory
adopted and Ft governs how much old internal state Ct−1 is usage to store the result of each cell with the increase of
inherited to update the new internal state Ct: the time step. Furthermore, the nature that the prediction of
the current hidden state must wait for the completion of its
̃t
Ct = Ft ⊙ Ct−1 + It ⊙ C (9) predecessors in RNNs limits their parallelism.
and then, the current hidden state H(t) which relies on the
2.2.2 The Temporal Convolutional Neural Network
obtained internal state Ct and output gate Ot is updated by:
( ) The temporal convolutional neural network (TCNN) [11],
H(t) = Ot ⊙ tanh Ct (10)
a variant of the convolutional neural network (CNN) which
where ⊙ denotes the elementwise product operator; 𝜎 here will be introduced in detail later in Sect. 2.3, is another ML
represents the sigmoid activation function. model that can be used in time-sequence forecasting prob-
On the other hand, the gated recurrent unit (GRU) neural lems. Different from the RNN which uses the memory cells
network shown in Fig. 3b provides a streamlined version of to integrate history information, the TCNN extracts the past
LSTM with a comparable performance but speeds up com- information by scanning the input data along the time-step
putation by only introducing two gates: update gate Zt and direction using filters with the shape of kernel size × Depth,
reset gate Rt to the standard RNN cell. The output of these in which trainable parameters are stored.
two gates can be expressed as: Figure 4 provides a detailed feedforward process in the
( ) TCNN. The input x is composed of a 2D array with one
Rt =𝜎 WR x(t) + UR H(t−1) + bR (11) representing time steps and the other representing depth (i.e.
the number of columns of the input data). To guarantee only
( ) the past and current data are involved and an equal length
Zt =𝜎 WZ x(t) + UZ H(t−1) + bZ (12)
between the input and output columns of each convolution
where Rt determines how much previous hidden state is kept layer, the causal convolution (CC) in which the forefront of
in the current candidate hidden state H
̃ (t): input data is padded with zero along the direction of time
sequence. Where the number of the padded zero line is equal
[ ( )]
� (t) = tanh Wx(t) + U Rt ⊙ H(t−1)
H (13) to kernelsize − 1. In the CC procedure, n filters will simulta-
neously make dot product calculations with the correspond-
and Zt controls the extent to which the new hidden state H(t) ing elements of input data along the direction of the time
matches the old state H(t−1) and how similar it is to the new step with fixed stride s and obtain n corresponding output
candidate state H̃ (t), which can be expressed as: series (o1 , o2 , … , on ). The obtained output vectors are then
flattened to one column and fed into the output neuron to
̃ (t)
H(t) = Zt ⊙ H(t−1) + (1 − Zt ) ⊙ H (14) predict the value of the current time step.
As the TCNN filters are independent, it is easy to parallel-
Incorporating more sophisticated gate architectures in the ize the training process across GPU cores [11]. Furthermore,
standard RNN can effectively alleviate the gradient vanish- the number of training parameters is solely dependent on the
ing or explosion problem, but such structures also lead to the number of filters, which means when the time steps of the
training data increase, there are no extra training parameters Leveraging CNNs, it becomes feasible to predict granu-
introduced. In addition, the transfer learning process is easy lar properties by encoding the structural characteristics
to conduct between two TCNN models as long as the num- of particle assemblies into a pixel matrix. Similar to the
ber and shape of filters remain constant. TCNN which extracts the temporal features with filters,
However, the TCNN also has some notable disadvan- the CNN collects spatial features of figures by filters.
tages. Unlike the LSTM or GRU in which the needed history Figure 5 illustrates the whole feedforward process in
information is adjustable to predict the current time step, the CNN. Unlike the TCNN, which requires input data in a
TCNN cannot filter out the redundant historical information sequential time-based format, CNN regards the image with
when transferring a model from a domain where only little abundant spatial information as the input data; thus, the
memory is needed (i.e., small kernel size) to a domain where time dimension is not required. Instead, the input is repre-
much longer memory (i.e., large kernel size) is required. In sented by its height ( h ), width ( w ), and channel (c ) dimen-
addition, the development of one TCNN normally needs sev- sions, where c represents the number of input images. Cor-
eral CC layers, which may need a large memory to store the respondingly, the dimension of the filter is changed to 3D
trained parameters, resulting in a lower training efficiency. but with the same height and width as the kernel size k .
Furthermore, unlike TCNN, where the filter can only move
2.3 The Geometry Information‑Based Neural along the direction of the time step, the filter in CNN scans
Networks the input data from both the height and width directions at
a fixed stride. It is worth noting that to keep the constant
The prediction of the granular mechanical response is tra- shape between the input and output, zero-padding around
ditionally considered a time-sequential problem and mainly the input pixel is still required, similar to the approach
treats granular information as vectors to develop the ML used in TCNN.
surrogate model, which ignores the rich spatial features of Compared to TCNN, CNNs have some unique features.
granular assembly. The recent development of ML technol- The first is its interpretability, as CNNs can be visualized to
ogy also enables researchers to directly extract physical show which part is significant for the prediction. The second
information of particles from figures or graphics to predict notable feature is its location invariance. Filters are trained
the behaviour of granular media. The two most commonly to detect various spatial or shape features, such as edges
used geometry information-based neural networks are the or corners, enabling CNNs to recognize objects in images
CNN and the graph neural network (GNN), respectively. regardless of their position.
However, CNNs also have some limitations. For example,
2.3.1 The Convolutional Neural Network CNNs are not suitable for non-grid input data. Furthermore,
their performance can be impacted when input images are
The advent of CNN [57, 167], has revolutionized fields occluded or contain significant noise. In addition, training
like image recognition [34, 35] and video analysis [8, 83]. one CNN to recognize meaningful patterns normally needs
( )
a large amount of labelled data, which is a great challenge ∑ hk−1
u
when data is expensive to acquire.
k
hkv = f W + bk hk−1
v
, ∀k ∈ {1, ..., K} (15)
u∈N(v)
N(v)
2.3.2 Graph Neural Network u represents the output features of nodes that con-
where hk−1
nect to the vertex v from the (k − 1)th layer. N(v) denotes the
∑ hk−1
In addition to CNN, the GNN [120, 140, 176] are also capa- number of adjacent nodes of vertex v . The item u∈N(v) N(v) u
ble of learning the structural information and topological aims to aggregate the received message of node v from the
features of graphs, which makes them a powerful tool for previous layer in an average way. Wk and bk are trainable
capturing the dynamic features of grain assemblies [6, 32]. parameters stored on each edge in the kth GNN layer. f rep-
An example architecture of the GNN is illustrated in resents the activation function (e.g. ReLU).
Fig. 6. The GNN takes one graph consisting of six nodes Compared with the MLP, the RNNs, CNNs, and the GNN
as the input. In the input layer, also called the 0th layer, the have some distinct advantages. Different from RNNs and
node feature of each vertex is represented by xv, and all node CNNs, whose input data needs to be arranged in one certain
( )
information h0v h01 , h02 , … , h06 is transported to the GNN order, the GNN acquires knowledge by updating the mes-
layer1. In the GNN layer1, the node feature is alternately sage stored in each node and thus disregarding the order of
updated and the newly-generated output h1v (h11 , h12 , … , h16 ) input data. In addition, the interaction relationship between
is passed to the next GNN layer. This process is repeated different objects can be captured by the edges of graphs,
until the output layer acquires the prediction result zv where but networks, like the MLP and RNNs, cannot explicitly
zv = hKv , and K denotes the output layer. reflect this dependency relationship. Therefore, the GNN is
In GNN, the feature of each node hkv is updated by the quite suitable for analyzing kinematic features of structural
message passing (MP) process, which aggregates the infor- scenarios, e.g. granular systems.
mation received from their adjacent nodes u. There are nor- However, it is worth noting that the GNN also confronts
mally three different ways to integrate the message from the many challenges. One obvious limitation is its vulnerabil-
neighbouring nodes, i.e. mean, sum, and max. Regarding the ity to the modification of graph structure. When nodes and
mean method as one example, the MP procedure of the node edges are added or removed, the GNN cannot adaptively
v in the kth GNN layer can be expressed as: adjust the network structure. Furthermore, similar to CNNs,
the GNN is also not robust to noise. Introducing a slight of neural networks does enhance their ability to approxi-
noise into the graph via node perturbation or addition/dele- mate or predict more intricate functions but meanwhile
tion of edges can cause an adversarial impact on the output increases training parameters, potentially reducing the
of GNNs. In addition, in graphs consisting of numerous training efficiency.
nodes, a large amount of training parameters are required Furthermore, different neural networks possess distinct
to represent relationships between adjacent objects, which advantages and limitations. For instance, the MLP is par-
results in a low training efficiency of GNN. ticularly useful in approximating one-to-one or many-to-
one mappings, while RNNs (including LSTM and GRU)
2.4 Summary and TCNNs are more adept at time series forecasting tasks,
owing to their unique architectures. While CNNs and GNNs
The networks used in the ML-aided granular material perform well at extracting spatial features from the input
simulation have their unique architectures, advantages, data. Instead, the focus should be on identifying the most
and shortcomings, which are summarized in Table 1. It suitable neural network for the designated task based on its
is found that all these neural networks excel at solving specific strengths and weaknesses.
non-linear or high-dimensional problems. Meanwhile, to In addition to the seven typical neural networks used
enhance their versatility in addressing various complex in ML-assist granular materials modelling, other neural
tasks, neural network architectures tend to become more networks, such as the radial basis function neural network
sophisticated, for example, the evolution of basic RNN to (RBFNN), Bi-LSTM, residual CNN, etc., are also developed
the LSTM and GRU. The growing structural complexity based on the aforementioned typical neural networks. These
Table 1 Features of seven typically used neural networks in the ML-aided granular materials simulation
Neural Advantages Disadvantages
networks
MLP (1) Simple architecture; (1) Requiring artificially added history variables
(2) High training efficiency in time-sequence problems;
(2) Gradient vanishing or explosion;
(3) Sensitive to noise and irregularities
RNN (1) No extra history variables are needed in sequential prediction; (1) More complex network architecture than MLP;
(2) Strong non-linear mapping capability (2) More training parameters than MLP;
(3) Gradient vanishing or explosion;
(4) Weak ability to record long-history information;
(5) Weak parallelism
LSTM (1) No extra history variables are needed in sequential prediction; (1) Increased structural complexity than RNN;
(2) Strong non-linear mapping capability; (2) More training parameters than RNN;
(3) Eliminate the gradient vanishing or explosion (3) Weak parallelism
by the gate structure;
(4) Strong ability to capture long-term
dependencies
GRU (1) Similar to LSTM; Similar to LSTM
(2) Simpler gate structures than LSTM
TCNN (1) No extra history variables are needed (1) Numerous training parameters;
in sequential prediction; (2) Weak ability to filter out the redundant history information
(2) Strong non-linear mapping capability;
(3) Excellent parallelism;
(4) Good portability of trained parameters;
(5) Suitable for longer history information
CNN (1) Strong high-dimensional mapping capability; (1) Numerous training parameters;
(2) Excellent parallelism; (2) Not available in non-grid input data;
(3) Good portability of trained parameters; (3) Weaker resistance to noise;
(4) Good visualization; (4) Requirement for large amounts of labelled data
(5) Location invariance of input;
(6) Strong ability to get rich spatial features
GNN (1) Available in non-grid input data; (1) Numerous training parameters;
(2) Strong ability to get rich spatial features; (2) Vulnerability to the modification of graph structure;
(3) Strong ability to reflect the (3) Less resistance to noise
relationship of adjacent objects
Machine Learning Aided Modeling of Granular Materials: A Review
networks inherit the strengths and limitations of their prede- (b) Contact detection and resolution: identifying pairs of
cessors, but will not be discussed in detail here. particles that are in contact through collision detection
algorithms [49]. Subsequently, computing the contact
features (e.g. contact norm n, tangent t , and corre-
3 The Microscopic Grain Information‑Based sponding inter-particle overlaps 𝛿n , Δ𝛿t ) between each
ML Models pair using collision resolution algorithms.
(c) Contact force and torque calculation: utilizing the
In the field of granular mechanics, DEM has been very acquired contact features and contact model [39, 76,
popular for modelling the mechanical behaviour of various 85], the resultant contact force F(t) and contact torque
granular materials and related engineering problems, such M(t) , which respectively govern the translational and
as landslides [109], shear deformation of soil and sand [92, rotational motion of particles, are calculated.
152], failure of tunnel surface [184] and fluidized bed [111]. (d) State variable update: the acceleration a(t+Δt) and the
While these methods can reflect the discrete nature of granu- angular acceleration 𝜶 (t+Δt) are respectively updated
lar media from the grain scale to a certain degree, they typi- with Newton’s second law with obtained contact force
cally suffer from intensive computational costs. The advent and torque. This update process further refreshes the
of deep learning methods offers a potential way to alleviate state variables, including both the v(t+Δt) and w(t+Δt).
the computational burdens by integrating ML models with (e) Position update: the geometry information of the par-
these conventional (microscopic) grain/particle-based tech- ticle (i.e. their location) is updated based on newly
niques. In this section, a concise review of relevant studies obtained state variables. After the particle positions are
in this area is presented. Before that, a brief introduction to updated within the current time step, a new iteration
the basic framework of the DEM is given. will begin.
3.1 The Basic Framework of the Discrete Element 3.2 The ML‑Aided Discrete Element Modeling
Method
In the traditional DEM calculation process, the contact
In contrast to the continuum approach, the DEM represents detection and resolution process are the most computation-
granular material as an assembly of distinct particle enti- ally intensive steps [95, 171]. Therefore, leveraging the
ties, and the overall (macroscopic) behaviour of the system superior computational efficiency inherent in ML models,
is governed by inter-particle contacts over time, making it the development of ML-based models for contact detection
superior to address the large-deformation problem. and resolution shows significant potential in accelerating the
Figure 7 illustrates the fundamental computational steps calculation process of the DEM.
of the DEM, encompassing the following stages: Within the ML-enhanced DEM framework, the contact
detection and resolution processes are achieved through a
(a) System initialization: This step involves 1) assigning classification and regression neural network, respectively.
material properties 𝜆 (e.g. mass m and friction coef- As shown in Fig. 8 and Fig. 9, the key difference between
ficient 𝜇 ) to particles within the current system; and 2) these two networks is their output layers. In the classifica-
initializing the geometry features X(t) (e.g. shape, size, tion network, the output of the final layer is 0 or 1, indicat-
and positions) [36, 51, 94, 106, 126] and state param- ing the contact status of two particles (one considered as
eters 𝝌 (t) (e.g. velocity v(t), angle velocity w(t), accelera- an object grain and the other as a cue particle). While the
tion a(t), and angle acceleration 𝜶 (t) to each particle. regression network outputs contact features such as contact
point, normal, and overlap between two particles given their ML models that can directly capture the particle motion
requisite geometric features. laws without calculating the contact forces using empirical
Lai et al. [95] extracted the contact status and contact or analytical contact models.
features of particles with elliptical and arbitrary shapes by Based on the ML method, both the collision law of sparse
a classification network and a regression network, respec- objects and the deformation of dense grain media have been
tively. Both these two networks take the same parameters as widely investigated. For instance, in the work of Katerina
input, including the shape parameters of the object grain, et al [53], the trajectory of the billiard is predicted via a
and the size, shape as well as position parameters of the cue CNN which takes the current and previous image informa-
particles. The obtained two networks are then embedded into tion of the system as input and outputs the velocity of the
the DEM algorithm to model cases, e.g. random packing, goal object in the future time steps. Meanwhile, a novel
iodometric compression, angle of repose, packing, and com- object-centric prediction method is used to enhance the
pression of arbitrarily irregular-shaped particles. Similarly, translational invariance of the acquired physical laws by the
Hwang et al. [81] used one classification and one regression ML model. Wu et al. [175] integrated the ML model into the
ANN to respectively predict the contact states (i.e. detached physical engine to perceive dynamic features of objects in
or intersected) and contact features of two identical irregular the future time step according to received physical informa-
particles, including the mean contact point, averaged norm tion (e.g. position, friction coefficient, shape) of objects at
vector of overlapped vertices, and inter-penetration depth. In previous time steps. Battaglia et al [18] embedded the physi-
their work, to train these two networks, the labelled contact cal state of the system at the previous step into one graph to
properties of two particles were computed by the deepest construct the interaction network which can reason the col-
point method according to their relative position and orienta- lision rules of objects in the complicated system interact in
tion which were fed into ML models as the input. the future step. Based on the ML method, Chang et al. [28]
built a Neural Physics Engine (NPE) which takes the past
3.3 The ML‑Based Grain‑Level Kinematic Features pair velocity of both the goal and its adjacent objects as
Simulations input to predict the velocity of objects at the next time step
in the system.
The evolution of the microstructure in granular materi- On the other hand, ML models trained with the data
als, influenced by the kinematic features of particles, sig- generated by DEM simulations have been employed to
nificantly affects the mechanical behaviour of the material. accelerate the computing of DEM simulations by replacing
Therefore, many researchers have endeavoured to develop the contact models with ML models. In the work of
Machine Learning Aided Modeling of Granular Materials: A Review
Ummenhofer et al [156] and Lu et al [111], a CNN that law Fn = kn ⋅ abs(Δx − ri − rj ) based only on the kinematics
can predict the collision laws of inter-particles was con- of the DEM particles.
structed with continuous filters to substitute the direct We can see most kinematic features-based ML models
computation of particle-particle/boundary collisions in take the position and velocity of individual grain at pre-
DEM modelling to accelerate the simulation of grains vious and current time steps as input and directly output
flow. Li et al [102] designed a GNN that takes the acquired the acceleration of the corresponding particle in the next
static microstructure of grain packing from DEM simula- step to update the current grain state of the whole system,
tion as input to predict the contact force of the grain sys- which bypasses the contact force and torque calculation,
tem under the uniaxial compression condition. Cheng thereby significantly improving the computational efficiency.
et al [30] used the particle and contact kinematical data However, the existing research indicates that the majority
obtained from a DEM simulator to train the GNN which of obtained models are trained using data collected from
can estimate the contact forces of granular assemblies granular systems composed of circular or spherical particles,
under the uniaxial compression condition. Bapst et al [14] and thus ignore the influence of particle shape on the grain
trained GNN with the solely initial position information trajectory.
of the grains system to represent the long-term evolution
of a glassy system under the shear condition. Mayr 3.4 Summary
et al [119] develop a Boundary-Graph Neural Network
(BGNN) to model the interaction of particles with com- This section summarizes the work related to the grain
plex boundary conditions. Virtual nodes are inserted into information-based ML models from two aspects. The first
the graph dynamically to represent the boundary surface is developing the ML-based contact model based on con-
regions within the cutoff radius of particles and have fea- tact state and contact geometric features in particle-based
tures encoding triangle orientation representing the bound- numerical methods, which has received comparatively less
ary. The authors successfully replicate the simulation of attention from researchers. The second aspect is leveraging
3D granular flows through hoppers, rotating drums, and appropriate ML models, e.g. (CNN or GNN), to predict the
mixtures. Kumar et al. [90]hypothesize that the GNN mes- kinematic features for both sparse and dense grain systems
sages encode latent representations that preserve the at the microscopic scale. Both of these two directions show
underlying interaction laws between particles in a DEM promise in reducing the intensive computation cost in par-
simulation. The ticle/grain-based numerical techniques. A summary of the
( ( sparse representation )) of the GNN mes-
advantages and limitations of ML-based microscopic grain
sages ek ← 𝜙 ek , v{rk } , v{sk } , u is a learned linear com-
information models is provided in Table 2.
bination of the true forces. To learn a maximally sparse Compared to the traditional contact models, ML-based
message representation, the authors sort the message vec- contact models offer one significant advantage by directly
tor components by standard deviation and enforce an L1 outputting contact features and relationships between two
regularization, forcing the GNN to describe the messages particles given their positions, bypassing the traditional con-
in a minimal vector space. With this approach, they suc- tact detection and resolution process, which accelerates the
cessfully recover the fundamental linear spring interaction overall computational process of the DEM. Furthermore,
the ML-based contact models have a strong capability to
account for the influence of particle shape, as the trained ML In addition to the aforementioned discussion, several
model can instantly provide contact features when given the promising avenues for future research are also proposed
necessary geometric information of particles, regardless of in this section, aiming to address current limitations and
their shape. This represents an improvement over traditional enhancing existing methodologies:
contact detection methods, which often simplify grains into
spheres (3D) or circles (2D) to improve computational effi- (1) Enhancing training data for ML-based contact mod-
ciency but overlook the influence of the rotation resistance els: The critical challenge in developing one universal
of particles. Additionally, although more advanced contact ML-based contact model is the quality and complete-
detection algorithms and contact models with rigorous theo- ness of the training data. Consequently, future research
retical foundations have been developed, their implementa- could focus on developing new methodologies to define
tion remains challenging for engineers and researchers unfa- contact features beyond traditional empirical models,
miliar with computational geometry [47, 48]. thereby minimizing inherent assumptions in the train-
In ML-aided grain-level kinematic feature simulations, ing datasets and improving the overall data quality.
the fine-tuned geometry information networks can simul- Additionally, efforts could be directed toward generat-
taneously predict the acceleration of all grains based on ing more comprehensive and diverse training datasets
their positions from the previous time step. This approach to enhance the generalization capability of ML-based
eliminates the need for the iterative calculation of particle contact models across a wider range of contact sce-
accelerations typically required in traditional discrete ele- narios.
ment modeling, thereby significantly enhancing simulation (2) Mitigating error accumulation in particle system roll-
efficiency. outs: Error accumulation is a key limitation when
However, it is also important to acknowledge the limi- utilising geometry information-based networks, par-
tations associated with each aspect. The development of ticularly in long-term simulations of particle systems.
ML-based contact models is primarily hindered by the com- Therefore, future works could explore error correction
pleteness and quality of the training data, including contact mechanisms, such as active learning, to dynamically
geometry information. The contact state and contact fea- adjust the model prediction during simulations to elimi-
tures are directly related to the relative position and geom- nate error propagation.
etry shape of two particles. To create an ML model that can (3) Incorporating rotational dynamics into ML-based kin-
accurately predict contact information for any contact posi- ematic feature models: Current ML-based models for
tion between two particles, the training data must encompass grain-level kinematic feature simulations focus primar-
a wide range of contact conditions, which typically requires ily on positional and translational velocities, ignoring
significant time and effort to generate, especially in grains rotational dynamics. Future work could focus on inte-
with complex shapes. In addition, the training data gen- grating rotational degrees of freedom, such as particle
eration relies on the theoretical contact models used. For rotation and angular velocity, into ML models, which
example, the definition of contact geometry features, e.g. may involve extending current CNN or GNN architec-
the overlap distance and the direction of contact norm, are tures to account for these additional physical factors.
empirical and can vary in different contact models. This dis-
parity further limits the development of ML-based contact
models.
On the other hand, when employing GNNs or CNNs to 4 The ML‑Based Constitutive Models
predict the rollouts of particle systems, error accumulation of Granular Materials
becomes unavoidable. This occurs because the dynamic
update at the current time step depends on the particle posi- The development of the ML-based constitutive model of
tions from the previous step. Furthermore, apart from the granular materials can be traced back to the 1990 s [60,
position and translation velocity, particle rotation, and angu- 124, 143], and the development of the ML-based constitu-
lar velocity are freedoms of individual particle/grain of gran- tive model of granular materials is undeniably one of the
ular materials. It seems that current kinematic features-based most prominent subjects in ML-assist numerical methods.
ML models are unable to reflect these fundamental physics. The construction of ML surrogate stress–strain models for
Additionally, it should be noted that geometry information- grain media depends on numerous factors, such as the hyper-
based networks, such as GNNs, are not well-suited for sys- parameters, optimization algorithm, loss function, etc., used
tems in which the number of particles changes during the in neural networks [190]. However, the two fundamental fac-
simulation. This limitation arises from the inherent structure tors that govern the performance of the ML-based constitu-
of graph-based networks, which require a fixed number of tive model are the data resource and the feature selection of
nodes to maintain consistency throughout the simulation. input–output corresponding to the used networks.
Machine Learning Aided Modeling of Granular Materials: A Review
Table 3 The development of Material Reference Experiment Loading Drained Data Neural
ML-based constitutive models source network
for granular materials
Sand [143] Triaxial M D Experiment MLP
Sand [60] Triaxial M D+U Experiment MLP
Soil [195] Triaxial M D Experiment MLP
Sand [124] Triaxial M D Experiment MLP
– [17] / C / Synthetic data, MLP
and experiment
Coarse [136] Triaxial M Experiment RNN
sand
– [16] / C / Synthetic data MLP
Lateritic [70] Triaxial M D Experiment MLP
gravel
Sand [13] Triaxial M U Experiment MLP
Ballast [142] Triaxial M D Experiment MLP
– [55] Triaxial M U Synthetic data MLP
(MCC)
Sand [73] Triaxial M D Experiment MLP
Soil [74] Triaxial M U Experiment MLP
Lateritic [84] Triaxial M D Experiment MLP
gravel
Soil [114] Triaxial M U Experiment MLP
Sand [141] Direct shear M / Experiment MLP
Rockfill [7] Triaxial M D Experiment MLP
Sand [134] Triaxial M D Experiment MLP
– [146] Triaxial M D Synthetic data MLP
(HS)
Sand [88] Triaxial M D Experiment MLP
– [100] / M / Synthetic data MLP
(DEM)
– [161] Simple shear C / Synthetic data LSTM
(DEM)
/ [160] Tension-shear C / Synthetic data GRU
(DEM)
– [188] / M / Synthetic data LSTM
(MCC)
– [192] Triaxial C D+U Synthetic data LSTM
(EM)
– [128] Triaxial C / Synthetic data GRU
(DEM)
– [163] Triaxial C / Synthetic data TCNN
(DEM)
– [115] Triaxial M / Synthetic data LSTM
(DEM)
Table 3 offers a summary of partial works on ML-based The experiment data, which directly reflects the stress–strain
constitutive models of granular materials over the past response of granular materials, implicitly encapsulates the
decades, there are mainly two types of data resources used most authentic constitutive laws without any assumption,
when developing the ML constitutive models of granular and thus the ML models developed from the experiment data
materials. One is the experiment data, and the other is can reveal the most essential mechanical features of granular
synthetic data. materials. As listed in Table 3, the mechanical response of
M. Wang et al.
different granular materials, such as soil, sand, clay, bal- Given these advantages, it is possible to generate exten-
last, and rockfill, has been investigated by different neural sive amounts of synthetic data encompassing various materi-
networks, where most ML models focus on the mechanical als and more extensive stress–strain space to establish more
behaviour of granular materials under the drained and und- robust machine learning models. Ma et al. [115] provides an
rained triaxial test, and the remaining research dedicates to example of this, where one ML model which is capable of
develop the ML surrogate models which can represent the simulating the stress–strain response of granular materials
mechanical features of granular media under direct shear- with different particle size distributions (PSDs) and initial
ing [141], simple shearing [161], and tension-shear [160]. void ratio ( e0 ) under random loading paths was obtained
While the experiment data can provide reliable inputs for through DEM-generated data. In addition, the development
neural networks to extract underlying principles governing of the synthetic data-based ML model can also provide prior
the behaviour of materials, the limitations of the experiment knowledge for constructing experiment data-based machine
data should also be taken into consideration. The training learning models. In reference [16], several mapping methods
of neural networks normally requires a sufficient amount based on synthetic data were compared before selecting the
of data samples, making a purely experimental data-driven true sequential dynamic mapping method for simulating the
approach expensive. Additionally, restricted by the experi- cyclic behaviour of soils with experiment data.
mental facility, most experiment data used for training ML While ML models derived from synthetic data can cap-
models are generated under specific shearing or triaxial ture the fundamental mechanical characteristics of granular
compression conditions, covering only a partial range of materials under specific loading paths, they are limited in
stress–strain space and material types, and thus the robust- uncovering deeper constitutive laws, since synthetic data are
ness of trained ML models is limited. generated under some assumptions (e.g. the homogenization
in theoretical models) and simplification (e.g. the simpli-
fied shape of grain in DEM), which results in the loss of
4.1.2 The Synthetic Data some intrinsic physical information of granular materials.
However, there is no doubt that synthetic data could be a
Compared to experiment data, synthetic data can be a cost- cost-effective supplement to experimental data in the devel-
effective alternative to experimental data to develop ML- opment of ML-based constitutive models.
based constitutive models for granular materials, as experi-
mental constraints do not bind it and can span a wider range 4.2 The Training Strategy for Different ML‑Based
of stress–strain space. As demonstrated in Table 3, the syn- Constitutive Models
thetic data generally can be acquired in two approaches. The
first is the phenomenological constitutive models, such as 4.2.1 The History‑Dependent Features of ML‑Based
the critical state-based models [180, 183] and the deviatoric Constitutive Models
hardening model [123, 127], and the other is the particle-
based numerical techniques, such as the discrete element Under the quasi-static condition, as shown in Fig. 10a, the
method (DEM). relationship between the strain tensor 𝜺(t) and stress tensor
𝝈 (t) of each time step in the monotonous loading cases keeps The variables used as the input and output of the data-
the one-to-one mapping which can be pursued as: driven model vary according to specific problems. As listed
( ) in Table 3, ML models are constructed with single-step-
𝝈 (t) = f 𝜺(t) , W, b (16) based or time-sequence neural networks to simulate the
behaviour of granular materials with cycling or monotonous
However, when subjected to loading reversal, as demon- loading data. Thanks to their specific architectures, time-
strated in Fig. 10b, a reversal loading may result in two iden- sequence neural networks like the LSTM, GRU, and TCNN
tical strains (e.g. 𝜀2 = 𝜀4 ) corresponding to different stress inherently acquire historical information from their input
states, and therefore, it is necessary to introduce the history data, which comprises a sequence of discrete data [115, 128,
variables to differentiate the loading state (i.e. the unloading 163]. Thus their input variables always consist of material
and reloading). and state parameters whether in monotonous or cycling
Different from the traditional theoretical constitutive loading
models, in which loading states can be explicitly formu- In single-step-based networks such as MLP, to consider
lated, ML models depend on other methods to differenti- the influence of the loading history on the constitutive rela-
ate the loading history under cycling loading conditions. tionship, there are generally two approaches to selecting his-
Time-sequence neural networks draw history information tory variables. The first one is regarding the predicted state
from their input data consisting of discrete strain–stress data parameters of the last time step as the history variables to
sequences: predict the current state, such as works [16, 73, 143, 146].
({ } ) As illustrated in Fig. 11a, the current prediction stress 𝝈̂ (t)
𝝈̂ (t) = f ML 𝜺(t−n) , … , 𝜺(t−2) , 𝜺(t−1) , 𝜺(t) , W, b (17)
not only relies on the current state variable strain 𝜺(t) but also
For instance, RNNs learn loading history by recording the the state variable of the strain 𝜺(t−1) and the predicted stress
hidden state encoded in the input data with RNN neurons, 𝝈̂ (t) at the (t − 1)th time step. The process can be formulated
while TCNN identifies different loading states by the causal as:
convolution calculation with trainable filters. ( { } )
𝝈̂ (t) = f ML 𝝀, g(𝜺(t) , 𝝈 (t) ), 𝜑 𝜺(t−1) , 𝝈̂ (t−1) , W, b (19)
However, single-step-based neural networks like the MLP
cannot distinguish the loading history itself. Therefore, they where 𝝀 represents the vector consisting of material param-
need to introduce artificially added history/internal variables eters. However, one inevitable issue is that any predicted
as extra input to transform the one-to-many mapping into a error of the previous stress state by neural network can lead
surjective between the input and output variables, which can to error accumulation in the ML system.
be expressed as: As demonstrated in Fig. 11b, an alternative scheme is
( ) directly extracting history variables 𝝋(t) from the state vari-
𝝈̂ (t) = f MLP 𝜺(t) , 𝜑(t) , W, b (18)
able of the current step, such as using the absolute accumu-
where 𝝈 (t) and 𝜺(t) represent the current stress and strain lated strain increment Δ𝜺(t) of the current time step as the
states of granular materials, 𝝋(t) means the history varia- history variable [66, 80], which can be computed as:
bles and consists of {𝜑(t)
1
(t)
, … , 𝜑m }; W and b mean trainable � (t)
Fig. 11 Different two methods for the selection of history variables in the MLP
( { } )
𝝈̂ (t) = f MLP1 𝜺(t) , 𝜺(t−1) , 𝝈̂ (t−1) , W, b (21) The identical cases in Fig. 13 are also employed to test the
trained MLP2 and GRU. The obtained results are showcased
( ) in Fig. 14, and the predicted average mean absolute error
𝝈̂ (t) = f MLP2 𝜺(t) , Δ𝜺(t) , W, b (22) (MAE) of these two models for the 80 test cases is listed in
Table 5. The outcomes, as illustrated in Fig. 14 and detailed
Furthermore, one of the time-sequence neural networks
in Table 5, reveal the MLP can achieve comparable perfor-
(i.e. GRU) is also trained and tested with the same data.
mance to the GRU using the parameterized history variables.
As a time-sequential network, the history-loading infor-
Furthermore, a comparison between the prediction results
{mation is encapsulated } in the input strain sequence in Fig. 14 and their counterparts in Fig. 13 indicates that the
𝜀(t−n+1) , 𝜺(t−n+2) , … , 𝜺(t) of GRU and can be identified by
use of parameterized history variables effectively mitigates
its RNN cells. Therefore, there is no need for the assistance
the problem of error accumulation.
of the extra history variable to learn the constitutive laws of
From the perspective of engineering application, the
granular materials, which can be expressed as:
strain intervals are normally different and random. There-
({ } ) fore, the sensitivity analysis of different neural networks to
𝝈̂ (t) = f GRU 𝜺(t−n+1) , 𝜺(t−n+2) , … , 𝜺(t) , W, b (23)
the loading strain interval is also conducted in this work.
where n represents the time step of the input strain sequence. The strain intervals of 80 test cases are randomized, and
The hyperparameters used in GRU and MLP are listed in the trained MLP2 and GRU are tested again with the newly
Table 4. generated test data. The prediction results of the same cases
Figure 13 gives an intuitive demonstration of the perfor- in Fig. 13 and Fig. 14 are plotted in Fig. 15. The results
mance of the trained MLP1 by six representative prediction illustrate that the single-step-based MLP remains effective
cases. In CTC and CP loading cases, in which one reversal in capturing the stress–strain response despite variations in
loading is included, although the MLP1 can capture the con- the loading interval. In contrast, the time-sequential neural
stitutive laws of granular materials in partial cases (CTC network GRU exhibits notable sensitivity to changes in the
case 2), an error accumulation phenomenon as mentioned loading interval, resulting in poorer performance under such
in Sect. 4.2.2 occurs, especially in reloading stage, see CTC conditions.
case 1, CP case 1, and CP case 2. Furthermore, it is also
found that uncontrollable accumulated errors can eventually
result in the breakdown of the entire prediction system (see 4.4 Summary
CP case 2), while controllable accumulated errors can be
gradually diminished (see CP case 1) due to the relationship This section provides an overview of ML-based constitu-
between the strain and stress changes from the one-to-many tive models for granular materials, focusing on two key
to a surjective after the reversal loading. Additionally, the aspects: the sources of training data and the training strate-
trained MLP1 performs well in all CB cases, as the relation- gies employed by different neural networks, and a concise
ship between stress and strain remains strictly surjective in summary of the advantages and limitations of each aspect is
these cases, and thus the selection of the history variables provided in Table 6. In addition, Sect. 4.3 presents a detailed
has less influence on the performance of the MLP1. case study that illustrates the predictive capabilities of
Table 5 The prediction results of all test cases with the GRU and In contrast, the data obtained from existing phenomenon-
MLP based constitutive models are more cost-effective, enabling
Loading paths ML models Average MAE the generation of diverse and extensive datasets that cover a
wide range of loading scenarios. This diversity can help pro-
CB GRU 0.0140 duce more general neural network models. However, their
MLP2 0.0159 fidelity fully depends on the selected constitutive models,
CP GRU 0.0174
which may have restrictions in reproducing certain char-
MLP2 0.0183
acteristics of granular materials. To the best knowledge of
CTC GRU 0.0168
the authors, no constitutive model is perfect. For the data
MLP2 0.0185
obtained from DEM simulations, reliability is also a problem
that we need to pay attention to. Apart from the accuracy of
the method itself, the ad-hoc/arbitrary selection of micro-
various neural networks under different loading conditions. scopic parameters for the grain/particle is primarily cali-
The following conclusions can be drawn from the analysis: brated based on certain cases, which may be not applicable
The datasets used to train ML-based constitutive models to a generic problem.
can be categorized into two types: experimental data and On the other hand, the constitutive law of granular
synthetic data. Experimental data are normally generated materials is inherently state and history-dependent, and
under conditions that closely resemble real-world loading both single-step and time-sequence (multi-step) neural
scenarios. However, these datasets are typically costly and networks can be employed to predict such constitutive
time-consuming to collect, resulting in limited coverage of behavior. Single-step-based neural networks, such as
the sampling space. Additionally, experimental data often MLPs, offer the advantage of a simple architecture, which
contain noise introduced by measurement devices, which typically leads to high training and prediction efficiency.
can adversely affect the training of neural networks. However, these models lack the inherent ability to capture
Machine Learning Aided Modeling of Granular Materials: A Review
loading history, necessitating the inclusion of artificially more suitable to be applied in the data with a fixed load-
added internal or history variables. There are generally two ing step.
approaches for incorporating history variables in MLPs: Based on the discussion of ML-based constitutive models
(a) state parameters predicted at the previous time step, of granular materials, some promising avenues for future
which can lead to an accumulation of errors over time, research, particularly from the perspectives of transfer learn-
and (b) variables extracted directly from the current state ing and optimization for training strategy, could be potential
parameters, which aim to provide more immediate histori- ways to solve the challenges mentioned above:
cal context without the propagation of previous errors.
Due to the unique architecture, time-sequence neural (1) Application of transfer learning in ML-based constitu-
networks are inherently capable of distinguishing differ- tive Modeling: Given the high cost and limited avail-
ent loading histories without requiring additional assis- ability of experimental data, transfer learning offers a
tance. This architecture enhances their ability to capture promising approach to enhance the generalization capa-
long-term historical dependencies and perform complex bilities of ML-based constitutive models. By leveraging
non-linear mappings. However, the increased complexity knowledge from pre-trained models on synthetic data-
of these networks, compared to single-step-based models, sets, transfer learning can fine-tune these models using
results in more training parameters. Consequently, time- smaller amounts of high-fidelity experimental data,
sequence neural networks are more challenging to tune which has the potential to improve the accuracy and
and exhibit lower training and prediction efficiency. Addi- authenticity of ML-based constitutive models across a
tionally, according to the prediction result showcased in broader range of loading conditions.
Sect. 4.3, it is also found that time-sequential networks are (2) Optimization of training strategies: Single-step-based
sensitive to the change of strain intervals, while the sin- neural networks are inherently limited in their ability
gle-step-based network still performs well under the same to capture loading history, necessitating the develop-
loading cases, indicating that time-sequential networks are ment of parameterization schemes for internal vari-
M. Wang et al.
Fig. 15 The prediction results of GRU and MLP2 in cases with different loading intervals
Table 6 A concise summary of the ML-based constitutive models for granular materials
ML-based Types Advantages Disadvantages
constitutive models
Data sources Experiment data (1) Reflecting the most authentic and realistic (1) High cost and time-consuming;
used constitutive laws of granular materials; (2) Incorporating noise;
(2) Representative of real-world loading conditions (3) Incomplete datasets
Synthetic data (1) Easily generated in large quantities; (1) Smearing out some intrinsic
(2) Wide coverage of different loading features of granular materials;
scenarios (2) Lack of authenticity of
experimental data
Neural network Single-step-based (1) Simple architecture; (1) Requirement for artificially
type (2) High training and prediction efficiency; added internal/history variables;
(3) Robust to the change of loading step (2) Error accumulation problem
Multi-step-based (1) No extra history variables are (1) Complex structure;
required in identifying the loading history; (2) Lower training efficiency;
(2) Stronger non-linear mapping capability (3) Sensitive to the change of
than single-step-based networks the loading interval
ables which aim to represent loading history using the polation or data augmentation may serve as effective
fewest physically meaningful variables possible. Addi- strategies to enhance the resilience of these networks
tionally, future research could focus on improving the to non-uniform loading intervals, thereby broadening
performance of time-sequence neural networks under their applicability in real-world scenarios.
varying loading intervals. Approaches such as inter-
Machine Learning Aided Modeling of Granular Materials: A Review
mation of each SPH particle can be iteratively computed by single-step, time-sequential networks [169], and specific tree
forces (e.g. body forces, viscosity forces, etc) acting on networks [91] can extract knowledge from.
them. GNS operates on graphs to learn the physics of the
Relying on macroscopic particle kinematic features to dynamic system and predict rollouts. The graph network
govern the deformation of the target domain makes MPM spans the system domain with nodes representing a collec-
and SPH effectively avoid mesh entanglement and distor- tion of particles and the links connecting the nodes repre-
tion, showcasing notable superiority in handling large-strain senting the local interaction between ) the material points.
issues. However, updating the dynamic features for each par- As shown in Fig. 18, GNS ( S𝜃 consists of two parts: one is
ticle is computationally significant, particularly in simula- the learnable approximator dΘ, and the other is the updater.
tions involving a substantial number of particles. The current physical state of each particle in MPM or SPH
simulations is represented as a set of xti ∈ Xt in S𝜃 . Where xti
5.1.2 The Framework of Macroscopic Kinematic is a vector, normally consisting of position pti , velocity ṗ ti ,
Features‑Based ML Models boundary information bti , type fi (i.e. deformable or rigid)
of each material or SPH point, which can be expressed as:
The advancement of data-driven simulators in the past two [ { } ]
decades [145] offers one promising way to solve the above- xti = pi , ṗ t−n+1
i
, ṗ t−n+2
i
, … , ṗ ti , bti , fi (25)
mentioned problem in MPM and SPH methods. Given one
where n represents the previous time step. Instead of the
specific position, ML-based simulators can output necessary
physical solving process, the approximator dΘ takes the cur-
dynamic features simultaneously, making them far superior
rent state of the particle system Xt as input to predict its
in computational efficiency [75] bypassing the complex
dynamic feature Yt . The updater then computes the physical
physical solution process.
state Xt+1 at the next time step with the Xt and predicted Yt.
In MPM and SPH methods, the system consists of an
The approximator dΘ comprises three parts, including
unordered set of macroscopic interpolation particles includ-
Encoder, Processor, and Decoder as demonstrated in
ing rich interaction relations with one another. These par-
Fig. 18. In the prediction process of dΘ , the Encoder is
ticle features and particle-particle interaction can be nat-
responsible) for embedding X into the initial latent graph
t
urally expressed with the graph that can be modelled by (
G0 V0 , E0 by assigning the node to each particle (i.e. node
the graph network simulators (GNS) but not convenient to
encoder 𝛿𝜃v ) and adding edges rti,j between particles within
be transferred into the regular sequences or vectors which
Machine Learning Aided Modeling of Granular Materials: A Review
one interaction radius R (i.e. edge encoder 𝛿𝜃e ), which is within different containers. Choi et al. [32] introduced
formulated as: physics-inspired inductive biases, such as an inertial frame
( ) that allows learning algorithms to prioritize one solution
vti = 𝛿Θv xti (26) (constant gravitational acceleration) over another, reducing
learning time. The GNS implementation uses semi-implicit
( )
Euler integration to update the next state based on the pre-
eti,j = 𝛿Θv rti,j (27)
dicted accelerations. The GNS was trained on 26 CB-Geo
MPM simulations of granular flow trajectories with 2500
where vti ∈ V0 represents the node state in the graph and
particles each with 400 timesteps. At each time step, a graph
eti,j ∈ E0 means the pair-wise relationship between nodes in
network is created by linking all the material points within
the graph. Then, the Processor, consisting of M GNN layers, an interaction radius of 0.03 m, which results in connect-
performs an M times message-passing process on G0 . ing each node to 128 neighbouring particles. GNS uses the
Finally, the Decoder outputs the dynamic features of the velocities from the previous five steps to predict the acceler-
system yti ∈ Yt with the final graph GM from the Processor ation at the next time step. Figure 19 shows the GNS rollout
to update the velocity ṗ t+1
i
and position pt+1
i
of each node by: prediction compared to MPM simulations. GNS successfully
replicates the granular flow within a 5% error in the trajec-
ṗ t+1 = ṗ ti + yti Δt (28)
i
tory estimation for a condition outside the training regime.
[33]
pt+1
i
= pti + ṗ t+1
i
Δt (29)
5.2 The Macroscopic Modelling with ML Models
So far, the application of GNS in particle-based numerical as Constitutive Models
methods (e.g. SPH and MPM) is still in the nascent stage.
For example, ML models trained with the data generated In addition to the works mentioned above, ML models can
from MPM simulations have been used to effectively pre- also feature the mechanical response of the material at each
dict the macroscale deformation of granular systems. Gon- Gaussian integral point of finite elements in mesh-based
zalez et al. [139] developed a graph network-based simulator numerical methods, such as the FEM. In the following, we
(GNS) via the ‘encode-process-decode’ scheme to model will focus on the combination of ML with the most popular
the dynamic features of physical systems of different mate- FEM and related FEM-DEM multiscale technique, because
rials, including sand and goop, by the data generated from the idea and workflow of the combination of other methods
both the MPM and SPH simulations. Aoyama et al. [6] con- with ML are similar. In the framework of FEM, the ML
structed one GNN with the data generated from the MPM model can substitute the phenomenological model or the
simulator to predict the stacked shape of the grain collection representative volume element (RVE) embedded in each
M. Wang et al.
Fig. 19 Comparison of granular flow interaction with a barrier between MPM and GNS. The colour represents the magnitude of displacement.
The GNS prediction is outside the training regime
Gaussian point of the FEM to provide the stress–strain to describe the local stress–strain relationship. Without con-
response when solving boundary value problems (BVPs). sidering the body force, the macroscopic deformation of the
In the following subsections, the basic framework of the geometry domain can be obtained by solving the nodal infor-
FEM and FEM-DEM method is first introduced concisely, mation of each FE element by the governing equation:
and then the relevant work about the development of the
∫Ω ∫Γ
FEM-ML framework will be discussed in detail. BT 𝛔t dΩ = 𝐍𝐓 𝐭 dΓ = Ft (30)
5.2.1 The Framework of the FEM and the Coupled where 𝛔t and Ft represent the stress tensor and external force
FEM‑DEM of the current load step; t is the boundary traction imposed
on the Neumann boundary Γ of the Ω ; N represents the
In the FEM, the whole macro domain Ω is first discretized shape function; B = LN , and L is the differential operator.
into a finite number of subdomains (elements) and the phe- In each loading step, as shown in Fig. 20a, the finite ele-
nomenological model (e.g. critical state-based model shown in ment algorithm fulfil the governing formulation via the
Fig. 20a) is embedded into Gaussian points of each FE element
following steps: 1) The gradient of nodal displacement ▽ut proposed as a promising way for the investigation of large
is calculated and passed into the constitutive model of each deformation of granular materials [103]. To overcome the
Gaussian point to output the nodal stress 𝝈 t and tangential computational issues of RVE encountered in large deforma-
operator Dt . 2) Given the Dt (also referred to as the constitu- tion, an adaptive RVE model was recently proposed [129,
tive matrix), 𝝈 t and boundary condition, the displacement 164]. Similar to the FEM-DEM framework, the MPM-ML
increment Δut is calculated. 3) The nodal displacement is framework can also be developed.
updated according to the obtained Δut . However, granular
materials normally perform high plasticity. Therefore, the 5.2.2 The FEM‑ML Framework
Newton–Raphson method is leveraged to iteratively solve
the Δut , ensuring the residual force Rt of the system remains The ML model is featured with ongoing improvement and
minimized, which can be expressed as: high computational efficiency in prediction. Thus it is a
potential way to circumvent the challenges mentioned above
∫Γ ∫Ω
Rt = 𝐍𝐓 𝐭 dΓ − BT 𝛔(t,m−1) dΩ = Kt Δu(t,m) (31) of traditional numerical methods by integrating ML surro-
gate models into them. As illustrated in Fig. 21, in FEM or
coupled FEM-DEM, replacing constitutive models or RVEs,
where Δu(t,m) represents the solved displacement increment
the trained neural networks offer the necessary tangential
at the mth iteration of the current load step, and the global
operator or stress state according to the received local defor-
stiffness matrix Kt is assembled by: mation information of each FE element to accelerate the
computational process in BVPs.
∫Ω (32)
Kt = 𝐁𝐓 Dt BdV Generally, according to the data used to develop the neu-
ral networks, the FEM-ML framework can be categorized
The iteration is repeatedly executed until the Rt converges into two different types: 1) one is the phenomenological
to one tiny value. model data-based FEM-ML framework, and 2) the other is
As shown in Fig. 20b, in the coupled FEM-DEM algo- the DEM data-based FEM-ML framework, which is dem-
rithm, the local constitutive relationship is provided by the onstrated in the following sections in detail, respectively.
representative volume element (RVE) solved by the DEM. Over the past decades, numerous efforts have been made
The nodal displacement is calculated by the FEM solver to represent the material response with ML models in BVPs.
and imposed on each RVE as their boundary conditions (i.e. The earlier work of constitutive data-based FEM-ML can be
▽ut ). Then, with received boundary conditions, the stress 𝝈 t traced back to the 1990 s, in which the trained ML model
and tangential operator Dt of each RVE is calculated based functions for the stress calculation in FEM to simulate
on various homogenization methods (e.g. Voigt’s hypoth- the material macroscopic behaviour under triaxial [143],
esis [89], Reuss’s hypothesis [27, 181]). Finally, the solved biaxial, uniaxial cycled compression [58], and un-uniform
𝝈 t and Dt by the DEM are returned to the FEM solver to loading conditions [59]. Subsequently, numerous scholars
calculate the displacement increment of each nodal to update have carried out related works. For instance, literature [72]
the nodal displacement. develops one NN model as an alternative to the constitutive
Although both the FEM and FEM-DEM can predict the model to accelerate the efficient convergence of the Newton
macroscopic characteristics of granular materials, they have iteration in the FE algorithm; In reference [86], one rate-
their respective limitations. The constitutive models used in dependent NN constitutive model is developed to capture
the FEM are continuum theory-based, potentially obscuring the time-dependent feature of geomaterials in the FEM; Xu
the discrete nature of granular materials. Furthermore, mod- et al. [178] compute the tangential matrix with a Cholesky-
els utilized in FEM often incorporate multiple variables that factored symmetric positive-definite neural network. Based
lack physical significance but require extensive calibration on a dimensionality reduction method (Proper Orthogonal
efforts. Moreover, constitutive models tend to be sophisti- Decomposition), Huang et al. [80] described the intrinsic
cated to accommodate specific phenomena and are thus dif- history-dependency feature of plastic materials with the
ficult to apply to new data cases. In addition, once the consti- feedforward neural network and states variables in FEM;
tutive model is determined in FEM, it remains unadjustable, Nikolaos N et al. [157] leveraged the deep learning network
which restricts the further applicability of FEM to new tests. to replicate one elastoplastic model, where the plastic flow
Regarding the FEM-DEM techniques, although they can and yield surface of the model are accurately portrayed
offset the deficiency of the FEM in capturing the discrete through a series of deep network predictions, to obtain the
features of materials to some extent, it still suffers from the seamless FEM simulation.
issue of excessive computational costs. Similarly, numerous endeavours have been given to the
It is worth noting that to avoid the deficiency of FEM in DEM data-based FEM-ML framework to tackle multiscale
simulating material large deformation, MPM-DEM is also simulations. For example, Le et al. [96] utilized the neural
M. Wang et al.
network to approximate the effective potential, which is used the aspects of the training data and neural networks used.
to derive the stress and tangent elastic tensor of materials in The completeness of the training sample, e.g. fixed time step
the multiscale computation of non-linear elastic materials, and the limited stress–strain space of training data, deterio-
to reduce the computational cost. The recurrent neural net- rate computational stability [110, 189] and the robustness
work is trained with DEM data and integrated into Gaussian of the FEM-ML framework. Furthermore, in the DEM data-
points of the FE meshes in the works of Simons et al. [61] based FEM-ML framework, there are no explicit history
and Logarzo et al. [110] to improve the efficiency of the variables are calculated to feature the loading states in DEM
multiscale scheme; The performance of the Gaussian pro- solvers, which performs the history and state-dependent
cess and ANN as the agent in the multi-scale BVP was com- feature of the particle assembly by fabric tensor or energy-
pared in the work of Fuhg et al. [56]; Replacing RVEs, Guan based parameters [44, 87]. Therefore, parameterizing history
et al [66] utilized the feedforward neural network to predict states into the macroscopic loading information (e.g. strain
the material matrix and local stress to complete the non- and stress) to develop the DEM data-based FEM-ML frame-
linear iteration in the multiscale computation process; Qu work is also a tricky issue, especially when employing the
et al. [130] developed an active learning-based data-driven single-step-based neural network to construct the FEM-ML
modeling framework which can pritorise the most informa- framework. Although multi-timestep deep learning models
tive data for a trained model and make it be progressively can avoid this problem [65, 67], they are incompatible with
improved via interactive model training and data labelling; the standard FEM computational framework, which operates
Qu et al. [131] further developed a transfer learning strategy on a single step.
that can combine the use of well-established phenomeno-
logical models, numerical models and physical experiments
for data-driven material modeling, thereby reducing the 5.3 An Example: The DEM Data‑Based FEM‑ML
data demands for certain material modeling tasks; Replac- Modelling
ing the DEM solver embedded in Gaussian point with one
well-trained neural network, Rangel et al. [133] proposed a A detailed study on time-sequential networks and constitu-
data-driven FVE-ML multiscale framework to explore the tive data-based FEM-ML framework can be found in the lit-
thermomechanical behavior of granular materials subjected erature [65]. Therefore, in this section, we offer an example
to thermal expansion. of the development of an MLP-based FEM-ML framework
However, the development of the FEM-ML framework with the DEM data where there are no explicit internal vari-
still confronts some problems, which will be discussed from ables available via a 2D biaxial compression simulation.
Machine Learning Aided Modeling of Granular Materials: A Review
5.3.1 The Process of Developing the FEM‑ML Framework Table 7 The detailed grain-scale parameters of each RVE in the
FEM-DEM scheme
To develop the FEM-ML framework, one biaxial compres- Parameter Value
sion FEM-DEM multiscale simulation is performed, and the
Particle number 450
stress 𝝈 (t), strain 𝜺(t), internal variable 𝝋(t), as well as tangen-
Particle size range (mm) (6 ∼12)
tial operator Dt of each RVE, are recorded as the training
Young’s modulus (MPa) 600
data of ML models. Wherein the absolute accumulated strain
Density (kg/m3) 2650
increment Δ𝜺(t) is adopted as the history [80] according to
the result in Sect. 4.3.2. Particle frictional coefficient 0.5
As illustrated in Fig. 22, during the multiscale model- Damping ratio 0.1
ling process, the bottom boundary of the coarse-meshed
geometry is fixed in the vertical and horizontal directions,
and both the left and right boundary of the mesh are con- the proper tangential operator during the Newton–Raphson
fined with the constant horizontal pressure ( P0 = 100 KPa). iteration in FEM. Alternatively, the method proposed in the
A displacement-controlled axial loading is applied to the work of Guan et al. [66] is resorted in this case, i.e. using the
top surface of the rectangular domain with a fixed veloc- neural network to (provide the material
) matrix. Consequently,
ity (0.001 m/load step) until the axial strain arrives at 0.05 two MLP models NN𝜎 and NND are trained to provide the
under the reversal loading 1( Reld1) path shown in Fig. 23. necessary stress 𝝈 (t) and material matrix Dt for the FE algo-
The parameters used in each RVE directly refer to the work rithm, which can be expressed as:
of Guo et al [68]. The detailed grain-scale parameters are ( )
listed in Table 7. 𝜎 (t) =N N𝝈 𝜺(t) , Δ𝜺(t) , W, b (33)
Different from continuum-based phenomenological mod-
els, whose constitutive function is sufficiently smooth and can ( )
Dt =N ND 𝜺(t) , Δ𝜺(t) , W, b (34)
be obtained by automatic differentiation (AD) methods [1, 60,
61, 72, 86], the stress–strain curve of granular materials has
dramatic disturbance, and thus the AD method cannot provide
5.4 Summary
Fig. 24 The solved axial displacement field under the Reld2 path in fine meshes [162]
enhanced given sufficient training data, enabling it to handle points within the FEM framework. While time-sequence
increasingly complex tasks more effectively. neural networks can naturally capture the history-dependent
However, it is important to note that training geometry constitutive behavior of granular materials, they are sensitive
information-based networks typically requires significant to changes in loading intervals, and their multi-step nature
computational memory due to the large number of particles is not aligned with the conventional FEM algorithms. This
in a single simulation case. Furthermore, when using Graph discrepancy requires additional efforts to integrate time-
Neural Networks (GNNs) to predict system rollouts over sequence networks into a FEM solver. In contrast, single-
extended periods, error accumulation becomes a challenge. step neural networks are inherently compatible with the
This issue arises because the dynamic information at each FEM algorithms and robust to variations in loading inter-
time step is updated based on the particle positions from vals. However, they still face the challenge of identifying
the preceding time step, leading to the propagation of errors appropriate internal variables to capture different loading
throughout the simulation. histories adequately.
In the development of the FEM-ML scheme, both single- Additionally, the generalization capability of current
step and multi-step surrogate models serve as alternatives to FEM-ML strategies is often limited by the completeness of
phenomenological models or RVEs embedded at Gaussian training data. For instance, the simulation result obtained in
M. Wang et al.
Fig. 25 The solved shear strain field under the Reld2 path in fine meshes [162]
Sect. 5.3 demonstrates the developed FEM-ML approach has training trustworthy ML models, Bayesian or Gaussian pro-
a certain extrapolation capability. Although the mesh density cesses for uncertainty quantification, and the development
of the solving domain increases and a different but simi- of hybrid computational frameworks that integrate neural
lar loading path is applied, the FEM-ML scheme can still networks with traditional numerical methods, which are
acquire a comparable solution to the corresponding FEM- detailed as follows:
DEM simulation. However, when encountering a new situ- 1). Application of active learning to mitigate error accu-
ation where the geometry domain and boundary conditions mulation: One of the key challenges in using ML-based
vary considerably, the performance of the current FEM-ML simulators, particularly GNS, is the accumulation of errors
solver will deteriorate [162]. This limitation makes many during long-term system rollouts. Active learning could be
FEM-ML approaches case-specific and restricts their appli- employed to dynamically select new data points that the
cability to more general scenarios. model finds most uncertain, enabling iterative retraining of
Based on the summary presented above, several promis- the model to correct errors. This approach would improve
ing directions for future research in ML-aided macroscopic the robustness of ML models over extended simulations and
simulation methods can be identified. These avenues include ensure more reliable results, especially in complex, multi-
employing active learning to identify informative data for step scenarios.
Machine Learning Aided Modeling of Granular Materials: A Review
learning seeks to understand the mappings between entire potentially uncovering new insights into the fundamental
function spaces. This is achieved by training on pairs of physics governing granular media.
input–output functions, rather than discrete data points.
The incorporation of spectral methods allows the model
to operate in infinite-dimensional function spaces without 7 Conclusion
being constrained by mesh resolutions. By representing
functions in the Fourier space, FNOs can efficiently learn This work reviewed the major advancement in ML-aided
operators by applying Fourier transforms and diagonal modelling of granular media, specifically including the
matrix multiplications within their layers. This not only application of the ML method in microscopic grain scale
reduces the need for excessive parametrization but also computation, the development of the data-driven constitutive
ensures that the learned operator behaves smoothly across model for granular materials, and the ML-aided macroscopic
the function space. Furthermore, the inherent properties simulation of granular media.
of the Fourier space, such as smoothness and regulariza- On the microscopic scale, we mainly focus on the recent
tion, enable FNOs to efficiently capture global patterns development of two different types of grain information-
and generalize well within the scope of the training data, based ML models. The first one is the contact information-
making them a promising tool for a wide range of scien- based ML model, and the other is the grain-level kinematic
tific applications. features-based ML model. The high computation efficiency
Another significant advancement in operator learning is of the ML algorithm makes these models promising ways
the Deep Operator Network (DeepONet; [63, 112, 117]). to accelerate the computational process of particle/grain-
This architecture is grounded in the Universal Approxima- based numerical techniques, and the black-box characteris-
tion Theorem for Operators, which extends the classical neu- tics make these ML models very user-friendly. However, it
ral network theory to function spaces. The theorem posits is essential to recognize that the development of such grain
that certain neural network architectures can approximate a contact information-based ML models is also subjected to
wide class of nonlinear operators between function spaces certain challenges. The variations of particle shape, diverse
with arbitrary precision. DeepONet implements this theorem contact situations, and the absence of consistent contact the-
through a novel two-part structure: a branch network and a ories make it nearly impossible to generate a high-quality
trunk network. The branch network processes input func- and comprehensive training set that covers all contact condi-
tions, while the trunk network handles the evaluation points tions. Consequently, this limitation hinders the generaliza-
of output functions. DeepONet approximates operators by tion of the ML-based contact surrogate models in practical
representing them as a sum of basis functions multiplied engineering applications. On the other hand, it has been
by coefficients, implemented through a dot product opera- observed that a majority of existing kinematic features-based
tion between the outputs of these networks. The trunk net- ML models ignored some essential physical motion informa-
work learns the basis functions, while the branch network tion of grains, e.g. particle rotation, and angular velocity.
computes the coefficients based on the input function. This These two main issues are the primary obstacles impeding
architecture allows DeepONet to solve multi-scale prob- the continued advancement of particle information-based
lems [107] and complex non-linear systems [117]. ML models at the grain level.
While FNOs and DeepONet operate in general Banach Both time-sequential and single-step-based neural net-
spaces, the Basis-to-Basis (B2B) approach [82] leverages works can be leveraged to derive the constitutive relation-
the geometric properties of Hilbert spaces to generate more ship of granular materials using various types of training
interpretable and generalizable operators. By exploiting the data. Wherein the distinctive architecture of these time-
inner product structure, spectral theory, and optimization sequence neural networks inherently enables them to cap-
geometry of Hilbert spaces, B2B learns basis functions ture the history-dependent stress–strain response of granular
for both input and output spaces, and then maps between materials. For single-step-based networks, to achieve com-
their coefficients. This structure allows B2B to handle vari- parable performance to time-sequence neural networks, the
able input locations, interpolate effectively, and potentially incorporation of artificially introduced internal/historical
extrapolate beyond the training domain, while offering variables is necessary to identify loading states. Since the
improved interpretability and analytical tractability. ML-based constitutive models directly acquire the constitu-
Although relatively unexplored in the space of granular tive laws from generated stress–strain data, they can natu-
media, operator learning approaches like FNO, DeepONet, rally circumvent assumption and intricate mathematical
and B2B have the potential to revolutionize and acceler- formulation and evolve continuously with the expansion of
ate numerical simulations in this field. These methods offer the training dataset, making them naturally overcome the
promising avenues for modeling complex granular systems, problems confronted by the traditional continuum-theory-
predicting their behavior under various conditions, and based phenomenological constitutive models. However,
Machine Learning Aided Modeling of Granular Materials: A Review
21. Beuth L, Wikeckowski Z, Vermeer P (2011) Solution of quasi- 42. Deen N, Annaland MVS, Van der Hoef MA, Kuipers J (2007)
static large-strain problems by the material point method. Int J Review of discrete particle modeling of fluidized beds. Chem
Numer Anal Methods Geomech 35:1451–1465 Eng Sci 62:28–44
22. Bowman ET, Soga K, Drummond W (2001) Particle shape 43. Duncan JM, Chang CY (1970) Nonlinear analysis of stress and
characterisation using fourier descriptor analysis. Geotechnique strain in soils. J Soil Mech Found Division 96:1629–1653
51:545–554 44. Eggersmann R, Kirchdoerfer T, Reese S, Stainier L, Ortiz M
23. Brinkgreve RB (2005) Selection of soil models and parameters (2019) Model-free data-driven inelasticity. Comput Methods
for geotechnical engineering application.Proceedings of Geo- Appl Mech Eng 350:81–99
Frontiers 2005, in Austin, Texas, pp: 69–98 45. Eghbalian M, Pouragha M, Wan R (2023) A physics-informed
24. Bui HH, Fukagawa R, Sako K, Ohno S (2008) Lagrangian mesh- deep neural network for surrogate modeling in classical elasto-
free particles method (sph) for large deformation and failure plasticity. Comput Geotech 159:105472
flows of geomaterial using elastic-plastic soil constitutive model. 46. Elman JL (1990) Finding structure in time. Cognit Sci
Int J Numer Anal Methods Geomech 32:1537–1570 14:179–211
25. Bui HH, Sako K, Fukagawa R (2007) Numerical simulation of 47. Feng Y (2021) An energy-conserving contact theory for discrete
soil-water interaction using smoothed particle hydrodynamics element modelling of arbitrarily shaped particles: Basic frame-
(sph) method. J Terramech 44:339–346 work and general contact model. Comput Methods Appl Mech
26. Bui HH, Sako K, Fukagawa R, Wells J (2008) Sph-based numeri- Eng 373:113454
cal simulations for large deformation of geomaterial considering 48. Feng Y (2021) A generic energy-conserving discrete element
soil-structure interaction. In: The 12th international conference modeling strategy for concave particles represented by surface
of international association for computer methods and advances triangular meshes. Int J Numer Methods Eng 122:2581–2597
in geomechanics (IACMAG), pp. 570–578 49. Feng Y (2023) Thirty years of developments in contact model-
27. Chang CS, Liao CL (1994) Estimates of elastic modulus for ling of non-spherical particles in dem: a selective review. Acta
media of randomly packed granules. Appl Mech Rev. https:// Mech Sinica 39:722343
doi.org/10.1115/1.3122814 50. Feng Y, Gao W (2021) On the strain energy distribution of
28. Chang MB, Ullman T, Torralba A, Tenenbaum JB (2016) A com- two elastic solids under smooth contact. Powder Technol
positional object-based approach to learning physical dynamics. 389:376–382
arXiv preprint arXiv:1612.00341 51. Feng Y, Han K, Owen D (2017) A generic contact detection
29. Chen X, Wang LG, Meng F, Luo ZH (2021) Physics-informed framework for cylindrical particles in discrete element model-
deep learning for modelling particle aggregation and breakage ling. Comput Methods Appl Mech Eng 315:632–651
processes. Chem Eng J 426:131220 52. Fern J, Rohe A, Soga K, Alonso E (2019) The material point
30. Cheng Z, Wang J (2022) Estimation of contact forces of granular method for geotechnical engineering: a practical guide. CRC
materials under uniaxial compression based on a machine learn- Press, Boca Raton
ing model. Granul Matter 24:1–14 53. Fragkiadaki K, Agrawal P, Levine S, Malik J (2015) Learning
31. Cho K, Van Merrienboer B, Bahdanau D, Bengio Y (2014) On visual predictive models of physics for playing billiards. arXiv
the properties of neural machine translation: Encoder-decoder preprint arXiv:1511.07404
approaches. arXiv preprint arXiv:1409.1259 54. Fu P, Walton OR, Harvey JT (2012) Polyarc discrete element
32. Choi Y, Kumar K (2024) Graph neural network-based surrogate for efficiently simulating arbitrarily shaped 2d particles. Int J
model for granular flows. Comput Geotech 166:106015 Numer Methods Eng 89:599–617
33. Choi Y, Kumar K (2024) Inverse analysis of granular flows using 55. Fu Q, Hashash YM, Jung S, Ghaboussi J (2007) Integration of
differentiable graph neural network simulator. Comput Geotech laboratory testing and constitutive modeling of soils. Comput
171:106374 Geotech 34:330–345
34. Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep 56. Fuhg JN, Marino M, Bouklas N (2022) Local approximate
neural networks for image classification, In: 2012 IEEE con- gaussian process regression for data-driven constitutive mod-
ference on computer vision and pattern recognition, IEEE. pp. els: development and comparison with neural networks. Com-
3642–3649 put Methods Appl Mech Eng 388:114217
35. Ciresan DC, Meier U, Masci J, Gambardella LM, Schmidhuber J 57. Fukushima K (1980) Neocognitron: a self-organizing neural
(2011) Flexible, high performance convolutional neural networks network model for a mechanism of pattern recognition unaf-
for image classification. In: Twenty-second international joint fected by shift in position. Biol Cybern 36:193–202
conference on artificial intelligence, Citeseer 58. Ghaboussi J, Garrett J Jr, Wu X (1991) Knowledge-based mod-
36. Cleary PW, Sawley ML (2002) Dem modelling of industrial eling of material behavior with neural networks. J Eng Mech
granular flows: 3d case studies and the effect of particle shape 117:132–153
on hopper discharge. Appl Math Model 26:89–111 59. Ghaboussi J, Pecknold DA, Zhang M, Haj-Ali RM (1998)
37. Cundall PA (1971) A computer model for simulating progressive, Autoprogressive training of neural network constitutive mod-
large-scale movement in blocky rock system. In: Proceedings of els. Int J Numer Methods Eng 42:105–126
the international symposium on rock mechanics, pp. 129–136 60. Ghaboussi J, Sidarta DE (1998) New nested adaptive neural
38. Cundall PA (1974) Rational design of tunnel supports: a com- networks (nann) for constitutive modeling. Comput Geotech
puter model for rock mass behaviour using interactive graphics 22:29–52
for the input and output of geometrical data. Technical Report 61. Ghavamian F, Simone A (2019) Accelerating multiscale finite
39. Cundall PA, Strack OD (1979) A discrete numerical model for element simulations of history-dependent materials using a
granular assemblies. Geotechnique 29:47–65 recurrent neural network. Comput Methods Appl Mech Eng
40. Cybenko G (1989) Approximation by superpositions of a sigmoi- 357:112594
dal function. Math Control Signals Syst 2:303–314 62. Gingold RA, Monaghan JJ (1977) Smoothed particle hydro-
41. Das SK, Das A (2019) Influence of quasi-static loading rates on dynamics: theory and application to non-spherical stars. Mon
crushable granular materials: a dem analysis. Powder Technol Not Royal Astron Soc 181:375–389
344:393–403 63. Goswami S, Bora A, Yu Y, Karniadakis GE (2023) Physics-
informed deep neural operator networks. Machine learning in
Machine Learning Aided Modeling of Granular Materials: A Review
modeling and simulation: methods and applications. Springer, 87. Karapiperis K, Stainier L, Ortiz M, Andrade JE (2021) Data-
Cham, pp 219–254 driven multiscale modeling in mechanics. J Mech Phys Solids
64. Graves A (2013) Generating sequences with recurrent neural 147:104239
networks. arXiv preprint arXiv:1308.0850 88. Kohestani V, Hassanlourad M (2016) Modeling the mechanical
65. Guan Q, Yang Z, Guo N, Hu Z (2023) Finite element geotech- behavior of carbonate sands using artificial neural networks
nical analysis incorporating deep learning-based soil model. and support vector machines. Int J Geomech 16:04015038
Comput Geotech 154:105120 89. Kruyt NP, Rothenburg L (1998) Statistical theories for the elas-
66. Guan S, Qu T, Feng Y, Ma G, Zhou W (2023) A machine learn- tic moduli of two-dimensional assemblies of granular materi-
ing-based multi-scale computational framework for granular als. Int J Eng Sci 36:1127–1142
materials. Acta Geotechnica 18:1699–1720 90. Kumar K, Choi Y (2023) Accelerating particle and fluid simu-
67. Guan S, Zhang X, Ranftl S, Qu T (2023) A neural network- lations with differentiable graph networks for solving forward
based material cell for elastoplasticity and its performance in fe and inverse problems, in: Proceedings of the SC’23 Workshops
analyses of boundary value problems. Int J Plastic 171:103811 of The International Conference on High Performance Com-
68. Guo N, Zhao J (2014) A coupled fem/dem approach for hier- puting, Network, Storage, and Analysis, pp. 60–65
archical multiscale modelling of granular media. Int J Numer 91. Ladicky L, Jeong S, Solenthaler B, Pollefeys M, Gross M
Methods Eng 99:789–818 (2015) Data-driven fluid simulations using regression forests.
69. Guo N, Zhao J (2016) 3d multiscale modeling of strain locali- ACM Trans Gr (TOG) 34:1–9
zation in granular media. Comput Geotech 80:360–372 92. Lai Z, Chen Q (2017) Characterization and discrete element
70. Habibagahi G, Bamdad A (2003) A neural network framework simulation of grading and shape-dependent behavior of jsc-1a
for mechanical behavior of unsaturated soils. Can Geotech J martian regolith simulant. Granul Matter 19:69
40:684–693 93. Lai Z, Chen Q (2019) Reconstructing granular particles from
71. Harlow FH (1964) The particle-in-cell computing method for x-ray computed tomography using the tws machine learning tool
fluid dynamics. Methods Comput Phys 3:319–343 and the level set method. Acta Geotech 14:1–18
72. Hashash Y, Jung S, Ghaboussi J (2004) Numerical implementa- 94. Lai Z, Chen Q, Huang L (2020) Fourier series-based discrete
tion of a neural network based material model in finite element element method for computational mechanics of irregular-shaped
analysis. Int J Numer Methods Eng 59:989–1005 particles. Comput Methods Appl Mech Eng 362:112873
73. Hashash Y, Song H (2008) The integration of numerical mod- 95. Lai Z, Chen Q, Huang L (2022) Machine-learning-enabled
eling and physical measurements through inverse analysis in discrete element method: contact detection and resolution of
geotechnical engineering. KSCE J Civil Eng 12:165–176 irregular-shaped particles. Int J Numer Anal Methods Geomech
74. He S, Li J (2009) Modeling nonlinear elastic behavior of rein- 46:113–140
forced soil using artificial neural networks. Appl Soft Comput 96. Le B, Yvonnet J, He QC (2015) Computational homogenization
9:954–961 of nonlinear elastic materials using neural networks. Int J Numer
75. He S, Li Y, Feng Y, Ho S, Ravanbakhsh S, Chen W, Poczos B Methods Eng 104:1061–1084
(2019) Learning to predict the cosmological structure forma- 97. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature
tion. Proceed Nat Acad Sci 116:13825–13832 521:436–444
76. Hertz H (1882) Ueber die berührung fester elastischer körper 98. Lee S, Ha J, Zokhirova M, Moon H, Lee J (2018) Background
77. Hochreiter S, Schmidhuber J (1997) Long short-term memory. information of deep learning for structural engineering. Arch
Neural Comput 9:1735–1780 Comput Methods Eng 25:121–129
78. Hornik K, Stinchcombe M, White H (1989) Multilayer feed- 99. Li H, Yu H, Cao N, Tian H, Cheng S (2021) Applications of
forward networks are universal approximators. Neural Netw artificial intelligence in oil and gas development. Arch Comput
2:359–366 Methods Eng 28:937–949
79. Hu X, Zhang Y, Guo L, Wang J, Cai Y, Fu H, Cai Y (2018) 100. Li Z, Chow JK, Wang YH (2017) Applying the artificial neural
Cyclic behavior of saturated soft clay under stress path with network to predict the soil responses in the dem simulation. In:
bidirectional shear stresses. Soil Dyn Earthq Eng 104:319–328 IOP Conference Series: Materials Science and Engineering, IOP
80. Huang D, Fuhg JN, Weißenfels C, Wriggers P (2020) A Publishing. p. 012040
machine learning based plasticity model using proper 101. Li Z, Huang DZ, Liu B, Anandkumar A (2023) Fourier neural
orthogonal decomposition. Comput Methods Appl Mech Eng operator with learned deformations for pdes on general geom-
365:113008 etries. J Mach Learn Res 24:1–26
81. Hwang S, Pan J, Sunny AA, Fan LS (2022) A machine learn- 102. Li Z, Li X, Zhang H, Huang D, Zhang L (2023) The prediction
ing-based particle-particle collision model for non-spherical of contact force networks in granular materials based on graph
particles with arbitrary shape. Chem Eng Sci 251:117439 neural networks. J Chem Phys 158:5
82. Ingebrand T, Thorpe AJ, Goswami S, Kumar K, Topcu U 103. Liang W, Zhao J (2019) Coupled mpm/dem multiscale model-
(2024) Basis-to-basis operator learning using function encod- ling geotechnical problems involving large deformation. In: 16th
ers. arXiv preprint arXiv:2410.00171 Asian Regional Conference on sSoil Mechanics and Geotechni-
83. Ji S, Xu W, Yang M, Yu K (2012) 3d convolutional neural cal Engineering
networks for human action recognition. IEEE Trans Pattern 104. Liang W, Zhao J (2019) Multiscale modeling of large defor-
Anal Mach Intel 35:221–231 mation in geomechanics. Int J Numer Anal Methods Geomech
84. Johari A, Javadi A, Habibagahi G (2011) Modelling the 43:1080–1114
mechanical behaviour of unsaturated soils using a genetic 105. Lim KW, Andrade JE (2013) Granular element method for com-
algorithm-based neural network. Comput Geotech 38:2–13 putational particle mechanics. Computer Methods in Applied
85. Johnson KL (1987) Contact mechanics. Cambridge University Mechanics and Engineering, 241, 262–274
Press, Cambridge 106. Lim KW, Andrade JE (2014) Granular element method for three-
86. Jung S, Ghaboussi J (2006) Neural network constitutive model dimensional discrete element calculations. Int J Numer Anal
for rate-dependent materials. Comput Struct 84:955–963 Methods Geomech 38:167–188
M. Wang et al.
107. Liu L, Cai W (2021) Multiscale deeponet for nonlinear opera- boundary for hierarchical multiscale analysis. Int J Numer
tors in oscillatory function spaces for building seismic wave Methods Eng 122:2239–2253
responses. arXiv preprint arXiv:2111.04860 130. Qu T, Guan S, Feng Y, Ma G, Zhou W, Zhao J (2023) Deep
108. Liu M, Liu G (2010) Smoothed particle hydrodynamics (sph): an active learning for constitutive modelling of granular mate-
overview and recent developments. Arch Comput Methods Eng rials: From representative volume elements to implicit finite
17:25–76 element modelling. Int J Plastic 164:103576
109. Liu Z, Su L, Zhang C, Iqbal J, Hu B, Dong Z (2020) Investigation 131. Qu T, Zhao J, Guan S, Feng Y (2023) Data-driven multiscale
of the dynamic process of the xinmo landslide using the discrete modelling of granular materials via knowledge transfer and
element method. Comput Geotech 123:103561 sharing. Int J Plastic 171:103786
110. Logarzo HJ, Capuano G, Rimoli JJ (2021) Smart constitutive 132. Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-
laws: inelastic homogenization through machine learning. Com- informed neural networks: a deep learning framework for solv-
put Methods Appl Mech Eng 373:113482 ing forward and inverse problems involving nonlinear partial
111. Lu L, Gao X, Dietiker JF, Shahnam M, Rogers WA (2021) differential equations. J Comput phys 378:686–707
Machine learning accelerated discrete element modeling of 133. Rangel RL, Franci A, Oñate E, Gimenez JM (2024) Multi-
granular flows. Chem Eng Sci 245:116832 scale data-driven modeling of the thermomechanical behavior
112. Lu L, Jin P, Pang G, Zhang Z, Karniadakis GE (2021) Learning of granular media with thermal expansion effects. Comput
nonlinear operators via deeponet based on the universal approxi- Geotech 176:106789
mation theorem of operators. Nat Mach Intel 3:218–229 134. Rashidian V, Hassanlourad M (2014) Application of an artificial
113. Lucy LB (1977) A numerical approach to the testing of the neural network for modeling the mechanical behavior of carbon-
fission hypothesis. Astronom J 82:1013–1024 ate soils. Int J Geomech 14:142–150
114. Lv Y, Nie L, Xu K (2011) Study of the neural network constitu- 135. Roberts N, Khodak M, Dao T, Li L, Ré C, Talwalkar A (2021)
tive models for turfy soil with different decomposition degree. Learning operations for neural pde solvers. In: Proc. ICLR
In: 2011 Second International Conference on Mechanic Auto- SimDL Workshop
mation and Control Engineering, IEEE. pp. 6111–6114 136. Romo MP, García SR, Mendoza MJ, Taboada-Urtuzuástegui V
115. Ma G, Guan S, Wang Q, Feng Y, Zhou W (2022) A predic- (2001) Recurrent and constructive-algorithm networks for sand
tive deep learning framework for path-dependent mechanical behavior modeling. Int J Geomech 1:371–387
behavior of granular materials. Acta Geotech 17:3463–3478 137. Roscoe K, Burland JB (1968) On the Generalized Stress-Strain
116. Ma X, Zhang DZ (2006) Statistics of particle interactions in Behavior of Wet Clay. In: Heyman, J. andLeckie, F., Eds., Engi-
dense granular material under uniaxial compression. J Mech neering Plasticity, Cambridge University Press, Cambridge,
Phys Solids 54:1426–1448 535–609
117. Mandl L, Goswami S, Lambers L, Ricken T (2024) Separable 138. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning repre-
deeponet: Breaking the curse of dimensionality in physics- sentations by back-propagating errors. Nature 323:533–536
informed machine learning. arXiv preprint arXiv:2407.15887 139. Sanchez-Gonzalez A, Godwin J, Pfaff T, Ying R, Leskovec J,
118. Mašín D (2005) A hypoplastic constitutive model for clays. Int Battaglia P (2020) Learning to simulate complex physics with
J Numer Anal Methods Geomech 29:311–336 graph networks. In: International conference on machine learn-
119. Mayr A, Lehner S, Mayrhofer A, Kloss C, Hochreiter S, Brand- ing, PMLR. pp. 8459–8468
stetter J (2023) Boundary graph neural networks for 3d simula- 140. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G
tions. In: Proceedings of the AAAI Conference on Artificial (2008) The graph neural network model. IEEE Trans Neural
Intelligence, pp. 9099–9107 Netw 20:61–80
120. Micheli A (2009) Neural network for graphs: a contextual con- 141. Sezer A (2011) Prediction of shear development in clean sands by
structive approach. IEEE Trans Neural Netw 20:498–511 use of particle shape information and artificial neural networks.
121. Nitka M, Combe G, Dascalu C, Desrues J (2011) Two-scale Expert Syst Appl 38:5603–5613
modeling of granular materials: a dem-fem approach. Granul 142. Shahin MA, Indraratna B (2006) Modeling the mechanical
Matter 13:277–281 behavior of railway ballast using artificial neural networks. Can
122. Oñate E, Idelsohn SR, Del Pin F, Aubry R (2004) The parti- Geotech J 43:1144–1152
cle finite element method–an overview. Int J Comput Methods 143. Sidarta D, Ghaboussi J (1998) Constitutive modeling of geomate-
1:267–307 rials from non-uniform material tests. Comput Geotech 22:53–71
123. Pande G, Pietruszczak S, Wang M (2020) Role of gradation 144. Sołowski W, Sloan S (2015) Evaluation of material point method
curve in description of mechanical behavior of unsaturated for use in geotechnics. Int J Numer Anal Methods Geomech
soils. Int J Geomech 20:04019159 39:685–701
124. Penumadu D, Zhao R (1999) Triaxial compression behavior of 145. Spengler M (1999) Fast neural network emulation and control of
sand and gravel using artificial neural networks (ann). Comput physics-based models. Proceedings of the 25th annual conference
Geotech 24:207–230 on Computer graphics and interactive techniques, in Orlando,
125. Petalas AL, Dafalias YF, Papadimitriou AG (2020) Sanisand-f: Florida, pp: 9–20
sand constitutive model with evolving fabric anisotropy. Int J 146. Stefanos D, Gyan P (2015) On neural network constitutive mod-
Solids Struct 188:12–31 els for geomaterials. J Civil Eng Res 5:106–113
126. Peters JF, Hopkins MA, Kala R, Wahl RE (2009) A poly-ellip- 147. Strack O, Cundall PA (1978) The distinct element method as a
soid particle for non-spherical discrete element method. Eng tool for research in granular media. University of Minnesota,
Comput 26:645–657 Minnesota
127. Poorooshasb HB, Pietruszczak S (1985) On yielding and flow 148. Sulsky D, Zhou SJ, Schreyer HL (1995) Application of a parti-
of sand; a generalized two-surface model. Comput Geotech 1:1 cle-in-cell method to solid mechanics. Comput Phys Commun
128. Qu T, Di S, Feng Y, Wang M, Zhao T (2021) Towards data- 87:236–252
driven constitutive modelling for granular materials via micro- 149. Sutskever I, Martens J, Hinton GE (2011) Generating text
mechanics-informed deep learning. Int J Plastic 144:103046 with recurrent neural networks. In: Proceedings of the 28th
129. Qu T, Feng Y, Wang M (2021) An adaptive granular repre- international conference on machine learning (ICML-11), pp.
sentative volume element model with an evolutionary periodic 1017–1024
Machine Learning Aided Modeling of Granular Materials: A Review
150. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence 171. Williams JR, O’Connor R (1999) Discrete element simulation
learning with neural networks. Adv Neural Inf Process Syst and the contact problem. Arch Comput Methods Eng 6:279–304
27:1409 172. Williams JR, Pentland AP (1992) Superquadrics and modal
151. Tavarez FA, Plesha ME (2007) Discrete element method for mod- dynamics for discrete elements in interactive design. Eng Com-
elling solid and particulate materials. Int J Numer Methods Eng put 9:115–127
70:379–404 173. Wood DM (2017) Geotech Model. CRC Press, Boca Raton
152. Thakur MM, Penumadu D (2020) Triaxial compression in sands 174. Wriggers P (2008) Nonlinear finite element methods. Springer
using fdem and micro-x-ray computed tomography. Comput science & business media, Cham
Geotech 124:103638 175. Wu J, Yildirim I, Lim JJ, Freeman B, Tenenbaum J (2015) Gali-
153. Ti KS, Huat B, Noorzaei J, Jaafar MS, Sew GS (2009) A review leo: Perceiving physical object properties by integrating a phys-
of basic soil constitutive models for geotechnical application. ics engine with deep learning. Adv Neural Inf Process Syst 28:1
Electron J Geotech Eng 14:1–18 176. Wu L, Cui P, Pei J, Zhao L, Guo X (2022) Graph neural net-
154. Tian Y, Yao YP (2017) Modelling the non-coaxiality of soils works: foundation, frontiers and applications. In: Proceedings
from the view of cross-anisotropy. Comput Geotech 86:219–229 of the 28th ACM SIGKDD conference on knowledge discovery
155. Ueda K, Iai S (2019) Constitutive modeling of inherent anisot- and data mining, pp. 4840–4841
ropy in a strain space multiple mechanism model for granular 177. Wu W, Bauer E, Kolymbas D (1996) Hypoplastic constitutive
materials. Int J Numer Anal Methods Geomech 43:708–737 model with critical state for granular materials. Mech Mater
156. Ummenhofer B, Prantl L, Thuerey N, Koltun V (2019) Lagran- 23:45–69
gian fluid simulation with continuous convolutions. In: Interna- 178. Xu K, Huang DZ, Darve E (2021) Learning constitutive relations
tional conference on learning representations using symmetric positive definite neural networks. J Comput
157. Vlassis NN, Sun W (2021) Sobolev training of thermodynamic- Phys 428:110072
informed neural networks for interpretable elasto-plasticity mod- 179. Yao Y, Sun D, Luo T (2004) A critical state model for sands
els with level set hardening. Comput Methods Appl Mech Eng dependent on stress and density. Int J Numer Anal Methods
377:113695 Geomech 28:323–337
158. Voyiadjis GZ, Alsaleh MI, Alshibli KA (2005) Evolving internal 180. Yao YP, Hou W, Zhou AN (2009) Uh model: three-dimensional
length scales in plastic strain localization for granular materials. unified hardening model for overconsolidated clays. Geotech-
Int J Plastic 21:2000–2024 nique 59:451–469
159. Wang J, Chan D (2014) Frictional contact algorithms in sph for 181. Yimsiri S, Soga K (2000) Micromechanics-based stress-strain
the simulation of soil-structure interaction. Int J Numer Anal behaviour of soils at small strains. Géotechnique 50:559–571
Methods Geomech 38:747–770 182. Yin ZY, Jin YF (2019) Practice of optimisation theory in geo-
160. Wang K, Sun W (2019) Meta-modeling game for deriving the- technical engineering. Springer, Cham
ory-consistent, microstructure-based traction-separation laws via 183. Yin ZY, Karstunen M, Chang CS, Koskinen M, Lojander M
deep reinforcement learning. Comput Methods Appl Mech Eng (2011) Modeling time-dependent behavior of soft sensitive clay.
346:216–241 J Geotech Geoenviron Eng 137:1103–1113
161. Wang L, Cai Y, Liu D (2018) Multiscale reliability-based topol- 184. Yin ZY, Wang P, Zhang F (2020) Effect of particle shape on
ogy optimization methodology for truss-like microstructures with the progressive failure of shield tunnel face in granular soils
unknown-but-bounded uncertainties. Comput Methods Appl by coupled fdm-dem method. Tunnel Undergr Space Technol
Mech Eng 339:358–388 100:103394
162. Wang M, Feng Y, Guan S, Qu T (2024) Multi-layer perceptron- 185. You Z (2003) Development of a micromechanical modeling
based data-driven multiscale modelling of granular materials approach to predict asphalt mixture stiffness using the discrete
with a novel frobenius norm-based internal variable. J Rock element method. University of Illinois at Urbana-Champaign,
Mech Geotech Eng. https://doi.org/10.1016/j.jrmge.2024.02.003 Champaign
163. Wang M, Qu T, Guan S, Zhao T, Liu B, Feng Y (2022) Data- 186. Zhang DZ, Ma X, Giguere PT (2011) Material point method
driven strain-stress modelling of granular materials via temporal enhanced by modified gradient of shape function. J Comput Phys
convolution neural network. Comput Geotech 152:105049 230:6379–6398
164. Wang M, Zhang DZ (2021) Deformation accommodating peri- 187. Zhang DZ, Rauenzahn RM (2000) Stress relaxation in dense and
odic computational domain for a uniform velocity gradient. Com- slow granular flows. J Rheol 44:1019–1041
put Methods Appl Mech Eng 374:113607 188. Zhang N, Shen SL, Zhou A, Xu YS (2019) Investigation on per-
165. Wang X, Yin ZY, Su D, Xiong H, Feng Y (2021) A novel arcs- formance of neural networks using quadratic relative error cost
based discrete element modeling of arbitrary convex and concave function. IEEE Access 7:106642–106652
2d particles. Comput Methods Appl Mech Eng 386:114071 189. Zhang P, Yang Y, Yin ZY (2021) Bilstm-based soil-structure
166. Wang Z, Liu K, Li J, Zhu Y, Zhang Y (2019) Various frameworks interface modeling. Int J Geomech 21:04021096
and libraries of machine learning and deep learning: a survey. 190. Zhang P, Yin ZY, Jin YF (2021) State-of-the-art review of
Arch Comput Methods Eng 1:1–24 machine learning applications in constitutive modeling of soils.
167. Weng JJ, Ahuja N, Huang TS (1993) Learning recognition Arch Comput Methods Eng 28:3661–3686
and segmentation of 3-d objects from 2-d images. In: 1993 191. Zhang P, Yin ZY, Jin YF, Liu XF (2021) Modelling the mechani-
(4th) International Conference on Computer Vision, IEEE. pp. cal behaviour of soils using machine learning algorithms with
121–128 explicit formulations. Acta Geotech 1:1–20
168. Werbos PJ (1990) Backpropagation through time: what it does 192. Zhang P, Yin ZY, Jin YF, Ye GL (2020) An ai-based model
and how to do it. Proceed IEEE 78:1550–1560 for describing cyclic characteristics of granular materials. Int J
169. Wiewel S, Becher M, Thuerey N (2019) Latent space physics: Numer Anal Methods Geomech 44:1315–1335
towards learning the temporal evolution of fluid flow. Comput 193. Zhang S, Lan P, Li HC, Tong CX, Sheng D (2022) Physics-
Gr forum. Wiley Online Library, Hoboken, pp 71–82 informed neural networks for consolidation of soils. Eng Comput
170. Wikeckowski Z (2004) The material point method in large 39:2845–2865
strain engineering problems. Comput Methods Appl Mech Eng
193:4417–4438
M. Wang et al.
194. Zhou W, Huang Y, Ng TT, Ma G (2018) A geometric potential- Publisher's Note Springer Nature remains neutral with regard to
based contact detection algorithm for egg-shaped particles in jurisdictional claims in published maps and institutional affiliations.
discrete element modeling. Powder Technol 327:152–162
195. Zhu JH, Zaman MM, Anderson SA (1998) Modelling of shearing
behaviour of a residual soil with recurrent neural network. Int J
Numer Anal Methods Geomech 22:671–687
196. Zienkiewicz OC, Taylor RL, Zhu JZ (2005) The finite element
method: its basis and fundamentals. Elsevier, Amsterdam