0% found this document useful (0 votes)
17 views

PINN

Uploaded by

Boot Box
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

PINN

Uploaded by

Boot Box
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Physics-Informed Neural Networks for Quantum

Eigenvalue Problems
Henry Jin Marios Mattheakis
School of Engineering and Applied Sciences School of Engineering and Applied Sciences
Harvard University Harvard University
Cambridge, MA, USA Cambridge, MA, USA
[email protected] [email protected]

Pavlos Protopapas
School of Engineering and Applied Sciences
arXiv:2203.00451v1 [cs.LG] 24 Feb 2022

Harvard University
Cambridge, MA, USA
[email protected]

Abstract—Eigenvalue problems are critical to several fields of [11], and available data can be incorporated into the loss
science and engineering. We expand on the method of using function to improve the network’s performance [12].
unsupervised neural networks for discovering eigenfunctions and Eigenvalue differential equations with certain boundary
eigenvalues for differential eigenvalue problems. The obtained
solutions are given in an analytical and differentiable form that conditions appear in a wide range of problems of applied
identically satisfies the desired boundary conditions. The network mathematics and physics, including quantum mechanics and
optimization is data-free and depends solely on the predictions electromagnetism. Lagaris et al. [1] have shown that neural
of the neural network. We introduce two physics-informed loss networks are able to solve eigenvalue problems and proposed
functions. The first, called ortho-loss, motivates the network to a partially iterative method that solves a differential equation
discover pair-wise orthogonal eigenfunctions. The second loss
term, called norm-loss, requests the discovery of normalized with a fixed eigenvalue at each iteration. More recently, Li et
eigenfunctions and is used to avoid trivial solutions. We find al. [13] showed that neural networks can solve the stationary
that embedding even or odd symmetries to the neural network Schrödinger equation for systems of coupled quantum oscilla-
architecture further improves the convergence for relevant prob- tors. This is a variational approach where the eigenvalue is in-
lems. Lastly, a patience condition can be used to automatically directly calculated from the predicted eigenfunction. Our work
recognize eigenfunction solutions. This proposed unsupervised
learning method is used to solve the finite well, multiple finite expands on the unsupervised neural network eigenvalue solver
wells, and hydrogen atom eigenvalue quantum problems. presented by Jin et al. [14], which simultaneously and directly
Index Terms—neural networks, eigenvalue, eigenfunction, dif- learns the eigenvalues and the associated eigenfunctions using
ferential equation a scanning mechanism. Here, we introduce physics-informed
improvements to the regularization loss terms: orthogonal loss
I. I NTRODUCTION (ortho-loss) and normalization loss (norm-loss). We further
design special neural network architectures with embedded
Differential equations are prevalent in every field of science symmetries that ensure the prediction of perfectly even or
and engineering, ranging from physics to economics. Thus, odd eigenfunctions. Furthermore, a modified parameterization
extensive research has been done on developing numerical is introduced to handle problems with non-zero boundary
methods for solving differential equations. With the unprece- conditions. The proposed technique is an extension to physics-
dented availability of computational power, neural networks informed neural network differential equation solvers and,
hold promise in redefining how computational problems are consequently, inherits all the benefits that neural network
solved or improving existing numerical methods. Among solvers have over numerical integrators. Moreover, our method
other applications in scientific computing, neural networks are has an additional advantage over integrators in that it discovers
capable of efficiently solving differential equations [1]–[4]. solutions that identically satisfy the boundary conditions.
These neural network solvers pose several advantages over We assess the performance of the proposed architecture by
numerical integrators: the obtained solutions are analytical and solving a number of standard eigenvalue problems of quantum
differentiable [3], numerical errors are not accumulated [4], mechanics: the single finite square well, multiple finite square
networks are more robust against the ‘curse of dimensionality’ wells, and the hydrogen atom.
[5], [6], a family of solutions corresponding to different initial
or boundary conditions can be constructed [7], the neural II. BACKGROUND
solutions can be transferred for fast discovery of new solutions This study extends the method presented in [14], where
[8], [9], inverse problems can be solved systematically [10], a fully connected neural network architecture was proposed,
with a single output corresponding to the predicted eigenfunc- network takes two inputs, the variable x and a constant input
tion, and with a constant input node designed to learn constant of ones. The constant input feeds into a single linear neuron
eigenvalues through backpropagation. To identically satisfy the (affine transformation) that is updated through optimization,
boundary conditions, a parametric function was used. In order allowing the network to find a constant λ. Afterwards, x
for the network to find non-trivial solutions to the differential and λ are inputs to a fully-connected feed-forward neural
eigenvalue equation, the two regularization loss functions network that returns an output function N (x, λ). The predicted
1 1 eigenfunctions f (x, λ) are defined using a parametric trick,
Lf = , Lλ = 2 (1) similar to [4], according to the equation:
f (x, λ)2 λ
were used to penalize trivial eigenfunctions and zero eigen- f (x, λ) = fb + g(x)N (x, λ). (4)
values, respectively. Moreover, a scanning mechanism allows
the network to search the eigenvalue space for eigenfunctions By choosing an appropriate g(x), the predicted eigenfunction
of different eigenvalues, enabled by the loss term defined as identically satisfies certain boundary conditions.
Ldrive = e−λ+c , (2)
where c was a value that changed during training through
scheduled increases, and was used to control the scanning.
The research by Li et al. [13] on neural network-based
multi-state solvers is also relevant to this study. However,
we present some novelties and differences in methodology.
Specifically, we assign a trainable network parameter to
discover the eigenvalue instead of indirectly calculating it
through the expectation of the Hamiltonian of the system.
Our approach avoids the repeated calculation of an integral Fig. 1: Physics-informed neural architecture for solving eigen-
(i.e., for the expectation value) which is evaluated every value problems.
training epoch. The second novelty of our approach is the
embedding of physical symmetries in the network architecture.
Our aim is to discover pairs of f (x, λ) and λ that approxi-
The symmetry of the wavefunctions can be determined by
mately satisfy Eq. (3). This is achieved by minimizing, during
the symmetry of the given potential function. We design a
the network optimization, a loss function L defined by Eq. (3)
specialized architecture with embedded even or odd symmetry
as:
that significantly improves the overall network optimization.
Finally, we suggest a parameterization that identically satisfies L = LDE + Lreg
non-zero boundary conditions, which is necessary to solve the M
radial equation of the hydrogen atom. 1 X 2
L= Lf (xi , λ) − λf (xi , λ) + Lreg , (5)
Orthogonality loss is also used in [13], where it is lever- M i=1
aged to simultaneously produce multiple eigenvalue solution
where averaging with respect to xi takes place in LDE for
outputs that are pair-wise orthogonal. This differs from our
M training sample points, namely x = (x1 , · · · , xM ). Any
method, since our neural network outputs one solution at a
derivative with respect to xi contained in L is calculated
time, and the orthogonality loss term is used to prevent us
by using the auto-differentiation technique [15]. The Lreg
from finding the same solution multiple times.
term in Eq. (5) contains regularization loss terms. In this
III. M ETHODOLOGY work, we introduce and apply a regularization function that
We consider an eigenvalue problem that exhibits the form: consists of three terms of the form: Lreg = νnorm Lnorm +
νorth Lorth + νdrive Ldrive . Empirically, for the problems dis-
Lf (x) = λf (x), (3)
cussed below, we found the optimal regularization coefficients
where x is the spatial variable, L is a differential operator that νnorm = νorth = 1. The normalization loss Lnorm encourages
depends on x and its derivatives, f (x) is the eigenfunction, normalized eigenfunctions, avoiding the discovery of trivial
and λ is the associated eigenvalue. For the finite square well eigenfunctions and eigenvalues, since it enforces non-zero
problems, we assume homogeneous Dirichlet boundary condi- solution as well as constraining the eigenfunction’s squared
tions at the left and right boundaries xL and xR , respectively, integral to be finite. The Lorth motivates the network to scan for
such that f (xL ) = f (xR ) = fb , where fb is a given constant orthogonal eigenfunctions and can replace or assist the non-
boundary value. On the other hand, for the hydrogen atom physical scanning (Ldrive ) method used in [14]. Ldrive accounts
problem, a single Dirichlet boundary condition f (xR ) = fb is to the scanning method which is used to guide the model’s
enforced. eigenvalue weight and is given by Eq. 2. However, for the
We expand on the network architecture proposed by [14] experiments presented in this study, we use Lorth as a physics-
and shown in Fig. 1. This feed-forward neural network is capa- informed regularization term that can replace the non-physical
ble of solving Eq. (3) when both f (x) and λ are unknown. The scanning method with Ldrive , and thus νnorm is set to 0.
A. Normalization Loss serving as a more physics-aware loss term than the brute-force
Our contribution includes a novel approach to solving the scanning approach.
trivial solution problem. While [14] employed non-trivial Following the network’s convergence to a new solution,
eigenfunction and non-trivial eigenvalue loss terms Lf and the new eigenfunction is added to ψeigen and thus, it is the
Lλ , as described in Eq. 2, these loss terms cannot numerically linear combination of all the discovered solutions. Hence, a
converge to 0 without scaling the solutions to infinity, and single orthogonality loss term is computed for each learning
thus they introduce numerical error. While they were effective gradient, as opposed to separate orthogonality computations
for preventing the network from converging to trivial f (x) for each learned eigenfunction. This reduces computational
and λ, they hold no physical meaning. We present a physics- cost since only one dot product is computed for each training
aware regularization loss function that not only prevents trivial iteration, as opposed to multiple dot products with each found
solutions, but also motivates the eigenfunction’s inner product eigenfunction.
with itself to approach a specific constant number, which is the C. Embedding Even and Odd Symmetry
normalization constraint physically required of eigenfunctions
in quantum mechanics. Thus, Lnorm is given by For certain differential equations where prior information
about the potential dictates even or odd symmetric eigen-
 2 functions, the neural network architecture can be embedded
M with a physics-informed modification that enforces the correct
Lnorm = f (x, λ) · f (x, λ) − , (6)
xR − xL symmetry in the eigenfunction output. As demonstrated by
where dot denotes the inner product. The loss function in Mattheakis et al. in [16] and extended by [17], symmetry can
Eq. (6) drives the network to find solutions with non-zero be embedded by feeding a negated input stream in parallel
integrals, where f (x, λ) represents the network solution, M to the original input, then combining streams before the final
is the number of samples, and xR − xL is the training range. dense layer. Adding streams leads to even symmetric outputs,
Specifically, this motivates the network solution to have a while subtracting gives rise to odd symmetric predictions.
squared integral equal to one. Unlike Lf and Lλ , Lnorm can We found that embedding symmetry into our model sig-
strictly reach zero and can also satisfy the normality constraint nificantly accelerates the convergence to a solution. This is
for eigenfunction solutions of Schrodinger’s equation. relevant for the multiple finite square wells problem, as we
demonstrate below.
B. Orthogonality Loss
An orthogonality loss regularization function is included as D. Parametric Function
part of Lreg to motivate the network to find different eigen- Selecting an appropriate parametric function g(x) is neces-
solutions to Schrodinger equation. This presents a physics- sary for enforcing boundary conditions. The following para-
informed approach whereby we can motivate a network to metric equation enforces a f (xL ) = f (xR ) = 0 boundary
solve for orthogonal solutions for problems where it is known conditions:
that solutions are orthogonal, a fundamental property of linear   
differential eigenvalue problems. Schrodinger’s equation is g(x) = 1 − e−(x−xL ) 1 − e−(x−xR ) . (8)
one such example, but this mechanism can be extended to As demonstrated in [14], this parametric function is suitable
any Hermitian operator. This serves as a replacement or an for problems where the eigenfunctions are fixed to or converge
improvement over solely relying on the scanning mechanism to zero, as in the case of the infinite square well and the
Ldrive presented in [14]. While a scanning search through harmonic oscillator problems. In the following experiments,
the eigenvalue space using Ldrive can be useful for providing we employ the parametric function of Eq. (8) for finite square
control over the model’s search for eigenfunction solutions, well problems, as they similarly require eigenfunctions to taper
solving equations that are known to be Hermitian (such as to zero at domain limits.
the Schrodinger equation) allows the use of an orthogonal The differential eigenvalue equation for the hydrogen atom,
loss term, since eigenfunctions of Hermitian operators are however, has a single zero boundary condition at x → ∞, as
orthogonal. In this paper, we show that the neural network is the fundamental solution is not fixed to 0 at the origin. For
able to find orthogonal eigenfunction solutions solely based on such problems where a single Dirichlet boundary condition is
the orthogonality loss. This loss term is given by the following required, we use the following parametric function:
equation.  
g(x) = 1 − e−(x−xR ) . (9)
Lorth = ψeigen · ψ, (7) E. Towards Solution Recognition
where ψeigen denotes the sum of all eigenfunctions that have To automatically extract the correct eigenfunctions, we
already been discovered by the network during training, and define convergence to an eigenfunction solution using two
ψ is the current network prediction. This regularization term criteria: the differential equation loss LDE and patience.
embeds the network with a physics-informed predisposition LDE describes the loss term for the differential eigenvalue
towards finding orthogonal solutions to a Hermitian operator, equation in question. For our experiments, without loss of
generality, we used Schrodinger’s equation. Nevertheless, the this study, we are interested in solving the one-dimensional
method is valid for any differential equation eigenvalue prob- stationary Schrodinger’s equation defined as:
lem. Considering that perfect eigenvalue solutions will have
~2 ∂ 2
 
an LDE loss equal to zero, we claim that a solution is found − + V (x) ψ(x) = Eψ(x), (10)
when LDE falls below a chosen threshold, which is a hyper- 2m ∂x2
parameter in the training process. where ~ and m stand for the reduced Planck constant and
The patience condition describes the model’s training the mass respectively, which without loss of generality, can
progress. When solving for a solution, the model initially be set to ~ = m = 1. Equation (10) defines an eigenvalue
improves very quickly, resulting in a fast decrease of LDE . problem where ψ(x) and E denote the eigenfunction f (x, λ)
However, over the course of converging to a solution, the rate and eigenvalue λ pair. The differential equation loss for this
of decrease in LDE decreases as well. Thus, we use the rate of one-dimensional stationary Schrodinger’s equation is given by
decrease in LDE as another condition for solution recognition. Equation (11), and henceforth we call this the Schrodinger
If the rolling average during the training iterations of the equation loss.
successive differences in LDE over a specified window hyper-
parameter falls below a chosen threshold hyper-parameter, we M
~2 ∂ 2
 
consider the patience condition to be met. 1 X 2
LDE = − 2
+ V (x i ) f (xi , E) − Ef (xi , E) .
When both the LDE condition (LDE falling below a thresh- M i=1 2m ∂x
old) and the patience condition are satisfied, we consider an (11)
eigenvalue solution to have been found. On the other hand,
A boundary condition eigenvalue problem is defined by
if only the patience condition is satisfied, then we interpret
considering a certain potential function V (x) and bound-
this to mean that the model has converged to a false solution.
ary conditions for ψ(x). We assess the performance of the
Consequently, we switch the symmetry (from even to odd
proposed network architecture by solving Eq. (10) for the
symmetry or vice versa) of the model to motivate the network
potential functions of the single finite well, multiple coupled
to search for other solutions. This approach of switching the
finite wells, and the radial equation for the hydrogen atom, all
symmetry of the model upon converging to a false solution
of which have known analytical solutions.
was inspired by our finding that the network’s function output
after converging to false solutions resembled true solutions, For the training, a batch of xi points in the interval [xL , xR ]
but of the opposite symmetry. Upon adopting this switching is selected as input. In every training iteration (epoch) the
approach, we found that the model was able to resume finding input points are perturbed by a Gaussian noise to prevent the
true solutions. The above method is described by Algorithm network from learning the solutions only at fixed points. Adam
1. optimizer is used with a learning rate of 8 · 10−3 . We use two
hidden layers of 50 neurons per layer with trigonometric sin(·)
Algorithm 1 The Physics-Informed Neural Eigenvalue Solver activation function. The use of sin(·) instead of more common
Algorithm activation functions, such as Sigmoid(·) and tanh(·), signifi-
cantly accelerates the network’s convergence to a solution [4].
1: Instantiate model with even symmetry We implemented the proposed neural network in pytorch [15]
2: while training do and published the code on github1 .
3: Generate training samples xi
4: Compute LDE , Lnorm A. Single Finite Well
5: Compute Lorth using all stored eigenfunctions
The finite well potential function is defined as:
6: Backpropagate and step
(
7: if patience condition and LDE < threshold then 0 0≤x≤`
8: Store copy of model V (x) = , (12)
V0 otherwise
9: else if patience condition then
10: Switch model symmetry where ` is the length and V0 is the depth of the quantum well.
11: end if The analytical solution to the finite well problem is tradition-
12: end while ally found by solving the stationary Schrodinger’s equation
in each region, then ’stitching’ the solutions of each region
together while enforcing a continuous eigenfunction that is
also continuously differentiable. For bound eigenfunctions, the
IV. E XPERIMENTS general form of the solution for regions where the eigenvalue
E is greater than the potential reads:
We evaluate the effectiveness of the proposed method

by solving eigenvalue problems defined by Schrodinger’s 2mE
equation. Schrodinger’s equation is the fundamental equation ψ = A sin(kx) + B cos(kx), k = . (13)
~
in quantum mechanics that describes the state wavefunction
ψ(x) and the associated energy E of a quantum system. In 1 https://github.com/henry1jin/quantumNN
For regions where the eigenvalue E is smaller than the the quantum finite well. The lower left panel outlines the
potential energy, the solution’s general form is LDE during the training. The red curve in the upper left
graph demonstrates the predicted energies where the plateaus
p
2m(V0 − E) indicate the discovery of an eigenstate; the dashed black lines
−αx αx
ψ = Ce + De , α= . (14) show the ground truth energies. On the right side, the four
~
predicted eigenfunctions are represented by blue lines; the
The solutions then for Eq. (12) is the following piece-wise bottom graph corresponds to the ground state. In particular,
eigenfunction, where constants c1 , c2 , and δ, are determined the neural network finds for the ground state solution with
by the requirement that the eigenfunction is continuous, con- energy E = 0.3586. After the first solution is found, we
tinuously differentiable, and normalized. introduce the orthogonal loss term into the training, motivating

αx the network to find a new eigenfunction. Consequently, the
c1 e x ≤ 0,
eigenvalue weight departs from its first value and rises to find

ψ(x) = c2 sin(kx + δ) 0 < x ≤ `, (15) the next even-symmetry solution with eigenvalue E = 3.2132
 −αx

c1 e x>` (the third graph on the right side in Fig. 2, counting from the
bottom). Once the patience condition is reached, the network
The ψ(x) eigenfunctions must decay to infinity outside the
automatically adds the latest solution to the orthogonal loss,
walls, implying the boundary conditions ψ(−∞) = ψ(∞) =
motivating the network to once again depart its solution in
0. In numerical methods, infinity is approximated with large
search of the next orthogonal solution. The model converges
values relative to the potentials. We adopt the approximate
to an eigenvalue of around E = 1.8, however it does not
boundary conditions of ψ(xL ) = ψ(xR ) = 0 with the choice
meet both conditions for solution acceptance. In particular,
xL = xR = 6`, for ` = 1 and V0 = 20. The proposed model
it does not meet the LDE condition. We take this to mean
with the orthogonal loss term is capable of solving for all
that, while the model has converged, it has converged to a
bound eigenstates. In the following we use the neural network
false solution. So the symmetry of the model is switched
to approximate the first four eigenfunctions and the associated
to odd symmetry. The next two solutions found are odd-
energies.
symmetric and correspond to the eigenvalues of E = 1.4322
and E = 5.6873 shown respectively by the second and fourth
images in the right panel of Fig. 2.

B. Multiple Finite Square Wells


The single finite well potential can be repeatedly spaced
to create a potential function that consists of multiple square
wells as follows:
(
0 2n` ≤ x ≤ (2n + 1)`
V (x) = , (16)
V0 otherwise
where n is an element of a subset of nonzero integers.
Like the single finite well, solutions to the multiple square
wells are piece-wise constructed by solving for each discrete
region and ”stitching” solutions. The general forms of the
Fig. 2: The plot on the upper left displays the model’s solutions in each region, namely, inside and outside a well, are
eigenvalue weight (i.e. energy) during the training process. once again given by Eq. (13) and Eq. (14), respectively. The
Horizontal dashed black lines demarcate the true eigenvalues, boundary conditions at infinity are also approximated by large
which our network accurately finds (plateaus of red lines). The values of x relative to the potential, that is, ψ(x → ±∞) = 0.
lower left plot shows the corresponding Schrodinger equation Our deep learning technique applied to the multiple wells
loss LDE over epochs during training. At 15000 epochs, solves for an arbitrary number of the solutions. Figure 3 shows
orthogonal loss with the first eigenfunction is introduced. Each our neural network finding the four lowest-energy (i.e. lowest-
following spike in LDE indicates the point where the model eigenvalue) states of the double finite square well. Similar to
reaches the patience condition, and the last eigenfunction is the single finite well problem, the model here uses the physics-
added to the orthogonal loss term. Column of plots on the right informed approach of solving for orthogonal eigenfunctions
are the resulting eigenfunctions that the model finds when the with the orthogonal loss term, given the knowledge that solu-
predicted energy converges to a plateau. tions to Hermitian differential operators must be orthogonal.

We start the network optimization by using a neural net- C. Symmetry vs No Symmetry


work with even symmetry embedded. Figure 2 summarizes Embedding symmetry into the network for problems where
the results for the discovery of the first four eigenstates of the solutions are known to be either even or odd symmetric
while the radial component R(r) equation to be solved is given
by:
d2 R 2 dR Ze2
   
2µ l(l + 1)
+ = − E + − R, (17)
dr2 r dr ~2 4π0 r r
where r is the radial variable, ~ is the reduced Planck’s
constant, µ is the reduced mass, Z is the number of protons,
0 denotes the vacuum permeability, and the variable l denotes
the angular momentum of the system and takes positive integer
values. We employ the proposed neural network to solve Eq.
(17) for l = 0, 1, 2, 3.
We note that Eq. 17 becomes singular at r = 0. Con-
sequently, training sample points close to r = 0 lead to
numerical instability. To avoid this problem, we allocate the
region r = [0, 1e − 1] to be a no-train zone. Thus, any training
Fig. 3: Top left plot shows the network’s eigenvalue over sample points that are generated are greater than r = 1e − 1.
epochs during training, with true eigenvalues shown by the Without this constraint, the numerical instability caused by
dotted horizontal lines. Bottom left plot shows the model’s sample points close to 0 disrupts the network’s ability to
corresponding Schrodinger equation loss at each training point. converge to solutions.
Column of plots on the right are the resulting eigenfunctions The analytical eigenvalue energies of the hydrogen atom are
that the model finds. given by
µZ 2 e4
En = − , (18)
3220 ~2 π 2 n2
proved to greatly improve the solution accuracy. Figure 4 com- where n denotes the order of the solutions. Namely, for n =
pares the eigenvalues (energy) predicted by the two models, 1, we get the ground energy. We notice that the eigenvalue
one with embedded symmetry (blue line) and one without energies are not dependent on the system’s angular momentum
(red line). While the symmetry-embedded model is able to l, but only on the system’s order of excitement n.
smoothly transition from one correct eigenvalue to the next The full three-dimensional solution to the Schrodinger equa-
one, the model without embedded symmetry converges to an tion for the hydrogen atom creates probability densities. The
incorrect eigenvalue. densities have not only radial dependence, but also angular
dependence. For our work, we focused solely on the radial
component of the Schrodinger equation. Furthermore, without
2 4
loss of generality, we set 8µZ e 1
2 h2 n2 = 2 .
0
We demonstrate that our method solves for the first few
eigenfunctions for four different angular momentum values
l = 0, 1, 2, 3. Figure 5 shows our model’s solutions, arranged
in a grid with angular momentum l running along the vertical
grid plots, and the energy level n running along the horizontal
axis.
Our method is able to solve for the lowest eigenvalue-
Fig. 4: Predicted eigenvalue energies of quantum multiple eigenfunction pairs with good accuracy. We analysed the
wells during the training of a network with embedded sym- accuracy of our method’s solutions in comparison to the true
metry (blue) and a network without any embedded symmetry solutions which are analytically known. Table I shows these
(red). Dotted black horizontal lines are analytical, ground truth results.
solution eigenvalues.
V. C ONCLUSION
In recent years, there has been a growing interest in the
application of neural networks to study differential equations.
D. Hydrogen Atom In this study, we introduced a neural network that is capable
of discovering eigenvalues and eigenfunctions for boundary
In quantum mechanics, the hydrogen atom is described by conditioned differential eigenvalue problems. The obtained
the three-dimensional Schrodinger equation with a Coulomb solutions identically satisfy the given boundary conditions via
potential energy. While the hydrogen atom is a three- a parametric function. We imposed even and odd symmetry in
dimensional problem, the equation can be decomposed into the network structure for problems that require such solutions,
radial and angular components via separation of variables. such as the single and multiple finite wells. We also introduced
The angular equation yields the spherical harmonics solutions, an orthogonality loss, which allows the network to learn new
solving the finite well, multiple finite wells, and hydrogen
atom quantum problems.
nn =
= 44 n =n =33 n=
n =22 n n== 11
VI. F UTURE R ESEARCH
l l== 00

For future work, we will generalise our method in two ways.


One generalisation is towards more dimensions. For instance,
= 11
ll =

the full solutions to the hydrogen atom Schrodinger equation


are three-dimensional. We believe such generalisations will
l l== 22

also more clearly reveal the advantages of solving such equa-


tions with neural networks. It is also possible to extend into the
temporal dimension and solve the time dependent Schrodinger
= 33

equation. The other avenue for future research is to apply our


ll =

method to more general eigenvalue differential equations. This


paper focuses on the Schrodinger’s equation, which belongs
to the Sturm-Liouville family. This study lays the groundwork
Fig. 5: Upper triangular plots show the model’s predicted for using neural networks to solve any eigenvalue differential
eigenfunctions for angular momentum values l = [0, 3] in- equation.
clusive, and for energy levels n = [1, 4] also inclusive. The
rightmost plot shows the true eigenvalues in dashed black, with VII. B ROADER I MPACT
our method’s found solutions in colors. The red line is the This work is valuable for computational physicists and ap-
Coulomb potential function that describes the radial hydrogen plied mathematicians, as well as in any field where differential
atom problem. eigenvalue problems may arise. We have demonstrated our
method’s success for the one-dimensional Schrodinger equa-
Performance Results (%)
Schrodinger Problem Eigenvalue Err. Mean Squared tion, but the technique can be generalised to Sturm-Liouville
Err. problems, as well as higher dimensional equations (e.g. 3D
Single (n = 1, s) 0.00 8.9e-4 Schrodinger and Helmholtz equations). We strongly believe
Single (n = 1, a) 0.25 3.7e-4
Single (n = 2, s) 0.25 9.1e-4 that this study will serve as the groundwork for future work
Single (n = 2, a) 0.46 4.8e-4 in the area of solving differential equations using deep learning
Double (n = 1, s) 0.25 6.3e-4 methods. We neither foresee and nor desire our research results
Double (n = 1, a) 0.32 4.8e-4
Double (n = 2, s) 0.56 8.7e-4 to be used for any kind of discrimination.
Double (n = 2, a) 0.61 7.1e-4
H (l = 0, n = 1) 1.46 4.0e-3 R EFERENCES
H (l = 0, n = 2) 0.08 5.8e-3
H (l = 0, n = 3) 0.70 8.2e-3 [1] I. Lagaris, A. Likas, and D. Fotiadis, “Artificial neural network
H (l = 0, n = 4) 1.10 1.0e-2 methods in quantum mechanics,” Computer Physics Communications,
H (l = 1, n = 2) 4.08 3.8e-5 vol. 104, no. 1, pp. 1 – 14, 1997. [Online]. Available: http:
H (l = 1, n = 3) 2.38 1.8e-4 //www.sciencedirect.com/science/article/pii/S0010465597000544
H (l = 1, n = 4) 2.52 2.2e-4 [2] M. Magill, F. Qureshi, and H. W. de Haan, “Neural networks trained to
H (l = 2, n = 3) 0.18 6.9e-6 solve differential equations learn general representations,” in NeurIPS,
H (l = 2, n = 4) 0.48 6.2e-4 2018.
H (l = 3, n = 4) 1.44 6.5e-3 [3] I. Lagaris, A. Likas, and D. Fotiadis, “Artificial neural networks for
solving ordinary and partial differential equations,” Neural Networks,
TABLE I: This table presents comparisons with our model’s IEEE Transactions on, pp. 987 – 1000, 10 1998.
[4] M. Mattheakis, D. Sondak, A. S. Dogra, and P. Protopapas, “Hamiltonian
solutions to the true analytical solutions. We find that our neural networks for solving differential equations,” 2020.
model finds the true eigenvalues to approximately 1 % error [5] J. Han, A. Jentzen, and E. Weinan, “Solving high-dimensional partial
consistently. Mean squared error denotes the mean squared differential equations using deep learning,” Proceedings of the National
Academy of Sciences of the United States of America, vol. 115 34, pp.
error of the function divided by the maximum of the absolute 8505–8510, 2017.
value of the true eigenfunction. [6] J. A. Sirignano and K. Spiliopoulos, “Dgm: A deep learning algorithm
for solving partial differential equations,” Journal of Computational
Physics, vol. 375, pp. 1339–1364, 2018.
[7] C. Flamant, P. Protopapas, and D. Sondak, “Solving differential equa-
eigenfunctions that are orthogonal to all previously learned tions using neural network solution bundles,” vol. 2006.14372, 2020.
eigenfunctions. Furthermore, a normalization loss was used to [8] M. Mattheakis, H. Joy, and P. Protopapas, “Unsupervised reservoir
enforce that the learned solutions are not trivial solutions, and computing for solving ordinary differential equations,” 2021.
[9] S. Desai, M. Mattheakis, H. Joy, P. Protopapas, and S. Roberts, “One-
that the quantum physical interpretation of eigenfunctions as shot transfer learning of physics-informed neural networks,” 2021.
probability distributions can be supported. The optimization [10] Y. Chen, L. Lu, G. E. Karniadakis, and L. D. Negro, “Physics-
solely depends on the network’s predictions, consisting of an informed neural networks for inverse problems in nano-optics and
metamaterials,” Opt. Express, vol. 28, no. 8, pp. 11 618–11 633, Apr
unsupervised learning method. We demonstrated the capability 2020. [Online]. Available: http://www.osapublishing.org/oe/abstract.
of the proposed architecture and training methodologies by cfm?URI=oe-28-8-11618
[11] A. Paticchio, T. Scarlatti, M. Mattheakis, P. Protopapas, and M. Bram-
billa, “Semi-supervised neural networks solve an inverse problem for
modeling covid-19 spread,” 2020.
[12] M. Raissi, P. Perdikaris, and G. Karniadakis, “Physics-informed
neural networks: A deep learning framework for solving forward and
inverse problems involving nonlinear partial differential equations,”
Journal of Computational Physics, vol. 378, pp. 686–707, 2019.
[Online]. Available: https://www.sciencedirect.com/science/article/pii/
S0021999118307125
[13] H. Li, Q. Zhai, and J. Z. Y. Chen, “Neural-network-based
multistate solver for a static schrödinger equation,” Phys. Rev.
A, vol. 103, p. 032405, Mar 2021. [Online]. Available: https:
//link.aps.org/doi/10.1103/PhysRevA.103.032405
[14] H. Jin, M. Mattheakis, and P. Protopapas, “Unsupervised neural net-
works for quantum eigenvalue problems,” ArXiv, vol. abs/2010.05075,
2020.
[15] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. Devito, Z. Lin,
A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in
pytorch,” 2017.
[16] M. Mattheakis, P. Protopapas, D. Sondak, M. D. Giovanni, and E. Kaxi-
ras, “Physical symmetries embedded in neural networks,” 2020.
[17] A. Bhattacharya, M. Mattheakis, and P. Protopapas, “Encoding
involutory invariance in neural networks,” CoRR, vol. abs/2106.12891,
2021. [Online]. Available: https://arxiv.org/abs/2106.12891

You might also like