Parthasarathy H. Advanced Probability and Statistics... 2023
Parthasarathy H. Advanced Probability and Statistics... 2023
Harish Parthasarathy
Professor
Electronics & Communication Engineering
Netaji Subhas Institute of Technology (NSIT)
New Delhi, Delhi-110078
First published 2023
by CRC Press
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
and by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
© 2023 Harish Parthasarathy and Manakin Press
CRC Press is an imprint of Informa UK Limited
The right of Harish Parthasarathy to be identified as author of this work has been asserted in
accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilised in any
form or by any electronic, mechanical, or other means, now known or hereafter invented,
including photocopying and recording, or in any information storage or retrieval system,
without permission in writing from the publishers.
For permission to photocopy or use material electronically from this work, access www.
copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact
[email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks,
and are used only for identification and explanation without intent to infringe.
Print edition not for sale in South Asia (India, Sri Lanka, Nepal, Bangladesh, Pakistan or
Bhutan).
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
A catalog record has been requested
ISBN: 9781032384375 (hbk)
ISBN: 9781032384382 (pbk)
ISBN: 9781003345060 (ebk)
DOI: 10.1201/9781003345060
Typeset in Arial, MinionPro, Symbol, CalisMTBol, TimesNewRoman, RupeeForadian,
Wingdings, ZapDingbats, Euclid, MT-Extra
by Manakin Press, Delhi
Preface
This book is primarily a book on advanced probability and statistics that
could be useful for undergraduate and postgraduate students of physics, engi-
neering and applied mathematics who desire to learn about the applications of
classical and quantum probability to problems of classical physics, signal pro-
cessing and quantum physics and quantum field theory. The prerequisites for
reading this book are basic measure theoretic probability, linear algebra, differ-
ential equations, stochastic differential equations, group representation theory
and quantum mechanics. The book deals with classical and quantum probabili-
ties including a decent discussion of Brownian motion, Poisson process and their
quantum non-commutative analogues. The basic results of measure theoretic
integration which are important in constructing the expectation of random vari-
ables are discussed. The Kolmogorov consistency theorem for the existence of
stochastic processes having given consistent finite dimensional probability dis-
tributions is also outlined. For doing quantum probability in Boson Fock space,
we require the construction of the tensor product between Hilbert spaces. This
construction based on the GNS principle, Schur’s theorem of positive definite
matrices and Kolmogorov’s consistency theorem has been outlined. The laws of
large numbers for sums of independent random variables are introduced here and
we state the fundamental inequalities and properties of Martingales originally
due to J.L.Doob culminating finally in the proof of the Martingale convergence
theorem based on the downcrossing/upcrossing inequalities. Doob’s Martingale
inequality can be used to give an easy proof of the strong law of large numbers
and we mention it here. Doob’s optional stopping theorem for submartingales is
proved and applied to calculating the distribution of the hitting times of Brow-
nian motion. We give another proof of this distribution based on the reflection
principle of Desire’ Andre which once again rests on the strong Markov property
of Brownian motion.
vi
fields and hence is a problem in advanced stochastic field theory. We also discuss
some applications of advanced probability to general electromagnetics and ele-
mentary quantum mechanics like what is the statistics of the far field radiation
pattern produced by a random current source and how when this random elec-
tromagnetic field is incident upon an atom modeled by the Schrodinger or Dirac
equation, the stochastically averaged transition probability can be computed in
terms of the classical current correlations. Many other problems in statistical
signal processing like prediction and filtering of stationary time series, Kalman,
Extended Kalman and Unscented Kalman filters, the MUSIC and ESPRIT al-
gorithms for estimating the directions of random signal emitting sources, the
recursive least squares lattice algorithm for order and time recursive prediction
and filtering are discussed. We have also included some material on superstring
theory since it is closely connected with the theory of operators in Boson and
Fermion Fock spaces which is now an integral component of non-commutative
probability theory. Some aspects of supersymmetry have also been discussed
in this book with the hope that supersymmetric quantum systems can be used
to design quantum gates of very large size. Various aspects and applications
of large deviation theory have also been included in this book as it forms an
integral part of modern probability theory which is used to calculate the prob-
ability of rare events like the probability of a stochastic dynamical system with
weak noise exiting the stability zone. The computation of such probabilities
enables us to design controllers that will minimize this deviation probability.
The chapter on applied differential equations focuses on problems in robotics
and other engineering or physics problems wherein stochastic processes and field
inevitably enter into the description of the dynamical system. The chapter on
circuit theory and device physics has also been included since it tells us how to
obtain the governing equations for diodes and transistors from the band struc-
ture of semiconductors. When a circuit is built using such elements and thermal
noise is present in the resistances, then the noise gets distorted and even am-
plified by the nonlinearity of the device and the mathematical description of
such circuits can be calculated by perturbatively solving associated nonlinear
stochastic differential equations. Quantum scattering theory has also been in-
cluded since it tells us how quantum effects make the probability distribution of
scattered particles different from that obtained using classical scattering theory.
Thus, quantum scattering theory is an integral part of quantum probability.
Many discussions on the Boltzmann kinetic transport equation in a plasma are
included in this book since the Boltzmann distribution function at each time
t can be viewed as an evolving probability density function of a particle in
phase space. In fact, the Bolzmann equation is so fundamental that it can be
used to derive not only more precise forms of the fluid dynamical equations but
also describe the motion of conducting fluids in the presence of electromagnetic
fields. Any book on applications of advanced probability theory must therefore
necessarily include a discussion of the Boltzmann equation. It can be used to
derive the Fokker-Planck equation for diffusion processes after making approx-
imations and by including the nonlinear collision terms, it can also be used to
prove the H-theorem ie the second law of thermodynamics. The section on the
vii
Atiyah-Singer index theorem has been included because it forms an integral part
of calculating anomalies in quantum field theory which is in turn a branch of
non-commutative probability theory.
At this juncture, it must be mentioned that this book in the course of dis-
cussing applied probability and statistics, also surveys some of the research
work carried out by eminent scientists in the field of pure and applied prob-
ability, quantum probability, quantum scattering theory, group representation
theory and general relativity. In some cases, we also indicate the train of thought
processes by which these eminent scientists arrived at their fundamental con-
tributions. To start with, we review the axiomatic foundations of probability
theory due to A.N.Kolmogorov and how the Indian school of probabilists and
statisticians used this theory effectively to study a host of applied probability
and statistics problems like parameter estimation, convergence of a sequence
of probability distributions, martingale characterization of diffusions enabling
one to extend the scope of the Ito stochastic differential equations to situations
when the drift and diffusion coefficients do not satisfy Lipschitz conditions, gen-
eralization of the large deviation principle and apply it to problems involving
random environment, interacting particle systems etc. We then discuss the work
of R.L.Hudson along with K.R.Parthasarathy on developing a coherent theory
of quantum noise and apply it to study in a rigorous mathematical way the
Schrodinger equation with quantum noise. This gives us a better understand-
ing of open quantum systems, ie systems in which the system gets coupled to
a bath with the joint system-bath universe following a unitary evolution and
after carrying a out a partial trace of the environment, how one ends up with
the standard Gorini-Kossokowski-Sudarshan-Lindblad (GKSL) equation for the
system state alone–this is a non-unitary evolution. The name of George Su-
darshan stands out here as not only one of the creators of open quantum sys-
tem theory but also as the physicist involved in developing the non-orthogonal
resolution of the identity operator in Boson Fock space which enables one to
effectively solve the GKSL equation. We then discuss the work of K.B.Sinha
along with W.O.Amrein in quantum scattering theory especially in the devel-
opment of the time delay principle which computes the average time spent by
the scattered particle in a scattering state relative to the time spent by it in the
free state. We discuss the heavy contributions of the Indian school of general
relativitsts like the Nobel laureate Subramaniyam Chandraskehar on developing
perturbative tools for solving the Einstein-Maxwell equations in a fluid (post-
Newtonian hydrodynamics) and also the work of Abhay Ashtekar and Ashoke
Sen on canonical quantization of the gravitational field and superstring the-
ory. We discuss the the train of thought that led the famous Indian probabilist
S.R.S.Varadhan to develop along with Daniel W.Stroock the martingale charac-
terization of diffusions and along with M.D.Donsker to develop the variational
formulation of the large deviation principle which plays a fundamental role in
assessing the role of weak noise on a system to cause it to exit a stability zone
by computing the probability of this rare event. We then discuss the work of
the legendary Inidian mathematician Harish-Chandra on group representation
viii
theory especially his creation of the discrete series of representations for groups
having both non-compact and compact Cartan subgroups to finally obtain the
Plancherel formula for such semisimple Lie groups. We discuss the impact of
Harish-Chandra’s work on modern group theoretical image processing, for ex-
ample in estimating the element of the Lorentz group that transforms a given
image field into a moving and rotating image field. We then discuss the con-
tributions of the famous Indian probabilist Gopinath Kallianpur to developing
non-linear filtering theory in its modern form along with the work of some other
probabilists like Kushner and Striebel. Kallianpur’s theory of nonlinear filter-
ing is the most general one as we know today since it is applicable to situations
when the process and measurement noises are correlated. Kallianpur’s mar-
tingale approach to this problem has in fact directly led to the development
of the quantum filter of V.P.Belavkin as a non-commutative generalization of
the classical version. Nonlinear filtering theory has been applied in its linearized
approximate form-the Extended Kalman filter to problems in robotics and EEG-
MRI analysis of medical data. It is applicable in fact to all problems where the
system dynamics is described by a noisy differential or difference equation and
one desires to estimate both the state and parameters of this dynamical system
on a real time basis using partial noisy measurement data. We conclude this
work with brief discussions of some of the contributions of the Indian school
of robotics and quantum signal processing to image denoising, robot control
via teleoperation and to artificial intelligence/machine learning algorithms for
estimating the nature of brain diseases from slurred speech data. This review
also includes the work of K.R.Parthasarathy on the noiseless and noisy Shan-
non coding theorems in information theory especially to problems involving the
transmission of information in the form of stationary ergodic processes through
finite memory channels and the computation the Shannon capacity for such
problems. It also includes the pedagogical work of K.R.Parthasarathy in sim-
plfying the proof of Andreas Winter and A.S.Holevo on computing the capacity
of iid classical-quantum channels wherein classical alphabets are encoded into
quantum states and decoding positive operator valued measures are used in the
decoding process. We also include here the work of K.R.Parthasarathy on realiz-
ing via a quantum circuit, the recovery operators in the Knill-Laflamme theorem
for recovering the input quantum state after it has been transmitted through a
noisy quantum channel described in the form of Choi-Krauss-Stinespring opera-
tors. Some generalizations of the single qubit error detection algorithm of Peter
Shor based on group theory due to K.R.Parthasarathy are also discussed here.
This book also contains some of the recent work of the Indian school of robotics
involving modeling the motion of rigid 3-D links in a robot using Lie group-Lie
algebra theory and differential equations on such Lie groups. After setting up
these kinematic differential equation in the Lie algebra domain of SO(3)⊗n , we
include weak noise terms coming from the torque and develop a large deviation
principle for computing the approximate probability of exit of the robot from
the stability zone. We include feedback terms to this robot system and optimize
this feedback controller so that the probability of stability zone exit computed
using the large deviation rate function is as small as possible.
ix
Table of Contents
8. Superconductivity 245–248
We note that
T r(ρ.χF ) = |en (ω)|2 dP (ω)
n F
1
2 Advanced Probability and Statistics: Applications to Physics and Engineering
In particular, if we choose the en s such that n |en (ω)|2 = 1 for P a.e.ω, then
we get
T r(ρ.χF ) = P (F ), F ∈ F
(H, F, P ) is a quantum probability space.
to choose a state such that X can be localised then in the same state Y will
have infinite variance, so that the two can never be simultaneously measured.
Another way to state this is that if we define the joint characteristic function of
X and Y in the state ρ as
ψ(t, s) = T r(ρ.exp(tX + sY )), t, s ∈ R
then ψ will in general not be positive definite and hence its inverse bivariate
Fourier transform by Bochner’s theorem will not generally be a probability
distribution. If however, X, Y commute, then ψ will always be positive definite:
c̄k cm ψ(tk −tm , sk −sm )
k,m
= T r(ρ.( ck exp(tk X+sk Y )))( cm exp(tm X+sm Y ))∗ ≥ 0
k m
[5] Proofs of the main theorems in classical integration theory.
[a] Monotone convergence, [b] Fatou’s lemma, [c] Dominated convergence,
[d]Fubini’s theorem.
If (Ω, F, μ) is a σ-finite measurable space and fn is a sequence of increas-
ing measurable functions bounded below by an integrable function, then the
monotone convergence theorem states that
lim fn dμ = limfn dμ
If fn is any sequence of measurable functions bounded below by an integrable
function, then by using the increasing sequence infk≥n fk in the monotone con-
vergence theorem, we can establish Fatou’s lemma:
liminf fn dμ ≥ liminf fn dμ
K :S×S →C
by
K((x1 , ..., xn ), (y1 , ..., yn )) = Πnk=1 < xk , yk >k
4 Advanced Probability and Statistics: Applications to Physics and Engineering
[8] The weak law of large numbers and the central limit theorem for sums of
independent random variables.
If X(n), n = 1, 2, ... is a sequence of independent r.vs with corresponding
means μn and corresponding finite variances σn2 and if
n
V ar(Sn ) = σk2
k=1
and hence
2
P (|Sn /n − (μ1 + ... + μn )/n| > ) ≤ V ar(Sn /n)/ = (σ12 + ... + σn2 )/n2 2
→0
ie, we get
P (|Sn /n − μ| > ) → 0
[9] Sums of independent random variables and the strong law of large num-
bers.
6 Advanced Probability and Statistics: Applications to Physics and Engineering
[10] The weak and strong laws of large numbers and the Levy-Khintchine
theorem on representing the characteristic function of an infinitely divisible
probability distributions in its most general form and its subsequent generaliza-
tion to infinitely divisible distributions for Hilbert space random variables by
S.R.S.Varadhan.
and then prove that if Xn is an iid sequence of N (0, 1) r.v’,s then the sequence of
n
continuous processes Bn (t) = k=1 Xk φk (t), k = 1, 2..., n converges uniformly
almost surely over [0, 1] and hence the limit is a Gaussian process with almost
surely continuous sample paths. This limiting process has all the properties of
Brownian motion. Some of the fundamental properties of BM proved using the
Borel-Cantelli lemmas are (a)
P (limh→0 supt∈[0,1−h] |B(t + h) − B(t)|/ Ch.log(1/h)) ≤ 1) = 1
sets to any given degree of accuracy with respect to any probability distribution
on Rn . The proof also uses a fundamental result in topology on compact sets,
namely, given a nested sequence of non-empty compact sets, the intersection
of all these sets is then non-empty. The consistency theorem states that given
a consistent sequence of probability distributions Fn on Rn , for n = 1, 2, ...,
there exists a probability space (Ω, F, P ) and an infinite sequence of real valued
random variables Xn , n = 1, 2, ... on this probability space such that
Then, t
xn+1 (t) − xn (t) = (μ(s, xn (s)) − μ(s, xn−1 (s)))ds
0
t
+ (σ(s, xn (s)) − σ(s, xn−1 (s))dB(s)
0
Thus, writing
Δn+1 (t) = max0≤s≤t |xn+1 (s) − xn (s)|2
we get t
E(Δn+1 (t)) ≤ 2K 2 t E(Δn (s))ds
0
t
+8K 2 E(Δn (s))ds
0
t t
2 2
= K (2t + 8) E(Δn (s))ds ≤ (2T + 8)K E(Δn (s))ds, 0 ≤ t ≤ T
0 0
from which, we get by iteration that
n+r−1
≤ E[max0≤t≤T |xj+1 (t) − xj (t)|]
j=n
n+r−1
n+r−1 √
1/2
≤ (E(Δj (t))) ≤ (CT n )/ n!
j=n j=n
[15a] Paul Levy’s construction of Brownian motion using the Haar basis.
Let
D(n) = {k/2n : 0 ≤ k ≤ 2n }
I(n) = D(n) − D(n − 1) = {k/2n : 0 ≤ k ≤ 2n , kodd}
Let {ξ(n, k) : k ∈ I(n), n ≥ 0} be iid N (0, 1) random variables. Define the Haar
wavelet functions Hn (t), t ∈ [0, 1] by Hn,k (t) = 2(n−1)/2 , (k − 1)/2n < t < k/2n
and Hn (t) = −2(n−1)/2 , k/2n < t < (k + 1)/2n }. Clearly {Hn,k : n ≥ 0 : k ∈
I(n)} is an onb for L2 [0, 1]. Define the Schauder functions
t
Sn,k (t) = Hn,k (s)ds
0
Then,
|Sn,k (t)| ≤ 2−(n+1)/2
We evaluate
Hn,k (t)Hn,k (s) = δ(t − s)
n,k
≤ (2C/n)exp(−n2 /2)
and since
(2n /n).exp(−n2 /2) < ∞
n≥1
Hence, for a.e. ω, there exists an integer N (ω) such that n > N (ω) implies
b(n) ≤ n. Then
|ξ(n, k)|Sn,k (t) ≤
n>N (ω) k∈I(n)
n.2−(n+1)/2 < ∞
n≥1
Note that
Sn,k (t)
k∈I(n)
has a minimum value of zero and a maximum value of 2−(n+1)/2 over the interval
[0, 1]. Its graph consists of nonoverlapping triangles of height 2−(n+1)/2 and base
widths 2−(n−1) .
The above argument implies that for a.e.ω, and for all n > N (ω)
supt∈[0,1] |Bn+r (t, ω) − Bn (t, ω)| ≤ n.2−(n+1)/2
m>n
which converges to zero as n, r → ∞. This means that for a.e.ω, the processes
Bn (., ω), n ≥ 1 converge uniformly over the interval [0, 1] and since each of
these processes is continuous, the limiting process B(., ω) is also continuous
with autocorrelation min(t, s) and hence we have explicitly constructed a mean
zero Gaussian process over the time interval [0, 1] that is a.e. continuous and
hs autcorrelation min(t, s). In other words, the limiting process is Brownian
motion over the time interval [0, 1].
by
P (D) = Pt1 ...tn (B)
Then by the Caratheodory theorem, it is sufficient to prove that P is countably
additive on C. To prove this it is in turn sufficient to show that if Dn ∈ C and
Dn ↓ φ, then P (Dn ) ↓ 0. Suppose that P (Dn ) ↓ δ > 0. Then, we must arrive
at a contradiction. Since Dn+1 ⊂ Dn , it follows that if we write
Dn = {ω : (ω(t1 ), ...ω(tn )) ∈ Bn }
where
Bm ⊂ Bn × Rm−n
Thus, we can pad sets Dn,1 , ..., Dn,m−n−1 in between the sets Dn and Dn+1 so
that
such that
Bn ∈ Rn , Bn+1 ⊂ Bn × R
Now by the regularity of probability measures on Rn , we can choose for each n
a non-empty compact set Kn ⊂ Bn such that
Now define
we find that
n
n
En = {ω : (ω(t1 ), ...ω(tm )) ∈ Km } = Fm
m=1 m=1
where
Fm = {ω : (ω(t1 ), ...ω(tm )) ∈ Km }
Then, En ↓ and
n
n
P (Dn − En ) = P ( (Dn − Fm )) ≤ P ( (Dm − Fm ))
m=1 m=1
n
n
≤ P (Dm − Fm ) = Pt1 ...tm (Bm − Km )
m=1 m=1
n
≤ δ/2m < δ
m=1
contains
the point {ω : (ω(1), ..., ω(m)) = (x(1), ..., x(m))} for each m. Thus,
m≥1 m contains the point (x(1), x(2), ...) and in particular,
F this set is non-
empty. But, Fm ⊂ Dm by our construction and hence m≥1 Dm is non-empty
which is a contradiction. This completes the proof of Kolmogorov’s existence
theorem.
ie, a continuous Gaussian stochastic process with zero mean and autocorrelation
min(t, s). Let X(t) be a stochastic process such that
for all t, s ∈ [0, 1] with C, a, b positive constants. Then X(.) has a continuous
modification, ie, there exists a continuous stochastic process Y (t) defined on the
same probability space such that for any distinct t1 , ..., tn ∈ [0, 1], n = 1, 2, ...,
we have
P (X(tj ) = Y (tj ), j = 1, 2, ..., n) = 0
In particular, all the finite dimensional distributions of X(.) coincide with the
corresponding distributions of Y . The idea behind the use of this theorem
to construct Brownian motion is to first construct using infinite sequences of
iid Gaussian random variables, a zero mean Gaussian process X(t) having the
same autocorrelation function as that of Brownian motion, then prove that X(.)
satisfies the conditions of this theorem and hence use the theorem to deduce the
existence of a continuous modification of the X(.) process, ie, we get a process
Y with continuous trajectories having the same finite dimensional distributions
as that of Brownian motion and hence conclude that Y is a Brownian motion
process.
Proof of the theorem: Let
D(n) = {k/2n : 0 ≤ k ≤ 2n }
Thus, by the union bound and Chebyshev’s inequality, for any γ > 0
2
n
−1
P (|X((k + 1)/2n ) − X(k/2n )| > 2−nγ ) ≤
k=0
it follows that
and matrices equal to vectors and matrices multiplied by the time domain spec-
tral projection and then derive the quantum Ito formula using the commutation
relations between the creation and annihilation operator fields. Explain using
basic physical principles why this derivation shows that the Ito formula can be
alternatively viewed as a manifestation of the Heisenberg uncertainty principle
between position and momentum.
[18] Ito’s formula for Brownian motion and Poisson processes and their quan-
tum generalizations. Take a Brownian motion process B(t) and verify the Levy
oscillation property
N −1
limN →∞ (B((n + 1)t/N ) − B(nt/N ))2 = t
n=0
in the mean square sense by computing the mean and variance of the lhs. Ex-
plain intuitively how this relationship can be cast in the form
(dB(t))2 = dt
M −1
M −1
limM →∞ [ (N ((k + 1)h) − N (kh))2 − (N ((k + 1)h) − N (kh))]2 = 0
k=0 k=0
To prove this, make use of the independent increment property of the Poisson
process and calculate E[N (h)2 ] and E(N (h)4 ) using
By taking limits prove in the mean square sense that if f (t) is a continuous
function, then
T T
E( f (t)dB(t))2 = f 2 (t)dt
0 0
where the rhs is interpreted as the limit of
N −1
f (nt/N )(B((n + 1)t/N ) − B(nt/N ))
n=0
Show that
df (B(t)) = f (B(t))dB(t) + (1/2)f (B(t))dt
Advanced Probability and Statistics: Applications to Physics and Engineering 15
Then
P (En , i.o) = 0
Proof:
{En , i.o} = Ek
n k≥n
Thus,
P (En , i.o) = limn→∞ P ( Ek ) ≤ P (Ek ) → 0
k≥n k≥n
Then
P (En , i.o) = 1
In fact,
1 − P (En , i.o) = P ( Ekc )
n k≥n
K
f (x) = C.exp( c(k)gk (x))
k=1
[2] Suppose ρ is a mixed quantum state such that its Von-Neumann entropy
H(ρ) = −T r(ρ.log(ρ)) is a maximum subject to constraints
T r(ρHk ) = μk , k = 1, 2, ..., K
δρ = ρ.((1 − exp(−ad(Z))/ad(Z))(δZ)
So
δ(ρ.log(ρ)) = δ(Z.exp(Z))
= δZ.exp(Z) + Z.exp(Z)((1 − exp(−ad(Z))/ad(Z))(δZ)
We may assume that Z is a Hermitian matrix. Then,
δ(T r(ρ.log(ρ))) =
[3] Let μ and ν be two probability measures on the same measurable space.
Define
H = supf ( f dν − log( exp(f )dμ))
f = log(dν/dμ) + C
[4] Let ρ, σ be two quantum states in a given Hilbert space and let X vary
over all observables (Hermitian matrices) in the same Hilbert space. Compute
Then,
T r(ρ.X) − log(T r(σ.exp(X))) =
p(k) < ek |ρ|ek > −log( exp(p(k)) < ek |σ|ek >)
k k
hint: Let τ be the first time at which the process hits either zero or y. Then τ
is a finite stop-time and hence by Doob’s optional stopping theorem,
and hence
P (T > t) = x/y
Now let s(t) = min(B(u) : u ≤ t). Then
Px (B(t) ∈ dy, T > t) = (2πt)−1/2 (exp(−(y −x)2 /2t)−exp(−(y +x)2 /2t)), y > 0
Let X(t) be a d− dimensional diffusion process with drift μ(x) and diffusion
coefficient σ(x). Suppose f (x) satisfies
Then, f (X(t)) is a Martingale and hence if τ denotes the first time at which
f (X(t)) hits either a or b starting from x ∈ (a, b), then by Doob’s optional
stopping theorem,
P (X(τ ) = a) + P (X(τ ) = b) = 1
Thus, the probability that X(t) hits f −1 ({a}) before it hits f −1 ({b}) staring at
x is given by
P (X(τ ) = a) = (f (x) − f (b))f (a) − f (b))
and the probability that X(t) hits f −1 ({b}) before it hits f −1 ({a}) is given by
we get taking
Z = 2n/2 (B((k + 1)/2n ) − B(k/2n )),
√ √
a = c.log(2). n
that
P (An,k ) ≤ C.n−1/2 .exp(−cn.log(2)/2) = C.n−1/2 .2−nc/2
Thus,
2
n
n
P( An,k ) ≤ P (An,k ) ≤ Cn−1/2 .2n(1−c/2)
k=1 k=1
So for c > 2,
2
n
limn→∞ P ( An,k ) = 0
k=1
and we deduce from the above equation and the continuity of the Brownian
paths that
P (limsuph→0 sup0<t<1−h |B(t + h) − B(t)|/ c.h.log(1/h) > 1) = 0
20 Advanced Probability and Statistics: Applications to Physics and Engineering
for all c > 2. Note that this is equivalent to the statement that
limδ→0 P (sup0<h<δ sup0<t<1−h |B(t + h) − B(t)|/ c.h.log(1/h) > 1) = 0∀c > 2
for all sufficiently large n. This is not summable. Hence by the Borel-Cantelli
Lemma we have that with probability one the events {B(q n ) − B(q n−1 ) >
ψ(q n − q n−1 )}, n = 1, 2, ... occur infinitely often. On the other hand, for any
a > 1,
[26](For Rohit Singh). Estimating the image intensity field when the noise is
Poisson plus Gaussian with the mean of the Poisson field being the true intensity
field.
The image field has the model
where N (x, y) is Poisson with unknown mean u0 (x, y) and W (x, y) is N (0, σ 2 ).
Further we assume that N (x, y), W (x, y), x, y = 1, 2, ..., M are all independent
r.v’s. {u0 (x, y)} is the denoised image field and is to be estimated from mea-
surements of {u(x, y)}.
Remark: We write
and interpret u0 (x, y) as the signal/denoised image field and N (x, y)−u0 (x, y)+
W (x, y) as the noise. The pdf of the noisy image field is
p(u|u0 ) = ΠM
x,y=1 φ((u(x, y) − n)/σ).exp(−u0 (x, y))u0 (x, y)n /n!
n≥0
22 Advanced Probability and Statistics: Applications to Physics and Engineering
and hence the log-likelihood function to be maximized after taking into account
a regularization term that minimizes the error energy in the image field gradient,
ie, reduces prominent edges is given by
L(u|u0 ) = log(p(u|u0 ))−E(u0 )
M
= log( φ((u(x, y)−n)/σ).exp(−u0 (x, y))u0 (x, y)n /n!)
x,y=1 n≥0
M
−c. |∇u0 (x, y)|
x,y=1
where
|∇u0 (x, y)|2 = (u0 (x + 1, y) − u0 (x, y))2 + (u0 (x, y + 1) − u0 (x, y))2
Setting the gradient of this w.r.t u0 (x, y) to zero gives us the optimal equation
[ φ((u(x, y) − n)/σ).exp(−u0 (x, y))u0 (x, y)n /n!]−1
n
×[ φ((u(x, y) − n)/σ)exp(−u0 (x, y))(u0 (x, y)n−1 /(n − 1)! − u0 (x, y)n /n!)]
n
u0 (t+1, x, y)
= u0 (t, x, y)+μ.[[ φ((u(x, y)−n)/σ).exp(−u0 (t, x, y))u0 (t, x, y)n /n!]−1
n
×[ φ((u(x, y)−n)/σ)exp(−u0 (t, x, y))(u0 (t, x, y)n−1 /(n−1)!−u0 (t, x, y)n /n!)]
n
−cdiv(∇u0 (t, x, y)/|∇u0 (t, x, y)|)]
Note that √
φ(x) = ( 2π)−1 .exp(−x2 /2)
Simulating this algorithm: First we explain how to simulate a Poisson ran-
dom variable with given mean λ. We use the fact that a binomial random vari-
able with parameters n, p = λ/n converges in distribution to a Poisson random
variable with mean λ. Further, a binomial random variable with parameters
Advanced Probability and Statistics: Applications to Physics and Engineering 23
The joint pdf of all the pixel intensities in the image is then
P (u|u0 ) = ΠM
x,y=1 p(u(x, y)|u0 )
and in in this case, we do not introduce any regularization. Thus, the problem
amounts to constructing the mle of u0 given the matrix of measurements U =
((u(x, y))). We therefore consider the problem of measuring the parameter θ on
which the pdf of X1 depends given an iid sequence X1 , ..., Xn and ask how this
esimator behaves as n → ∞. let
We write
θ = θ0 + δθ
where θ0 is the true value of θ and then note that
L(Xk |θ0 + δθ) = L(Xk |θ0 ) + Lk (Xk |θ0 )δθ + (1/2)Lk (Xk |θ0 )(δθ)2
θ̂[n] = θ0 + δ θ̂[n]
where
n
δ θ̂[n] = argmaxδθ [Lk (Xk |θ0 )δθ + (1/2)Lk (Xk |θ0 )(δθ)2 ]
k=1
Thus,
n
n
δ θ̂[n] = −[ L (Xk |θ0 )]−1 .[ L (Xk |θ0 )]
k=1 k=1
since
E[L (X1 |θ0 )] = p (X1 |θ)dX1 = ∂θ p(X1 |θ)dX1 = 0
The LDP tells us at what rate does δ θ̂[n] converge to zero. From the contraction
principle of LDP, if I(x, y) is the rate function of n−1 . k=1 (L (Xk |θ0 ), L (Xk |θ0 )),
n
0r 0r×N −r
H = diag(0r , IN −r ) =
0N −r×r IN −r
Define
Ir 0r×N −r
K = (−1)H = diag(Ir , −In−r ) =
0N −r×r −IN −r
Then consider the Boson Fock space
Γs (L2 (R+ ) ⊗ CN )
Note that
K 2 = IN , K ∗ = K
We shall now prove that for s < t,
G(t)dΛab (s) = (−1)σab dΛab (s).G(t)
Indeed, we have for s < t,
This proves the claim. Now define the process ξab (t) by
The grading of ξab (t) process is σab = σ(Eab ). From (1), it follows on integration
that
[dξab (t), ξcd (s)]S = 0, s < t
Note that we have used the easily proved identity
G(t)G(s) = G(s)G(t)∀s, t
Advanced Probability and Statistics: Applications to Physics and Engineering 27
Now consider
Thus we get
[dξab (t), dξcd (t)]S = a
d dξcb − (−1)σab σcd cb dξad
Now Let A, B be N + 1 × N + 1 matrices. We define
where summation over the repeated indices a, b is implied. Then, we have from
the above,
[dξA (t), dξB (t)]S = Aab Bcd ad dξcb (t) − (−1)σab σcd Aab Bcd cb dξad (t)
p × W = diag[p(x)Wx , x ∈ A]
p ⊗ Wp = diag[p(x)Wp , x ∈ A]
D(p × W |p ⊗ Wp ) =
T r[(p × W )[ln(p × W ) − ln(p ⊗ Wp )]]
= p(x)T r(Wx .ln(p(x)Wx )) − p(x)T r(Wx (ln(p(x)) + ln(Wp )))
x x
= p(x)T r(Wx (ln(p(x)) + ln(Wx ))) − p(x)(ln(p(x) + T r(Wx .ln(Wp )))
x x
= p(x)(ln(p(x)) + T r(Wx .ln(Wx ))) − p(x)(ln(p(x)) + T r(Wx .ln(Wp )))
x x
=− p(x)H(Wx ) + H(Wp ) = I(p, W )
x
u(x, y) → u0 (x, y)
Advanced Probability and Statistics: Applications to Physics and Engineering 29
+(1/2)exp(−u(x, y)2 /2σ 2 )−(B2 /2)exp(−u(x, y)2 /2σ 2 )[ln(u0 (x, y)/ζ)
−(ζu(x, y)/σ 2 )]]
For large values of z, the Γ(z + 1) factor appearing in the denominator becomes
very large and hence the integral above can be well approximated by
T
A [exp(−(u(x, y) − zζ)2 /2σ 2 )exp(z.ln(u0 (x, y)/ζ)]dz
0
where T is finite and A represents some sort of an average of 1/Γ(z + 1) over
[0, T ]. Further, for z >> T , the integrand above goes rapidly to zero as
exp(−z 2 ζ 2 /2σ 2 ). Hence the above integral can further be well approximated
by ∞
A [exp(−(u(x, y) − zζ)2 /2σ 2 )exp(z.ln(u0 (x, y)/ζ)]dz
0
Further, for small ζ, we have that u0 (x, y)/ζ > 1 and hence ln(u0 (x, y)/ζ) > 0.
Then for negative values of z, exp(−(u(x, y) − zζ)2 /2σ 2 ).exp(z.ln(u0 (x, y)/ζ))
becomes very small and hence the above integral can be extended to the range
30 Advanced Probability and Statistics: Applications to Physics and Engineering
(−∞, ∞) without causing too much error. This means the finally, the function
to be maximized is
E(u0 (x, y)|u(x, y)) ≈
∞
A.exp(−u0 (x, y)/ζ). [exp(−(u(x, y) − zζ)2 /2σ 2 )exp(z.ln(u0 (x, y)/ζ)]dz
−∞
+(1/2)exp(−u0 (x, y)/ζ).exp(−u(x, y)2 /2σ 2 )[1−B2 .(ln(u0 (x, y)/ζ)−(ζu(x, y)/σ 2 ))]
Using the standard Gaussian integral
∞ √
exp(−z 2 /2σ 2 ).exp(az)sz = σ 2πexp(σ 2 a2 /2)
−∞
we easily evaluate the above integral to get
∞
exp(−(u − zζ)2 /2σ 2 ).exp(z.ln(u0 /ζ))dz
−∞
∞
= exp(−u2 /2σ 2 ) exp(−z 2 ζ 2 /2σ 2 ).exp(z.(uζ/σ 2 + ln(u0 /ζ)))dz
−∞
√
= (σ 2π/ζ).exp(−u2 /2σ 2 ).exp((σ 2 /2ζ 2 )(uζ/σ 2 + ln(u0 /ζ))2 )
and hence
E(u0 (x, y)|u(x, y)) ≈
√
exp(−u0 /ζ).exp(−u /2σ ).[A.(σ 2π/ζ).exp((σ 2 /2ζ 2 )(uζ/σ 2 + ln(u0 /ζ))2 )
2 2
n−1 k+1
= [(x − k − 1/2)f (x)|k+1
k − f (x)dx]
k=0 k
n−1 n
= (f (k + 1) + f (k))/2 − f (x)dx
k=0 0
n n
= f (k) − (f (0) + f (n))/2 − f (x)dx
k=0 0
or equivalently,
n n n
f (k) = f (x)dx − (f (0) + f (n))/2 + f (x)(x − [x] − 1/2)dx
k=0 0 0
Define
P1 (x) = x − [x] − 1/2
Then,
n n
f (x)P1 (x) = f (x)P2 (x)|n0 − f (x)P2 (x)dx
0 0
n
= f (n)P2 (n) − f (x)P2 (x)dx
0
where x x
P2 (x) = P1 (x)dx = (x − [x] − 1/2)dx
0 0
we get that
n
f (k) =
k=0
n
f (x)dx−(f (0)+f (n))/2+P2 (n)f (n)−P3 (n)f (n)+...+(−1)N f (N −1) (n)PN (n)+
0
n
(−1)N +1 f (N ) (x)PN (x)dx
0
{x}
=0+ (u − 1/2)du = (1/2)({x} − 1/2)2 − 1/4) = (1/2)({x}2 − {x})
0
32 Advanced Probability and Statistics: Applications to Physics and Engineering
which is a polynomial in {x} = x−[x]. Note that all the Pn (x) s are polynomials
in {x} and so we can write
X = diag[f1 , f2 , ..., fn ]
ρ = diag[p1 , ..., pn ]
Then
Ep (f ) = T r(ρ.X)
Is is clear that for non-diagonal ρ and non-diagonal X, the expected value of
X defined by T r(ρX) in quantum probability is a generalization of the classical
scenario.
[2] If X, Y are two random variables in classical probability and both of them
assume discrete values, then X + Y will also assume only discrete values. This
is not so in quantum probability. We can have two observables (ie, self-adjoint
operators) A, B in the same Hilbert space assuming discrete values, ie, having
discrete spectrum, but A + B can have a continuous spectrum. For example,
The Hydrogen atom Hamiltonian H1 = p2 /2m − e2 /r has both discrete and
continuous spectrum and the 3-D Harmonic oscillator H2 = p2 /2m + Kr2 /2 has
discrete spectrum, but their difference
H = H2 − H1 = Kr2 /2 + e2 /r
X(Y − Z) ≤ 1 − Y Z, X(Z − Y ) ≤ 1 − Y Z
This is called Bell’s inequality. This does not hold in quantum probability. For
example, consider the three observables
where a, b, c are unit vectors. Then, X, Y, Z all have only ±1 as their eigenval-
ues, but
XY = (a, b) + (i(σ, a × b)
Y Z = (b, c) + i(σ, b × c),
ZX = (c, a) + i(σ, c × a)
Thus, if ρ is any state in which all the three Pauli matrices have zero mean,
then in this state
T r(ρ(XY + Y X)/2) = (a, b),
T r(ρ(Y Z + ZY )/2) = (b, c),
34 Advanced Probability and Statistics: Applications to Physics and Engineering
Now, if U is another observable having spectral measure EU (.), then g(U ) has
the spectral measure EU (g −1 (.)). if X and U commute, then their spectral
measures also commute and we can talk of the joint probability distribution of
X, U in any state ρ as
or equivalently, as
Note that this is a well defined probability distribution since EX (.) commutes
with EU (.). To see that this is non-negative, we write it as
This is possible only because [EX (A), EU (B)] = 0 for any two Borel sets A, B.
If they do not commute, this does not define a probability distribution, in fact,
T r(ρ.EX (A).EU (B)) may even be a complex number with non-vanishing imag-
inary part.
then the event P ∩ Q is defined as the orthogonal projection onto R(P ) ∩ R(Q)
and likewise the event P ∪ Q is the orthogonal projection onto the closure
R(P ) + R(Q). To see how De’Morgan’s law can fail in quantum probability,
let the Hilbert space be R2 , P the orthogonal projection onto the line y =
m1 x and Q the orthogonal projection ont y = m2 x where m2 = ±m1 . Then
R(P )+R(Q) = R2 and hence P ∪Q = I. Thus, if S is the orthogonal projection
onto the line y = m3 x, we get that S ∩(P ∪ Q) = S. On the other hand, S ∩ P =
0, S ∩ Q = 0 if m3 = ±m1 , ±m2 and hence in this case, (S ∩ P ) ∪ (S ∩ Q) = 0.
Then Gleason proved that there exists a unique positive semifinite operator ρ
in H having unit trace such that
Note that we can choose ρ to be positive semidefinite, since the above definition
implies that
< f, ρf >= T r(ρ.|f |2 ) = |f (ω)|2 P (dω) ≥ 0
Let X, Y be two random vectors defined on the same probability space such
that
E(Y|X) = X
For example, let
X = [x1 , ..., xn ]T
have a pdf pX (x) and let Y = [y1 , ..., yn ]T be such that
where given xk Pk (xk /ζk ) is a Poisson r.v. with mean xk /ζk and W = ((wk ))
is independent of X with zero mean and pdf pW (w). Then
EY|X) = X
and
∞
pY (y|x) = pW (y1 −m1 ζ1 , ..., yn −mn ζn ).Πnk=1 [exp(−xk /ζk )(xk /ζk )mk (mk !)−1]
m1 ,...,mn =0
Note that the MAP estimator of X based on the noisy data Y is given by
X̂ = AY + b
E[ X − X̂ 2 ]
E[(X − AY − b)YT ] = 0,
E[(X − AY − b] = 0
or equivalently,
ARY Y + bμTY = RXY ,
μX − AμY − b = 0
Now,
RXY = E(XYT ) = E[E(XYT |X)]
= EXE(YT |X) = E(XXT ) = RXX ,
RY = E[(YYT ]
Advanced Probability and Statistics: Applications to Physics and Engineering 37
In the special case of the Poisson model above, when W = 0, so that conditioned
on X, Y is independent Poissonian with mean X = ((xk )), we have that
E((yi yj )X) = xi xj , i = j,
It should be noted that the first problem solved in scattering theory was by
Ernest Rutherford when he bombarded a gold foil with α-particles and from
the scattering pattern w.r.t the angles of deflection, he was able to predict
that the gold foil consisted of atoms with a positively charged nucleus. For
this, he made use of classical scattering theory which is based on calculating
39
40 Advanced Probability and Statistics: Applications to Physics and Engineering
the angle of deflection of a particle from its initial line of approach towards a
scattering centre that produces a repulsive inverse square law of force in terms
of the initial velocity of the particle and the distance of its line of approach
from the scattering centre. From this formula, it is easy to derive a relationship
between the deflection angle and the scattering cross section, ie, the number
of particles scattered per unit solid angle divided by the incident flux. This is
called classical Coulomb scattering. One of the striking facts in quantum theory
is that the wave operators and hence the scattering cross section are not defined
in Coulomb scattering and one has to modify the definition of the wave operator
by multiplying it by a time varying unitary operator so that its asymptotic limit
exists. These facts can be found in W.O.Amrein’s book.
[2] Quantum scattering theory:Scattering cross sections
Derivation of the wave operators, average time spent by the projectile in a
set and inside a cone in terms of the scattering matrix.
Let H = H0 + V ,
U0 (t) = exp(−itH0 ), U (t) = exp(−itH),
|f > is the in-state of the free particle and after encountering the scatterer, it
goes into the in-scattered state Ω− |f > which evolves with time as U (t). Thus,
F (C)U (t)Ω− f 2 equals the probability that at time t, the scattered particle’s
position will fall within the set C and hence P (f, C) represents the probability
that after a very long time, the particle’s position eventually falls within the set
C. It is easy to see that if C is a cone with apex at the origin and opening to
r = +∞ subtending a solid angle of Ω0 , then tC = C∀t > 0. Now, we have
U (t)Ω− = Ω− U0 (t)
and hence,
P (f, C) = limt→∞ F (C)Ω− U0 (t)f 2
We have
| F (C)U (t)Ω− f 2 − F (C)U0 (t)Sf 2 |
= ( F (C)U (t)Ω− f + F (C)U0 (t)Sf ).| F (C)U (t)Ω− f
− F (C)U0 (t)Sf |
≤ 2 f . F (C)(U (t)Ω− f − U0 (t)Sf )
≤ 2 f . U (t)Ω− f − U0 (t)Sf
Now,
(U (t)Ω− − U0 (t)S)f
Advanced Probability and Statistics: Applications to Physics and Engineering 41
R(Ω− ) ⊂ R(Ω+ )
is satisfied, then
Ω+ Ω∗+ Ω− = Ω−
holds good and then we get
and hence
so that we get
F (C)U0 (t)Sf = χC (2t.P )Zt (Q)Sf
where
Zt (Q) = exp(iQ2 /4t)
is a multiplicative unitary operator. Taking C as the cone described above, we
have that χC (2t.P ) = χC (P ), ∀t > 0 and hence
= χC (P )Sf 2
since
Zt (Q) → 1, t → ∞
42 Advanced Probability and Statistics: Applications to Physics and Engineering
By using the fact that Fourier transformation F preserves the norm, we get
P (f, C) = |FSf (k)|2 dn k
C
t
H0 = P 2 , H = H0 +V (Q), Xt = V (2P t)dt, U (t) = exp(−itH), U0 (t) = exp(−itH0 )
0
d
(U (t)∗ U0 (t)exp(−iXt )f )
dt
= iU (t)∗ (V (Q) − V (2P t))U0 (t).exp(−iXt )f
d
(U (t)∗ U0 (t)exp(−iXt )f ) =
dt
(V (Q) − V (2P t))U0 (t)exp(−iXt )f =
(U0 (−t)V (Q)U0 (t) − V (2P t))exp(−iXt )f
U0 (−t)QU0 (t) = exp(itad(P 2 ))(Q) = Q + 2tP
U0 (−t)V (Q).U0 (t) = V (Q + 2P t)
So
d
(U (t)∗ U0 (t)exp(−iXt )f ) =
dt
= (V (Q + 2P t) − V (2P t))exp(−iXt )f
Alternately,
V (Q + 2P t) = Zt (Q).V (2P t)Zt (Q)∗
where
Zt = Zt (Q) = exp(−iQ2 /4t)
since
exp(−iad(Q2 )/4t)(P ) = P + Q/2t
Thus,
d
(U (t)∗ U0 (t)exp(−iXt )f ) =
dt
1
d/dx(exp(−ix.ad(Q2 )/4t)(V (2P t)).exp(−iXt )f.dx
0
1
≤ d/dx(exp(−ix.ad(Q2 )/4t)(V (2P t)).exp(−iXt )f .dx
0
1
= (4t)−1 [Q2 , V (2P t)].Z(t/x)
∗
.exp(−iXt )f .dx
0
Advanced Probability and Statistics: Applications to Physics and Engineering 43
Now,
[Q2 , V (2P t)] = Q.[Q, V (2P t)] + [Q, V (2P t)].Q
= 2it(Q.V (2P t) + V (2P t).Q) = 2it([Q, V (2P t)] + 2V (2P t).Q)
= −(2t)2 V (2P t) + 4itV (2P t).Q = (2t)2 V (2P t) + 4itQ.V (2P t)
Now, with gt = exp(−iXt )f , we have
V (2P t)Z(t/x)
∗
.gt
= V (xQ)U0 (t/x)gt
and
V (2P t).QZ(t/x)
∗
gt
= V (xQ).U0 (t/x)Qgt
Now,
and
[Q2 , exp(−iXt )] = Q.[Q, exp(−iXt )] + [Q, exp(−iXt )].Q
= Q.Xt (P )exp ∗ (−iXt ) + Xt (P ).exp(−iXt )Q
and t
Xt (P ) = (2s)V (2sP )ds
0
For
V (Q) = K/|Q|
we get
Xt (P ) = (K/|2P |)ln(t) = V (2P ).ln(t)
[Q, exp(−iXt )] = Xt (P )exp(−iXt )
or equivalently,
exp(iXt )Q.exp(−iXt ) = Q + Xt (P )
Then,
exp(iXt )F (Q).exp(−iXt ) =) = F (Q + Xt (P ))
so that
where
K(x) = (2π)−nx/2
Thus we obtain by taking 2/(1 − x) = p, 2/(1 + x) = q that
F f p ≤ K(1 − 2/p) f q
An application:
Let p, q be such that
1 = 1/p + 1/q, p ≥ 2
Let q , q be defined by
Then, q ≥ 2,
1/2 = 1/q − 1/p
ie,
1/q = 1/2 + 1/p
and we have by Holder’s and Hausdorff-Young inequalities,
where χN is the indicator of the closed ball in Rn with centre at the origin and
radius N . It is readily seen that
φ − φN p → 0, N → ∞,
ψ − ψN p → 0, N → ∞
Further,
φ(P )ψ(Q) − φN (P )ψN (Q)
= (φ(P ) − φN (P ))ψ(Q) + φN (P )(ψ(Q) − ψN (Q))
≤ (φ(P ) − φN (P ))ψ(Q) + φN (P )(ψ(Q) − ψN (Q))
K ≤ φ − φN p . ψ p
+K φN p . ψ − ψN p → 0, N → ∞
In other words, φN (P )ψN (Q) converges in the operator norm to φ(P )ψ(Q).
Thus, if we can prove that φN (P )ψN (Q) is compact for each N > 0, then
we would establish that φ(P )ψ(Q) is a compact operator. Equivalently, by
the unitary equivalence of Q, P via the Fourier transform, we would get that
φ(Q)ψ(P ) is compact. Now, for any f ∈ L2 (Rn ), we have
F φN (P )ψN (Q)f (Q)(k) = φN (k) ψ̃N (k − k )f˜(k )dn k
and hence, in the Fourier F φN (P )ψN (Q)F −1 has the kernel φN (k)ψ̃N (k − k ).
Thus its Hilbert-Schmidt norm is given by
φN (P )ψN (Q) HS = |φN (k)ψ̃N (k − k )|2 dn kdn k
=( |φN (k)|2 dn k).( |ψ̃N (k)|2 dn k
= ( |φN (k)|2 dn k).( |ψN (k)|2 dn k) < ∞
46 Advanced Probability and Statistics: Applications to Physics and Engineering
Hence for each finite positive N , φN (P )ψN (Q) is Hilbert-Schmidt and hence
compact. This proves that φ(P )ψ(Q) and φ(Q)ψ(P ) are compact operators
whenever φ, ψ ∈ Lp (Rn ) for some p ≥ 2.
V (Q)U0 (t/x)exp(−iXt )f
≤ C2 < Q >−3 U0 (t/x)exp(−iXt )f
V (Q)U0 (t/x).Qexp(−iXt )f ≤
Advanced Probability and Statistics: Applications to Physics and Engineering 47
C1 < Q >−2 U0 (t/x)Qj exp(−iXt )f
j
We now consider
P.QU0 (s)exp(−iXt ) = P.([Q, U0 (s)] + U0 (s)Q)exp(−iXt )
U0 (s)exp(−iXt )f
−3/2
= P.QU0 (s)exp(−iXt )(2sH0 )−1 f +Cln(t)(2s)−1 U0 (s)exp(−iXt )H0 f
Also
P.Q = Q.P − [Q, P ] = Q.P − in
so
U0 (s)exp(−iXt )f
= Q.U0 (s)exp(−iXt )(2sH0 )−1 P f −in(2s)−1 U0 (s)exp(−iXt )H0−1 f
−3/2
+Cln(t)(2s)−1 U0 (s)exp(−iXt )H0 f
Also note that
< Q >−2 U0 (s)Qj exp(−iXt )f
=< Q >−2 ([U0 (s), Qj ] + Qj U0 (s))exp(−iXt )f
=< Q >−2 (−2sPj + Qj )U0 (s)exp(−iXt )f
Suppose we make the inductive hypothesis that for large t,
Then we easily deduce from the above inequalities that we can take any fk+1 , ck+1
satisfying
fk+1 (t)ck+1 (f ) ≥
48 Advanced Probability and Statistics: Applications to Physics and Engineering
−3/2
t−1 .fk (t)ck ((2H0 )−1 Pj f )+fk (t)ck (n(2H0 )−1 f )+C.(ln(t)/t)fk (t)ck (t)(2−1 H0 f)
Remark: We have made use of
< Q >−1 = 1
and therefore,
< Q >−k−1 φ ≤< Q >−k φ
C.exp(−ikz) = C.exp(−ikr.cos(θ))
=C r−1 (al (k)exp(ikr)+bl (k).exp(−ikr))Pl (cos(θ))
l≥0
The total wave function after scattering is in the asymptotic limit r → ∞ given
by
C.exp(−ikr.cos(θ)) + f (θ).exp(ikr)/r
= r−1 .exp(ikr)(f (θ) + Cal (k).Pl (cos(θ)))
l
+r−1 .exp(−ikr). Cbl (k).Pl (cos(θ))
l
and we get on matching these two expressions for the asymptotic wave function,
f (θ) + C al (k).Pl (cos(θ)) = cl (k)Pl (cos(θ))
l l
C. bl (k).Pl (cos(θ)) = dl (k).Pl (cos(θ))
l l
≤ R. Tm − Tn → 0, m, n → ∞
where
R = sup xn
Thus, {yn } is a Cauchy sequence and hence convergent, say
yn → y
Finally, we have
T zk − y = (T − Tm )zk + Tm zk − ym + ym − y
≤ T − Tm . zk + Tm zk − ym + ym − y
We then get
limsupk T zk − y ≤ R. T − Tm
+ ym − y
Letting m → ∞ then gives us
limk T zk − y = 0
ie, the subsequence {zk } of {xn } has the property that T zk → y proving that
T is compact.
and so
T x 2 ≤ ( T en .| < en , x > |)2
n
Advanced Probability and Statistics: Applications to Physics and Engineering 51
≤( T en 2 ). | < en , x > |2 = T 2HS x 2
n n
N
TN = |en >< en |T |em >< em |
n,m=1
T − TN 2 ≤ T − TN 2HS
= | < en |T |em > |2 → 0, N → ∞
n>N orm>N
since
| < en |T |em > |2 = < en |T ∗ T |en > | = T en 2 = T HS < ∞
n,m≥1 n
W ∗ exp(it(H + V )) = exp(itH)W ∗
W = I + Γ(V W ) − − − (1)
52 Advanced Probability and Statistics: Applications to Physics and Engineering
where ∞
Γ(X) = i exp(itH)V X.exp(−itH)dt
0
p
V = c(k)|ek >< ek |, c(k) ∈ R, < ek |em >= δ(k, m)
k=1
Note that
p ∞
Γ(XV ) = i c(k) exp(itH)|ek >< ek |X.exp(−itH)dt
k=1 0
and hence
p ∞
< u|Γ(XV )|v >= i c(k) < u|exp(itH)|ek >< ek |X.exp(−itH)|v > dt
k=1 0
Show that W transforms Hac (H) into Hac (H + V ). To do this part, we note
that
W.exp(itH) = exp(it(H + V ))W, t ∈ R
and hence for any Borel set B ⊂ R, we have that
where EH (.) is the spectral measure of H while EH+V (.) is the spectral measure
of H + V . Therefore, in particular, the measure B → EH (B)|u >2 is abso-
lutely continuous iff the measure B → EH+V (B)W |u >2 is absolutely contin-
uous which is equivalent to saying that |u >∈ Hac (H) iff W |u >∈ Hac (H + V ).
Note that
exp(itH) = exp(itx)EH (dx)
Also, ∞
θ(t)exp(−itx)dt = 1/ix + π.δ(x)
0
This formula implies that
∞
πδ(H − y) − iR(y, H) = θ(t)exp(−it(H − y))dt
0
We have b
(R(H, x + i ) − R(H, x − ))dx
a+0
b
= dE(y) ((y − x − i )−1 − (y − x + i )−1 )dx
R a+0
b
= dE(y) 2i .((y − x)2 + 2 −1
) dx
R a+0
which converges as → 0+ to
b b
2πi dE(y) δ(y − x)dx = 2πi dE(x)
R a+0 a+0
= 2πiE((a, b])
Thus we have a formula that expresses directly the spectral measure in terms
of the resolvent for a self-adjoint operator.
Ω+ = slimt→∞ exp(itH).exp(−itH0 )
T
≤ (Ω+ − exp(itH).exp(−itH0 ))f .exp(−epsilont)dt
0
∞
+ (Ω+ − exp(itH).exp(−itH0 ))f . .exp(− t)dt
T
Now by hypothesis, for any δ > 0, there exists a finite T = T (δ) such that
≤ 2T . f +δ f .exp(− T )
Letting first → 0 and then δ → 0 yields the desired result.
[3] Every symmetric operator is closed since for such an operator A, we have
that A ⊂ A∗ and A∗ is closed. We wish to explore two issues. One, what are all
the closed symmetric extensions of A and two, when does A have a self-adjoint
extension. To this end, define the deficiency indices ν± by
We note that
ν− = dimR(A − i)⊥ , ν+ = dimR(A + i)⊥
Note that (A ± i)−1 exist and are bounded. This follows for example from
(A + i)f 2 = Af 2 + f 2 , f ∈ D(A)
A + i thus maps D(A) = D(A + i) = R((A + i)−1 ) onto R(A + i) The bounded
operator U = (A + i)(A − i)−1 maps R(A − i) onto R(A + i). The former
subspace has a co-dimension of ν− while the latter subspace has a co-dimension
of ν+ . It is easy to see that U is an isometry. It is in fact easy to see that U
is a unitary operator between the space R(A − i) and R(A + i). Now let V be
a unitary operator between a subspace M1 of R(A − i)⊥ and a subspace M2
of R(A + i)⊥ . Then W = U ⊕ V is a unitary operator between the subspace
R(A − i) ⊕ M1 and the subspace R(A + i) ⊕ M2 . We can recover A from its
Cayley transform U via the equation
A = i(U − 1)−1 (U + 1)
so that
(U − 1)Ag = i(U + 1)g, g ∈ D(A)
We have
(1 + U )(A − i)g = (A + i)g + (A − i)g = 2Ag, g ∈ D(A)
and
(1 − U )(A − i)g = (A − i)g − (A + i)g = −2ig, g ∈ D(A)
These give
Ag = −i(1 + U )(1 − U )−1 g, g ∈ D(A)
Thus,
D(A) = D((1 − U )−1 )) = R(1 − U )
Let A be closed symmetric. We then wish to determine the structure of D(A∗ )andA∗ .
We claim that
D(A∗ ) = D(A) + R(A − i)⊥ + R(A + i)⊥
with f ∈ D(A) implying A∗ f = Af , f ∈ R(A − i)⊥ implying A∗ f = −if and
f ∈ R(A + i)⊥ implying A∗ f = if . To see that A∗ is well defined, suppose
f ∈ D(A), g ∈ R(A − i)⊥ = N (A∗ + i), h ∈ R(A + i)⊥ = N (A∗ − i) are such
that
f +g+h=0
Then, g + h ∈ D(A) and we have
A∗ (f +g+h) = Af −ig+ih = −A(g+h)−ig+ih = −(A+i)g−(A−i)h
= −(A∗ +i)g−(A∗ −i)h = 0
and hence A∗ is consistently defined. We have further, for f ∈ D(A), g ∈
N (A∗ + i), h ∈ N (A∗ − i) and u ∈ D(A) that
< f + g + h, Au >=< A∗ (f + g + h), u >=< Af − ig + ih, u >
Suppose now that f ∈ D(A∗ ). We wish to show that f can be expressed as the
sum of three vectors in D(A), N (A∗ + i) and N (A∗ − i) respectively. We first
project (A∗ + i)f onto R(A + i) and hence we can write
Then,
(A∗ + i)(f − f1 ) = f2
since D(A) ⊂ D(A∗ ) by the assumed symmetry of A. Now,
(A∗ − i)f2 = 0
and hence
(A∗ + i)(f − f1 ) + (A∗ + i)if2 /2 = 0
or equivalently,
(A∗ + i)(f − f1 + if2 /2) = 0
and therefore,
f − f1 + if2 /2 ∈ N (A∗ + i)
Chapter 3
57
58 Advanced Probability and Statistics: Applications to Physics and Engineering
[3] If (X, μ) is a σ-finite measure space and Y is a normed vector space, then
the set of all measurable maps f : X → Y with X f (x) p dμ(x) < ∞ is a
normed vector space and in fact a Banach space with norm
f p = ( |f (x)|p dμ(x))1/p
prove that this is indeed a norm and completeness under this norm.
[4] If X is a normed linear space and Y is a Banach space, then the set of
all bounded continuous maps f : X → Y with the norm
f = supx∈X f (x)
[9] Tensor products of Hilbert spaces and properties of the tensor product.
The Boson and Fermion Fock spaces with application to describing the nature
of the state of bosons and Fermions. Gelfand-Naimark-Segal (GNS) principle
for construction of the tensor product between Hilbert spaces based on Schur’s
Advanced Probability and Statistics: Applications to Physics and Engineering 59
positivity theorem for matrices and the Kolmogorov consistency theorem for
stochastic processes. (This proof is due to Professor K.R.Parthasarathy).
[12] The unit open ball in a finite dimensional normed vector space is compact
while that in an infinite dimensional normed vector space is non-compact. Proof
of this fact.
[13] Proof of the fact that all norms on a finite dimensional vector space
are equivalent, ie generate the same topology which is not the case for infinite
dimensional vector spaces.
[14] Vector space, Banach space and Hilbert space isomorphisms with exam-
ples.
[15] Properties of linear operators in a vector space: Rank-nullity theorem,
range, nullspace of infinite dimensional operators, examples of unbounded oper-
ators like position, momentum, creation, annihilation, conservation, angular mo-
mentum energy operators in quantum mechanics, the domain of an unbounded
operator.
[16] Adjoint of an operator, uniqueness of the adjoint when the operator is
densely defined, closed operators, closable operators and closure of an operator
in the unbounded case.
[17] Proof of the fact that if an operator defined in a Banach space or even
a normed linear space X has a dense domain and is bounded, then it can be
uniquely extended to a bounded operator on the whole of X .
[18] The open mapping/closed graph theorem: If a closed operator (ie, an
operator with a closed graph) in a Banach space X has domain the whole of X,
then it is bounded. If the operator is closed and has an inverse, then its inverse
is also closed and hence by the above theorem, the operator has a bounded
inverse, ie, the operator maps open sets onto open sets.
[19] The spectral theorem for normal operators in a finite dimensional Hilbert
space with proof.
[20] Statement of the spectral theorem for compact normal operators in an
infinite dimensional Hilbert space.
[21] Statement of the spectral theorem for bounded and unbounded self-
adjoint operators in an infinite dimensional Hilbert space.
[22] A short survey of spectral measures on a measurable space and spectral
integration with applications to the description of quantum mechanical observ-
ables like position, momentum, energy and angular momentum. Description of
60 Advanced Probability and Statistics: Applications to Physics and Engineering
[30] The matrix inversion lemma and its application to the recursive least
squares algorithm for real time estimation of parameters in LIP systems.
[31] Inverting a matrix when one row and one column is appended to it and
its application to the recursive least squares lattice algorithm for forward and
backward prediction of a time series in a a time and order recursive manner.
Advanced Probability and Statistics: Applications to Physics and Engineering 61
[2] Wavelets
Note that since φ(x − k), k ∈ Z is a basis for V0 , it follows from the above
assumption that φ(2x − k), k ∈ Z is a basis for V1 . u[k] is called the scaling
sequence. Let
ψ(x) = v[k]φ(2x − k)
k
Then,
ψ̂(ω) = (1/2) v[k]exp(−jkω/2)φ̂(ω/2)
k
= v̂(ω/2)φ̂(ω/2)
and likewise,
φ̂(ω) = û(ω/2)φ(ω/2)
In these equations,
φ̂(ω) = φ(x).exp(−jωx)dx,
R
ψ̂(ω) = ψ(x).exp(−jωx)dx,
R
û(ω) = u[n]exp(−jωn),
n
v̂(ω) = v[n]exp(−jωn)
n
Thus,
< ψ(x), φ(x − k) >=< ψ̂(ω), exp(−jkω)φ̂(ω) >
= exp(−jkω)v̂(ω)∗ û(ω)|φ(ω/2)|2 dω
Advanced Probability and Statistics: Applications to Physics and Engineering 63
If we take
v[k] = (−1)k u[−k − 1]
then,
v̂(ω) = (−1)k u[k]exp(jω(k + 1)) = exp(jω)û(ω + π)∗
k
and hence
û(ω)v̂(ω)∗ = exp(−jω)û(ω)û(ω + π)
Thus,
û(ω + π).v̂(ω + π)∗ = −exp(jω)û(ω)û(ω + π)
= −û(ω)v̂(ω)∗
Thus,
u(ω)v̂(ω)∗ + û(ω + π)v̂(ω + π)∗ = 0
We then find that S[k] =< ψ(x), φ(x − k) > is given by
S[k] = ψ(x)phi(x − k)dx = ψ̂(ω)∗ φ(ω).exp(−jωk)dω
= v(ω/2)∗ u(ω/2)|φ(ω/2)|2 .exp(−jωk)dω
= exp(−jω(k − 1/2))u(ω/2)u(ω/2 + π)|φ(ω/2)|2 dω
= exp(−jω(2k − 1))u(ω).u(ω + π)|φ(ω)|2 dω
N
−1
= (2N + 1) (−1) n
exp(−jω(2k − 1))u(ω)u(ω + π)|φ(ω + nπ)|2 dω
n=−N
N
= −(2N +1)−1 (−1)n exp(−jω(2k−1))u(ω)u(ω+π).|φ(ω+(n+1)π)|2 dω
n=−N
Now define
N
χN (ω) = (−1)n |φ(ω + nπ)|2
n=−N
Then, we get
S[k] = (2N + 1)−1 exp(−jω(2k − 1))u(ω)u(ω + π)χN (ω)dω
64 Advanced Probability and Statistics: Applications to Physics and Engineering
Also
Thus, denoting
χ(ω) = limN →∞ (2N + 1)−1 χN (ω),
we get
χ(ω + π) = −χ(ω)
and
S[k] = exp(−jω(2k − 1))u(ω)u(ω + π)χ(ω)dω
so if we assume that
limω→∞ |φ̂(ω)|2
exists, then
and this proves that ψ(x) is orthogonal to φ(x−k) for every integer k. It follows
easily from this that the subspace W1 = Clspan{ψ(x − k) : k ∈ Z} is orthogonal
to the subspace V0 = Clspan{φ(x − k) : k ∈ Z}
V1 = V0 ⊕ W1
where
W1 = Clspan{ψ(x − k) : k ∈ Z}
Orthogonality of this sum has been proved. It remains only to show that any
f ∈ V1 can be expressed as a finite/infinite linear combination of elements from
V0 and W1 . Since φ(2x − k), k ∈ Z is a basis for V1 , it suffices to show that
φ(2x − k) = a[m]φ(x − m) + b[m]ψ(x − m)
m m
for some sequences a[m], b[m]. Taking Fourier transforms, we get that it suffices
to show that
for some functions A, B having period 2π. Thus, it amount to proving that
there exist 2π-periodic functions A, B such that if F (ω) denotes the rhs of (a),
then F (ω + 2π) = (−1)k F (ω), ie,
has period 2π. This follows immediately from the 2π-periodicity of û, v̂. Thus,
having proved that
V1 = V0 ⊕ W1
is an orthogonal
√ direct sum, it follows by applying the unitary scaling operator
S : f (x) → 2f (2x) to this direct sum n times that
Vn+1 = Vn ⊕ Wn+1 , n ∈ Z
where
Wn+1 = span{2n/2 ψ(2n x − k) : k ∈ Z}
and hence we get the direct sum decomposition
L2 (R) = Wn
n∈Z
as an orthogonal direct sum where in deriving this we use the MRA properties
L2 (R) = Cl( Vn ), {0} = Vn , Vn ⊂ Vn+1 , S(Vn ) = Vn+1
b n
with the additional property that ψn,k ⊥ ψm,l , n = m. However to get an onb
wavelet basis for L2 (R) we require in addition that ψn,k ⊥ ψn,kl , k = l. We shall
prove this under the assumption that φ(x − k), k ∈ Z is an onb (orthonormal
basis) for V0 , not merely a basis. Thus, we have the relations
< φ(x), φ(x − k) >= φ(x)φ(x − k)dx = δ[k]
Then
φ(x) = u[m]φ(2x − m)
k
implies
δ[k] =< φ(x), φ(x − k) >= u[m]u[n] < φ(2x − m), φ(2x − 2k − n) >
m,n
= (1/2) u[m]u[n] < φ(x−m), φ(x−2k−n) >= (1/2) u[m]u[n]δ[2k+n−m]
m,n m,n
= (1/2) u[n]u[n + 2k]
n
where
Rm u[n] = u[n − m]
Taking the DTFT of (b) then gives
(b) and (c) are equivalent. Conversely, if u satisfies (b) or (c) and φ(x) satisfies
the functional equation
φ(x) = u[k]φ(2x − k) − − − (d)
k
then φ(x−k), k ∈ Z will be an orthogonal set with the same norm and hence after
appropriate normalization, these will form an onb for V0 = Clspan{phi(x − k) :
k ∈ Z} and by the above procedure, we can then obtain an orthonormal wavelet
basis ψn,k (x) = 2n/2 ψ(2n x − k), n, k ∈ Z for L2 (R). To solve (d) for φ(x) in
terms of the scaling sequence u[k], we take the Fourier transform on both sides
to get
φ̂(ω) = û(ω/2)φ̂(ω/2)
which gives on iteration,
φ̂(ω) = φ̂(0)Π∞ n
n=1 û(ω/2 )
Advanced Probability and Statistics: Applications to Physics and Engineering 67
Now, suppose u[k] satisfies (c) (or equivalently (b)) and we define
or equivalently,
v[n]u[n + 2k] = 0
n
Taking the Fourier transform of this, shows that we require only to show that
But this follows immediately from (e), the 2π periodicity of û and the fact that
exp(jπ) = −1.
M.2n ≤ ωm ,
[3] Some prerequisites for making a transition from linear algebra to infinite
dimensional functional analysis
[1]
[a] General Topology, notion of a topological space, continuity of a func-
tion between two topological spaces, convergence of a sequence in a topological
space. Compactness of a topological space. The product topology, Tychonoff’s
compactness theorem and the axiom of choice.
[b] metric spaces, Cauchy sequences and convergence in metric spaces, the
topology on a metric space induced by a metric, open balls and closed balls in
a metric space. Convergence and continuity of sequences and functions on a
metric space. Equivalence of metrics in terms of the same topology induced by
two different metrics. Complete metric spaces. Completion of a metric space.
Sequential compactness of a metric space in terms of existence of a convergent
subsequence. Equivalence of sequential compactness to total boundedness (ex-
istence of a finite -net for any > 0) and completeness. Hence, equivalence of
sequential compactness to compactness of a metric space regarded as a topolog-
ical space. Rn as a metric space under any norm. Compactness of a subset of
Rn being equivalent to the closedness and boundedness of the subset of Rn
p1
N m−1 v2,k = cj N m−1 v1,j
j=1
Advanced Probability and Statistics: Applications to Physics and Engineering 69
p 1
and hence v2,k may be replaced by v2,k − j=1 cj v1,j without affecting the
required linear independence. Likewise, it is clear that
for R(N m−3 ). Further, it is clear that we can arrange matters so that
N m−2 v3,k ∈ span{N m−1 v1,k1 , N m−2 v1,k1 , N m−2 v2,k2 : k1 = 1, 2, ..., p1 ,
k2 = 1, 2, ...p2 }
and hence we can replace v3,k by v3,k minus a linear combination of the vectors
N v1,k1 , v1,k1 , v2,k2 , k1 = 1, 2, ..., p1, k2 = 1, 2, ..., p2 yielding the desired basis.
[3] An identity regarding resolvents of linear operators. The aim is to show
that if the resolvents of two operators are close to each other at a certain point
in the complex plane, then they are close to each other at all points in the
complex plane.
R(T, z) = (T − z)−1 = (T − z0 − (z − z0 ))−1 =
such that
fi = aij ej , 1 ≤ i ≤ n
j
70 Advanced Probability and Statistics: Applications to Physics and Engineering
Prove that
T (fi ) = aij T (ej ) = aij tkj ek = sji fj
j j,k j
= sji ajk ek
j,k
and hence
aij tkj = sji ajk
j j
and hence
[T ]B AT = AT [T ]B
or equivalently,
[T ]B = A−T [T ]B AT
[2] Let V be vector space and let {e1 , ..., en } and {f1 , ..., fm } be two bases
for V , ie, both are maximal linearly independent subsets of V . Prove that both
are minimal spanning sets for V and that n = m.
hint: To prove that n = m, write
fi = aij ej , ej = bjk fk
j k
x0 (t) = F (x0 (t)), x1 (t) = F (x0 (t))x1 (t) + w(t),
x2 (t) = F (x0 (t))x2 (t) + (1/2)F (x0 (t))(x1 (t) ⊗ x1 (t))
Advanced Probability and Statistics: Applications to Physics and Engineering 71
Define
J(t) = F (x0 (t)),
Φ(t, s) = I + J(t1 )...J(tn )dt1 ...dtn
n≥1 s<tn <..<t1 <t
Show that t
Φ(t, s) ≤ exp( J(u) du)
s
and that t
x1 (t) = Φ(t, s)w(s)ds,
0
t
x1 (t) ≤ ( Φ(t, s) .ds).sup0≤s≤t w(s)
0
Also deduce that
t
x2 (t) = Φ(t, s)K(s)(x1 (s) ⊗ x1 (s))ds
0
where
K(s) = (1/2)F (x0 (s))
Define √
|T | = T 2 , T+ = (|T | + T )/2, T− = (|T | − T )/2
show that
|T | = |λ(a)|Pa , T+ = λ(a)Pa ,
a a:λ(a)>a
T− = λ(a)Pa
a:λ(a)<0
Define E(c) to be the orthogonal projection onto ((T − c)+ ). Show that
E(c) = Pa
a:λ(a)≤c
72 Advanced Probability and Statistics: Applications to Physics and Engineering
Show that
limc↓d E(c) = E(d), c, d ∈ R
but
limc↑d E(c)
need not be E(d). Show that if λ(a), a = 1, 2, ..., r are arranged in ascending
order, so that λ(a − 1) < λ(a) for all a, then
limc↓λ(a) E(c) = Pb
b:b≤a
while
limc↑λ(a) E(c) = Pb
b:b≤a−1
We write
limc↓x E(c) = E(x + 0), limc↑x E(c) = E(x − 0)
Thus, we have proved that
then
pk (cj ) = δ(k, j)
Show that if f (t) ∈ C[t] and degf < r, then
r
f (t) = f (ck )pk (t)
k=1
p(t) = Πrj=1 (t − cj )
pj (T )pk (T ) = 0, j = k, pj (T )2 = pj (T )
Advanced Probability and Statistics: Applications to Physics and Engineering 73
hint: Take an onb {e1 , e2 , ...} for H and define T (ej ) = e2j , j ≥ 1. Show
that this T is an isometry, ie, T ∗ T = I but T is not unitary.
[3] Let H be a separable Hilbert space, ie, H has a countable dense subset
D. Then, by applying the Gram-Schmidt process to the elements of D, deduce
that H has a countable orthonormal basis.
Show that
0 ≤ (T − c1 )+ ≤ (T − c2 )+
so the required isometries are easily constructed. The result of this problem is
known as the polar decomposition.
[3] Examples: The algebra of n×n matrices with values in a field, the ring of
polynomials, the fields Fp , p a prime, R, C, the algebra of ratio of polynomials,
the group of permutations of a set, the vector space Fn , F being any field, the
vector space spanned by a set of functions on a set with values in a field. The
notion of algebraically closed and non-closed fields with examples.
[8] Inner product and Hilbert spaces in finite and infinite dimensions.
[10] The rank-nullity theorem, range and nullspace of a linear operator, Ech-
elon form of a matrix and its application to solving linear systems of equations.
[b] The spectral theorem for normal, Hermitian and unitary operators in
finite dimensional vector spaces.
[c] Bounded and unbounded operators in a Hilbert space, the spectral theo-
rem for bounded Hermitian operators in an infinite dimensional Hilbert space.
Proof based on construction of the square root of a positive operator.
[g] The LDU and U DL decomposition for positive definite matrices with
application to linear prediction theory.
[11] The matrix inversion lemma and its application to the RLS-Lattice
algorithm for time and order recursive prediction of a discrete time signal.
76 Advanced Probability and Statistics: Applications to Physics and Engineering
[12]
[a] Applications of vector space and linear transformation theory to quantum
mechanics: Pure state, mixed state, observables, Schrodinger and Heisenberg
dynamics, interaction picture dynamics, Dyson series solution to Schrodinger
evolution. Computing probabilities of events in quantum mechanics, projection
valued and positive operator valued measurements, collapse of a state following
a measurement, time independent perturbation theory,
[b] Scattering theory in quantum mechanics, the wave operators, Lippmann-
Schwinger equations for the scattered states, Born approximation.
[13] Basic quantum information theory over finite dimensional Hilbert spaces.
[a] Proof the Shannon noiseless and noisy coding theorem and its converse
in classical communication theory based on typical sequences and the Feinstein-
Khintchine fundamental lemma.
[16] Application of the singular value decomposition to the data matrix based
approach to MUSIC and ESPRIT algorithms.
Advanced Probability and Statistics: Applications to Physics and Engineering 77
[17] Definition and properties of tensor product of vector spaces and opera-
tors.
[a] Definition of the tensor product of vector spaces and specialization to the
Kronecker tensor product.
[b] Symmetric and antisymmetric tensor products of vector spaces.
[c] Application of tensor products to Maxwell-Boltzmann, Bose-Einstein and
Fermi-Dirac statistics of elementary particles.
[d] Tensor product of infinite dimensional Hilbert spaces, construction based
on the GNS principle combined with Kolmogorov’s consistency theorem.
[18] Basics of classical and quantum filtering theory and control. (comes
under the heading ”stochastic calculus in Boson-Fock space” which is a special
kind of Hilbert space constructed using multiple tensor products).
[a] The Hudson-Parthasarathy quantum stochastic calculus and the HP noisy
Schrodinger equation.
[b] Derivation of the Belavkin filter for a mixture of quadrature and photon
counting measurements.
[c] Quantum control applied to the Belavkin filter for reduction of Lindblad
noise.
[c] The generalized Lipshitz conditions for proving existence and uniqueness
of solutions to sde’s driven by discontinuous semi-martingales.
[d] Applications of stochastic calculus to mathematical finance.
[e] Stochastic optimal control for Markov processes.
[f] Stochastic nonlinear filtering for Markov processes in the presence of Levy
measurement noise.
p
ef (t|p) = x(t) − x̂p (t) = ap (t, k)x(t − k), ap (0) = 1
k=0
The prediction filter coefficients ap (t, k) are determined from the normal equa-
tions
< ef (t|p), x(t − k) >= 0, k = 1, 2, ..., p
or equivalently,
p
R(t, t − k) = ap (t, m)R(t − m, t − k), 1 ≤ k ≤ p
m=1
These are obtained by minimizing < ef (t|p), ef (t|p) >. It is easy to see that
ef (t|p) is orthogonal to ef (t − k|p − k), k = 1, 2, ..., p. Then, we can write
where Up (t) is the above upper triangular matrix. Forming the correlation
matrix on both sides then gives
where
Rp (t) = ((R(t − k, t − m)))0≤k,m≤p
is a (p + 1) × (p + 1) positive definite matrix and
with
Ef (t|p) =< ef (t|p), ef (t|p) >
This decomposition can also be expressed as
= |j1 , ..., jm >< j1 , ..., jm |
(j1 ,....,jm )∈TB (m,q,δ)
Define
ρ(u) = ⊗x∈A ρ(x)N (x|u) , u ∈ An
where N (x|u) denotes the number of times that the element z occurs in a se-
quence u. Prove the following statements:
[a]
p(n) (TB (n, p, δ)) ≥ 1 − a/δ 2
Advanced Probability and Statistics: Applications to Physics and Engineering 81
[b] √
p(n) (u) = 2−nH(p)+δ.O( n)
, u ∈ TB (n, p, δ)
where
H(p) = − p(x).log(p(x))
x∈A
[c] √
μ(TB (n, p, δ)) = 2n.H(p)+δ.O( n)
[d]
√ √
2−mH(ρ)−δ.O( m)
E(ρ⊗m , δ) ≤ ρ⊗m E(ρ⊗m , δ) ≤ 2−mH(ρ)+δ.O( m)
E(ρ⊗m , δ)
where
H(ρ) = −T r(ρ.log(ρ)) = − q(j).log(q(j))
j
[e]
√
2−n. x p(x)H(ρ(x))−δ.O( n)
E(n, u, δ) ≤ ρ(u)E(n, u, δ)
√
≤ 2−n. x p(x)H(ρ(x))+δ.O( n)
E(n, u, δ)
[i] √
M = Mn ≥ 2n.I(p,ρ)−δ.O( n)
so that
liminfn→∞ log(Mn )/n ≥ I(p, ρ)
where
I(p, ρ) = H( p(x)ρ(x)) − p(x)H(ρ(x))
x x
[2] Calculate the row reduced echelon form of the following 3 × 3 matrix by
indicating the sequence of row and column operations:
⎛ ⎞
0 a12 a13
A = ⎝ 0 a22 a23 ⎠
0 0 0
where
a12 , a13 , a22 , a23 , a12 a23 − a13 a22 = 0
where
a11 , a12 , a22 , a33 ∈ R
Use this decomposition to evaluate exp(tA).
[4] State and prove the Cayley-Hamilton theorem. Use it to evaluate cos(A)
where
a b
A=
c d
Advanced Probability and Statistics: Applications to Physics and Engineering 83
where a, b, c, d are complex numbers such that the two eigenvalues of A are
distinct.
p(t) = (t − λ1 )2 (t − λ2 )
such that (U2 − c2 )e2 ∈ W for some scalar c2 , ie, (T2 − c2 )e2 ∈ W . Obviously,
(T1 − c1 )e2 ∈ W by the definition of V1 . Now let V2 denote the set of all x such
that (T1 − c1 )x ∈ W, (T2 − c2 )x ∈ W . Then, V2 is F-invariant and contains both
e2 and W . In other words, V2 properly contains W and hence there exists an
e3 ∈/ W and a scalar c3 such that (T3 − c3 )e3 ∈ W . Obviously, by the definition
of V2 , we have that (T1 − c1 )e3 ∈ W, (T2 − c2 )e3 ∈ W . Continuing in this
way, after r steps, we obtain a vector en ∈ / W and scalars c1 , ..., cn such that
(Tj − cj )en ∈ W, j = 1, 2, ..., n and this completes the proof of the theorem.
Note that we have made use of the following result:
Result: Let T be a linear operator in V and let W be a proper T -invariant
subspace. Then, there exists a vector α ∈ / W such that T α ∈ span{α, W }. To
prove this, we choose a β ∈ / W and note that if p0 is the monic generator of the
ideal of all polynomials p for which p(T )β ∈ W (this set is not empty since it
contains the minimal polynomial of T ), then we can write p0 (t) = (t − c)q(t)
where c is a scalar and q is another polynomial, and by the minimality of degp0 ,
it follows that α = q(T )β ∈ / W while (T − c)α ∈ W . This completes the proof.
Note that we have used the T -invariance of W in stating that the above set is
an ideal in the algebra of polynomials.
⊥
is a Hilbert space, choose any vn ∈ Mn ∩Mn−1 having unit norm. Now, vn /cn is
bounded and hence T (vn )/cn has a convergent subsequence. But m < n implies
(T − cn )(vn ) ∈ Mn−1
and hence,
T (vn )/cn − T (vm )/cm ∈ vn − Mn−1
Thus,
T (vn )/cn − T (vm )/cm ≥ d(vn , Mn−1 ) = 1
which contradicts the hypothesis that T (vn )/cn has a convergent subsequence.
[b] Suppose c = 0 is not an eigenvalue of T . Then R(T − c) is closed. Indeed,
suppose un is a bounded sequence such that (T − c)un → v. We must show that
v ∈ R(T − c). Since T is compact and un is bounded, there is a subsequence
up(n) of un such that T (up(n) → w. Then,
cup(n) = T (up(n) − v → w − v
or
up(n) → (w − v)/c = x
say and we get from the boundedness of T ,
and therefore,
T (x) − v = cx
or equivalently,
v = (T − c)(x) ∈ R(T − c)
proving the claim. Now suppose un is a sequence with un → ∞ and (T −
c)(un ) → v. We must show that v ∈ R(T − c). But defining wn = un / un ,
we get that wn is a bounded sequence and
(T − c)(wn ) → 0
T (wp(n) ) → w
Then,
cwp(n) → w
or equivalently,
wp(n) → w/c = 0
86 Advanced Probability and Statistics: Applications to Physics and Engineering
and hence
T (wp(n) ) → T (w)/c
and therefore
T (w)/c = w
ie,
(T − c)(w) = 0
ie, c is an eigenvalue of T , a contradiction which is resolved by going back to
the previous case.
Let B be a closable operator in a Hilbert space. Then R(B̄)⊥ = R(B)⊥ .
Indeed, suppose x ∈ R(B)⊥ . It suffices to show that x ∈ R(B̄)⊥ . We have that
< x, By >= 0∀y ∈ D(B). Choose z ∈ R(B̄). Then there is a sequence yn such
that yn → y, Byn → z and z = B̄y. Then
Let A be essentially self-adjoint. Then R(A ± i)⊥ = 0,. ie, R(A ± i) is dense
in H. Indeed, suppose z ∈ R(A + i)⊥ . Then
(Ā − i)z = 0
(Note that D(A) is dense in H). Taking the norm on both sides and using the
fact that Ā is self-adjoint, we get
Āz 2 + z 2 = 0
and hence z = 0. Thus, R(A + i)⊥ = 0. Likewise, we establish that R(A − i)⊥ =
0.
Show that A∗ is a well defined unique linear operator in H. For proving the
uniqueness, you must use the density of D(A).
[2] Let A be a linear operator in H with dense domain D(A). Show that A∗
is a closed operator in H.
hint: fn ∈ D(A∗ ), fn → f, A∗ fn → g imply that for all h ∈ D(A),
and
< A∗ fn , h >→< g, h >
proving that
< f, Ah >=< g, h > ∀h ∈ D(A)
and therefore,
g = A∗ f
so that A∗ is closed.
[3] If an operator A in H is closed and D(A) = H, then A is bounded.
Equivalently, if A is a closed invertible operator in H with range H, then A−1
is bounded, ie, A maps open sets into open sets. These two equivalent versions
of the same result are respectively known as the closed graph theorem and the
open mapping theorem. Its proof is based on Baire’s category theorem in general
topology.
Let A be closed with D(A) = H. Let S = A−1 (B(0, 1)). Then, H is the
union of nS, n = 1, 2, .... Hence, by the Category theorem, S̄ contains a ball
K = B[u, r] say. If x < 2r, we can write x = u1 − u2 = u1 − u0 − (u2 − u0 )
with u1 , u2 ∈ K, ie uk − u0 < r, k = 1, 2. Since u1 , u2 ∈ S̄, it follows that
there exist sequences u1 (n), u2 (n) ∈ S such that uk (n) → uk , k = 1, 2. Then,
and hence
u1 (n) − u2 (n) ∈ 2S
Thus taking limits,
u1 − u2 ∈ 2S̄
This argument implies that x ∈ B(0, λr) implies x ∈ λS̄. Choose any x ∈
B(0, r). Then x ∈ S̄ and hence there exists v1 ∈ S such that x − v1 <
r. Then, it follows that x − v1 ∈ S̄. So there exists v2 ∈ S such that
x − v1 − v2 < 2 r and hence x − v1 − v2 ∈ 2 S̄. Continuing this way, we get
for each n = 1, 2, ... a vn ∈ n−1 S such that x − v1 − v2 − ... − vn n r Thus,
n
vj → x,
j=1
and
n
n
A( vj ≤ j−1
j=m+1 j=m+1
88 Advanced Probability and Statistics: Applications to Physics and Engineering
n
Thus, A( j=1 vj ), n = 1, 2, ... is Cauchy in H and hence by the closedness of
A, it converges to A(x). This means that
∞
A(x) ≤ A(vj )
j=1
∞
≤ j−1
= (1 − )−1
j=1
It follows that
supx∈B(0,r) Ax ≤ (1 − )−1
thereby establishing that A is a bounded operator.
is defined via an iterative process, ie, as the operator norm limit of a sequence
of polynomials in T . Define
Then,
T± ≥ 0, T = T+ − T− , |T | = T+ + T−
Let E denote the projection onto N (T+ ). Then since
T+ T − = 0
it follows that
R(T− ) ⊂ R(E)
and hence
ET− = T−
Taking adjoints gives us
T− E = T−
and hence,
T− = ET− = T− E = ET− E
Let [S, T ] = 0 with S ≥ 0, S ≥ T . Then S ≥ T+ . To see this we note that
[S, T+ ] = 0 and hence
T+ SE = ST+ E = 0
so that
SE = ESE
Advanced Probability and Statistics: Applications to Physics and Engineering 89
and hence
(1 − E)SE = 0
Thus,
Note that E projects onto N (T+ ) = R(T+ )⊥ and hence 1 − E projects onto
R(T+ ). Hence (1−E)T+ = T+ and taking adjoints, we get T+ (1−E) = T+ . This
also directly follows from T+ E = 0. Also T+ T− = T− T+ = (|T |2 − T 2 )/4 = 0
implies ET− = T− and hence (1 − E)T− = T− (1 − E) = 0. Thus,
T − μ ≤ T − λ ≤ (T − λ)+
(T − λ)+ ≥ (T − μ)+
E(λ) ≤ E(μ)
Then,
and hence
(T − λ)(E(μ) − E(λ)) ≥ 0
and therefore from the above two inequalities,
n−1
T − cj (E(cj+1 ) − E(cj )) 2
j=0
n−1
= (T − cj )(E(cj+1 ) − E(cj )) 2
j=0
n−1
≤ (cj+1 − cj )(E(cj+1 ) − E(cj ))
j=0
n−1
≤ (cj+1 − cj )2 ≤ 2M |P | → 0, |P | → 0
j=0
Note that we have used the fact that Since E(cj ) ≤ E(cj+1 )∀j and since the
E(cj ) s are orthogonal projections, it follows that the subspaces R(E(cj+1 ) −
E(cj )), j = 0, 1, 2, ..., n − 1 are all mutually orthogonal.
z = x− < y, x > y
whence
ψ(x) =< u, x >, u = ψ̄(y).y
N
N
x(k)ek ≤ K. |x(k)|
k=1 k=1
where
K = max( ek : 1 ≤ k ≤ N }
holds good. So it suffices to show that there exists a δ > 0 such that for all
x(1), ..., x(N ) ∈ C, we have
N
N
δ. |x(k)| ≤ x(k)ek
k=1 k=1
Suppose that this is false, then there exist a non-zero sequence (x(1, n), ..., x(N, n)),
n =1, 2, ... in CN such that
N
N
δn . |x(k, n)| > x(k, n)ek
k=1 k=1
N
y(k, n) = x(k, n)/ |x(k, n)|
k=1
we find that
y(n) = (y(1, n), ..., y(N, n))
is a bounded sequence in CN such that
N
δn > y(k, n)ek , ∀n
k=1
N
N
z(k)ek = limk→∞ y(m, n(k)) ≤ limδn = 0
k=1 k=1
92 Advanced Probability and Statistics: Applications to Physics and Engineering
N
|z(k)| = 1
k=1
in view of
N
|y(k, n)| = 1
k=1
lim(x1 (nN (k)), x2 (nN (k)), ..., xN (nN (k))) = (z1 , z2 , ..., zN )
This proves the theorem. Note that this proof fails if the dimension N →
∞, since the iterative procedure of choosing convergent subsequence does not
terminate and in fact there will generally not be any ”final subsequence”.
[17] Proof of the theorem that a Hilbert space is finite dimensional
iff the closed unit ball is compact
Let H be a finite dimensional Hilbert space, say dimH = N < ∞. Let
B = {x ∈ H : x ≤ 1}
Since as proved earlier, all norms on a finite dimensional vector space are equiv-
alent, there exist finite positive numbers K1 , K2 such that
N
N
N
K1 . |x(k)| ≤ x(k)ek ≤ K2 . |x(k)|
k=1 k=1 k=1
forall x(k) ∈ C, k = 1, 2, ..., N where {e1 , ..., eN } is a basis for H. Then, let
N
zn = xn (k)ek
k=1
bounded sequence in H. Then we get from the left hand inequality above
be a
N
that k=1 |xn (k)| is a bounded sequence in R and hence
N
N
xn(m) (k)ek − v(k)ek
k=1 k=1
N
K2 ≤ |xn(m) (k) − v(k)| → 0, m → ∞
k=1
N
which proves that k=1 xn (k)ek has a convergent subsequence in H and hence
we have proved that the closed unit ball in H is compact. Note that the closed
unit ball is a bounded subset of H. conversely, suppose H is infinite dimen-
sional. Then, we can choose an infinite sequence of linearly independent vectors
x1 , x2 , ... in H such that
xn+1 ∈
/ Mn = span{x1 , ..., xn }, n = 1, 2, ..
Define
xn+1 − Pn xn+1
en+1 =
xn+1 − Pn xn+1
where Pn is the orthogonal projection onto Mn . Then {en : n = 1, 2, ...} forms
an orthonormal set in H and hence does not have any Cauchy subsequence
which proves that the closed unit ball is non-compact.
[18] Syllabus for end-sem exam for SPC01, Applied linear algebra
[1] Schur’s lemmas and the Peter-Weyl theorem with proofs based on the
Schur orthogonality relations and the completeness proof based on the spectral
theory for compact self-adjoint operators in a Hilbert space. Intertwining op-
erators, irreducible representations, completely reducible representations, proof
that unitary representations of a group are completely reducible. Left and right
invariant Haar measures on locally compact groups.
[2] Proof of the Cq Shannon coding theorem. Proof of Schumacher’s noiseless
quantum coding theorem.
[3] differential equations satisfied by the scale factor S(t) and the density
and pressure ρ(t), p(t) in an expanding universe that is homogeneous, isotropic
and spatially flat.
[4] The general form of the energy-momentum tensor of the matter fluid
taking into account viscous and thermal effects. Derivation based on the second
law of thermodynamics.
[5] The linearized Einstein field equations applied to the derivation of the
propagation of inhomogeneities in the form of metric perturbations, density and
pressure perturbations and the velocity field perturbations
94 Advanced Probability and Statistics: Applications to Physics and Engineering
M (λ) = E[exp(λX1 )]
where
I(x) = supλ (λx − log(M (λ)))
Show that
inf {I(x) : |x| > δ} = supλ≥0 (−λδ − log(M (λ)))
What is the meaning of this result ?
[2] (a) State and prove the contraction principle in large deviation theory.
Specifically show that if I(x) is the rate function for the family of random
variables Z( ), → 0, then the rate function for the family ψ(Z( )), → 0 is
given by
Iψ (x) = inf {I(z), z : ψ(z) = x}
Let X1 , X2 , ... be iid N (0, σ 2 ) random variables and define
Sn = X1 + ... + Xn
N −1
−inf ( (yn+1 − f (yn ))2 /2σ 2 , (y1 , ..., yn ) ∈ B)
n=0
we have
[a(u), a(v)] = 0, [a(u), a(v)∗ ] =< u, v >
Now define for an infinite matrix H = (Hnm )), the operator
Λ(H) = Hnm a∗n am
n,m
Show that
[a(u), Λ(H)] = a(Hu)
where
(Hu)n = Hnm um
m
Now let {|en >} be an orthonormal basis for the Hilbert space H ⊗ L2 (R+ ) and
define
At (u) = < uχ[0,t] , en > an
n
where
u ∈ H ⊗ L2 (R+ )
Prove the quantum Ito formula:
Your may do this problem by considering the matrix elements on both sides
w.r.t coherent vectors |e(w) > defined by
First prove that such a solution exists. Write down the first order perturbed
field equations taking the perturbed metric tensor as
1 + δv0 , δvr , r = 1, 2, 3,
ρ(t) + δρ(t, r)
p(t) + δp(t, r)
ρ1 0
ρ= , ρ1 ∈ Cr×r
0 0
Advanced Probability and Statistics: Applications to Physics and Engineering 97
The noise manifold N consists of all matrices having the block structure
0 X
N=
cY Z
[24] Let A be the set {0, 1, ..., N − 1} with addition modulo N . For a, b ∈ A,
define
< a, b >= exp(2πiab/N )
Let |a > denote the N ×1 vector having a one in the a+1th position and zeros at
all the other positions. Thus, L2 (A) = CN has the onb {|a >: a = 0, 1, ..., N −1}
and we define the operators U (a), V (a) on this space by
Prove that W (a, b) = U (a)V (b), a, b ∈ A are unitary operators in L2 (A) and
that
V (b)U (a) =< b, a > U (a)V (b), a, b ∈ A
Prove that
T r(W (a, b)∗ W (u, v)) = N δa,u δb,v
and hence that {N −1/2 W (a, b), a, b ∈ A} forms an orthonormal basis for the
space B(L2 (A)) of all linear operators in L2 (A) with inner product
Write down the spectral representations of the Abelian unitary groups of op-
erators {U (a), a ∈ A} and {V (a), a ∈ A}. Specifically, determine the spectral
families {P (a), a ∈ A} and {Q(a), a ∈ A} such that
U (a) = < a, b > P (b), V (a) = < a, b > Q(b)
b∈A b∈A
[1] Weyl’s character formula for the characters of compact semisimple Lie groups.
[2] Weyl’s integration formula for compact groups. Let G be a compact
semisimple group and H a Cartan subalgebra. Let X = G/H and define the
map ψ : X × H → Gby ψ(gH, h) = ghg −1 . It is clear that the map ψ is well
defined. The differential of ψ can be obtained as follows:
ghg −1 (1 + g.δh.g −1 )
on the one hand and on the other,
δh → g.δh.g −1
99
100 Advanced Probability and Statistics: Applications to Physics and Engineering
The reason for the factor |W |−1 appearing on the rhs is that if w ∈ W , then
gwH(gw)−1 = gHg −1 so in the process of integration, each element of G/H is
being counted |W | times. Equivalently ψ(w.H, h) = whw−1 ∈ H which means
that the number of permutations of each element in H is |W |. Equivalently
the number of permutations of each element in each coset of H is |W | in the
expression ghg −1 , ie, we are counting each element in each coset of H |W | times.
and hence f (g) 2 is a well defined function on G/H whenever condition (a)
is satisfied. We define the induced representation UL = indG H L to act in H so
that
(UL (g1 )f )(g) = f (g1−1 g), g, g1 ∈ G, f ∈ H
A second way to define the induced representation is as follows. Let γ(.) be
a cross section for G/H, ie, for each x ∈ G/H, let γ(x) ∈ G satisfy γ(H) =
e, γ(x)H = x and for any distinct x, y ∈ G/H, γ(x) = γ(y). Then define for
x ∈ G/H, g ∈ g,
Note that some of the characters χj , j = 1, 2, ..., N may be the same so we denote
by N̂ the set of distinct characters of N and let P (χ) the orthogonal projection
onto the common eigenspace of V (n), n ∈ N with corresponding eigenvalues
χ(n), n ∈ N . Then by the spectral theorem,
V (n) = χ(n)P (χ)
χ∈N̂
= (h−1 χ)(n)P (χ) = χ(n)P (h.χ)
χ∈N̂ χ∈N̂
O(χ0 ) = {h.χ0 : h ∈ H}
V (χ) = P (χ)H, χ ∈ N̂
For n ∈ N , we have
and for h ∈ H,
= P (h.χ)H = V (h.χ)
Thus H0 = {V (χ) : χ ∈ O(χ0 )} is invariant under the representation W of
H. We define H0 to be the set of all h ∈ H for which h.χ0 = χ0 . then H0
is a group. Let V0 be any proper subspace of V (χ0 ). It is clearly invariant
under the representation V of N . Note that we have seen above that V (χ0 )
is invariant under
V . It is clear then that {W (h)V0 : h ∈ H} is a proper
subspace of {V (χ) : χ ∈ O(χ0 )} that is U -invariant. Note that W (h)V0 ⊂
V (h.χ0 ), h ∈ H. Thus, U |H0 is irreducible if V (χ0 ) has no proper invariant
subspace under the representation W of H or equivalently (since W (h)V (χ0 ) =
V (h.χ0 ) has zero intersection with V (χ0 ) unless h.χ0 = χ0 ), if V (χ0 ) has no
proper invariant subspace under the representation W of H0 . The converse
is also easily seen to be true. Specifically, suppose S is a proper subspace of
H0 that is invariant under U . Then it is easily seen that V0 = S ∩ V (χ0 )
is a proper subspace of V (χ0 ) that is invariant under H0 because V (χ0 ) and
V (h.χ0 ) are unitarily isomorphic under W (h). Hence the representation U |H0
of G is irreducible under G iff the representation W of H0 is irreducible.Now
let L be any irreducible representation of H0 . Then it is easy to see that
nh → (χ0 ⊗ L)(n, h) = χ0 (n)L(h) is an irreducible representation of N ⊗s H0
and hence by the above argument, U = indG N ⊗s H0 ]χ0 ⊗ L is irreducible.
G = KM AN
Now,
f (k)dk = f (x[u])(dx[u]/du)du = f (x[u])(dx[u]/du)dxdu
K
dx[u]/du = μB (b(xu))−1
2
f (ana )δ(a) dadn = f (aa a −1 na )δ(a)2 dadn
= f (an)δ(aa −1 )2 dad(a na −1 ) = f (an)δ(a)2 delta(a )−2 δ(a)2 dadn
= f (an)δ(a)2 dadn
= |W (A)|−1 |Δ(h)|2 f (xhx−1 )( χ(sh)/Δ(h))dxdh
s∈W (A)
= f (x)θχ (x)dx = Tχ (f )
G
Advanced Probability and Statistics: Applications to Physics and Engineering 105
Then for g ∈ G,
(fˆ)(gXg −1 ) = f (Y )exp(iB(gXg −1 , Y ))dY
g
= f (Y )exp(iB(X, g −1 Y g))dY
g
= f (gY g −1 )exp(iB(X, Y ))d(gY g −1 )
g
= f (Y )exp(iB(X, Y ))dY = fˆ(X)
g
f (gXg −1 ) = f (X), ∀g ∈ G, X ∈ g
It follows from this that dX is invariant under adjoint action of G. More pre-
cisely, the Euclidean measure in N -dimensional Euclidean space remains in-
variant under orthogonal transformations and X → gXg −1 is an orthogonal
transformation w.r.t the inner product B(., .). Now observe that
so
Lχ (α)f (u) = ( α(x)Lχ (x)dx)f (u) =
G
106 Advanced Probability and Statistics: Applications to Physics and Engineering
α(x)χ̄(a(x−1 u)δ(a(x−1 u))−1 f (x−1 [u])dx
G
= α(x)χ̄(a(x−1 u))δ(a(x−1 u))−1 f (k(x−1 u))dx
G
= α(ux−1 )χ̄(a(x))δ(a(x))−1 f (k(x))dx
G
= α(un−1 a−1 k −1 )χ̄(a)δ(a)−1 f (k)δ(a)2 dkdadn
Hence,
y[x[u]] = (yx)[u]
Now, we define
a(x : u) = a(xu), x ∈ G, u ∈ K
Then,
k(xu)a(x : u)n(xu) = xu
or equivalently,
x[u].a(x : u)n(xu) = xu
and also
t+z x + iy
− − − (1)
x − iy t−z
Advanced Probability and Statistics: Applications to Physics and Engineering 107
exp(mt)
χm (at ) = , at = exp(tH) ∈ L
exp(t) − exp(−t)
and
exp(imθ)
χm (θ) = , u(θ) = exp(θ.(X − Y )) ∈ B
exp(iθ) − exp(−iθ)
In this formula, L is the non-compact Cartan subgroup of G generated by the
Lie algebra R.H and B is compact Cartan subgroup of G generated by the Lie
algebra R.(X − Y ) where
1 0
H= ,
0 −1
0 1
X= ,
0 0
0 0
Y =
1 0
Note that {X, Y, H} is a basis for the Lie algebra of SL(2, R) and these satisfy
the standard Lie algebra commutation relations:
+ |exp(iθ) − exp(−iθ)|2 f (xux−1 )dxdu
G/B×B
where
h = exp(at ) = exp(tH), u = exp(θ.(X − Y ))
108 Advanced Probability and Statistics: Applications to Physics and Engineering
We have
Am (f1 og, f2 og) = f1 (gg1 )f2 (gg2 )χm (g2−1 g1 )dg1 dg2
= f1 (g1 )f2 (g2 )χm (g2−1 g1 )dg1 dg2 = Am (f1 og, f2 og), g ∈ SL(2, R)
cos(θ) sin(θ)
u(θ) = = exp(θ(X − Y ))
−sin(θ) cos(θ)
exp(t) 0
a(t) = = exp(tH)
0 exp(−t)
1 s
n(s) =
0 1
Then,
∂g/∂s = g.X,
∂g/∂t = u(θ)a(t)Hn(s) = g(n(s)−1 Hn(s)) =
∂g/∂θ = u(θ)(X − Y )a(t)n(s) = g(n(−s)a(−t)(X − Y )a(t)n(s))
Now,
1 s 1 s
n(s)−1 Hn(s) =
0 −1 0 1
1 2s
= = H + 2sX
0 −1
so
∂/∂t → H + 2sX,
n(−s)a(−t)(X − Y )a(t)n(s) =
1 −s exp(−t) 0 0 1 exp(t) 0 1 s
. . . .
0 1 0 exp(t) −1 0 0 exp(−t) 0 1
So,
∂/∂θ → s.exp(2t)H + (s2 .exp(2t) + exp(−2t)).X − exp(−2t).Y
and
∂/∂s → X
Solving these equations gives us the correspondence
Actually, this expression has to be multiplied by one half if we use the Iwasawa
decomposition of G = SL(2, R) in the form
g = kman, G = KM AN
Let Ge denote the elliptic subgroup and Gh the hyperbolic subgroup of G. Thus
G is the the union of these two groups and Ge ∩ Gh = {I}. We note that any
g ∈ Ge is conjugate to an element of the form u(θ) while any element g ∈ Gh is
conjugate to an element of the form a(t). Thus,
f (g)dg = f (g)dg + f (g)dg
G Ge Gh
where
Δ(θ) = exp(iθ) − exp(−iθ)
and dx̄ is the invariant measure on G/T where T is the one dimensional torus
For any g ∈ G = SL(2, R), we also have the singular value decomposition
g = u(θ1 )a(t).u(θ2 )
110 Advanced Probability and Statistics: Applications to Physics and Engineering
We compute the Haar measure using this representation as above and obtain
∂g/∂θ2 = g.(X − Y )
so
∂/∂θ2 = X − Y
∂g/∂t = u(θ1 )a(t)Hu(θ2 )
= g.u(−θ2 )Hu(θ2 ) = cos(2θ2 ).H − sin(2θ2 ).(X + Y )
so we can write
∂/∂t → cos(2θ2 ).H − sin(2θ2 ).(X + Y )
and finally,
∂g/∂θ1 = u(θ1 )(X − Y )a(t)u(θ2 )
= g.u(−θ2 )a(−t)(X − Y )a(t)u(θ2 ) =
g.[sinh(2t)sin(2θ2 ).H + (exp(2t)sin2 (θ2 ) + exp(−2t)cos2 (θ2 ))X − Y ]
Thus,
[4] Let B denote the elliptic Cartan subgroup of G and L the hyperbolic
Cartan subgroup of the same. Determine the invariant measures d(G/B) and
Advanced Probability and Statistics: Applications to Physics and Engineering 111
d(G/L) on G/B and G/L respectively. Note that the Iwasawa decomposition
for G is G = KAN so that G/B can be identified with AN and every element
in L is conjugate to an element in A while every element in B is conjugate to
an element in K. Note that K = exp(R.(X − Y )) while A = exp(R.H). Given
a function f (g) on G we use Weyl’s integration formula:
f (g)dg = G/K × K|ΔK (k)|2 f (xkx−1 )dx̄dk
G
+ |ΔA (a)|2 f (yay −1 )dȳda
G/A×A
Here dx̄ is the invariant measure on G/K while dȳ is the invariant measure
on G/A. Suppose we write g ∈ G using the singular value decomposition g =
k1 ak2 , k1 , k2 ∈ K, a ∈ A. The invariant measure on G can be computed using
this decomposition as
dg = P (k1 , a, k2 )dk1 dadk2
Now given a function f on G/K, we can write it as f (gK) and we have that
K×A×K
f (k1 ak2 K)P (k1 , a, k2 )dk1 dadk2 is an invariant integral on G. It can
be expressed as f (k1 aK)P (k1 , a)dk1 da where P (k1 , a) = K P (k1 , a, k2 )dk2 .
It is clear then that this integral defines an invariant integral on G/K. An-
other way to do this is to use the decomposition G = N AK to write dg =
Q(n, a, k)dndadk for the invariant measure on G and then note that f (nak)Q(n, a)dnda
is the invariant integral on G/A where Q(n, a) = Q(n, a, k)dk. Likewise, sup-
pose we use the decomposition G = KN A to write the invariant measure on
G as g = R(k, n, a)dkdnda . Then defining R(k, n) = A R(k, n, a)da we can
express the invariant integral of f χ on G/A as f (kna)R(k, n)dkdn.
Now let χ(g) be a character of G. We wish to evaluate I = I(f, χ) =
G
f (g)χ(g)dg. We have using Weyl’s integration formula,
I= |ΔK (k)|2 f (xkx−1 )χ(k)dxdk + |ΔA (a)|2 f (xax−1 )χ(a)dxda
G/K×K G/A×A
= |ΔK (k)|2 f (k aka−1 k −1
)χ(k)P (k , a)dk dadk
K×A×K
+ |ΔA (a)|2 f (knan−1 k −1 )χ(a)R(k, n)dkdnda
K×N ×A
This method can be used to compute image pair invariants for the Lorentz group
in the txz space or equivalently for SL(2, R).
[7] Computing Haar integrals
[1] Suppose that G is a semisimple Lie group with H as its Cartan subgroup.
We consider the mapping ψ : (x̄, h) → x̄hx̄−1 from G/H × H into G. Note
that the mapping is well defined since if x1 ∈ G, h1 ∈ H are arbitrary, then
(x1 h1 )h(x1 h1 )−1 = x1 hx−1
1 . Hence, it does not matter what representative
112 Advanced Probability and Statistics: Applications to Physics and Engineering
where
L(x̄,h) (x̄ , h ) = (x̄, h)o(x̄ , h ) = ψ −1 (ψ(x̄, h)oψ(x̄ , h ))
= ψ −1 (gog ), g = ψ(x̄, h), g = ψ(x̄ , h )
Thus,
L(x̄,h) (x̄ , h )|x̄ =H,h =e = ψ −1 (g).Lg (e)ψ (H, e)
where
Lg (g ) = gog
and hence
|det(L(x̄,h) (x̄ , h )|x̄ =H,h =e )|−1 dx̄dh =
|dg/d(x̄, h)|.||Lg (e)|−1 |ψ (H, e)|−1 dx̄dh
= |dψ(x̄, h)/d(x̄, h)||ψ (H, e)|−1 dx̄dh
since |Lg (e)|−1 dg = dg is the Haar measure on G and hence |Lg (e)| = 1 More
precisely, if G is also parametrized, then its invariant measure should be taken as
dg/|Lg (e)| and the above calculation will then show that the invariant measure
on G can be written in the G/H × H parametrization as
where the constant |ψ (H, e)|−1 has been set equal to unity without any loss
of generality. In this formula, dx̄dh is any parametrization of a measure on
G/H × H. Now we observe the following: In this parametrization, the left
invariant measure on G/H is dx̄/|Lx̄ (e)| where Lx̄ (ȳ) = x̄oȳ. Then we can
write the above as
and we we denote the invariant measure on G/H by dx̄, ie we denote dx̄/|Lx̄ (e)|
by dx̄ so that dx̄ becomes the invariant measure on G/H, then we have for the
invariant measure on G,
and in particular,
(d/dȳ)χ(x̄oy)|ȳ=H = χ (x̄)Lx̄ (e)
Advanced Probability and Statistics: Applications to Physics and Engineering 113
where dx̄ is now the invariant measure on G/H and dh is the invariant measure
on H.
Remark: We can give a more precise proof using parametrizations of G
and G/H × H in terms of Euclidean measures. Specifically, in terms of these
parametrizations we require
f (g)dg/|Lg (e)| = f (ψ(x̄, h))|L(x̄,h) (H, e)|−1 dx̄dh
Note that dx̄ is now a Euclidean measure on G/H relative to this ///parametriza-
tion, not the invariant measure on G/H and by the above calculation,
where now ψ(x̄, h) = x̄hx̄−1 . Also the invariant measure on G in terms of these
Euclidean coordinates is therefore
with dx̄/Lx̄ (H) being the invariant measure on G/H so that if we use the
notation dx̄ in place of dx̄/|Lx̄ (H)|, then we can write the above formula as
The lhs of this equation gives the invariant measure on G in terms of its coor-
dinate parametrization while the rhs gives the same invariant measure in terms
of the parametrization of G using coordinates on G/H × H with g = x̄hx̄−1 .
It is clear that for the groups, SL(n, R), SL(n, C), U (n, R), U (n, C) and their
subgroups, |Lg (e)| = 1 and hence we can derive using the above formula, the
celebrated Weyl integration formula.
g=h⊕ gα
α∈Δ
= h ⊕ n+ ⊕ n−
where h is a Cartan subalgebra and
n+ = gα ,
α∈P
n− = g−α
α∈P
where P is the set of positive roots w.r.t a fixed Weyl chamber and we note that
Δ = P ∪ (−P ) is the set of all roots. Let
g=t⊕p
Advanced Probability and Statistics: Applications to Physics and Engineering 115
Then
θ(X + Y ) = X − Y, X ∈ t, Y ∈ p
Note that if X ∈ t, then ad(X) has all imaginary eigevalues and hence B(X, X) <
0. Likewise, if Y ∈ p, then ad(Y ) has all real eigenvalues and hence B(Y, Y ) > 0.
Now since B(X, Y ) = 0 for X ∈ t, Y ∈ p, it follows that
and this means that (U, V ) → −B(U, θ(V )) is a positive definite form on g × g.
We have
fˆ(Y ) = f (Ad(x)H)exp(B(Ad(x)(H), Y ))|π(H)|2 dxdH
G/A×h
Ad(x)(H) = H − tα(H)Xα
and hence
B(Ad(x), H ) = B(H, H )
since
B(Xα , H ) = 0
for any H ∈ h and α any root. Thus, it is clear that B(Ad(x)H, H ) is non-
zero only when Ad(x)H ∈ h. This can happen when either x = exp(tXα ) with
α(H) = 0 or else when Ad(x) belongs to the Weyl group of h so that we get
Ad(x)H ∈ h. For a given H ∈ h, the set of linear functionals λ on h for which
λ(H) = 0 has zero measure and hence it easily follows from the above argument
that the Fourier transform of f at H can be expressed as a superposition of
functions of the form (the superposition is over different H0 s)
fH0 (H ) = c(s, H0 )exp(B(sH0 , H ))
s∈W
= −B(Z, Y )fˆ(Y )
and
FH0 (X, p) = χ(p)FH0 (X), p ∈ Z
Advanced Probability and Statistics: Applications to Physics and Engineering 117
∂(Ad(x)X)f (Y ) = df (Y + tAd(x)X)/dt|t=0
−1
= df x (Ad(x−1 )Y + tX)/dt|t=0
−1 −1
= (∂(X)f x )(Ad(x−1 )Y ) = (∂(X)f x )x (Y )
= T (x)∂(X)T (x−1 )f (Y )
or more precisely,
(∂(p)f )(H) = (π −1 .∂(p̄).π.f )(H)
The Casimir element: Consider
B(X, Y ) = T r(ad(X).ad(Y ))
gij = B(Xi , Xj )
118 Advanced Probability and Statistics: Applications to Physics and Engineering
n
ω= g ij Xi Xj
i,j=1
B(X i , Xj ) = δji
Then clearly,
g ij = B(X i , X j )
and
g ij B(Ad(g)Xi , X).B(Ad(g)Xj , Y ) = g ij B(Xi , Ad(g)X).B(Xj , Ad(g)Y )
i,j i,j
= B(X j , Ad(g)X).B(Xj , Ad(g)Y ) = B(Ad(X), Ad(Y )) = B(X, Y )
j
Ad(g)ω = ω, g ∈ G
ie,
ω∈Z
Let
F (x : X) = F (Ad(x)X), x ∈ G, X ∈ g
Then let U, V ∈ g. We have
∂t1 ...∂tr ∂s1 ...∂sm F (Ad(x.exp(t1 U1 )...exp(tr Ur )).(X +s1 V1 +...+sm Vm ))|t=0,s=0
We have
F (H; ∂(H1 ) : H ) = B(Ad(x)H1 , H )exp(B(Ad(x)H, H ))dx
G
F (sH : H ) = F (H : sH ) = F (H : H ), H, H ∈ h, s ∈ W
Therefore,
c(s) = c0 , s ∈ W
is a constant, ie,
F (H : H ) = c0 exp(B(sH, H ))
∈W
120 Advanced Probability and Statistics: Applications to Physics and Engineering
f1 (x : H) = f (x : H; π −1 ∂(p̄)π)
F (x : X) = F (Ad(x)X), x ∈ G, X ∈ g
[∂(([U1 , [U2 , ψ(t )X] + ad(U1 ).ad(U2 ))ψ(t )p)F (ψ(t )X)
[∂(ad(U1 )([U2 , X]+ad(U2 ))p)F (X)+∂([U2 , X]).∂(([U1 , X]+ad(U1 ))p)+∂([U1 , X][U2 , p])]F (X)
= ∂([U1 , [U2 , X]]p + [U1 , [U2 , p]])F (X) + ∂([U1 , X].[U2 , X]p)F (X)
122 Advanced Probability and Statistics: Applications to Physics and Engineering
A simple case:
F (x; U1 U2 : X; V ) = ∂ 3 /∂t1 ∂t2 ∂s(F (Ad(x.exp(t1 U1 ).exp(t2 U2 )).(X+sV )))|t1 =t2 =s=0
χ(Ad(g)H) = χ(H), g ∈ G, H ∈ h
Advanced Probability and Statistics: Applications to Physics and Engineering 123
We wish to prove two things: One that χ is a finite integer linear combination
of characters of the torus h and two that denoting χ above by χΛ , we have that
|Δ(H)|2 χΛ (H)χ̄μ (H)dH = δ(Λ, μ)
h
where μ is any other dominant integral weight. To prove this last statement,
it suffices to show that if s ∈ W differs from the identity, then s(Λ + ρ) − ρ is
integral but not dominant integral. For then it would follow that
exp(s(Λ + ρ)(H)).exp(−s (μ + ρ)(H))dH
h
= exp(s −1 s.(Λ + ρ)(H) − ρ(H)).exp(−μ(H))dH = 0
h
−Λ(Hα ) − 2ρ(Hα )
which is a non-positive integer. A better way to see this is to note that since α
is simple, sα (P − {α}) = P − {α} and sα α = −α. Thus, sα ρ = ρ − α and we
get sα ρ − ρ = −α from which the result follows.
Reference:
[1] Harish-Chandra, Collected papers, Edited by V.S.Varadarajan, Springer.
[2] V.S.Varadarajan, ”Harmonic Analysis on Semisimple Lie Groups”, Cam-
bridge University Press.
[o] The equations of motion of several rigid bodies pivoted to each other in the
presence of external non-random torque plus a small random torque component.
Computing the approximate mean and variance propagation equations using
perturbation theory. Computing the dynamics in coordinate free Lie algebra
domain using the formula for the differential of the exponential map. Large
deviation analyis of the perturbed motion by calculating the rate functional for
the perturbed Lie algebra process. First moment generating functional, then
limiting logarithmic moment generating functional of the Lie algebra process and
finally the Fenchel-Legendre transform of the moment generating functional and
then applying the Gartner-Ellis theorem to calculate the asymptotic probability
(in the limit of small noise amplitude) for the perturbed Lie algebra element to
remain within a small neighbourhood of the zero matrix (stability zone). We
do this calculation after introducing control feedback torque into the system in
the form of the error between a desired Lie algebra process and the actual Lie
algebra process. The coefficients of this control torque are then designed so that
the probability of deviation of the process from the stability zone (calculated
using large deviation theory) is as small as possible.
π(h−1 ) = π(h)∗ , h ∈ H
Advanced Probability and Statistics: Applications to Physics and Engineering 125
and
P2 = π(h1 h2 )dh1 dh2 = π(h)dh1 dh = P
H×H H×H
by using the left invariance of the measure dh. Let Ĝ denote the collection of
all inequivalent irreducible representations of G. We expand a function f on G
using the Plancherel formula
f (g) = T̂ r(fˆ(π)π(g))dν(π)
Ĝ
and hence
< f1 , f2 >= f¯1 (g)f2 (g)dg =
G
fˆ1ab (π)∗ fˆ2cd (σ)πba (g)∗ σcd (g)dgdν(π)dν(σ)
abcd G×Ĝ×Ĝ
then we get
< f1 , f2 >= ω(π)fˆ1ab (π)∗ fˆ2ab (π)
π,a,b
and in particular,
2
f = |f (g)|2 dg = ω(π) fˆ(π) 2
G
π∈Ĝ
where
A 2 = T r(A∗ A)
is the Frobenius norm square of the matrix A. In this case, the Plancherel
measure ν on Ĝ is discrete which is true in particular if the group is compact.
In the general case, we must replace the above by
πba (g)∗ σcd (g)dg = ω(π)δ(π, σ)δbc δad
G
where now δ(π, σ) is not the Kronecker delta function, but rather the Dirac
delta function w.r.t. the measure ν. Thus, in this case, we get
f 2 = fˆ(π) 2 ω(π)dν(π)
Ĝ
126 Advanced Probability and Statistics: Applications to Physics and Engineering
where
f1 (g) = f (gx0 ), g ∈ G
and then we get for g ∈ G, h ∈ H that
f (gx0 ) = f1 (gh) = f1 (g) = f1 (gh)dh = T r(fˆ1 (π)π(g)P )dν(π)
H Ĝ
which is the desired expansion for functions defined on M in terms of the func-
tions
ψab (x) = (π(γ(x))P )ab = [π(γ(x)]ac Pcb
c
where the χk s are integral linear functions on h. It is an easy result in Fourier
series that the functions exp(2πiχk (H)), k = 1, 2, ... are orthogonal on h w.r.t the
Euclidean measure on h. From the irreducibility of χλ and Weyl’s integration
formula, we have
|W |−1 χλ (h).χμ (h)∗ |Δ(h)|2 dh = δ(λ, μ)
H
Advanced Probability and Statistics: Applications to Physics and Engineering 127
for any two dominant integral weights λ, μ. Here, H = exp(h) is the Cartan
subgroup of G corresponding to the Cartan sub-algebra h. In this formula,
s∈W (s).exp(s.ρ(H))
so that
(α(H) + β(H)).B(X, Y ) = 0∀H ∈ h
which implies that B(X, Y ) = 0 since by hypothesis α + β = 0 and hence there
exists an H ∈ h such that α(H) + β(H) = 0. Therefore since gα ⊥ h, it follows
that gα ⊥ g which contradicts the non-degeneracy of the symmetric bilinear
form for a semisimple Lie algebra (Cartan’s theorem on semisimplicity).
128 Advanced Probability and Statistics: Applications to Physics and Engineering
for all integers k, it follows that there exist two non-negative integers p, q such
that
p
gβ+kα
k=−q
Remark:
[Xα , X−α ] = c.Hα
where Hα ∈ h is defined so that
B(Hα , H) = α(H), H ∈ h
= α(H).B(Xα , X−α )
and hence,
c = B(Xα , X−α )
We then define
H̄α = 2.Hα / < α, α >
where
< α, α >= α(Hα ) = B(Hα , Hα )
Then,
[H̄α , Xα ] = α(H̄α )Xα = 2.Xα − − − (a),
[Hα , X−α ] = −2X−α − − − (b),
Note that
B(cHα , Hα ) = c < α, α >
or equivalently,
without affecting the commutation relations (a) and (b). We then get H̄α =
c.Hα and
[H̄α , Xα ] = 2.Xα ,
[Hα , X−α ] = −2.X−α ,
[Xα , X−α ] = H̄α
In other words, {H̄α , Xα , X−α } form a canonical sl(2, C) triple and as such ρα
defined as the adjoint representation of the Lie algebra spanned by this triple
p
acting on k=−q gβ+kα is an irreducible representation. Note that dimgα = 1
for any root α. To see this, we consider the subspace (X−α is an arbitrary
non-zero element in g−α )
Vα = span{X−α } ⊕ h ⊕ gkα
k>0
= T r(adH̄α |Vα ) =
−α(H̄α ) + 0 + k.α(H̄α )dimgα
k>0
= −2 + 2k.dimgkα
k≥1
or equivalently,
is a non-negative integer.
Remark: The subspace gβ+kα has weight given by (β + kα)(H̄α ) because if
X is any vector in this subspace,
We also note that if α is a root and c is any complex number such that cα
is a root, then c = ±1.
where dy, dz denote the area measure on S 2 , ie, in spherical polar coordinates,
if x = (θ, φ), then dx = sin(θ)dθ.dφ. We assume that K1 , K2 are G-invariant
kernels where G = SO(3). This means that
We wish then to determine the most general forms of K1 , K2 . Let Ylm (x) denote
the spherical harmonics on S 2 , m = −l, −l + 1, ..., l − 1, l, l = 0, 1, 2, .... We can
expand
K1 (x, y) = K1 (l, m, l , m )Ylm (x)Ȳl m (y)
ll mm
Advanced Probability and Statistics: Applications to Physics and Engineering 131
We now require analogously a similar condition for the G-invariance of the kernel
K2 . The condition is
K2 (x, y, z) = K2 (lml m l m )Ylm (x)Ȳl m (y)Ȳl m (z)
ll l mm m
with K2 satisfying
K2 (lml m l m )[πl (g)]m1 m [π̄l (g)]m1 m [π̄l (g)]m1 m
mm m
k=−|l −l |
or equivalently,
πl (g)[K2 (ll l )](s)Tl ,l (s, k)πk (g −1 ) = [K2 (ll l )](m)Tl ,l (m, k), g ∈ G
s m
where δXk (t), k = 1, 2 are skew-symmetric matrices and represent the effect of
noise. We first derive linearized equations for δXk . For this we note that upto
linear orders in the δXk s, we have
(0)
Rk (t) = Rk (t)(1 + δXk (t)) = Ak (t)(1 + δXk (t))
+x1m (t)(A1 Lm J1 + J1 Lm A1 T )]
3
+ [x2m (t)(A2 Lm k T + kLm AT2 ) + 2x2m (t)(A2 Lm k T + kLm A2 T )
m=1
√
+x2m (t)(A1 Lm k T + kLm A1 T )] = W1 (t)
with another equation obtained by interchanging the subscripts 1 and 2. Here,
We note that in the above equation, the coefficient matrices of xkm , xkm , xkm , k =
1, 2, m = 1, 2, 3 are all skew-symmetric real matrices. Hence we obtain a set of
3 + 3 = 6 linearly independent sde’s for the six variables xkm , k = 1, 2, m =
1, 2, 3. These six scalar equations can be obtained by multiplying the above
matrix sde’s with Ln , n = 1, 2, 3 respectively and taking the trace.
Choose an symmetric bilinear form (.|.) for SO(2n, C) and a basis u1 , ..., u2n
for C2n so that
(ui |uj+n ) = δ(i, j), 1 ≤ i, j ≤ n,
(ui |uj ) = (un+i |un+j ) = 0, 1 ≤ i, j ≤ n
This is equivalent to saying that u1 , ..., u2n is an onb w.r.t to the weight matrix
0 In
W =
In 0
One way to construct such a basis, is to start with the standard basis e1 , ..., en
for Cn and then to define
A B
X=
C D
D = −AT , B T = −B, C T = −C
We denote this matrix by X(A, B, C). Let g denote this Lie algebra. Let E(p, q)
denote the n × n matrix whose (p, q)th entry is a one and whose all other entries
are zeros. Then, define
F (p, q) = X(0, (E(p, q) − E(q, p))/2, 0), G(p, q) = X(0, 0, (E(p, q) − E(q, p))/2),
We leave it as an exercise to verify that these are precisely the set of all linearly
independent eigenvectors of ad(h) in g.
Chapter 5
y0[Lm + n] = x[n], 0 ≤ n ≤ L − 1, 0 ≤ m ≤ K − 1
Note that this ”periodized signal” y is defined over the time interval 0 ≤ t ≤
KL − 1. Now add noise to this periodized signal to get
After this, delay this noisy signal over each slot by a different amount. Let τ (m)
denote the delay over the mth slot. The resulting signal is then given by z[n]
where
and
z[mL + n] = 0, 0 ≤ n ≤ τ (m) − 1
135
136 Advanced Probability and Statistics: Applications to Physics and Engineering
Now take the DFT of z over each slot and estimate the τ (m) s by comparing
this DFT with the DFT of the original signal x. Reconstruct the signal x from
knowledge of these delay estimates and the noisy signal z[.] and determine the
reconstruction error variance.
[3] Let X be a random vector having the pdf p(X|θ). Let X[n], n = 1, 2, ...
be iid samples of X so that (X[n], n = 1, 2, ..., M ) has the pdf ΠM
n=1 p(X[n]|θ).
Estimate θ from measurements of X[n], n = 1, 2, ..., M by the maximum likeli-
hood method. Denote this estimate by θ̂[M ]. Calculate using the large deviation
principle
limM →∞ M −1 .log(P (|θ̂[M ] − θ| > δ)
Make appropriate approximations.
hint: Write θ̂[M ] = θ + δθ[M ] and write
and
log(p(X|θ + δθ)) ≈ logp(X|θ) + δθ.p (X|θ)/p(X|θ)
+((δθ)2 )(p (X|θ)/p(X|θ) − (1/2)(p (X|θ)/p(X|θ))2 )
M
Use this formula to approximately maximize n=1 log(p(X[n]|θ + δθ)) w.r.t δθ
and then apply the LDP.
where
U.V = ημν U μ V ν
Consider a perturbing Lagrangian
1
ΔL = (F.∂τ X + G.∂σ X)dσ
0
where F μ (τ, σ), Gμ (τ, σ) are random fields. Calculate the perturbation in the
solution to the string field equations caused by these random perturbations and
evaluate using the LDP the probability that these perturbations will cause the
string deviation from its unperturbed solution to be more than a given threshold
δ.
(∇2 + k 2 )ψ(r) = 0
Advanced Probability and Statistics: Applications to Physics and Engineering 137
Substitute
ψ(r) = A(r).exp(iφ(r))
where A, φ are real functions into the Helmholtz equation and derive nonlinear
pde’s for the two functions A, φ. Hence by making appropriate approximations,
derive a nonlinear Schrodinger equation for φ with the z coordinate taking the
role of the time variable.
hint:
∇ψ = (∇A + iA∇φ)exp(iφ)
∇2 ψ = (∇2 A + iA∇2 φ + 2i(∇A, ∇φ) − A|∇φ|2 )exp(iφ)
and hence the Helmholtz equation reads on neglecting the single and double
partial derivatives of A (which corresponds to the situation when the amplitude
of the wave does not vary rapidly with position, only the phase varies rapidly)
i∇2 φ − |∇φ|2 + k 2 = 0
We now write
φ(r) = kz + χ(r)
and note that
|∇φ|2 = k 2 + |∇χ|2 + 2k∂χ/∂z
so that the above equation becomes
Then,
P,t (t, x) = ψ ∗ ψ,t + ψ,t
∗
ψ=
138 Advanced Probability and Statistics: Applications to Physics and Engineering
iψ ∗ ψ,xx /2 − iψψ,xx
∗
/2 = −J,x (t, x)
where
J(t, x) = (i/2)(ψ ∗ ψ,x − ψψ,x
∗ ∗
) = Im(ψψ,x )
Therefore if P is to increase with time, then J must decrease with increasing
x. It is plausible to believe from the Gibbs principle, that if the potential V
is large at a point, then the probability of the particle to be at that point is
smaller. However, the basis of that result is a part of quantum statistics which
involves bringing in the notion of a mixed state. We can bring in quasi-classical
quantum mechanics to understand the quantum neural network better. Let
then
P (t, x) = A(t, x)2
Substituting this into the above Schrodinger equation gives us
iA,t − AS,t = V A − (1/2)(A − AS 2 + 2iA S + iAS )
and hence
A,t = −A S − AS /2, S,t = −V + A /2 − AS 2 /2
Also,
∗
P,t = −Im(ψψ,x ),x
= 2AA,t = −A(A S + AS /2)
If we fix our energy E, then we get
EA = V A − (1/2)(A − AS 2 + 2iA S + iAS )
or
S = 2(E − V )
and hence the equation
2A S + AS = 0
gives
A2 S = C
where C is a constant so that
√
A = C0 / S = C0 / 2(E − V )
or
P = A2 = C02 /2(E − V )
Advanced Probability and Statistics: Applications to Physics and Engineering 139
This equation implies that if V < E and V increases, then P will also increase
and this fact is at the heart of the neural network.
p(y(.)|{τk })p({τk })
k=0 exp(−λ(τk+1 − τk ))
p(τ1 , ..., τn ) = λn Πn−1
and
T
p(y(t), t ∈ [0, T ]|{τk }) = C.exp(−(2σv2 )−1 (y(t) − x(t − τk ))2 dt)
0 k≥0
Thus, the posteriori negative log likelihood function of the delay times is
L({τk }|y(.)) =
T
(1/2σv2 ) (y(t) − x(t − τk ))2 dt + λ (τk+1 − τk )
0 k≥0 k
The last sum evaluates to τN where N is the number of Poisson jumps in the time
interval [0, T ]. In order to formulate this more clearly, we require to estimate
both N = N (T ), the number of arrival times in the time interval [0, T ] as well
as the arrival times τk , k = 1, 2, ..., N . This is done by the following evaluation:
T
(1/2σv2 ) (y(t) − x(t − τk ))2 dt − N.log(λ) + log(N !)
0 0≤k≤N
[9] Random delay estimation for arbitrary renewal process arrival times.
Assume that the delay times τk , k = 1, 2, ..., τ0 = 0 are such that the inter-
delay times τk+1 − τk , k = 0, 1, 2, ... are iid with probability distribution F (.)
concentrated on R+ . Then, let N (T ) denote the number of arrival times in the
time interval [0, T ]. Thus,
N = N (T ) = max(n ≥ 0 : τn ≤ T ),
This gives us the negative log likelihood function for the delays given the mea-
surement signal over the time interval [0, T ] as
and minimizing this function w.r.t N, τ1 , ..., τN gives us the optimum delay esti-
mates. We can compute this likelihood function approximately using Parseval’s
theorem on Fourier analysis. Let X(ω, Y (ω) denote the Fourier transforms of
x(t), y(t) respectively. Then,
y(t) = (2π)−1 Y (ω).exp(iωt)dω,
R
N
x(t − τk ) = (1/2π) X(ω) exp(iω(t − τk ))dω
k=0 R
N
(1/2π) |Y (ω) − X(ω). exp(−iωτk )|2 dω
R k=0
N −1
− log(f (τk+1 − τk ))
k=0
The sources of various rhythms present within the brain generate signals of
different frequencies, say ω1 , ..., ωp . When some stimuli in the form of light,
sound, skin perturbations etc. are applied on the person, then these frequencies
get phase coupled. Usually these stimuli are modeled by a discrete Poisson
train, so that if τ1 , τ2 , ... are the arrival times of a Poisson process with rate
λ (or equivalently, the inter-arrival time τk+1 − τk , k = 1, 2, ... are independent
exponential random variables with mean 1/λ, and the original EEG signal is
p
f (t) = A(k)cos(ωk t + φk )
k=1
then after the application of this discrete stimulus, the EEG signal becomes
g(t) = f (t − τk ) = A(k).cos(ωk t − ωk τm + φk )
k k,m
and it can be shown easily that if the φk s are iid uniform over [0, 2π), then the
triple correlations E(f (t)f (t + t1 )f (t + t2 )) vanishes but E(g(t)g(t + t1 )g(t + t2 ))
does not vanish. In fact, the f (t), g(t) are stationary processes with f (t) Gaus-
sian provided that the A(k) s are iid Rayleigh but g(t) is a non-Gaussian process.
This suggests that by estimating the bispectrum (bivariate Fourier transform
of the triple correlations) of g(t), we can get information about the stimulus
142 Advanced Probability and Statistics: Applications to Physics and Engineering
rate λ. Moreover, if there is some sort of a brain disease, then even without the
stimulus, the different phases {φk } in f (t) be not be statistically independent
causing the bispectrum of f (t) to be non-zero. One model for a brain disease
could be that the original harmonic signal f (t) comprising independent phases
gets non-linearly distorted leading to the resulting output having linear phase
relations causing its bispectrum to be non-zero. Thus by estimating the signal
bispectrum and noting the frequency pairs at which it peaks, we get information
about which frequencies in the rhythms are phase coupled. For example if the
non-linearity is of the Volterra type
y(t) = h1 (τ )f (t − τ )dτ + + h2 (τ1 , τ2 )f (t − τ1 )f (t − τ2 )dτ1 dτ2
where
H1 (ω) = h1 (τ ).exp(−jωτ )dτ,
H2 (ω1 , ω2 ) = h2 (τ1 , τ2 ).exp(−j(ω1 τ1 + ω2 τ2 ))dτ1 dτ2
E(g(t)g(t + t2 )g(t + t2 )) =
A(k)2 A(m)2 |H1 (ωk )||H1 (ωm )||H2 (ωk , ωm )|cos(ωk t1 +ωm t2 +φH2 (ωk , ωm )
k,m
−φH1 (ωk )−φH1 (ωm ))
Advanced Probability and Statistics: Applications to Physics and Engineering 143
= A(k)2 A(m)2 |H1 (ωk )||H1 (ωm )||H2 (ωk +ωm )
k,m
|exp(jφH (ωk , ωm ))δ(Ω1 −ωk )δ(Ω2 −ωm )
plus similar terms with ωm replaced by −ωm and φH1 (ωm ) replaced by −φH1 (ωm ).
In this expression, we have defined
φH (ωk , ωm ) = φH2 (ωk , ωm ) − φH1 (ωk ) − φH1 (ωm )
where
H(ω, x) = h(t, x).exp(−jωt)dt
R
Lf (t, x) = s(t, x)
144 Advanced Probability and Statistics: Applications to Physics and Engineering
where L is a partial differential operator in space and time. L−1 would then be
an integral kernel and we would get by solving the above pde
f (t, x) = L(t − τ, x, y)s(τ, y)dτ dy
which would be a signal of the above form when s(t, x) is a harmonic process
w.r.t time. We can also model this signal using a dynamic model and apply the
EKF to estimate the parameters of the operator L. For example, suppose
L = ∂/∂t + M
where w(t, x) is white Gaussian noise. θ are the parameters of the EEG signal
and they can be estimated using the EKF applied to spatially discrete measure-
ment data:
g(t, xk ) = f (t, xk ) + v(t, xk ), k = 1, 2, ..., d
obtained by placing sensors at the discrete points xk , k = 1, 2, ..., d on the brain’s
surface. If we take non-linearities into account caused by the disease, then the
dynamic model for the signal would be
∂f (t, x)/∂t+ M1 (x, y|θ)f (t, y)dy+ M2 (t, x, y, z|θ)f (t, y)f (t, z)dydz
= s(t, x)+w(t, x)
and once again, we could apply the EKF to estimate the model parameters θ
from which the nature of the brain disease could be classified.
Application of neural networks and artificial intelligence in the classification
of brain disease. Suppose we take a recurrent neural network RNN governed by
the state equations
where X(k) are the signals at the various layers, W (k) are the weights and u(k)
is the input vector. The measurement is taken on the final layer, ie, the Lth
layer:
y(k) = XL (k) + v(k) = HX(k) + v(k)
where
X(k) = [X1 (k)T , ..., XL (k)T ]T
with Xj (k) being the signal vector at the j th layer and H a matrix of the form
[0, 0, ..., 0, I]. The weights W (k) are governed by the evolution
W (k + 1) = W (k) + noise
Advanced Probability and Statistics: Applications to Physics and Engineering 145
We apply a vector stimulus signal u(t)to the diseased brain and record the
output signal vector yd (t) measured on the surface of the brain. We then use
the EKF for the above neural network model to estimate the weights W (k) with
the driving input for the EKF taken as yd (k). Once the weights have converged,
we regard these converged weights as ther characteristic features of the disease.
The use of quantum neural networks in estimating the brain signal pdf.
Quantum neural networks are nature inspired algorithms that naturally generate
a whole family of probability densities and hence can be used to estimate the
joint pdf of the EEG signal on the brain surface.
[11] Modeling speech signals based on distorted MRI data and es-
timating the MRI parameters from distorted speech using the EKF
This is just a special case of synthesizing a higher dimensional signal from a
lower dimensional signal.
Synthesis of MRI data from speech/EEg signals
The MRI space-time signal M (t, x, y) = M (t, x, y|θ) modeled as a function
of time and space is
p
∂t M (t, x, y|θ) = akm (θ)∂xk ∂ym M (t, x, y|θ)+
k,m=0
In the special case when the brain is normal, θ = θ0 and in the absence of noise
z(t) = z0 (t) so that we have an identity
p
∂t M0 (t, x, y|θ0 ) = akm (θ0 )∂xk ∂ym M0 (t, x, y|θ0 ) + z0 (t)χ(t, x, y)
k,m=0
This identity determines the normal brain, noise free MRI data M0 (t, x, y|θ) as
a function of the computer generated speech data z0 (t). Our aim is to estimate
the brain disease parameters θ from recordings of the speech data z(t). This we
propose to do using the EKF.
θ = θ0 + δθ
146 Advanced Probability and Statistics: Applications to Physics and Engineering
Likewise, if z(t) deviates slightly from z0 (t), we call this deviation δz(t):
∂δM (t, x, y) =
p
(akm (θ0 )δθ(t))∂xk ∂ym M0 (t, x, y|θ0 ) + w(t, x, y)
k,m=0
dδθ(t) = d θ (t),
and the linearized measurement model is
dδz(t) = ( φ(t, x, y)δM (t, x, y)dxdy)dt + dv(t)
The problem now amounts to estimating δθ(t) dynamically from δz(s), s ≤ t and
this can be achieved via the Kalman filter. Another model that could describe
this situation involves considering that the MRI image data within the brain
depends on the distorted brain parameters θ as well as on the distorted speech
data z(t). In this case, the state and measurement models will be intertwined.
However, such a model is not too realistic since it is natural to suppose that
first the brain acquires a disease independent of the spoken speech so that the
corresponding distorted parameters θ have nothing to do with the spoken speech
but could depend on the noiseless speech that the computer has stored. One
may also assume that this MRI data is independent of the noiseless speech but
the spoken speech depends on the MRI data and the computer stored noiseless
speech. In this case, a state and measurement model would be
p
∂t M (t, x, y|θ) = akm (θ)∂xk ∂ym M (t, x, y|θ)dxdy
k,m=0
+w(t, x, y)
dz(t) = dz0 (t) + ( φ(t, x, y)M (t, x, y|θ)dxdy)dt + dv(t)
The role of the bispectrum in modeling brain disease: Brain disease is usually
manifested by the generation of nonlinear mechanisms which distort the speech
data. So, if M (t, x, y|θ) is the MRI image field data, then the spoken speech
z(t) would satisfy an sde of the form
dz(t) = (δ. f (M (t, x, y|θ), z0 (t))φ(t, x, y)dxdy)dt + dz0 (t) + dv(t)
Advanced Probability and Statistics: Applications to Physics and Engineering 147
where z0 (t) is the computer recorded speech data. We are assuming that when
the patient carrying the brain disease varies, the parameter t will also vary and
for the tth patient, z0 (t) is fixed. Actually to be more precise the forcing term
in the above pde should depend on the entire speech record z0 (.) taken over the
finite duration [0, T ] and therefore we should write
After discretizing this pde, in both time and space, we obtain a partial difference
equation
M [t + 1, x, y|θ] = M [t, x, y|θ] + δ( θ[k]Ld,k )M [t, x, y|θ] + ψ[t, x, t, z0 ])
k
where δ is the time discretization step size and Ld,k is the discrete matrix repre-
sentation of the partial differential operator Lk after spatial discretization. For
example if Lk = ∂xr ∂ys , then
where Δ is the spatial discretization step size and Dx , Dy are the partial differ-
ence operators
Dx f (x, y) = f (x + 1, y) − f (x, y), Dy f (x, y) = f (x, y + 1) − f (x, y)
where x + 1 means x + Δ. More precisely, the integer x corresponds to the
spatial length xΔ and likewise for y. Thus,
N −1
M (t) = Dxr Dys M (t, x, y)ex ⊗ ey
x,y=0
= AM (t)
where A is the N 2 × N 2 matrix
N −1
A= (ex ⊗ ey )Dxr Dys (ex ⊗ ey )T
x,y=0
N −1
= (ex ⊗ ey )(Dxr ex ⊗ Dys ey )T
x,y=0
where
p
A(θ) + θ[k]Ak
k=1
t−1
M [t|θ] = (I + A(θ))t M [0] + (I + A(θ))t−k−1 ψ[k, z0 ]
k=0
for k = 1, 2, ..., p.
Advanced Probability and Statistics: Applications to Physics and Engineering 149
We assume that
y[t] = x[t] + v[t]
where v[.] is noise and get
y[t + 1] − f (t, y[t], θ[t]) = x[t + 1] + v[t + 1] − f (t, x[t] + v[t], θ[t])
+ M (t + 1) − A4 (φ)M (t) − A5 (φ)ζ0 (t + 1) − a6 (φ) 2 ]
t
150 Advanced Probability and Statistics: Applications to Physics and Engineering
Now while running the EKF to estimate θ dynamically from the measurement
model
z(t) = Hζ(t) + v(t)
we assume that θ(t) = θ0 + δθ(t) and linearize the above model around θ0 and
then estimate δθ(t) using the EKF. For the EKF, the extended state vector is
taken as
ξ(t) = [zeta(t)T , M (t)T , δθ(t)T ]T
and the EKF estimate of this gives us M̂ (t|t), ie, the MRI data estimate as
a by-product. It also gives us δ̂θ(t|t) and hence θ̂(t|t) which characterizes the
nature of the disease.
[15] Detecting brain diseases using EEG data based on group in-
variant processing
The sources of various rhythms present within the brain generate signals of
different frequencies, say ω1 , ..., ωp . When some stimuli in the form of light,
sound, skin perturbations etc. are applied on the person, then these frequencies
get phase coupled. Usually these stimuli are modeled by a discrete Poisson
train, so that if τ1 , τ2 , ... are the arrival times of a Poisson process with rate
λ (or equivalently, the inter-arrival time τk+1 − τk , k = 1, 2, ... are independent
exponential random variables with mean 1/λ, and the original EEG signal is
p
f (t) = A(k)cos(ωk t + φk )
k=1
Advanced Probability and Statistics: Applications to Physics and Engineering 151
then after the application of this discrete stimulus, the EEG signal becomes
g(t) = f (t − τk ) = A(k).cos(ωk t − ωk τm + φk )
k k,m
and it can be shown easily that if the φk s are iid uniform over [0, 2π), then the
triple correlations E(f (t)f (t + t1 )f (t + t2 )) vanishes but E(g(t)g(t + t1 )g(t + t2 ))
does not vanish. In fact, the f (t), g(t) are stationary processes with f (t) Gaus-
sian provided that the A(k) s are iid Rayleigh but g(t) is a non-Gaussian process.
This suggests that by estimating the bispectrum (bivariate Fourier transform
of the triple correlations) of g(t), we can get information about the stimulus
rate λ. Moreover, if there is some sort of a brain disease, then even without the
stimulus, the different phases {φk } in f (t) be not be statistically independent
causing the bispectrum of f (t) to be non-zero. One model for a brain disease
could be that the original harmonic signal f (t) comprising independent phases
gets non-linearly distorted leading to the resulting output having linear phase
relations causing its bispectrum to be non-zero. Thus by estimating the signal
bispectrum and noting the frequency pairs at which it peaks, we get information
about which frequencies in the rhythms are phase coupled. For example if the
non-linearity is of the Volterra type
y(t) = h1 (τ )f (t − τ )dτ + + h2 (τ1 , τ2 )f (t − τ1 )f (t − τ2 )dτ1 dτ2
where
H1 (ω) = h1 (τ ).exp(−jωτ )dτ,
H2 (ω1 , ω2 ) = h2 (τ1 , τ2 ).exp(−j(ω1 τ1 + ω2 τ2 ))dτ1 dτ2
E(g(t)g(t + t2 )g(t + t2 )) =
A(k)2 A(m)2 |H1 (ωk )||H1 (ωm )||H2 (ωk , ωm )|cos(ωk t1 +ωm t2 +φH2 (ωk , ωm )
k,m
−φH1 (ωk )−φH1 (ωm ))
which means that the bispectrum of g is
Bg (Ω1 , Ω2 ) = E(g(t)g(t + t1 )g(t + t2 ))exp(−j(Ω1 t1 + Ω2 t2 ))dt1 dt2
= A(k)2 A(m)2 |H1 (ωk )||H1 (ωm )||H2 (ωk +ωm )
k,m
|exp(jφH (ωk , ωm ))δ(Ω1 −ωk )δ(Ω2 −ωm )
plus similar terms with ωm replaced by −ωm and φH1 (ωm ) replaced by −φH1 (ωm ).
In this expression, we have defined
φH (ωk , ωm ) = φH2 (ωk , ωm ) − φH1 (ωk ) − φH1 (ωm )
where
H(ω, x) = h(t, x).exp(−jωt)dt
R
which would be a signal of the above form when s(t, x) is a harmonic process
w.r.t time. We can also model this signal using a dynamic model and apply the
EKF to estimate the parameters of the operator L. For example, suppose
L = ∂/∂t + M
where M is time independent, then our dynamic model would be
∂f (t, x)/∂t + M (x, y|θ)f (t, y)dy = s(t, x) + w(t, x)
where w(t, x) is white Gaussian noise. θ are the parameters of the EEG signal
and they can be estimated using the EKF applied to spatially discrete measure-
ment data:
g(t, xk ) = f (t, xk ) + v(t, xk ), k = 1, 2, ..., d
obtained by placing sensors at the discrete points xk , k = 1, 2, ..., d on the brain’s
surface. If we take non-linearities into account caused by the disease, then the
dynamic model for the signal would be
∂f (t, x)/∂t+ M1 (x, y|θ)f (t, y)dy+ M2 (t, x, y, z|θ)f (t, y)f (t, z)dydz
= s(t, x)+w(t, x)
and once again, we could apply the EKF to estimate the model parameters θ
from which the nature of the brain disease could be classified.
Now suppose that the functions M1 and M2 are G-invariant. Then the
implementation of the EKF is greatly simplified. Here, G is a Lie group of
transformations acting on the head surface manifold M. Let (g, x) → g.x
denote the group action from G × M → M. Then we denote by U the unitary
representation of G in L2 (M) induced by this group action and a G-invariant
measure on M. Specifically
L2 (M) = Hn
n≥1
where Hn is a subspace invariant and irreducible under U (.). We can write for
any f (x) ∈ L2 (M),
f (x) = Pn f (x)
n≥1
dn
Pn f = |en,k >< en,k , f >
k=1
where
< en,k , f >= ēn,k (x)f (x)dμ(x)
M
where k(n) are constants dependent only on n and Pn (x, y) is the kernel of the
orthogonal projection operator onto Hn . To verify the G-invariance of K(., .),
we note that
K(g −1 x, g −1 y) = k(n)en,k (g −1 x).ēn,k (g −1 y)
n,k
= k(n)[πn (g)]k k en,k (x).[π̄n (g)]k k ēn,k (y)
n,k,k ,k
= k(n)en,k (x)ēn,k (y) = K(x, y)
n,k
since
[πn (g)]k k” [π̄n (g)]k k = δk k
k
s(t, x) = s[t, n, k]en,k (x),
n,k
M (x, y|θ) = M [n|θ]en,k (x)ēn,k (y)
and substituting these expressions into the differential equation gives us
Thus, by measuring the input s(t, x) and the response f (t, x) at different values
of (t, x) ∈ R+ × M, we can estimate θ. Note that by exploiting our apriori
knowledge that M is G-invariant, we have reduced the computation since in
view of this G-invariance, M [, |θ] depends only on n, not on both (n, k) which
means reduced computation. θ can be extracted from i/o data by the following
elementary algorithm:
t
θ̂ = argminθ |f [t, n, k] − exp(−(t − τ )M [n|θ])s[τ, n, k]dτ |2
t,n,k 0
suppose that s(t, x) was a random process with G-invariant correlation function,
ie
E[s(t, gx)s(t , gx )) = E[s(t, x).s(t , x )] = Ks (t, x, t , x ), g ∈ G
then we can expand
s(t, x) = s[t, n, k]en,k (x)
n,k
where X(k) are the signals at the various la yers, W (k) are the weights and
u(k) is the input vector. The measurement is taken on the final layer, ie, the
Lth layer:
y(k) = XL (k) + v(k) = HX(k) + v(k)
where
X(k) = [X1 (k)T , ..., XL (k)T ]T
with Xj (k) being the signal vector at the j th layer and H a matrix of the form
[0, 0, ..., 0, I]. The weights W (k) are governed by the evolution
W (k + 1) = W (k) + noise
We apply a vector stimulus signal u(t)to the diseased brain and record the
output signal vector yd (t) measured on the surface of the brain. We then use
the EKF for the above neural network model to estimate the weights W (k) with
the driving input for the EKF taken as yd (k). Once the weights have converged,
we regard these converged weights as ther characteristic features of the disease.
[17] The use of quantum neural networks in estimating the brain signal pdf.
Quantum neural networks are nature inspired algorithms that naturally generate
a whole family of probability densities and hence can be used to estimate the
joint pdf of the EEG signal on the brain surface.
where
L−1
Yi (ω, k) = L−1 yi (kL + t)exp(−i2πωt/L), ω = 0, 1, ..., L − 1
t=0
Advanced Probability and Statistics: Applications to Physics and Engineering 157
L−1
x(t) = A(ω)cos(2πωt/L + φ(ω))
ω=0
with A(ω) s independent and φ(ω) s independent and uniform over [0, 2π). The
output is modeled as
r
z(t) = hk (m1 , ..., mk |θ)x(t − m1 )...x(t − mk )
k=1
where
s
hk (m1 , ..., mk |θ) = θ(l).ψ(m1 , ..., mk , l)
l=1
It is not hard to see that the r + 1th order polyspectrum of z(.) has a term of
the form
Pz (ω1 , ..., ωr ) = (A(ω1 )...A(ωr ))2 Re(H̄r (ω1 , ..., ωr )H1 (ω1 )...H1 (ωr ))
where
s
Hm (ω1 , ..., ωm ) = Hm (ω1 , ..., ωm |θ) = θ(l)ψ̂m (ω1 , ..., ωm , l)
l=1
∞
= hm (l1 , ..., lm |θ)exp(−i2π(ω1 l1 +...+ωm lm )/L) = Hm (omega1 , ..., ωm |θ)
l1 ,...,lm =−∞
where
L−1
ψ̂k (ω1 , ..., ωk , l) = ψk (m1 , ..., mk , l)exp(−(i2π/L)(m1 ω1 +...+mk ωk ))
m1 ,...,mk =0
We can express the above polyspectrum equation in the presence of noisy mea-
surements after taking into account the fact that the parameters θ can have
slow time variations:
P (ω|t) = F (ω, θ(t)) + W (t)
or equivalently in vector form as
r
×1
P(t) = F(θ(t)) + V(t) ∈ RL
where
P(t) = P (ω|t)e(ω)
ω∈RLr
where
e(ω) = f (ω1 ) ⊗ ... ⊗ f (ωr )
158 Advanced Probability and Statistics: Applications to Physics and Engineering
with
ω = (ω1 , ..., ωr ), f (ω) = [δ(ω, 1), ..., δ(ω, L)]T
Here V(t) is measurement noise and
F(θ) = F (ω|θ),
ω∈RLr
with
F (ω) = (A(ω1 )...A(ωr ))2 Re(H̄r (ω1 , ..., ωr )|θ)H1 (ω1 |θ)...H1 (ωr |θ))
We can write
F(θ) = ψθ⊗r
where
ψ= (A(ω1 )...A(ωr ))2 Re(ψ̄r (ω1 , ..., ωr , l0 )
ω1 ,...,ωr ,l0 ,l1 ,...,lr
ψ.θ̂⊗r = K δ̂θ
where
⊗(r−1) ⊗(r−2) ⊗(r−1)
K = ψ.(θ0 ⊗ I + θ0 ⊗ I ⊗ θ0 + ... + θ0r−k ⊗ I ⊗ θ0k−1 + ... + I ⊗ θ0 )
N
√
= N −1 (K T K)−1 K T (Kδθ + V (t))
t=1
√
N
= δθ + N −1 (K T K)−1 K T . V (t)
t=1
where
I(v) = (1/2)v T Qv, Q = N σv2 (K T K)−1
Now we come to the case when the parameter vector θ has slow time variations.
Let θ0 (t) be our initial guess value for this parameter at time t and we denote
its estimate at time t by δ θ̂(t). Then our measurement model acquires the form
√
δP (t) = K(t)δθ(t) + V (t)
where
K(t) = ψ.(θ0 (t)⊗(r−1) ⊗I+θ0 (t)⊗(r−2) ⊗I⊗θ0 (t)+...+θ0 (t)r−k ⊗I⊗θ0 (t)k−1
+...+I⊗θ0 (t)⊗(r−1) )
The dynamical model for δθ(t) is given by
δθ(t + 1) = δθ(t) + W (t + 1)
and its EKF estimate is denoted by δ θ̂(t|t). The LDP must be applied to
estimate the probability
distorted speech/EEG signal z(t) is related to the MRI data via a ”measurement
model”
dz(t) = ( φ(t, x, y)M (t, x, y|θ)dxdy)dt + dv(t)
In the special case when the brain is normal, θ = θ0 and in the absence of noise
z(t) = z0 (t) so that we have an identity
p
∂t M0 (t, x, y|θ0 ) = akm (θ0 )∂xk ∂ym M0 (t, x, y|θ0 ) + z0 (t)χ(t, x, y)
k,m=0
This identity determines the normal brain, noise free MRI data M0 (t, x, y|θ) as
a function of the computer generated speech data z0 (t). Our aim is to estimate
the brain disease parameters θ from recordings of the speech data z(t). This we
propose to do using the EKF.
θ = θ0 + δθ
Likewise, if z(t) deviates slightly from z0 (t), we call this deviation δz(t):
∂δM (t, x, y) =
p
(akm (θ0 )δθ(t))∂xk ∂ym M0 (t, x, y|θ0 ) + w(t, x, y)
k,m=0
dδθ(t) = d θ (t),
and the linearized measurement model is
dδz(t) = ( φ(t, x, y)δM (t, x, y)dxdy)dt + dv(t)
The problem now amounts to estimating δθ(t) dynamically from δz(s), s ≤ t and
this can be achieved via the Kalman filter. Another model that could describe
this situation involves considering that the MRI image data within the brain
depends on the distorted brain parameters θ as well as on the distorted speech
data z(t). In this case, the state and measurement models will be intertwined.
However, such a model is not too realistic since it is natural to suppose that
first the brain acquires a disease independent of the spoken speech so that the
corresponding distorted parameters θ have nothing to do with the spoken speech
but could depend on the noiseless speech that the computer has stored. One
may also assume that this MRI data is independent of the noiseless speech but
Advanced Probability and Statistics: Applications to Physics and Engineering 161
the spoken speech depends on the MRI data and the computer stored noiseless
speech. In this case, a state and measurement model would be
p
∂t M (t, x, y|θ) = akm (θ)∂xk ∂ym M (t, x, y|θ)dxdy
k,m=0
+w(t, x, y)
dz(t) = dz0 (t) + ( φ(t, x, y)M (t, x, y|θ)dxdy)dt + dv(t)
The role of the bispectrum in modeling brain disease: Brain disease is usually
manifested by the generation of nonlinear mechanisms which distort the speech
data. So, if M (t, x, y|θ) is the MRI image field data, then the spoken speech
z(t) would satisfy an sde of the form
dz(t) = (δ. f (M (t, x, y|θ), z0 (t))φ(t, x, y)dxdy)dt + dz0 (t) + dv(t)
Now
y(t) = ( φ(t, x, y)f (k) (M (t, x, y|θ), 0)dxdy)z0 (t)k /k!
k≥0
[20] Historic remark: It was Kalman who created the theory of real time/time
recursive estimation of the state of a dynamical system. His theory was however
based on linear state and measurement models. It was only after G.Kallianpur
and Kushner generalized Kalman’s work to include nonlinear state and mea-
surement models that engineers were able to solve a variety of signal estimation
problems wherever the objective was to do real time processing. Kallianpur’s
work was very mathematical involving measure valued random processes but
engineers were able to simplify it and obtain approximate implementable al-
gorithms. The Kushner-Kallianpur filter is an infinite dimensional filter as it
162 Advanced Probability and Statistics: Applications to Physics and Engineering
talks about how to estimate the conditional pdf of a signal given measurements
recursively. The EKF is a finite dimensional approximation to this.
Reference: Vijay Upreti, Sagar, Vijyant Agrawal and Harish Parthasarathy,
paper communicated.
If the brain has no disease, θ = 0 and if further there is no noise, S(t) = S0 (t), ie,
in this case there is no slurring of the speech so that the spoken speech coincides
with the computer recorded speech. The MRI dynamics also depends upon the
brain parameters θ and the computer recorded speech S(t):
p
g(t + 1) = θ(k)Dk g(t) + BS0 (t) + Wg (t + 1)
k=1
and reverberations within the brain caused by signals coming neurons connected
to other parts of the body, or disturbances caused by change in the positions
and movement of the measuring apparatus. By incorporating such disturbance
terms in the dynamical model, we can design a disturbance observer that will
provide real time estimates of the disturbance and hence subtract this distur-
bance estimate from the dynamical model.
[22] Stochastic instability of a stable system caused by small random spikes,
analysis of the probability of diffusion exit from a domain using large deviation
theory. Application of the diffusion exit problem to computing the probability of
exit of the EKF for MRI data from the stability domain. Consider a dynamical
system
√
x[n + 1] = f (x[n]) + w[n + 1]
where x[n] ∈ Rp and w[n] is an iid N (0, σ 2 Ip ) sequence. Then Let x0 be a fixed
point of the noiseless dynamical system ,ie,
f (x0 ) = 0
Assume that G is a connected open set containing x0 with the property that if
x(t) is the trajectory of the noiseless system with x(0) ∈ G, then limt→∞ x(t) =
x0 . In other words, x0 is an asymptotically stable fixed point. Then the
large deviation principle implies that the probability of the trajectory of the
noisy system exiting G for the first time at n = N is given for small val-
ues of by the formula exp(−V (N )/epsilon) where V (N ) equals the min-
N −1
imum of (2σ 2 )−1 n=0 x[n + 1] − f (x[n]) 2 over all those trajectories
{x[n] : n = 0, 1, ..., N } for which x[0] = x0 , x[n] ∈ G, 1 ≤ n ≤ N − 1 and
x[N ] ∈
/ G. It follows easily from this result that if N denotes the first time of
exit of the noisy system from G, then for small ,
E(N ) = exp(V̄ / )
where
V̄ = infN V (N )
The precise result is that
lim →0 .log(EN ) = V̄
p
S(t + 1) = S0 (t + 1) + θ(k)Ck S(t) + Ws (t) + ds (t + 1)
k=1
p
g(t + 1) = θ(k)Ck g(t) + BS0 (t) + Wg (t + 1) + dg (t + 1)
k=1
S(t + 1)
=
g(t + 1)
p
Ck 0 S(t) ds (t + 1)
θ(k) +
0 Dk g(t) dg (t + 1)
k=1
or equivalently, writing
S(t)
X(t) = ,
g(t)
Advanced Probability and Statistics: Applications to Physics and Engineering 165
Ck 0
Fk = ,
0 Dk
as
p
X(t + 1) = θ(k)Fk X(t) + W (t + 1) + d(t + 1) = F X(t) + d(t + 1) + W (t + 1)
k=1
Disturbance observer:
ˆ + 1) = d(t)
d(t ˆ + L(t)(d(t + 1) − d(t))
ˆ
p
ˆ + L(t)(X(t + 1) −
= d(t) ˆ
θ(k, t)Fk X(t) − d(t))
k=1
Then,
ˆ + 1) = z(t + 1) + P X(t + 1) = d(t)
d(t ˆ + L−1 P X(t + 1))
ˆ + L(F X(t) − d(t)
Thus,
E(z) = D(z) − D̂(z) = [(1 − z)(L − 1)/(z + L − 1)]D(z) − (z/(z + L − 1))W (z)
It is clear then from the final value theorem of Z-transform theory that if (z −
1)2 D(z) → 0 as z → 1, which is equivalent to d(n + 1) − d(n) → 0 then
provided that lim w(n) exists. In the special case of zero noise, the disturbance
estimation error converges to zero.
[25] Time varying spectrum and bispectrum estimation from the EKF: Con-
sider an AR time series model satisfied by the speech-MRI vector process X(t) =
[s(t), g(t)T ]T :
we get
Y (t + 1) = A(a)Y (t) + bd(t) + W (t)
where A is a matrix dependent upon the AR parameters a. Estimation of a,Y(t)
using an EKF gives us â(t). If we assume that a varies slowly and is therefore
almost constant over time slots of duration L, we get the spectral and bispectral
estimates of X(t) over each slot as
where Σw is the power spectral density matrix of W. More generally, we can al-
low the AR parameters a(k) to be matrices and then the power spectral estimate
of X(t) would be given by
How to incorporate wavelet based state and parameter estimation into the EKF.
X(t + 1) = F (θ)X(t) + W (t + 1)
Can this parameter estimation be made time recursive ? After a long duration
has elapsed, it is clear that if the signal is transient, its wavelet transform
taken upto time t will not change significantly with time and hence if we have
an apriori idea about what dominant frequencies are present in the signal over
each time slot, we need retain only those wavelet coefficients of the signal having
resolution indices corresponding to these dominant frequencies. This knowledge
therefore saves us of storing excessive data during the estimation process. For
example, suppose that we solve the above state variable system as
t−1
X(t) = F (θ)t−k−1 W (k), t = 0, 1, 2, ...
k=0
We take the wavelet transform over the time slot t ∈ [rL, (r + 1)L − 1] retaining
only those coefficients which fall in the index range Dr and then use this data for
different slot numbers r to estimate θ. However, this is not a recursive parameter
estimation scheme. Again, suppose that we have a dynamically varying three
dimensional signal field X(t, x, y, z) = X(t, r). Suppose that its dynamics is
described by a pde
or equivalently,
∂t WX (t, n) = WX (t, m)ψm (r)L(θ)∗ ψn (r)d3 r + noise
m
or equivalently,
∂t W (t, n) = a(n, m, θ)WX (t, m) + noise
m
where
a(n, m, θ) = ψn (r)L(θ)ψm (r)d3 r + noise
We require to store this wavelet domain dynamical information only for a few
dominant wavelets in order to estimate θ efficiently. and the resulting estimation
algorithm can then be carried out with lesser computational complexity.
Chapter 6
Electromagnetism and
Quantum Field Theory
[1] Stochastic analysis of electromagnetic wave propagation in curved
space-time
Problem statement: The background metric is gμν (x). The permittivity-
permeability tensor is μν
αβ (x) this is assumed to be a Gaussian random field of
the form
δαμ δβν + δ.χμν
αβ (x)
where χμν
αβ (x) is a zero mean Gaussian field with known correlations
E[χμν μν
αβ (x)χα β (x )]
The aim is to solve for the electromagnetic potentials upto O(δ 2 ) and express its
correlation as well as the correlation of the antisymmetric electromagnetic field
tensor upto O(δ 2 ) terms. The next problem is to consider in addition to such a
random permittivity-permeability tensor, the condition that the metric tensor of
the gravitational field is also a small zero-mean random Gaussian perturbation
of a non-random background metric tensor:
(0)
gμν (x) = gμν (x) + δgμν (x)
T = R 1 ∪ R2
where
R1 = {(x, y) : 0 ≤ x ≤ a, 0 ≤ y ≤ b}
169
170 Advanced Probability and Statistics: Applications to Physics and Engineering
R2 = {(x, y) : a ≤ x ≤ c, −d ≤ y ≤ b + d}
determine the electromagnetic fields within the box having T as its upper sur-
face at z = L/2 and the same surface at z = −L/2 with all the walls being
perfectly conducting surfaces. For doing this problem, you may regard this
cavity resonator as the union of two rectqangular boxes, determine the fields
within each rectangular box assuming standard formulae for rectangular cavity
resonators and then applying the continuity condition on the interface between
the two rectangular boxes.
[3] Various kinds of feed structures are discussed by the author. In any case,
once the probe shape that feeds into the cavity antenna is known, the Maxwell
equations within the cavity must be supplemented apart from the boundary wall
condition with the boundary condition on the probe surface. Specifically, the
tangential magnetic field on the probe surface should equal the surface current
density on it. I would like to see a satisfactory reply to this at the time of the
viva-voce exam.
[4] The following typical one dimensional example may be used to explain
how resonance occurs when there is a transition in the permittivity from within
the rdra to outside. Assume that the region 0 ≤ x ≤ a has permittivity 1 while
Advanced Probability and Statistics: Applications to Physics and Engineering 171
the regions x < 0 and x > a have permittivity 2 . Thus the one dimensional
Helmholtz equations in these regions are (a) ψ (x) + k12 ψ(x) = 0, 0 ≤ x ≤ a,
(b) ψ (x) + k22 ψ(x) = 0, x < 0 or x > a where k12 = ω 2 1 μ0 , k22 = ω 2 2 μ0 . The
solutions in the three regions taking into account absence of reflection for x < 0
and x > a are given by
[5] Helmholtz equation for Ez within and outside the rdra when the dielectric
within has permittivity 1 and that outside has permittivity 2 are given by
outside the boundary where the z dependence is exp(±γz). Note that γ has to
be the same for the fields within and outside in view of the continuity of Ez at
the walls. Here
∇2⊥ = ∂ 2 /∂x2 + ∂ 2 /∂y 2
and
k12 = ω 2 1 μ0 , k22 = ω 2 2 μ0
Since Ez vanishes at the top and bottom surfaces, we must have
γ = jπp/d
and outside as
ψ(x, y) = (C1 .cos(β1 x) + C2 .sin(β1 y)).(D1 .cos(β2 y) + D2 .sin(β2 y))
172 Advanced Probability and Statistics: Applications to Physics and Engineering
II:
a 3 < x < a 3 + a 4 , a2 < y < a 2 + a 5 ,
The interface between these two regions is
y = a 2 , a3 < x < a 3 + a 4
Ez (t, x, y, z) = sin(mπ(x−a3 )/a4 )sin(nπ(y−a2 )/a5 )cos(pπz/d)
mnp
Re(cII (mnp)exp(jωII (mnp)t)
where
ωI (mnp)2 /c= (mπ/a1 )2 + (nπ/a2 )2 + (pπ/d)2 ,
ωI (mnp)2 /c2 = (mπ/a4 )2 + (nπ/a5 )2 + (pπ/d)2
It is clear that Ez in both the regions vanishes at the interface and so does
∂Ez /∂x. Thus to ensure continuity of all the components of the em field at
the interface, we require that ∂Ez /∂y should be continuous at the interface.
Since this is only an approximate analysis, we shall require that the coefficients
cI (mnp) and cII (mnp) be such that if we denote by EzI and EzII the above
expressions for the z-components of the electric field in the two regions, then
|∂EzI (t, x, a2 , z)/∂y − ∂EzII (t, x, a2 , z)/∂y|2 dxdzdt
be a minimum subject to other constraints given by for example the feed points.
The region of integration is the same as that in the above discussed T M case.
Now consider another example in which there are three connected rectangu-
lar patches given by the regions (0 < x < a1 , 0 < y < a2 ), (a3 < x < a3 +a4 , a2 <
y < a2 + a5 ) and (a6 < x < a6 + a7 , a2 + a5 < y < a2 + a5 + a8 ).
[5] The Abelian and non-Abelian anomalies in effective quantum
field theories
174 Advanced Probability and Statistics: Applications to Physics and Engineering
D = ∂ + ieA(x)
ie,
γ.D = γ.∂ + ieγ.A(x) = γ μ ∂μ + ieγ μ Aμ (x)
Here, we are considering the Abelian U (1) anomaly, ie, the field considered is the
electromagnetic field. Note that f ((γ.D)2 /M 2 ) is an operator that converges
to the identity operator as M → ∞. So its trace converges to the trace of the
identity, ie, δ(x − x)d4 x as M → ∞. We find that for large M
and
T r((γ.D)2 ) = 0
effectively, since
(γ.D)2 = γ μ γ ν (∂μ + iAμ )(∂ν + iAν )
= (1/2) − (1/2)A2 + iγ μ γ ν (2Aν ∂μ + iAν,μ )
Now exp((1/2)T r()) contributes a field independent constant to the factor
by which the path measure is to be multiplied after the Chiral transformation,
while T r((−(1/2)T r(A2 )) may also be taken as unity after an approrpiate gauge
transformation of the electromagnetic four potential. Also T r(∂μ ) = 0 since ∂μ
is a skew-Hermitian operator. Finally,
T r(iT r(γ μ γ ν Aν,μ (x)d4 x)
is zero since
γ μ γ ν Aν,μ = (2η μν − γ ν γ μ )Aν,μ
Advanced Probability and Statistics: Applications to Physics and Engineering 175
= 2Aν,ν − γ ν γ μ Aν,μ
so that after choosing a gauge so that Aν,ν = 0, we get
T r(γ μ γ ν Aν,μ ) = 0
Thus, the only term that effectively contributes to the anomaly is the fourth
degree term f (0)T r(γ.D)4 /2M 4 and the non-zero part of this evaluates to
(f (0)/2M 4 ) T r(γ.∂ + iγ.A)4 d4 x
= (f (0)/2M 4 ) T r(γ μ γ ν Aμ,ν )2 d4 x
Now,
γ μ γ ν Aμ,ν = −γ ν γ μ Aμ,ν
(since Aμ,μ = 0) and thus,
When the infinitesimal symmetry transformation becomes finite, then this cal-
culation amounts to a change in the action by the quantity
(μναβ)Fμν (x)Fαβ (x)d4 x
where
(μναβ) = T r(γ μ γ ν γ α γ β )
The non-Abelian anomaly: Here, the gauge group is SU (N ) with N > 1. Let
ta , a = 1, 2, ..., N be Hermitian generators of this gauge group and after taking
into account Chirality, ie, the Lie algebra is a direct sum of (1 + γ5 )su(N ) and
(1 − γ5 )su(N ). The factor defining the change in the measure now gets modified
to
T r(f ((γ.D)2 /M 2 ))
or equivalently as
T r((γ.D)4 )
where
D = γ.∂ + iγ.A
where
A(x) = ta Aa (x)
176 Advanced Probability and Statistics: Applications to Physics and Engineering
ie
γ.A(x) = γ μ ta Aaμ (x)
By γ μ ta we mean γ μ ⊗ ta and we get by a similar calculation as for the Abelian
case,
−i(γ.D)2 = −i(γ.∂ + iγ.A)2
= [γ.∂.γ.A + γ.A.γ.∂] + i(γ.A)2
so that effectively (ie, after neglecting factors that have zero trace),
[γ.∂.γ.A + γ.A.γ.∂]2 =
γ.A.(γ.partial)2 .γ.A
+γ.∂.γ.A.γ.∂.γ.A
+γ.A.γ.∂.γ.A.γ.∂
+γ.∂.(γ.A)2 .γ.∂
Taking trace,
−T r((γ.D)4 ) =
iT r((A2 )2 ) + iT r(Aμ Aμ .[γ.∂.γ.A + γ.A.γ.∂])
T r[(γ.A)2 (γ.∂)2 ] + 2T r[γ.A.γ.∂.γ.A.γ.∂]
Dμ = ∂μ + iAμ , Aμ = Aaμ ta
(γ.D)2 = γ μ γ ν Dμ Dν
= (2η μν − γ ν γ μ )Dμ Dν
= 2D2 − [γ ν γ μ [Dμ , Dν ] + γ ν γ μ Dν Dμ ]
= 2D2 − γ ν γ μ Fμν − γ μ γ ν Dμ Dν
Therefore,
2(γ.D)2 = 2D2 − γ ν γ μ Fμν
and we get on squaring this followed by taking the trace,
2T r((γ.D) ) = T r(γ ν γ μ Fμν (x))2 d4 x
2
= (μναβ)T r(Fμν (x).Fαβ (x))d4 x
Note that
D2 = (∂μ + iAμ ).(∂ μ + iAμ )
= − A2 + i(2Aμ (x)∂μ )
Advanced Probability and Statistics: Applications to Physics and Engineering 177
so
μ ν
T r(A A ∂μ ∂ν ) = C A2 (x)d4 x
which is a pure gauge term and can be made to vanish by an appropriate choice
of the gauge. Again,
T r(Aμ Aν,μ ∂ν )
= Aμ (x)Aν,μ (x) ,ν (0)d4 x = 0
as follows. Assume
ψ(x) = u(x).exp(−ax2 /2)
where a is determined so that the coefficient of x2 cancels out in the differential
equation for u(x). Then, write down an infinite series expansion for u(x) and
show that if the infinite series does not terminate with some term being zero,
then ψ(x) = exp(−ax2 /2).u(x) explodes as |x| → ∞ and hence cannot be
square integrable. Show that the infinite series terminates to a polynomial iff
E = (n + 1/2)hω for some non-negative integer n.
[17] Computation of the photon and electron propagators using the Green’s
function satisfying an inhomogeneous pde driven by the four dimensional Dirac
delta function.
[18] Spectral function sum rules: From general principles of Lorentz invari-
ance, the form of the propagator of any scalar field can be expressed as an
integral of the propagator of a particle of mass μ over all possible masses μ
with respect to a measure on the space of masses μ. This is also called the
Kallen-Lehmann representation.
[19] Notion of the quantum effective action derived from the Legendre trans-
form of the logarithm of the path integral for a field in the presence of interaction
between a current field J and the field φ. Properties of the quantum effective
action like its invariance under a gauge transformation when the original action
is invariant under the gauge transformation. How to express the quantum equa-
tions of motion in terms of the quantum effective action. How to compute the
quantum effective action for superconductivity as a function of the gap function
and the external magnetic vector potential and from this function, how to derive
the basic properties of a superconductor.
[20] The ADM action for the gravitational field. How to express the Einstein-
Hilbert action in terms of canonical ADM position and momentum fields. How
to quantize the gravitational field by introducing commutation relations for the
canonical position and momentum fields in the ADM action. While introducing
the canonical commutation relations, we should bear in mind that the ADM
Hamiltonian has constraints like some of the momenta vanish and hence one
should use the Dirac bracket rather than the Poisson/Lie bracket for the com-
mutation rules.
[21] Proof of the general fact that spontaneous symmetry breaking leads to
the generation of massless Goldstone bosons and that the addition of a sym-
metry breaking action to the original invariant action leads to the generation
of particles having nonzero masses. In other words, approximate symmetry
breaking gives masses to massless particles and this has applications to the elec-
troweak theory which explains not only how the gauge bosons that propagate
nuclear forces acquires masses but also how the electron acquires mass when the
matter and gauge fields get coupled to the scalar Higgs field.
[22] The thermal emission of particles from a blackhole via Hawking radiation
and how this phenomenon can be used to communicate from within the critical
radius to without by the use of entangled states. First we start with a state
on the tensor product between the Hilbert spaces for fields with support within
the critical radius and without. Then by using local operations combined with
classical communication, we can generate an entangled state between the interior
and exterior of a blackhole using which we can transmit quantum states from
inside to the outside of the blackhole and vice-versa. By local operations, we
mean the following: Let Ha , a = 1, 2 be two Hilbert spaces and let H = H1 ⊗ H2
be their tensor product. Let ρ be a state in H. I choose a POVM {Mk } acting
in H1 . After applying this measurement to ρ, I note the outcome, say a. Then
180 Advanced Probability and Statistics: Applications to Physics and Engineering
This formula means that (tnm φ0m )n is an eigenstate of the mass matrix having
zero eigenvalues. Thus, spontaneous symmetry breaking which occurs when
the physics is viewed from the ground state leads to the generation of massless
particles, called massless Goldstone Bosons.
Remark: Let
exp(W (J)) = exp(iI(φ) + i J.φ)Dφ
Where
J.φ = Jφd4 x
We have
i exp(iI(φ) + iJ.φ)φ(x)Dφ
δW (J)/δJ(x) = i < φ(x) >J =
exp(iI(φ) + iJ.φ)Dφ
Advanced Probability and Statistics: Applications to Physics and Engineering 181
< φ >J = ψ
Then define
iV (ψ) = ExtK (iK.ψ − W (K))
The extremum is attained when
iψ = W (K)(x) = δW (K)/δK(x)
= −ΔJ (x, y)
where ΔJ is the propagator at current J. Then,
and hence,
ΔJ (x, y) = −δ 2 W (J)/δJ(x)δJ(y)
Now if
< φ >J = ψ
182 Advanced Probability and Statistics: Applications to Physics and Engineering
we get that
or equivalently.
−ΔJ = δψ/δJ = (δJ/δψ)−1 |ψ=0
= (δ 2 V (ψ)/δψ ⊗ δψ)−1 |ψ=0
In other words, the Hessian of the quantum effective potential is the negative
of the inverse of the propagator kernel evaluated at zero mean field.
Then
dYo (t) = dYi (t) + U (t)∗ (c̄L∗2 dt + cSdA
Application of quantum Ito’s formula gives
dYo (t) = jt (cL2 +c̄L∗2 )dt+jt (c+bL∗2 +cS)dA+jt (c̄+b̄L2 +c̄S ∗ )dA∗ +jt (b+bS ∗ +b̄S)dΛ
Clearly,
P [1] = cL2 + c̄L∗2 , Q[1] = c + bL∗2 + cS, R[1] = c̄ + b̄L2 + c̄S ∗ , S[1] = b + bS ∗ + b̄S
gives us
and likewise for H⊥ (t, r). It follows that the total energy of the em field within
the cavity resonator can be expressed as
a b d
U (t) = ( /2)[|E(t, r)|2 + (μ/2)|H(t, r)|2 ]d3 r
0 0 0
and its time average is the same as its instantaneous value in view of the or-
thogonality of the modes, ie the energy within the guide is a constant:
T
−1
U (t) =< U >= limT →∞ T U (t)dt = α(mnp)|cmnp (0)|2 +β(mnp)|dmnp (0)|2
0 mnp
184 Advanced Probability and Statistics: Applications to Physics and Engineering
= α(mnp)|cmnp (t)|2 + β(mnp)|dmnp (t)|2
mnp
In the case of a cavity with arbitrary curvilinear cross-section, the form of the
energy is the same:
U (t) =< U >= [α(n)|cn (t)|2 + β(n)|dn (t)|2 ]
n
Note that taking the cavity quantum Hamiltonian H = U , we get on using the
commutation relations
It should be noted that in a cavity of arbitrary cross sectional shape, the char-
acteristic frequencies of oscillation for the T E and T M modes are different
owing to the different kinds of boundary conditions involved. Specifically, in
the T E case, the boundary condition is of the Dirichlet type which follows
from the condition that Ez vanish on the boundary while in the T M case,
the boundary condition is of the Neumann type which follows from the condi-
tion that the normal component of the magnetic field or equivalently, ∂Hz /∂ n̂
vanishes on the boundary. Now suppose that the cavity is placed within a
noisy bath described by the annihilation, creation and conservation processes
An (t), An (t)∗ , Λn (t), n = 1, 2, .... The magnetic vector potential of the em field
within the guide has the form
Asys (t, r) = 2Re [cn (t)ψsys,n (r) + dn (t)χsys,n (r)]
n
= [cn (t)ψsys,n (r) + cn (t)∗ ψsys,n (r)∗ + dn (t)χsys,n (r) + dn (t)∗ χsys,n (r)∗ ]
n
and hence if there are photocells in the system whose noise current is propor-
tional to the number of bath photons, then this noise current will in turn gener-
ate a noisy em field within the cavity and bath whose corresponding magnetic
vector potential has the form
Aph (t, r) = Λn (t)ψph,n (r)
n
The total electric field within the system is the sum of the system and bath
components:
E(t, r) = Esys (t, r) + Ebath (t, r) =
−∂Asys (t, r)/∂t − ∂Abath (t, r)/∂t − ∂Aph (t, r)/∂t
an likewise for the magnetic field:
B(t, r) = Bsys (t, r)+Bbath (t, r) = ∇×Asys (t, r)+∇×Abath (t, r)+∇×Aph (t, r)
The total em field energy of the system and bath is then given by
It is the component H(t) = Usys + Uint (t) which affects the system dynamics
and this concerns us here. It has the form
H(t) = [α(n)|cn (t)|2 + βn (t)|dn (t)|2 ]
n
+ (L1 (n, m)cn (t) + L2 (n, m)cn (t)∗ + L3 (n, m)dn (t) + L3 (n, m)dm (t)∗ )Am (t)
n,m
+ (L5 (n, m)cn (t) + L6 (n, m)cn (t)∗ + L7 (n, m)dn (t) + L8 (n, m)dm (t)∗ )Am (t)∗
n,m
+ (L9 (n, m)cn (t)+L10 (n, m)cn (t)∗ +L11 (n, m)dn (t)+L2 (n, m)dm (t)∗ )Λm (t)
n,m
out of the surface of sphere of radius R centred at the origin both directly and
using Gauss’ divergence theorem. Prove that these two expressions agree. You
may use the following formula for the divergence in spherical polar coordinates:
1 ∂ 2 1 ∂ 1 ∂
divF = 2
r Fr + sin(θ)Fθ + . Fφ
r ∂r r.sin(θ) ∂θ r.sin(θ) ∂φ
You may also use the expression dS = R2 sin(θ)dθ.dφ for the surface element
on the spherical surface.
[3] Two concentric rings of radii a < b exist in the plane with centre at the
origin. If the potential on the inner ring is V1 (φ) and that on the outer ring
is V2 (φ) with 0 ≤ φ < 2π, then by solving Laplace’s equation in the annulus
a < ρ < b and applying the boundary conditions, calculate the potential in this
annulus. Express your solution in the form:
2π
V (ρ, φ) = [K1 (ρ, φ, φ )V1 (φ ) + K2 (ρ, φ, φ )V2 (φ )]dφ
0
[5] Write down the general equation of continuity, Gauss’ law and Ohm’s law
in terms of ρ, J, E, σ, . By manipulating these, show that ρ decays with time
as exp(−σt/ ). Using this formula, define and determine the relaxation time.
[2] Let D denote the cross sectional region in the xy plane of a cylindrical
waveguide with axis along the z-axis. Choose an orthogonal curvilinear coor-
dinate system (q1 , q2 ) in the xy plane and express the transverse components
of the electric and magnetic field in the q1 − q2 system in terms of the first
order partial derivatives of their z-components w.r.t q1 and q2 . Formulate the
Helmholtz equations for Ez , Hz with appropriate boundary conditions in the
q1 − q2 system. Assume now that the boundary of the guide is q1 = constant
and explain how by drawing a triangular grid in the q1 − q2 system and apply-
ing the finite element method combined with the Dirichlet boundary conditions
on Ez and the Neumann boundary conditions on Hz , you can approximately
determine the modal eigenvalues and eigenfunctions for these fields as a matrix
generalized eigenvalue problem.
Show that
r̂.Xlm = 0
Show that the Helmholtz equation
leads to an ode for f (r) parametrized by l and denote two of its linearly inde-
pendent solutions by fl (r), gl (r). Consider solving the Maxwell equations
Show that fl (r)Xlm (r̂) satisfies the Helmholtz equation and that this vector is
perpendicular to the radial direction. Thus, we can define a multipole transverse
electric field component as
Hlm = fl (r)Xlm(r̂)
188 Advanced Probability and Statistics: Applications to Physics and Engineering
Show then that the general solution for the radiation fields can be expressed as
E= [a(l, m)fl (r)Xlm (r̂) + b(l, m)(1/jω )∇ × (fl (r)Xlm (r̂)]
lm
H= [(−1/jωμ)a(l, m)∇ × (fl (r)Xlm (r̂)) + b(l, m)(fl (r)Xlm (r̂)]
lm
Using the orthogonality relations and eigenfunction relations for the spherical
harmonics, prove orthogonality relations for the vector spherical harmonics Xlm .
Using these orthogonality relations, explain how if you know r̂.E and r̂.H on the
surface of a sphere of radius R with centre at the origin, then you can compute
the coefficients a(l, m), b(l, m) in this multipole expansion.
[4] Explain via the Boltzmann kinetic transport equation, how a plasma
interacts with an electromagnetic field and derive approximate dispersion rela-
tions that describe the propagation of perturbations in the plasma distribution
function as well as those in the electromagnetic field.
where hμν is of the first order of smallness. The bath electromagnetic field is
described by an electromagnetic four potential of the form
Aμ (t, r) = [a(k, s)ψμ (k, s, t, r) + a(k, s)∗ ψμ (k, s, t, r)]d3 k
+ [Am (t)φμ (m, r) + Am (t)φμ (m, r)∗ + Λm (t)χm (r)]
m
where a(k, s), a(k, s)∗ are the usual annihilation and creation operator fields in
momentum space within the bath satisfying the CCR
and Am (t), Am (t)∗ , Λm (t) are the annihilation, creation and conservation noise
processes in the bath satisfying the usual Hudson-Parthasarathy quantum Ito
formula:
dAm (t)dAn (t)∗ = δmn dt, (dΛ(t))2 = dΛ(t),
dAm (t)dΛn (t) = δmn dAm (t), dΛm (t)dAn (t)∗ = δmn dAn (t)∗
Advanced Probability and Statistics: Applications to Physics and Engineering 189
The quantum field operators a(k, s), a(k, s)∗ are assumed to commute with the
quantum noise operators Am (t), Am (t)∗ , Λm (t). The bath is assumed to be in
the following pseudo-coherent state:
|v, φ(u) >
defined so that the above quantum field and quantum noise operators have the
following action:
a(k, s)|ψ(v), φ(u) >= v(k, s)|ψ(v), φ(u) >,
< ψ(v ), φ(u )|a(k, s)|ψ(v), φ(u) >= v(k, s) < ψ(v ), φ(u )|ψ(v), φ(u) >=
v(k, s)exp(< v |v >)exp(< u |u >)
It should be noted that
< v |v >= v (k, s) v(k, s)d k, < u |u >= ūm (t)∗ um (t)dt
∗ 3
with the summation over over the repeated index s = 1, 2 being implied and
likewise over the repeated index m. We have therefore
< ψ(v ), φ(u )|a(k, s)∗ |ψ(v), φ(u) >= v (k, s)∗ .exp(< v |v >)exp(< u |u >)
and further,
t
Am (t)|ψ(v), φ(u) >= um (t )dt |ψ(v), φ(u) >,
0
so that
t
< ψ(v ), φ(u )|Am (t)|ψ(v), φ(u) >= um (t )dt .exp(< v |v >).exp(< u |u >)
0
t
< ψ(v ), φ(u )|Am (t)∗ |ψ(v), φ(u) >= um (t )∗ dt < exp(< v |v >).exp(< u |u >)
0
and finally,
t
< ψ(v ), φ(u )|Λm (t)|ψ(v), φ(u) >= ( um (t )∗ um (t )dt )exp(< v |v >).exp(< u |u >)
0
where the summation over the repeated spin indices s1 , ..., sn is implied and
|0 > is the vacuum state while C(v) is a normalization constant. We have for
example
a(k, s) v(k1 , s1 )v(k2 , s2 )a(k1 , s1 )∗ a(k2 , s2 )∗ d3 k1 d3 k2 |0 >
190 Advanced Probability and Statistics: Applications to Physics and Engineering
= v(k1 , s1 )v(k2 , s2 )([a(k, s), a(k1 , s1 )∗ ]+a(k1 , s1 )∗ a(k, s))a(k2 , s2 )∗ d3 k1 d3 k2 |0 >
= v(k1 , s1 )v(k2 , s2 )(δss1 δ 3 (k − k1 )a(k2 , s2 )∗ d3 k1 d3 k2 |0 >
+ v(k1 , s1 )v(k2 , s2 )a(k1 , s1 )∗ ([a(k, s), a(k2 , s2 )∗ ])+a(k2 , s2 )∗ a(k, s))d3 k1 d3 k2 |0 >
= v(k, s) v(k2 , s2 )a(k2 , s2 )∗ d3 k2 |0 >
+v(k, s) v(k1 , s1 )a(k1 , s1 )∗ d3 k1 |0 >
= 2v(k, s) v(k1 , s1 )a(k1 , s1 )∗ d3 k1 |0 >
since
a(k, s)|0 >= 0
This result is easily extended to give
a(k, s) v(k1 , s1 )∗ ...v(kn , sn )∗ a(k1 , s1 )∗ ...a(kn , sn )∗ d3 k1 ...d3 kn |0 >
= na(k, s) v(k1 , s1 )∗ ...v(kn−1 , sn−1 )∗ a(k1 , s1 )∗
where
Fμν (x) = Aν,μ (x) − Aμ,ν (x)
Advanced Probability and Statistics: Applications to Physics and Engineering 191
where
F μν = g μα g νβ Fαβ
by replacing
g μν ≈ ημν − hμν
where
hμν = ημα ηνβ hαβ
and the far field magnetic vector potential produced by this density is given by
Aa (ω, r) = K.(exp(−jkr)/r) Ja (ω, r )exp(jkr̂.r )d3 r
We get
E(Jˆa (ω, r).Jˆb (ω , r )∗ ) = Rab (t − t , r − r )exp(−j(ωt − ω t ))dtdt
Now suppose that an atom with a single electron is excited by this random
electromagnetic field.
The interaction Hamiltonian in the Dirac picture is given by
V (t) = e(α, A(t, r)) − eΦ(t, r) = −eαμ Aμ (t, r)
The transition probability in time [0, T ] from an initial state u(r) to a final state
v(r) is given upto O(e2 ) in perturbation theory by
T
PT (u → v) = e2 E[| < v|αμ Aμ (t, r)|u > dt|2 ]
0
= e2 < v(r)|αμ |u(r) >< u(r )|αν |v(r )
[0,T ]2
> E[Aμ (t, r)Aν (t , r )]dtdt d3 rd3 r
where
Aμ (t, r) = K Jμ (x )δ((x − x )2 )d4 x
where x = (t, r) and x2 = t2 − r2 with c = 1. Thus, the above formula for the
transition probability can be expressed in four dimensional notation as
PT (u → v) =
K 2 e2 < v(r1 )|αμ |u(r1 ) >< u(r2 )|αν |v(r2 )
> E(Jμ (x1 )Jν (x2 ))δ((x1 −x1 )2 ).δ((x2 −x2 )2 )d4 x1 d4 x2 d4 x1 d4 x2
= K 2 e2 < v(r1 )|αμ |u(r1 ) >< u(r2 )|αν |v(r2 )
J
> Rμν (x1 −x2 )δ((x1 −x1 )2 ).δ((x2 −x2 )2 )d4 x1 d4 x2 d4 x1 d4 x2
[14] Problems on physics in a curved background metric
[1] Write down the Maxwell equations in the curved metric
where
h(r) = f (r) − 1
Advanced Probability and Statistics: Applications to Physics and Engineering 193
We note that
dr = (x.dx + y.dy + z.dz)/r
so that
Thus writing
x1 = x, x2 = y, x3 = z
we can write
dτ 2 = dt2 − S 2 (t)(δab + .hab (r))dxa dxb
where is a small perturbation parameter. The Maxwell equations can be
expressed as
Fμν = Aν,μ − Aμ,ν ,
√
(F μν −g),ν = 0
We have
g00 = 1, g0a = 0, gab = −S 2 (t)(δab + .hab (r))
Thus,
g 00 = 1, g 0a = 0, g ab = −S(t)−2 (δab − .hab ) + O( 2 )
The Maxwell equations upto O( ) terms read as follows:
F μν = g μα g νβ Fαβ
F 0a = g ab F0b = −S −2 F0a + .S −2 hab F0b
F ab = g ak g bm Fkm =
S −4 (δak − .hak )(δbm − .hbm )Fkm
= S −4 (Fab − .(hak Fkb + hbk Fak )
g = −S 6 (1 + .h), h = haa
√
−g = S 3 (1 + .h/2)
√
−gF 0a = S 3 (1 + .h/2)(−S −2 )(F0a − .hab F0b )
= −S(F0a + .(h.F0a /2 − hab F0b ))
[15] A problem in electrodynamics
A perfectly conducting cylindrical surface of radius R and length L is placed
with its axis along the z axis and extending from z = −L/2 upto z = L/2.
An electromangetic wave with electric field Ei (ω, ρ, θ, φ) is incident upon this
cylinder Let
Js (φ, z) = Jsφ (z, φ)φ̂ + Jsz (φ, z)ẑ
be the induced surface current density on the cylindrical surface. Derive integral
equations satisfied by it at frequency ω.
194 Advanced Probability and Statistics: Applications to Physics and Engineering
Note that
G(R, φ, z|φ , z ) =
(μ/4π).exp(−jK 4R2 sin2 ((φ − φ )/2) + (z − z )2 )/ 4R2 sin2 ((φ − φ )/2) + (z − z )2
The corresponding electric field in space produced by this surface current density
is given by
Es (ω, ρ, φ, z) = (jω )−1 (∇(divA) + K 2 A)
Since the tangential components Ez , Eφ of the total electric field E = Ei + Es
must vanish when ρ = R, |z| ≤ L/2, we get two integral equations
where m is a constant.
[2] Study the dynamics of small metric perturbations around the Schwarzchild
solution by linearizing the Einstein field equations:
δRμν = 0
hint:
μα ):ν − (δΓμν ):α
δRμν = (δΓα α
Advanced Probability and Statistics: Applications to Physics and Engineering 195
where the covariant derivatives are carried out w.r.t the unperturbed metric.
Further,
(δΓα α
μα ):ν = (δΓμα ),ν
β
−Γα
μν δΓαβ
(1)
−(Γα
μν ) (Γβαβ )(1) + (Γα
μβ )
(1)
(Γβνα )(1)
gμν = ημν + hμν (x)
(2)
(Γα
μα ) = −(1/2)hαβ .(hβμ,α + hβα,μ − hμα,β )
Thus,
(2)
(Γα
μα,ν )
(Γβαβ )(1) =
Note that
(1) (1) (1)
Rμν = (Γα
μα,ν ) − (Γα
μν,α )
= (1/2)[h,μν − hα
μ,αν − hν,αμ + hμν ]
α
G(2)
μν = (Rμν − (1/2)Rgμν )
(2)
=
(2)
= Rμν − (1/2)R(2) ημν − (1/2)Rμν
(1)
hμν
196 Advanced Probability and Statistics: Applications to Physics and Engineering
Now,
R(2) = (g μν Rμν )(2) = ημν Rμν
(2) (1)
− hμν .Rμν
(2) (2)
Substituting these, it is easy to see that that Gμν has the same form as Rμν
as given in equation (1) but with a different set of constants C1 (.), C2 (.). Then
the quadratic component in the contravariant part of Gμν is given by
Gμν(2) = [(ημα − hμα )(ηνβ − hνβ )Gαβ ](2)
(2) (1)
= ημα ηνβ Gαβ − (ημα hνβ + ηνβ hμα )Gαβ
It is easy to see once again that this has the same form of the rhs of (1) but
with different constants C1 (.), C2 (.).
where Sμν (x) is proportional to Tμν −(1/2)T ημν with Tμν as the energy-momentum
tensor of the matter plus electromagnetic field. We take
Tμν = ρvμ vν − (1/4)Fαβ F αβ ημν + Fμα Fνβ ηαβ
The pressure terms in the energy-momentum tensor of the matter field are
taken care of in a generalized form in the energy-momentum tensor of the elec-
tromagnetic field Fμν . Note that the energy-momentum tensor of the em field
is traceless. Thus,
T =ρ
In the frequency domain, we have
hμν (ω, r) = Sμν (ω, r ).exp(−jω|r − r |)d3 r /|r − r |
Γr00 = g r0 Γ000 +g rs Γs00 = −Γr00 = (−1/2)(2gr0,0 (3)−g00,r (2)) = Γr00 (2)+Γr00 (4)
where
Γr00 (2) = (1/2)g00,r (2), Γr00 (4) = −gr0,0 (3)
Γrsm = g r0 Γ0sm + g rk Γksm
= (1/2)g rk (2)(gks,m (2)+gkm,s (2)−gsm,k (2))−(1/2)(grs,m (2)+grm,s (2)−gsm,r (2))
= Γrsm (2) + Γrsm (4)
where
Γrsm (2) = −(1/2)(grs,m (2) + grm,s (2) − gsm,r (2)),
Γrsm (4) = (1/2)g rk (2)(gks,m (2) + gkm,s (2) − gsm,k (2))
Γr0m = g r0 Γ00m + g rs Γs0m
= −Γr0m (3) = −(1/2)(gr0,m (3) + grm,0 (2) − gm0,r (3))
Γ000 = Γ000 (3) = (1/2)g00,0 (3)
Γ0sm = Γ0sm (3) = (1/2)(g0s,m (3) + g0m,s (3) − gsm,0 (2))
Now we compute the perturbation terms of the Ricci tensor components upto
fourth order.
= (1/2)[h,μν − hα
μ,αν − hν,αμ + hμν ]
α
Thus,
R(1) = ημν Rμν
(1)
= h − h,αβ
αβ )
G(1) (1)
μν = Rμν − (1/2)R
(1)
ημν =
αβ
= (1/2)[hμν − h,μν − hα
μ,αν − hν,αμ + hημν + h,αβ ημν ]
α
hνμν − (1/2)h,μ = 0
then we get
G(1)
μν = (1/2)(hμν − (1/2)hημν )
dτ 2 = gμν dxμ dxν = g00 dt2 + 2g0r dxr dt + grs dxr sxs
The light ray takes a time dt = dx0 to travel from xr to xr + dxr where dx0
satisfies
dτ 2 = 0
Thus,
dx0 = dt = hr dxr + γrs dxr dxs
where
2 1/2
hr = −g0r /g00 , γrs = [(g0r g0s − g00 grs )/g00 ]
Likewise the light ray starting from xr + dxr and travelling to xr takes a time
(dx0 ) = −hr dxr + γrs dxr dxs
Advanced Probability and Statistics: Applications to Physics and Engineering 199
Hence
[dx0 − (dx0 ) ]/2 = hr dxr
and this correction must be taken into account while performing synchronization
of a moving particle. Specifically, the corrected proper time interval should be
taken as
√
g00 (dx0 − hr dxr )
and hence the three velocity of a moving particle must be defined as
√ √
wr = dxr /[ g00 (dx0 − hr dxr )] = v r /[ g00 (1 − hr v r )]
where
v r = dxr /dx0
We find that
dτ 2 = g00 dt2 + 2g0r dtdxr + grs dxr dxs
= g00 dt2 + 2g0r dtdxr + g00 ((grs g00 − g0r g0s )/g00
2
)dxr dxs + g00 (g0r dxr /g00 )2
= g00 dt2 − 2g00 hr dtdxr − g00 dl2 + g00 (hr dxr )2
= g00 (dt − hr dxr )2 − g00 dl2
where
dl2 = γrs dxr dxs
is the three-length element. We define
Thus,
ur = dxr /dτ = wr /(1 − w2 )1/2
and the geodesic equations of motion can be expressed as
or equivalently,
= θ(x0 −y 0 ) < 0|ψ(x).ψ(y)∗ |0 > −θ(y 0 −x0 ) < 0|ψ(y)∗ ψ(x)|0 >
Thus,
Dμν (x) = δ 4 (x)ημν − e < 0|T (ψ(x)∗ αμ ψ(x)Aν (0))|0 >
(iγ μ ∂μ − m)S (x) = iδ 4 (x) − iγ μ < 0|Aμ (x)ψ(x)ψ(0)∗ |0 >
Note that Dμν (x) is the exact photon propagator while
Dμν (x) = F −1 Dμν (k), Dμν (k) = ημν /k 2 + i0)
is the photon propagator for the free photon field, ie, in the absence of the Dirac
current. Likewise S (x) is the exact electron propagator while
and taking the Minkowski space 4-D Fourier transform on both sides gives us
< 0|T (ψ(x)∗ αμ ψ(x).Aν (0))|0 > exp(ik.x)d4 x =
Advanced Probability and Statistics: Applications to Physics and Engineering 201
( T r(αμ S (p + k)Γρ (p + k, p)S (p))d4 p)D νρ (k)
The equation
Dμν (x) = δ 4 (x)ημν − e < 0|T (ψ(x)∗ αμ ψ(x)Aν (0))|0 >
Likewise, we can derive an equation for the exact electron propagator in terms
of the vertex function as follows: We start with
Thus,
(γ.k − m)S (k) = i − i[ γμ S (p + k)Γν (p + k, k)D μν (p)d4 p]S (k)
or equivalently,
(γ.p − m)S (p) = i − i[ γμ S (p + k)Γν (p + k, p)D μν (k)d4 k]S (p)
or equivalently,
S (p) = S(p) + S(p)Σ(p)S (p)
where
Σ(p) = − γμ S (p + k)Γν (p + k, p)D”μν (k)d4 k
202 Advanced Probability and Statistics: Applications to Physics and Engineering
where
G(z) = e δ(q 2 )exp(−iq.z)d4 q
where we now regard ψ(.) to be the free electron-positron wave field, ie, the
Dirac field in the absence of the photon field:
(iγ.∂ − m)ψ(x) = 0
or equivalently,
ψ(x) = (a(p, σ)u(p, σ)exp(−ip.x) + b(p, σ)∗ v(p, σ)exp(ip.x))d4 p
with a(p, σ) and b(p, σ) being respectively the electron and positron annihilation
operator fields in momentum-spin space. We evaluate this term by the usual
Wick relations:
< 0|T (ψa (x)ψb (y)∗ ψ(w)∗ αμ ψ(w))|0 >
= (αμ )cd < 0|T (ψa (x)ψb (y)∗ ψc (w)∗ ψd (w))|0 >
= (αμ )cd (Sab (x − y)Scd (w − w) + Sac (x − w)Sdb (w − y))
or after neglecting the infinite constant matrix S(w − w) = S(0), it evaluates to
so that
S(p )Γν (p , p)S(p)Dμν (p − p)exp(−ip .x + ip.y − i(p − p ).z)d4 pd4 p ≈
= G(z − w)S(x − w)αμ S(w − y)d4 w
or equivalently,
Dμν (p − p)Γν (p , p) = G(p − p)αμ
ie,
Γν (p , p) = (p − p)2 G(p − p)αν = 0
So in order to calculate Γν (p , p), we must go to the next degree of approxima-
tion. This approximation is given by
Dμν (p − p)S (p )Γν (p , p)S (p) = G(p − p)S(p )αμ S(p),
and
S (p) = S(p) + S(p)Σ(p)S(p),
Σ(p) = − γμ S(p + k)Γν (p + k, p)Dμν (k)d4 k
Thus
[22] The gravitational field in the presence of an external quantum
photon field
Metric:
gμν (x) = ημν + hμν (x)
Choose the coordinate system so that
h0μ = 0
and also neglecting terms that are quadratic or higher in the metric perturba-
tions. The photon field Fμν (x) can be expanded as a linear superposition of the
creation and annihilation operator fields of the photon in momentum space:
Fμν (x) = [a(K, s)eμν (K, s)exp(−ik.x) + a(K, s)∗ ēμν (K, s)exp(ik.x)]d3 K
= −jω∇μ × H + ω 2 μ.E
Likewise,
∇(divH) − ∇2 H = curlcurlH = jωcurl( E)
= jω(∇ × E + .curlE)
= jω∇ × E + ω 2 μ .H
.divE + ∇ .E = 0
divE = −(∇ , E)/
divH = −(∇μ, H)/μ
So,
(∇2 + ω 2 μ )E + ∇(∇log( ), E) − jω∇μ × H = 0
(∇2 + ω 2 μ )H + ∇(∇log(μ), H) + jω∇ × E = 0
Writing
(ω, r) = 0 (1 + δ.χe (ω, r)),
μ(ω, r) = μ0 (1 + δ.χm (ω, r))
we get by first order perturbation theory,
(∇2 + ω 2 μ0 0 )(Ei , Hi ) = 0,
(∇2 + ω 2 μ0 0 )Es + ω 2 μ0 0 (χe + χm )Ei
+∇(∇χe , Ei ) − jωμ0 ∇χm × Hi = 0,
(∇2 + ω 2 μ0 0 )Hs + ω 2 μ0 0 (χe + χm )Hi
+∇(∇χm , Hi ) + jω 0 ∇χe × Ei = 0,
We note that the inverse (∇2 +k 2 )−1 of the operator (∇2 +k 2 ) where k 2 = ω 2 μ0 0
has the kernel
Gk (r, r ) = −exp(−jk|r − r |)/4π|r − r |
and
Es (ω, r) = Gk (r, r )[−k 2 (χe (ω, r )+χm (ω, r ))Ei (ω, r )
−∇(∇χe (ω, r ), Ei (ω, r ))+jωμ0 ∇χm (ω, r )×Hi (ω, r )]d3 r
Hs (ω, r) = Gk (r, r )[−k 2 (χe (ω, r )+χm (ω, r ))Hi (ω, r )−∇(∇χm (ω, r ), Hi (ω, r ))
−jω 0 ∇χe (ω, r )×Ei (ω, r )]d3 r
In other words, we can write for the scattered em field when the incident field
is fixed and known,
Fs (ω, r) = (Es (ω, r), Hs (ω, r))
Fs (ω, r) = F0 (ω, r, χe , χm )
206 Advanced Probability and Statistics: Applications to Physics and Engineering
with the rhs being a linear functional of χe , χm . The noisy measurement data
therefore has the model
N
χe (ω, r) = θe (n)ψn (ω, r)
n=1
N
χm (ω, r) = θm (n)ψn (ω, r)
n=1
Substituting these expressions into (1), the model assumes the form
Fs (ω, r) = F0 (ω, r, θe (n)ψn , θm (n)ψn ) + v(ω, r)
n n
Rather than storing this entire measurement data Fs (ω, r), (ω, r) ∈ D we com-
press it by storing only its dominant wavelet coefficients
WF 0 (n, m) = Fs (ω, r)φn,m (ω, r)dωd3 r = G0 (n, m, θe , θm ), (n, m) ∈ I
D
where
θe = ((θe (n))), θm = ((θm (n)))
and we can express this measurement model in vector form
WF = G0 (θe , θm ) + v
More generally, if we use pth order perturbation theory for p > 1, then in this
measurement model, we will have G0 as a polynomial function of θe , θm . The
problem of characterizing the disease is to obtain θe , θm from WF . In the noise-
less case, we can do this by a neural network in which we give sample values of
(θe , θm ) and train the weights of the network to take as input WF = G0 (θe , θm )
and output (θ)e , θm ). Then when presented with a general measurement data,
we compute its wavelet coefficient vector WF and input this into the network
which will output the corresponding parameters (θe , θm ). Now we wish to eval-
uate the robustness of this algorithm with respect to measurement noise. We
Advanced Probability and Statistics: Applications to Physics and Engineering 207
select a parameter vector (θe , θm ) and input G0 (θe , θm ) + v into the neural net-
work with the trained weights. if Q(.) denotes the nn function, then the output
parameter estimate is given by
where I(.) is the rate function for v. Then the contraction principle gives
[1] Discuss the irreducible representations of the permutation group using the
group algebra method and Young diagrams. Explain how this theory can be used
to obtain the characters of the permuatation group and how by using the duality
between the action of the unitary group in its standard tensor representation and
the corresponding action of the permutation group on tensors, one can derive a
formula for the generating function for the characters of the permutation group
provided that one uses Weyl’s character formula for the characters of the unitary
group.
mn
= [ων,μ − ωμ,ν
mn
+ [ωμ , ων ]mn ]Γmn
209
210 Advanced Probability and Statistics: Applications to Physics and Engineering
Note that {Γmn } satisfy the Lorentz Lie algebra commutation relations and that
we can write
mn
Rμν = Rμν Γmn
where
mn α β
Rμν em en = Rμναβ
is the standard Riemann-Christoffel curvature tensor in the coordinate basis.
The curvature scalar is therefore
mn μ ν
R = Rμν em en
where
χa = eμa χμ
and
Dμ = ∂μ + ωμab Γab
is the gravitational spinor covariant derivative. The supersymmetry transfor-
mation of ωμmn is not required here, since ωμmn is assumed to be determined by
the field equation that it satisfies obtained by setting the variational derivative
of the supergravity action w.r.t. it to zero. This equation turns out to be a
purely algebraic equation for ωμmn which determines it in terms of the tetrad
field eμn and the gravitino field χa or equivalently χμ .
a
Remark: In the special relativistic Yang-mills theory, Fμν are the fields
obtained from the gauge Boson fields Aaμ by
a
ieFμν = [∂μ + ieAμ , ∂ν + ieAν ]a
with
Aμ = Aaμ τa
where the τa s are Hermitian generators of the gauge group. A gauge invariant
option for the Lagrangian density are obtained by adding matter action terms
to the Yang-Mills Lagrangian is
a
L = (1/2)Fμν F μνa + ψ̄Γa [i∂a + eAa ) − m]ψ
+η̄Γab ηBab
where
Γab = [Γa , Γb ]
It remains to determine the global supersymmetry transformations which change
this Lagrangian by a total differential.
Advanced Probability and Statistics: Applications to Physics and Engineering 211
where
Γmn = [Γm , Γn ]
The curvature tensor in spinor notation is
mn
Rμν = [∂μ + (1/4)ωμν Γmn , ∂ν + (1/4)ωνrs Γrs ]
mn
= (1/4)(ων,μ − ων,μ
mn
)Γmn
+(1/16)ωμmn ωνrs [Γmn , Γrs ]
Now using the anticommutator
{Γm , Γn } = 2ηmn
[Γmn , Γrs ] = 4(ηms Γnr + ηnr Γms − ηmr Γns − ηns Γmr )
Thus
Rμν =
mn
= (1/4)(ων,μ − ων,μ
mn
)Γmn
+(1/4)ωμmn ωνrs (ηms Γnr + ηnr Γms − ηmr Γns − ηns Γmr )
This can be expressed as
mn
Rμν = (1/4)Rμν Γmn
where
mn
Rμν mn
= ων,μ − ων,μ
mn
− ωμrn ωνms ηrs + ωμms ωνrn ηsr + ωμsn ωνrm ηsr − ωμmr ωνns ηrs
=
mn
ων,μ − ων,μ
mn
+ ηrs (−ωμrn ωνms + ωμms ωνrn + ωμsn ωνrm − ωμmr ωνns )
mn
= ων,μ − ων,μ
mn
+ 2ηrs (ωμmr ωνsn − ωνmr ωμns )
It is easily shown that when the spinor connection ωμmn for the gravitational field
is appropriately chosen so that the Dirac equation in curved space-time remains
invariant under both diffeormophisms and local Lorentz transformations, then
212 Advanced Probability and Statistics: Applications to Physics and Engineering
chosen so that the covariant derivative of the tetrad enμ having one spinor index
and one vector index is zero:
0 = Dν enμ = enμ,ν − Γα n nm
μν eα + ων emμ
This is an algebraic equation for ωμmn and is easily solved. However when there
are spinor fields like the gravitino in addition to the gravitational field specified
by the tetrad eμn (ie, the graviton), then the definition of the spinor connection
has to be modified and it is expressed in terms of both the graviton and the
gravitino fields. This equation is obtained by first considering the supergravity
Lagrangian in four space-time dimensions
c1 eR + iχ̄μ Γμνρ Dν χρ
This is obtained by totally antisymmetrizing the product Γμ1 ...Γμk over all its
k indices. The basic property of a Majorana Fermionic operator field ψ(x) is
that apart from all its components anticommuting with each other, it has four
components and satisfies
(ψ(x)∗ )T = ψ T Γ5 Γ0
Advanced Probability and Statistics: Applications to Physics and Engineering 213
where if ⎛ ⎞
ψ1 (x)
⎜ ψ2 (x) ⎟
ψ(x) = ⎜ ⎟
⎝ ψ3 (x) ⎠
ψ4 (x)
then ⎛ ⎞
ψ1 (x)∗
⎜ ψ2 (x)∗ ⎟
ψ(x)∗ = ⎜ ⎟
⎝ ψ3 (x)∗ ⎠
ψ4 (x)∗
ψk (x)∗ denoting the operator adjoint of ψk (x) in the Fock space on which it
acts. Also we define
so that we have
0 1
e = iσ 2 =
−1 0
so that
= diag[e, e], e2 = −I
e 0 0 I
Γ5 Γ0 =
0 −e I 0
0 e
=
−e 0
Thus, the condition for ψ to be a Majorana Fermion can be stated as
∗ ∗
ψ1:2 = eψ3:4 , ψ3:4 = −eψ1:2
or equivalently,
∗ T ∗ T ∗ T ∗ T
(ψ1:2 ) = −(ψ3:4 ) e, (ψ3:4 ) = (ψ1:2 ) e
Also,
Γ0 Γn , Γ0 Γμ , Γn Γ0 , Γμ Γ0
214 Advanced Probability and Statistics: Applications to Physics and Engineering
ψ̄ = (ψ ∗ )T Γ0 = ψ T Γ5 Γ0
iχ̄μ Γμνρ Dν χρ
= iχ∗T 0 μνρ
μ Γ Γ D ν χρ
= iχTμ Γ5 Γ0 Γμνρ Dν χρ
We can verify that apart from a perfect divergence, this quantity is a Hermitian
operator. First observe that
(Γ0 Γμ Γν Γρ )∗ =
(Γ0 Γμ Γν Γ0 Γ0 Γρ )∗ =
Γ0 Γρ Γν Γ0 Γ0 Γμ
= Γ0 Γρ Γν Γμ
so that on antisymmetrizing over the three indices, we get
Thus,
(iχ∗μ Γ0 Γμνρ ∂ν χρ )∗
T
= iχ∗T 0 μνρ
ρ,ν Γ Γ χμ
= iχ∗T 0 ρνμ
μ,ν Γ Γ χρ
= −iχ∗T 0 μνρ
μ,ν Γ Γ χρ
= ∂ν (−iχ∗T 0 μνρ
μ Γ Γ χρ )
+iχ∗T 0 μνρ
μ Γ Γ χρ,ν
proving our claim provided that we replace Dν by ∂ν . If we take the connection
into account, ie
Dν χρ = ∂ν χρ + (1/4)ωνmn Γmn χρ
−Γα
ρν χα
then it follows that we must prove the Hermitianity of the operator fields
χ∗T 0 μνρ
μ Γ Γ Γmn χρ .ωνmn − − − (a)
and
χ∗T 0 μνρ
μ Γ Γ ρν − − − (b)
χα .Γα
Advanced Probability and Statistics: Applications to Physics and Engineering 215
However the field (b) is identically zero since Γμνρ is antisymmetric in (ν, ρ)
while Γαρν is symmetric in (ν, ρ). Hence, we have to prove only the Hermitianity
of the field
χ∗T
μ Γ Γ
0 μνρ
Γmn χρ − − − (c)
Now,
Γμνρ Γmn = [Γμνρ , Γmn ] + Γmn Γμνρ
and
[Γpqr , Γmn ] = [Γp Γqr + Γq Γrp + Γr Γpq , Γmn ]
Now,
[Γp Γqr , Γmn ] = Γp [Γqr , Γmn ] + [Γp , Γmn ]Γqr
= 4Γp (ηqn Γrm + ηrm Γqn − ηqm Γrn − ηrn Γqm )
+4(ηpm Γn − ηpn Γm )Γqr
Summing this equation over cyclic permutations of (pqr) gives us
[Γpqr , Γmn ] =
4 ηmq (Γp Γnr + Γr Γpn + Γn Γrp )
(pqr)
+4 ηnq (Γp Γrm + Γr Γmp + Γm Γpr )
(pqr)
=4 (ηmq Γpnr + ηnq Γprm )
(pqr)
Note that this quantity is antisymmetric w.r.t interchange of (m, n). It thus
follows that
eqν χ∗T 0 μνρ
μ Γ [Γ , Γmn ]χρ
= χp∗T Γ0 [Γpqr , Γmn ]χr
=4 [ηmq χp∗T Γ0 Γpnr χr + ηnq χp∗T Γ0 Γprm χr ]
(pqr)
Now,
(χp∗T Γ0 Γpnr χr )∗ =
χr∗T Γ0 Γrnp χp
= χp∗T Γ0 Γpnr χr
which proves the Hermitianity. Another way to see the Hermitianity of this is
to use the Majorana Fermion property of χp to write
χp∗T Γ0 Γpnr χr =
χpT Γ5 Γpnr χr
216 Advanced Probability and Statistics: Applications to Physics and Engineering
ΓTn = Γn , Γn Γ5 = −Γ5 Γn
Now consider
X = χ∗T 0
μ Γ {Γ
μνρ
, Γmn }χρ
where {., .} denotes anticommutator. We have
X = X1 + X2
where
X1 = χ∗T 0 μνρ
μ Γ Γ Γmn χρ
X2 = χ∗T 0
μ Γ Γmn Γ
μνρ
χρ
we have
X1∗ = χ∗T 0
ρ Γ Γmn Γ
μνρ
χμ
= −χ∗T 0
μ Γ Γmn Γ
μνρ
χρ = −X2
which shows that
X ∗ = −X
ie, X is skew Hermitian. Note that we have used the fact that
(Γ0 Γp Γq Γr Γm Γn )∗ =
(Γ0 Γp Γ0 Γ0 Γq Γr Γ0 Γ0 Γm Γ0 Γ0 Γn )∗ =
Γ0 Γn Γm Γ0 Γ0 Γr Γ0 Γ0 Γq Γp Γ0 Γ0 =
Γ0 Γn Γm Γr Γq Γp
since Γ0 Γn , Γn Γ0 , Γ0 are Hermitian and Γ02 = I. Thus, by antisymmetrizing
over (pqr) and over (mn), we get
= Γ0 Γmn Γpqr
Γ5 Γmn Γ0 = Γ5 Γmn Γ0 =
δχ (χ̄μ Γμνρ Dν χρ ) =
= δ χ̄μ Γμνρ Dν χρ +
χ̄μ Γμνρ Dν δχρ
= D̄μ (x)Γμνρ Dν χρ
+χ̄μ Γμνρ Dν Dρ (x)
The term in this quantity that is quadratic in {ωμmn } is given by
with θA (τ ) being Fermionic coordinates. The sum is over A and each θA is there-
fore a D dimensional Majorana Fermion. We wish to determine the supersym-
metry transformation under which S is invariant. We consider an infinitesimal
supersymmetry transformation
δθA = γ 0 γ μ pμ .k A = αμ pμ .k A
δθA = k A , δX μ = k AT γ μ θA
AT
= 2k,α γ μ θA
This is zero iff k A is a constant Grassmanian parameter in which case it follows
that the action √
S = ημν hαβ hpμα pνβ d2 σ
where
iFμν = [∂μ + iAμ , ∂ν + iAν ]
or equivalently,
a
Fμν = Aaν,μ − Aaμ,ν + C(abc)Abμ Acν
where C(abc) are the structure constants of a set of Hermitian basis for the
Lie algebra of the gauge group. Dμ , the gauge covariant derivative acts in the
adjoint representation on the gaugino fields ψ a :
Dμ ψ a = ∂μ ψ a + iC(abc)Abμ ψ c
∂μ ψ a + i[Abμ τb , ψ c τc ]a
220 Advanced Probability and Statistics: Applications to Physics and Engineering
= ∂μ ψ a + C(abc)Abμ ψ c
Note that Aaμ is a gauge Boson field and its superpartner ψ a is a gauge Fermion
field, also called a gaugino field. Now, we must introduce infinitesimal supersym-
metry transformations under which the action Ld4 x is invariant. We assume
that such a transformation has the form
δAaμ = k T γμ ψ a ,
δψ a = (γ μν k)Fμν
a
for any three vector fields X, Y, Z. It is not hard to prove that the connection ∇
induced by the metric is uniquely determined by the metric provided we assume
in addition that the torsion of the connection is zero, ie,
∇X Y − ∇Y X − [X, Y ] = 0
for any two vector fields X, Y . An elliptic differential operator on the Rieman-
nian manifold is a second order differential operator of the form
where we have chosen and fixed the coordinate system x and have defined g ij (x)
so that
g ij (x) = ((gij (x)))−1
where
gij (x)X i (x)Y j (x) = g(X, Y )(x)
A Dirac operator on the Riemannian manifold is an operator of the form
where Vaμ (x) is the Vierbein of the metric gμν (x). We can define the space-time
dependent Dirac matrices
Γμ (x) = Vaμ (x)γ a
Then, we can define the Dirac operator as
where now Aμ (x) is a matrix valued four vector potential. It takes into account
both the Yang-Mills connection Abμ (x)τb and the spinor connection
μ
Γμ (x) = ωab (x)[γ a , γ b ]
The Dirac operator D acts on the space of N-component spinor fields defined
on the manifold M.
[7] Integration on a differentiable manifold
Let M be an n-dimensional manifold and ω a k ≤ n differential form on M.
Stokes’ theorem states that
ω= dω
∂M M
In terms of coordinates,
ω(x) = ωμ1 ...μk (x)dxμ1 ∧ ... ∧ dxμk
Then,
dω(x) = ωμ1 ...μk ,m (x)dxm ∧ dxμ1 ∧ ... ∧ dxμk
Then
dω = sgn(m, mu1 , ..., μk )
M m,μ1 ,...,μk distinctandinincreasingorder
ωμ1 ...μk ,m (x)dxm dxμ1 ...dxμk
where sgn(m, μ1 , ..., μm ) is the signature of the permutation that takes the se-
quence (m, μ1 , ..., μk ) to the sequence obtained by arranging m, μ1 , ..., μk in
increasing order.
[8] Index of an Elliptic operator
222 Advanced Probability and Statistics: Applications to Physics and Engineering
Kt = exp(−tP )
We have
T r(exp(−tP )) = Kt (x, x)dx
M
Let c1 , ..., ck be the positive eigenvalues of P and ck+1 , ..., cN its negative eigen-
values. Then,
index(P ) = k − (N − k) = 2k − N
On the other hand,
N
T r(exp(−tP )) = exp(−tck )
k=1
We claim that N (t) = dim(N (Q))−dim(N (Q∗ ). This can be proved for example
using the singular value decomposition. We write
Q = U DV
+ψ μT ρα ∂α ψμ
where ψ μ (τ, σ) is a Majorana Fermion field and ρα s are skewsymmetric matri-
ces. It should be noted that for Majorana Fermion fields ψ on R4 , we have
ψ ∗ = ψ T γ5 γ 0
Advanced Probability and Statistics: Applications to Physics and Engineering 223
and
ψ̄ = ψ ∗ γ 0
so that
ψ̄γ μ ∂μ ψ = ψ T γ5 γ μ ∂μ ψ
We note that γ5 γ μ and γ μ are skew-symmetric matrices and we may as well
therefore remove the γ5 factor to get the Fermionic contribution to the La-
grangian density as
ψ T γ μ ∂μ ψ
For strings, howewer, space-time is two dimensional and hence we should replace
γ μ by skew-symmetric matrices ρα to obtain the Fermionic contribution to the
Lagrangian density as
ψ T ρα ∂ α ψ
The infinitesimal supersymmetry transformations that leave the total super-
string action invariant are
δX μ = k T ψ μ , δψ μ = ραT k∂α X μ
ρα ρβ + ρβ ρα = 2hαβ
operators. Thus the quantum superstring acts in a tensor product of Boson and
Fermion Fock space. The Hamiltonian of the superstring can be expressed as
H= c(n)α(−n)α(n) + d(n)β(−n)β(n)
n≥1 n≥1
where
α(−n) = α(n)∗ , β(−n) = β(n)∗
and the α s satisfy the CCR while the β s satisfy the CAR:
It should be noted that the unperturbed action for the superstring is given by
μ
S[X, ψ] = (1/2) X,α X μ,α dτ dσ
+(1/2) ψ μT .ρα ∂α ψμ dτ σ
δX μ = −k T ψ μ
δψ μ = ρα kX,α
μ
ρα ρβ + ρβ ρα = 2η αβ
where ((η αβ )) = diag[1, −1] is the string sheet metric. This form of the metric
can be obtained by the application of an appropriate Weyl scaling. ρα are
skew-symmetric matrices just as in four space-time dimensions γ μ are skew-
symmetric matrices (See S.Weinberg, Vol.III, Supersymmetry). We shall now
verify invariance of the above superstring action under the above infinitesimal
supersymmetry transformation:
μ
δ((1/2)X,α Xμ,α ) =
Advanced Probability and Statistics: Applications to Physics and Engineering 225
Xμ,α δX,α
μ
=
−2Xμ,α k T ψ,α
μ
δ(ψ μT ρα ψμ,α ) =
δψ μT ρα ψμ,α
+ψ μT ρα δψμ,α
μ T βT
= X,β k ρ ρα ψμ,α
+ψ μT ρα ρβ kXμ,βα
μT
= −ψ,alpha ρα ρβ kXμ,β
+ψ μT ρα ρβ kXμ,βα
= −(ψ μT ρα ρβ kXμ,β ),α
+2ψ μT ρα ρβ kXμ,βα
= −(ψ μT ρα ρβ kXμ,β ),α
+ψ μT {ρα , ρβ }kXμ,βα
= η αβ ψ μT kXμ,βα
where a total two divergence has been neglected since such a term does not
contribute to the action integral. Adding the two terms, we find that on neglect
of total divergence terms and using the antisymmetry of , the variation in the
action is given by
δS = −2Xμ,α k T ψ,α
μ
− Xμ,α
,α T
k ψμ
=0
pμα = (X,α
μ
− θAT γ μ θ,α
A
)
where A = 1, 2. Define
L1 = (1/2)pμα pα μ να
μ = (1/2)ημν pα p
where α is raised using the worldsheet metric diag[1, −1]. Also define
L2 = c 1 αβ μ
X,α (θ1T γμ θ,β
1
− θ2T γμ θ,β
2
)
+c2 αβ
(θ1T γ μ θ,α
1
).(θ2T γμ θ,β
2
)
The local supersymmetry transformations are
δX μ = k AT γ μ θA , δθA = k A
226 Advanced Probability and Statistics: Applications to Physics and Engineering
[12] Spinors
Let V be a vector space and Q a quadratic form on V so that for any u, v ∈ V ,
we have a multiplication u.v satisfying
uv + vu = B(u, v)
and hence
B(u, v) = Q(u + v) − Q(u) − Q(v)
or equivalently,
B(u, v) = (Q(u + v) − Q(u − v))/2
We assume that the product u1 ...un is defined for any u1 , ..., un ∈ V . The formal
linear span of all such products is denoted by C(V ) and (C(V ), Q), ie, C(V )
equipped with the quadratic form Q is called a Clifford algebra over C(V )+
denotes the subalgebra of C(V ) spanned by even number of products. One
example of a Clifford algebra is as follows. Let ∧ be an antisymmetric tensor
product on V . For u ∈ V , let a(u)∗ act on ∧V by
and let a(u) act on the same by the adjoint operation, ie, contraction:
n
a(u)w1 ∧ ... ∧ un = c(n) < u, wk > (−1)k w1 ...wk−1 ∧ wk+1 ∧ ... ∧ wn
k=1
W = span{e1 , ..., en }
we get that
γ(u)γ(v) + γ(v)γ(u) = (u|v), u, v ∈ V
Thus, we get a Clifford structure on V and it is clear that the dimension of this
Clifford algebra is 22n .
where
α(n)μ∗ = α(−n)μ , n = 0
in order that X μ be a Hermitian operator field and this string field satisfies the
wave equation
τ − X,σσ = 0
μ μ
X,τ
that is derived from the action
S[X] = (1/2) μ
(X,τ Xμ,τ − X,σ
μ
Xμ,σ )dτ dσ
228 Advanced Probability and Statistics: Applications to Physics and Engineering
The space-time Lorentz group is SO(1, D − 1). The canonical momentum field
is
μ
Pμ = δS/δX,τ = Xμ,τ
and hence the CCR gives
or equivalently,
[X μ (τ, σ), X,τ
ν
(τ, σ )] = iη μν δ(σ − σ )
and hence we get
[ (α(n)μ /n)exp(2πin(τ −σ))+(β(n)μ /n)exp(2πin(τ +σ)), (α(m)μ /m)
n=0 m=0
where
p2 = pμ pμ = p02 − pi pi
The condition that the state of the system have total energy E gives (H −
E)|φ >= 0, or formally, on such states, H = E. On the other hand, from
relativistic mechanics, we know that the mass is given by M 2 = p2 = pμ pμ .
Thus, we get the result that the mass operator of the string is given by
2M 2 − 4π 2 (α(−n).α(n) + β(−n).β(n)) = E
n≥1
or equivalently,
M 2 = E/2 − 2π 2 (α(−n).α(n) + β(−n).β(n))
n≥1
Advanced Probability and Statistics: Applications to Physics and Engineering 229
where a is a constant. We are absorbing the other modes β(n) into the α(n) s.
We can evaluate the matrix elements of the propagator as follows.
[α(n)μ , α(−n)ν ] = nη μν
whence
< m1 m2 ...|z L |k1 k2 ... >= z −a n kn
δ[m − k]
B(w) = A(z)(dz/dw)J
or equivalently,
so that
δA(z) = A (z) (z) − JA(z) (z)
This is the condition for A to have conformal weight J. Now, consider the
following vertex function for a Bosonic string:
V (k, z) =: exp(ik.X(z)) := exp(k. α(n)z n /n).exp(k. α(n)z n /n)
n≤−1 n≥1
We wish to compute its conformal weight. First we introduce the Fourier com-
ponents of the energy-momentum tensor:
Ln = (1/2) α(n − m)α(m)
m
[15] Super-strings
√
S[X, ψ] = (1/2) hhαβ X,α
μ
Xμ,β d2 σ
a 2
− ψ a ρα ψ,α d σ
Weyl scaling: First, we can choose our string coordinates (τ, σ) so that
and then
√
h = exp(2φ), ((hαβ )) = exp(−φ)diag[1, −1], (( hhαβ )) = diag[1, −1]
and so there is no need for Weyl scaling. Already, in our system of coordinates,
the Bosonic part of the string action is
(1/2) η αβ X,αμ
Xμ,β d2 σ
we get that
J μν = n−1 (αμ (n)αν (−n) − αν (n)αμ (−n))
n=0
232 Advanced Probability and Statistics: Applications to Physics and Engineering
+ n−1 (β μ (n)β ν (−n) − β ν (n)β μ (−n))
n=0
ρα ψ,α
a
=0
ie,
ρ0 ψ,τ
a
+ ρ1 ψ,σ
a
=0
We first start with the Fermionic Lagrangian density
LF = ψ a ρα ψ,α
a
= ψ a ρ0 ψ,τ
a
+ ψ a ρ1 ψ,σ
a
or equivalently,
ψ a∗ = ψ aT ρ0
and hence
ψ̄ a = ψ a∗ ρ0 = ψ aT
We can expand the solution as
ψ a (τ, σ) = S a (n, τ )exp(−2πinσ)
n
Advanced Probability and Statistics: Applications to Physics and Engineering 233
and hence
S a (n, τ ) = exp(2πinτ A)S a (n)
where
A = ρ0 ρ1
The eigenvalues of A are ±1 and therefore as in the Bosonic case, we again
get the result that in the Fermionic case, the wave field is a superposition of a
forward and a backward travelling wave:
and hence
and hence
where
P0 = e0 eT0 , P1 = e1 eT1
and hence,
ψ a (τ, σ) = [P0 S a (n)exp(2πin(τ − σ)) + P1 S a (n)exp(−2πin(τ + σ))]
n
where
ρ0 = σ1 , ρ1 = iσ2
Thus
(ρ0 )2 = I, ρ0 ρ1 = iσ1 σ2 = −σ3
234 Advanced Probability and Statistics: Applications to Physics and Engineering
Then writing
μ μ T
ψ μ = [ψ+ , ψ− ]
we get that the Fermionic component of the superstring Lagrangian density is
given by
LF = ψ μT ρ0 (ρ0 ∂0 + ρ1 ∂1 )ψμ
= ψ μT (I∂0 − σ3 ∂1 )ψμ =
μ μ
ψ+ ∂− ψμ+ + ψ− ∂+ ψμ−
where
∂ ± = ∂0 ± ∂ 1
Note that we may define
x+ = (τ + σ)/2, x− = (τ − σ)/2
and then
∂0 = (1/2)(∂/∂x+ + ∂/∂x− )
∂1 = (1/2)(∂/∂x+ − ∂/∂x− )
and hence
∂/∂x+ = ∂0 + ∂1 = ∂+ ,
∂/∂x− = ∂0 − ∂1 = ∂−
The Euler-Lagrange equations for the Fermionic components are easily seen to
be
∂+ ψ− = 0, ∂− ψ+ = 0
μ
where ψ+ is an abbreviation for ((ψ+ )) and likewise ψ− is an abbreviation for
μ
ψ− . These equations imply that ψ+ is any function of τ − σ only and ψ− is
any function of τ + σ only. The solutions to these equations are of two kinds
depending upon the boundary conditions. These boundary conditions can be
described as follows. The spatial string variable σ is assumed to vary over
π
[0, π]. The variation in SF = 0 LF dσ gives us the following boundary term on
integration by parts:
ψ+ δψ+ − ψ− δψ−
This must vanish at the boundary, ie, at σ = 0, π. To make it vanish at σ = 0,
we assume the boundary condition that
ψ+ (τ, 0) = ψ− (τ, 0)
and hence
δψ+ (τ, 0) = δψ− (τ, 0)
To make it vanish at σ = π, we may assume that the boundary condition that
either
ψ+ (τ, π) = ψ− (τ, π)
Advanced Probability and Statistics: Applications to Physics and Engineering 235
and hence
δψ+ (τ, π) = δψ− (τ, π)
or else
ψ+ (τ, π) = −ψ− (τ, π)
and hence
δψ+ (τ, π) = −δψ− (τ, π)
The first kind of boundary conditions implies that ψ± admit the modal expan-
sions
μ
ψ+ (τ, σ) = bμ (n)exp(in(τ − σ)),
n∈Z
μ
ψ− (τ, σ) = bμ (n)exp(in(τ + σ))
n∈Z
μ
ψ− (τ, σ) = bμ (n)exp(in(τ + σ))
n∈Z+1/2
This is the situation for general strings, open or closed. If we further restrict to
closed strings, then we must impose the periodic boundary conditions, that for
any solutions, the value at σ = 0 must coincide with the value at σ = π. This
additional restriction implies that we obtain the following modal expansions:
μ
ψ+ (τ, σ) = bμ (n)exp(2in(τ − σ))
n∈Z
and
μ
ψ− (τ, σ) = bμ (n).exp(2in(τ + σ))
n∈Z
for the first kind of boundary conditions and for the second kind, we do not
have any periodic solutions unless we extend the spatial domain to the range
[−π, π] in which case, we get the (2π periodic) solution
μ
ψ+ (τ, σ) = bμ (n)exp(2in(τ − σ))
n∈Z+1/2
and
μ
ψ− (τ, σ) = bμ (n).exp(2in(τ + σ))
n∈Z+1/2
ψ− .∂+ ψ− = 0
TF−+ = (∂LF /∂∂− ψ+ ).∂+ ψ+
= ψ+ .∂+ ψ+
TF+− = (∂LF /∂∂+ ψ− ).∂− ψ−
= ψ− .∂− ψ−
and finally,
TF−− = (∂LF /∂∂− ψ+ ).∂− ψ+
ψ+ ∂ − ψ+ = 0
Now, we choose a coordinate system followed by a Weyl rescaling so that the
metric becomes
g++ g+−
=
g−+ g−−
0 1
1 0
ie,
g++ = g−− = 0, g+− = g−+ = 1
and then we get the covariant components of the Fermionic part of the energy-
momentum tensor are given by
Likewise,
TF −− = g−+ TF+− = ψ− ∂− ψ−
TF +− = g+− TF−− = TF−− = ψ+ ∂− ψ+ = 0
TF −+ == g−+ TF++ = 0
We note the Fermionic energy-momentum conservation laws
∂ + = g +− ∂− = ∂− ,
∂ − = g −+ ∂+ = ∂+
and hence
∂ + TF ++ + ∂ − TF −+ = ∂ + TF F+ + = ∂− TF ++ = ∂− (ψ+ ∂+ ψ+ ) = 0
since
∂− ψ+ = 0, ∂− ∂+ = ∂+ ∂−
Advanced Probability and Statistics: Applications to Physics and Engineering 237
Likewise,
∂ − TF −− + ∂ + TF +− = ∂+ TF −− = ∂+ (ψ− ∂+ ψ− ) = 0
since
∂ + ψ− = 0
We now also observe that the Noether theorem applied to the supersymmetry
invariance of the combined Boson-Fermion action implies the conservation of the
supercurrent. The components of the supercurrent are obtained by observing
that the variation of the total action under an infinitesimal local supersymmetry
transformation is given by and expression of the form
δ ,susy S = (∂α (σ)(σ))J α (σ)d2 σ
and hence if the equations of motion are satisfied, then the above variation must
vanish for all infinitesimal local parameters and hence the supercurrent must
be conserved:
∂α J α (σ) = 0
This conservation law can be stated in an alternate form as
∂− J+ = 0, ∂+ J− = 0
where
J+ = ∂+ X μ ψμ+ , J− = ∂− X μ ψμ−
In this form, it is immediate to see these currents are conserved from the equa-
tions of motion:
∂− ∂ + = ∂ α ∂ α
and hence
∂− ∂+ X μ = 0,
∂− ψμ+ = 0
D = γ5 ∂θ − γ μ θ∂μ
DL = (1 + γ5 )D/2, DR = (1 − γ5 )D/2
V A (x, θ) is the gauge superfield which in the Wess-Zumino gauge, can be ex-
pressed as
V A (x, θ) = θT γ μ θ.VμA (x)
We put
θL = (1 + γ5 )θ/2, θR = (1 − γ5 )θ/2
Then,
D L = ∂ θL − γ μ θ R ∂ μ
D R = − ∂ θR − γ μ θ L ∂ μ
where we have used the identity,
γ5 γ μ + γ μ γ5 = 0,
t.V = tA V A
where summation over the non-Abelian gauge index A is implied. {tA } form
a complete set of Hermitian generators for the gauge group assumed to be a
subgroup of U (N ). We define the left Chiral fields
WaA (x, θ) = DR
T
.DR (exp(−t.V ).DLa exp(t.V ))
xμ+ = xμ + θR
T
γ μ θL
Note that
DR xμ+ = −γ μ θL + (− )( γ μ θL ) = 0
and likewise, Φ is right Chiral iff it is a function of only θR and
xμ− = xμ − θR
T
γ μ θL
We note that
DL xμ− = 0
Also,
DR θLa = − ∂θR θLa = 0
and likewise,
DL θRa = 0
since
(1 + γ5 )(1 − γ5 ) = (1 − γ5 )(1 + γ5 ) = 0
These relations can be expressed in matrix notation as
T T
D L θR = D R θL =0
= θT (1 − γ5 ) γ μ θ/2
Then,
V A (x, θ) = θT γ μ θ.VμA (x)
∂θ (θT θ.θT w)
= 2 θ.θT w + θT θ.w
D = α T γ5 L
where
L = γ5 ∂θ + γ μ θ∂μ
Then the change in the component fields under such an infinitesimal supersym-
metry transformation are given by
Likewise,
θT θ.δM + θT γ5 θ.δN + θT γ μ θ.δVμ
= αT γ5 [γ μ θ.θT ω,μ
+γ5 ∂θ (θT θθT γ5 (λ + aγ μ ω,μ ))]
= αT γ5 [γ μ θ.θT ω,μ
+γ5 (2 θθT γ5 (λ + aγ μ ω,μ ) + (θT θ)γ5 (λ + aγ μ ω,μ )]
Advanced Probability and Statistics: Applications to Physics and Engineering 241
= αT γ5 γ μ θ.θT ω,μ
−2αT θθT γ5 (λ + aγ μ ω,μ ) − (θT θ)αT γ5 (λ + aγ μ ω,μ )
Writing
θθT = c1 θT θ + c2 θT γ5 θ.γ5 + c3 θT γ μ θ. γμ
we get on equating coefficients of θT θ, θT γ5 θ and θT γ μ θ respectively, the
equations
δM = αT γ5 (2c1 λ + ((2c1 − 1)a − c1 )γ μ ω,μ ),
δN = 2c2 αT λ + c2 (1 + 2a)αT γ μ ω,μ
= c2 αT (2λ + (1 + 2a)γ μ ω,μ )
[20] Bosonic string theory: Derivation of the Einstein field equations for
gravitation in vacuum based on conformal invariance of the string action.
(τ, σ) represent respectively the time variable and the length variable along
the string, τ ≥ 0, 0 ≤ σ ≤ 1. The string is assumed to be D dimensional, so
that any point on its surface is parameterized by (τ, σ): X μ = X μ (τ, σ). The
space-time metric on the string surface is a two dimensional metric given by
hαβ (τ, σ). Thus, the string action functional is given by
√
S1 (X) = hαβ (τ, σ) −hgμν (X(τ, σ))X,α μ ν
X,β dτ dσ
Then writing
∂ α = hαβ ∂β
we get
∂ 0 = ∂0 = ∂/∂τ,
∂ 1 = −∂1 = −∂/∂σ
and then the string action can be expressed as
S1 [X] = gμν (X)∂α X μ .∂ α X ν dτ dσ
= θ(τ − τ ) < X μ (τ, σ).X ν (τ , σ ) > +θ(τ − τ ) < X ν (τ , σ ).X μ (τ, σ) >
the equations of motion and the equal time Bosonic commutation relations
since in such a normal coordinate system, the first order partial derivatives of
gμν vanish at X0 . Further, from the above calculation of the propagator, using
dimensional regularization,
< xρ (τ, σ).xσ (τ, σ) >= η ρσ limτ →τ ,σ →σ d2+ k.exp(i(k1 (τ −τ )
and the condition for this variation to be zero, ie conformal invariance of the
quantum averaged string action is that
Rμν = 0
Superconductivity
To calculate the quantum effective action, we must first evaluate the path inte-
gral
Γ(A, V ) = exp(iS[ψ])Dψ.Dψ ∗
and then derive formulas for the superconductivity current etc. from this effec-
tive action. We add another term to this action, namely
ΔS[ψ, Ψ] = − Vab (x, y)(ψa (x)∗ ψb (y)∗ −Ψab (x, y)∗ )(ψa (x)ψb (y)−Ψab (x, y))d4 xd4 y
245
246 Advanced Probability and Statistics: Applications to Physics and Engineering
Remark: ψa (x) describes the Fermion wave field of a-type particles at the
space-time point x while ψa (x)∗ ψa (x) describes the number density operator
for the Fermion field of a-type particles at the space-time point x. Vab (x, y)
described the interaction potential between one particle of a type located at x
and another particle of b type located at y. Ψab (x, y) describes the Cooper pair
field formed by a Fermion of type a located at x with another another Fermion of
type b located at y. In terms of matrices, we can write the various components
of the above Lagrangian density as
L1 = δ(y − x)ψa (y)∗ (i∂t − E(−i∇ + eA(x)) + eV (x))ψa (x)
ψ(x)
ψ(x)∗
The associated path integral for this component taken over ψ, ψ ∗ has the form
ω − E(−i∇ + eA) + eV V ⊗Ψ
det(
V ⊗ Ψ∗ ω − E(−i∇ + eA) + eV .exp(−i Vab (x, y)Ψab (x, y)∗ Ψab (x, y)d4 xd4 y)
where ω = i∂t is in the frequency domain. The logarithm of this quantity is the
quantum effective action. It is therefore given by
Γ(A, V, Ψ) =
Vab (x, y)Ψab (x, y)∗ Ψab (x, y)d4 xd4 y
ω − E(−i∇ + eA) + eV V ⊗Ψ
+i.log(det(
V ⊗ Ψ∗ ω − E(−i∇ + eA) + eV
Some properties of the quantum effective potential: Consider the path integral
for an action S[φ] with a coupling current:
Z(J) = exp(iS[φ] + i Jφ.d4 x)Dφ
We have
δlog(Z(J))/δJ(x) = i < φ >J (x)
Consider the equation (as in large deviation theory)
(δ/δJ(x))(i Jψd4 x − log(Z(J))) = 0
This gives
ψ(x) =< φ >J (x)
Let the solution to this equation be given by
J(x) = Jψ (x)
Advanced Probability and Statistics: Applications to Physics and Engineering 247
= iJψ (x)
Now suppose that the original action and path measure is invariant under an
infinitesimal transformation φ → φ(x) + χ(x). Then what can we say about a
corresponding invariance of the quantum effective action ? We observe that
Simulating path integrals for fields using MATLAB. Consider for example
the KG path integral in an external electromagnetic field:
Z[Aμ ] = exp(i [(1/2)(∂μ +ieAμ (x))φ(x).(∂ μ −ieAμ (x))φ(x)−(m2 /2)φ2 ]d4 x)Dφ
Reference:
D.Swaroop and H.Parthasarathy, ”Simulation using MATLAB of the quan-
tum effective action for Cooper pairs given the electromagnetic field”, Technical
Report, NSUT, 2019.
where
ψn (x) = ρn (x).exp(iφn (x))
Substituting this into the Schrodinger Lagrangian, we get
so that
(δnm ∇ + ieAtnm /h)ψm =
(∇ρn + iρn ∇φn )exp(iφn ) + (ieA/h)tnm ρm )exp(iφm )
Summation here over the repeated index m is implied. For a single Cooper pair
field, this reduces to
When we minimize this integral, then it naturally follows that |∇φ + eA/h| will
become very small, ie, the magnetic vector potential will be close to a perfect
gradient which means that the magnetic field B = ∇ × A will nearly be expelled
from the body of the superconductor. This is precisely the Meissner effect. In
the general case of several Cooper pair fields, we have
|(δnm ∇ + ieAtnm /h)ψm |2 =
n
(∇ρn )2 + (δnm ρn ∇φn +(eA/h)tnm , δks ρk ∇φk +(eA/h)tks )cos(φm −φs )
n k,n,m,s
+2 (∇ρn , δnm ρn ∇φn + (eA/h)tnm ρm )sin(φn − φm )
n,m
1
converges uniformly in time over [0, T ] in probability to dt 0 J (θ)ρ(t, θ)(1 −
ρ(t, θ))dθ where the measure ρ(t, θ)dθ is the weak limit of the sequence of random
measures
μN,t = N −1 ηt (x)δx/N
x∈ZN
249
250 Advanced Probability and Statistics: Applications to Physics and Engineering
Alternate proofs of the hydrodynamic scaling limit without the use of en-
tropic principles are proved by Varadhan et.al., namely based on the local
averaging
principle. This involves expressing the above time differential of
N −1 x J(x/N )ηt (x) in terms of a drift component and a Poisson martin-
gale component, proving by standard martingale arguments that the martingale
component converges to zero as N → ∞ and then applying the local averaging
principle to arrive at the hydrodynamic scaling limit. The averaging princi-
ple roughly states that the limit of (2N + 1)−1 y:|y−x|≤N f (τx η) converges to
fˆ(ρ(x)) as N → ∞ followed by → 0 where the η(x) s are independent Bernoulli
with means ρ(x). Varadhan has also used large deviation theory to establish
the hydrodynamic scaling limits for a system of N particles following Hamilto-
nian dynamics in phase space. This is very similar to how one derives the fluid
dynamical equations in statistical mechanics from the Boltzmann kinetic trans-
port equation with the difference that the Boltzmann distribution function is a
function of just one position and one velocity variable, but in Varadhan’s work,
one starts with the joint distribution of N particles in 6N -dimensional phase
space and then forms the empirical averages of the positions and momenta of
the N particles to arrive at the hydrodynamical equations of Euler including
an energy equation. Varadhan clearly states that this derivation is not rigorous
since it assumes the validity of the averaging principle but he states and proves
clearly that if noise is present in the Hamiltonian dynamics, then the averaging
principle is valid. In fact the presence of random noise ensures ergodic behaviour
of the system.
where w is a noise random field, say a mixture of a Gaussian and a Poisson field.
When = 0, ie, in the absence of noise, the solution is f0 = L−1 (s). When noise
is present, write f = f0 + δf , so that δf approximately satisfies the pde
√
L0 (δf ) + δL(f0 ) = w
Advanced Probability and Statistics: Applications to Physics and Engineering 251
In fluid dynamics:
Consider a curved p dimensional surface M on which we introduce curvilin-
ear coordinates xi , i = 1, 2, ..., p and we embed this surface in N > p dimensional
Euclidean space by introducting N cartesian coordinates y n (x), n = 1, 2, ..., N .
The metric tensor on this surface is
N
gμν (x) = (∂y n /∂xμ ).(∂y n /∂xμ )
n=1
so that the tangent vector to this curve at x(t) is specified by the partial differ-
ential operator
Xx(t) = (dxμ (t)/dt).∂/∂xμ
This operator has the interpretation of the velocity operator, ie, writing v μ =
dxμ /dt, the velocity field in component form, the velocity field in the vector
field differential operator formalism can be expressed as v(x) = v μ (x)∂μ . The
energy momentum tensor of the fluid on this curved surface is
and the taking into account the fluid viscosity, the Navier-Stokes equations of
motion of the fluid on this curved surface can be expressed as
T:νμν = f μ (x)
where f μ (x) is the external random forcing field. We can express these equations
as
((ρ + p)v μ v ν ):ν − g μν p,ν = f μ
To be precise, we take x0 = t and the space-time metric as
n n
g00 = 1, g0k = 0, gkm = y,k y,m , k, m = 1, 2, ..., p
p
dτ 2 = dt2 − gkm (x)dxk dxm
k,m=1
252 Advanced Probability and Statistics: Applications to Physics and Engineering
N
N −1 log(E[exp( f (Xi Xi+1 ...Xi+r−1 )] − − − (1)
i=1
Now suppose that this process is a Markov process with transition probability
measureπ(x, dy) and we wish to evaluate the rate function for the sequence of
empirical measures μN , n ≥ 1 at ν where ν is a probability measure on Rr and
then let r → ∞ so that ν is a stationary, ie shift invariant probability measure
on RZ . We have that (1) equals for large N approximately,
−1
N log(Eexp(N. f (x)dμω (x)))
almost surely, where ω = (Xn )n≥1 is the Markov path. So if ν is any measure
on Rr , then the rate function for μN , N ≥ 1 is given by
I(ν) = supf [ f (x)dν(x) − limN →∞ N −1 .log(E[exp(N. f (x)dμω (x))]]
or equivalently,
p
x[n] = − a[k]x[n − k] + δ.f (x[n − k], 1 ≤ k ≤ p) + w[n + 1]
k=1
We can construct a Markov state model for this process by defining the state
vector
X[n] = [x[n − 1], ..., x[n − p]]T
and then writing the above time series model as
We estimate the parameters a[k] in A along with the state X[n] using the EKF.
We can in fact, also incorporate some parameters θ into the function F and
estimate it also using the EKF. Suppose that the parameter estimates of a[k], θ
are nearly constant over time slots of duration L. Then, within each such slot, we
can use these estimates to estimate the spectrum and higher order spectrum of
x[n] approximately. For example suppose a[k], θ denote the parameter estimate
over a given time slot. Then, we have
p
x[n] + a[k]x[n − k] = δ.F (x[n − 1], ..., x[n − p], θ) + w[n]
k=1
and writing
x[n] = x0 [n] + δ.x1 [n] + δ 2 x2 [n] + ...
254 Advanced Probability and Statistics: Applications to Physics and Engineering
we get
p
x0 [n] + a[k]x0 [n − k] = w[n],
k=1
p
x1 [n] + a[k]x1 [n − k] = F (x0 [n − 1], ..., x0 [n − p], θ)
k=1
x1 [n] = h[k]F (x0 [n − k − 1], ..., x0 [n − k − p], θ)
k≥0
where
H(z) = h[k]z −k = A(z)−1
k≥0
where
p
A(z) = 1 + a[k]z −k
k=1
From the expressions, we deduce the approximate spectrum and higher order
spectra of x[n].
Large deviation analysis of the EKF error process The state model
is
x[n + 1] = f (x[n], n) + g(x[n], n)w[n + 1]
and the measurement model is
The EKF is
x̂[n + 1|n] = f (x̂[n|n], n),
Advanced Probability and Statistics: Applications to Physics and Engineering 255
where
Problem: Write down the linearized stochastic difference equation for the
error process e[n] = x[n] − x̂[n|n] and using formulas for the rate function of
a Gaussian process, calculate the LDP rate function for the process e[n]. Note
that in the difference equation for e[n], the coefficients will generally be functions
of x̂[n|n]. If we assume that tracking is good, then the stochastic process x̂[n|n]
is replaced by the deterministic process xd [n] in the linear difference equation
for e[n] thus guaranteeing that e[n] is a Gaussian process. If we wish to be more
accurate, then we expand functions of x̂[n|n] around xd [n] in which case, we
obtain nonlinear difference equations satisfied by the two error processes x[n] −
x̂[n|n]andxd [n] − x̂[n|n] that are driven by a white Gaussian process and hence
by applying the contraction principle, we can obtain a more accurate formula for
the joint rate function of these two error processes from which the approximate
formula for the probability of deviation of both of these error processes by a
threshold value around zero can be calculated and controllers can be designed
to minimize this deviation probability.
From the Neyman-Pearson decision theory, it is known that the optimal test
that minimizes the false alarm probability P (H1 |H0 ) for a given miss probability
256 Advanced Probability and Statistics: Applications to Physics and Engineering
P (H0 |H1 ) = is given by the following: Decide H1 if p⊗n (x)/q ⊗n (x) > c(n)
and decide H0 if the same is < c(n), where c(n) is chosen so that
q ⊗n (x)dx =
Zn
where
Zn = {x : p⊗n (x)/q ⊗n (x) > c(n)}
Define the time averaged log likelihood ratio
n
Ln (x) = n−1 / log(p(xk )/q(xk )) = n−1 .log(p⊗n (x)/q ×n (x))
k=1
or equivalently as
n−1 log(P (Ln (x) > R|H0 )) ≈ −inf {I0 (x) : x > R}
and
n−1 log(P (Ln (x) < R|H1 )) ≈ −inf {I1 (x) : x < R}
where
I0 (x) = sups∈R {sx − log( q 1−s (x)ps (x))}
x
and
I1 (x) = sups∈R {−sx − log( p1−s (x)q s (x))}
x
Define
F0 (s) = log( q 1−s (x)ps (x)), F1 (s) = log( p1−s (x)q s (x))
x x
F0 ( ) = − .H(q|p) + O( 2 ), ↓ 0,
F0 (1 − ) = − .H(p|q) + O( 2 ), ↓ 0,
F1 ( ) = − .H(p|q) + O( 2 ),
F1 (1 − ) = − .H(q|p) + O( 2 )
Advanced Probability and Statistics: Applications to Physics and Engineering 257
where
H(p|q) = p(x)log(p(x)/q(x))
x
is the relative entropy between the pdfs’ p, q. Note that H(p|q), H(q|p) ≥ 0 with
equality iff both the pdf’s p, q coincide. Now,
I0 (x) ≥ .x − F0 ( )
and therefore,
and hence
−inf {I0 (x) : x > R} ≤ −δ + O( 2 )
This means that for this choice of R, we have that
P (H1 |H0 ) → 0, n → ∞
I0 (x) ≥ (1 − )x − F0 (1 − )
= R + (H(p|q) − R) + O( 2 )
and hence,
= −R − δ + O( 2 )
for the choice R = H(p|q) − δ. By choosing sufficiently small, we find that
which means that for the choice of R = H(p|q) − δ, we can get the probability
of false alarm to approach zero at a rate arbitrary close to −H(p|q). Moreover,
for this choice of R,
so that
inf (I1 (x) : x < R) ≥ − .R − F1 ( )
258 Advanced Probability and Statistics: Applications to Physics and Engineering
and so
= .(R − H(p|q)) + O( 2 ) = − .δ + O( 2 )
which means that for this same choice R = H(p|q) − δ, the miss probability
converges to zero. On the other hand, now take R = H(p|q) + δ. Then, we have
by the same logic,
F1 (s) = −x
ie,
exp(−F1 (s)) p1−s q s .log(p/q) = x − − − (a)
x
−inf {I1 (x) : x < R} = supx<R infs {sx + F1 (s)} = supx<R {s0 (x)x + F1 (s0 (x)))
Now for δ small and R = H(p|q) + δ, s0 (R) is close to zero and negative. So,
we can write
where w(n) are iid standard normal random vectors, we make a measurement
where v(n) are iid normal independent of w(.) and construct the EKF:
is a minimum where
and
Note that
Now T r(P (n + 1|n + 1)) can be minimized easily w.r.t K(n) by a variational
principle in matrix calculus. Also note that
≈ f (x̂(n|n))e(n|n) + g(x̂(n|n))w(n + 1)
and so
P (n + 1|n) = F (n)P (n|n)F (n)T + G(n).G(n)T
where
F (n) = f (x̂(n|n)), G(n) = g(x̂(n|n))
Now consider the error sequence e(n) = e(n|n) = x(n)− x̂(n|n). From the above
equations, it is clear that its covariance P (n) = P (n|n) satisfies approximately
a linear quadratic difference equation. The question is, under what conditions
does P (n) converge to a constant matrix P (∞) and if so, then at what rate does
it converge to this limiting error covariance ? First we observe that the optimal
K(n) satisfies
and define Zn = Sn /n. Assume for the moment that Xn is real valued. Then,
= Πnk=1 Mk (λ)
Mk is the moment generating function of Xk . We form
n
n−1 Λn (nλ) = n−1 log(Mk (λ))
k=1
and we can apply the Gartner-Ellis theorem to obtain the LDP for {Zn }.
where V (r) is the static nuclear potential. The solution to this equation can
formally be written as
t
ψ(t, r) = T {exp(−i H(s)ds)}ψ(0, r)
0
where
H(t) = −(2m)−1 (∇ + ieA(t, r))2 + V (r)
is a random operator valued stationary stochastic process. Define the random
unitary operator valued stochastic process
t
U (t) = T {exp(−i H(s)ds)}
0
Then, we wish to formulate a large deviation principle for U (t). The transition
probability amplitude per unit time from a state |u > to another state |v > is
given by
Kt (u, v) = t−1 < v|U (t)|u >
and the transition probability per unit time between the same states is given by
where
H0 = −∇2 /2m + V (r),
V1 (t) = (−i/2m)(divAt + 2(At , ∇)), V2 (t) = A2t /2m
Upto O(e2 ), we get
t
U (t) = U0 (t)−ie U0 (t−s)V1 (s)U0 (s)ds−e2
0
U0 (t−s1 )V1 (s1 )U0 (s1 −s2 )V1 (s2 )U0 (s2 )ds1 ds2
0<s2 <s1 <t
t
2
−ie U0 (t − s)V2 (s)U0 (s)ds
0
V1 , V2 are random operator valued stationary stochastic processes. A special
case in which the A(t, r) is white w.r.t the time variable implies that the oper-
ators V1 (t), V2 (t) are also white noises with V1 being white Gaussian. However,
V2 (t) being the square of a white noise process is not precisely defined. So
we pass over to discrete time in order to formulate the LDP precisely. The
discretized form of the above Dyson series solution is
n
W [n] = U0 [n]−1 [U [n] − U0 [n]] = −ieΔ U0 [−k]V1 [k]U0 [k]
k=0
n
−e2 (iΔ U0 [−k]V2 [k]U0 [k] + Δ2 U0 [−m]V1 [m]U0 [m − k]V1 [k]U0 [k])
k=0 0≤k≤m≤n
In this expression, {(V1 [k], V2 [k])} is an iid bivariate operator sequence. Let us
first consider only the linear term:
n
W [n] = −ie U0 [−k]V1 [k]U0 [k]
k=0
Let
Eexp(T r(XV1 [k])) = M1 (X)
where X is a Hermitian matrix. Then,
Then, the error probability for the test T where ρ has a-priori probability P1
and σ has a-priori probability P2 is given by
= P1 p(m) < um |vk > |2 + (P2 q(k) < um |vk >< vk |T |um > −P1 p(m)
k,m k,m
< vk |um >< um |T |vk >)
T r(σT ) = q(k) < vk |T |um >< um |T |vk >= q(m)| < uk |T |vm > |2
k,m k,m
Further,
p(k)| < uk |1 − T |vm > |2 + q(m)| < uk |T |vm > |2 ≥
min(p(k), q(m)).(| < uk |1 − T |vm > |2 + | < uk |T |vm > |2 )
≥ α(k, m)| < uk |1 − T |vm > + < uk |T |vm > |2 = α(k, m)| < uk |vm > |2
where
α(k, m) = min(p(k), q(m))
Thus,
P r(e, T ) ≥ (1/2) α(k, m)| < uk |vm > |2
k,m
Note that this result is true for any two positive definite matrices ρ, σ. Now let
ρ, σ be two states and define Tn to be the orthogonal projection onto {exp(nR)ρ⊗n −
σ ⊗n > 0}. Let
ω = (k1 , ..., kn , m1 , ..., mn )
and
P ⊗n (ω) = p(k1 )...p(kn )Πnr=1 | < ukr |vmr > |2
Q⊗n (ω) = q(m1 )...q(mn )Πnr=1 | < ukr |vmr > |2
264 Advanced Probability and Statistics: Applications to Physics and Engineering
T r(ρ⊗n (1 − Tn )) + T r(σ ⊗n Tn )
≥ (1/2) min(p(k1 )...p(kn ), q(m1 )...q(mn ))Πnr=1 | < ukr |vmr > |2
k1 ,...,kn ,m1 ,...,mn
= (1/2) min(P ⊗n (ω), Q⊗n (ω))
ω
⊗n
= (1/2)P {ω : P ⊗n (ω) < Q⊗n (ω)}
+(1/2)Q⊗n {ω : P ⊗n (ω) > Q⊗n (ω)}
Note that P ⊗n (ω) is the n-fold product measure whose one dimensional marginals
are P1 (k, m) = p(k) < uk |vm > |2 and Q⊗n (ω) is the n-fold product measure
whose one dimensional marginals are Q1 (k, m) = q(m)| < uk |vm > |2 . Large
deviation theory or more precisely, Cramer’s theorem in the classical setting can
now be applied to this problem to derive an asymptotic lower bound for
Let G = SO(3) act on an image field f (r) to produce another image field
√
g(r) = f (R−1 r) + w(r), R ∈ SO(3)
where w(r) is a zero mean Gaussian noise field that has G-invariant correlations.
Let Ylm (r̂) denote the standard spherical harmonics. Then the noise correlations
have the form
Kw (r, r ) = σ(l)2 Ylm (r̂).Ylm (r̂ )∗
l,m
Let
flm = f (r)Ylm (r̂)∗ dΩ(r̂), fl = ((flm ))−l≤m≤l
Advanced Probability and Statistics: Applications to Physics and Engineering 265
Then,
gl = πl (R)fl + wl , l = 0, 1, 2, ...
and moreover by the orthogonality of the spherical harmonics for different l, it
is easy to see that wl , l = 0, 1, 2, ... are independent complex Gaussian random
vectors with zero mean and moreover,
We can write
S = exp(X)R
and then
R̂ = exp(X̂)R
where
X̂ = argminX ( σ(l)−2 gl − exp(Xl ).πl (R)fl 2 )
l
Xl = dπl (X)
ie, dπl is the representation of the Lie algebra of G corresponding to the repre-
sentation πl of G:
πl (exp(X)) = exp(dπl (X))
Our aim is to use the LDP to determine approximately the probability P (
X̂ > δ) in terms of a rate function I(X), ie,
−1
P ( X̂ > δ) ≈ exp(− inf (I(X) : X > δ))
and then determine an optimal image field f (r) so that this ”error probability”
is as small as possible subject to the constraint that f belongs to a class of
image fields. We make an approximate calculation noting that X ≈ 0:
exp(Xl ) ≈ 1 + Xl + Xl2 /2
Then,
gl − exp(Xl ).πl (R)fl 2 ≈
gl − πl (R)fl − Xl πl (R)fl − Xl2 πl (R)fl /2 2
≈ gl − πl (R)fl 2 + Xl πl (R)fl 2
−2Re(< gl − πl (R)fl , Xl πl (R)fl >) − Re(< gl − πl (R)fl , Xl2 πl (R)fl >
=≈ gl − πl (R)fl 2 +T r(Xl πl (R)fl fl∗ πl (R−1 )Xl∗ )
266 Advanced Probability and Statistics: Applications to Physics and Engineering
3
−2. xi .Re(T r[(gl − πl (R)fl )fl∗ πl (R−1 )Zil∗ ])
i=1
3
− xi xj .Re(T r[(gl − πl (R)fl )fl∗ πl (R−1 )Zil∗ Zjl
∗
])
i,j=1
3
σ(l)−2 T r(Zil πl (R)fl fl∗ πl (R−1 )Zjl
∗
)xj
l j=1
3
− σ(l)−2 Re(T r[(gl − πl (R)fl )fl∗ πl (R−1 )Zil∗ Zjl
∗
])xj
l j=1
= σ(l)−2 Re(T r[(gl − πl (R)fl )fl∗ πl (R−1 )Zil∗ ]), i = 1, 2, 3
l
x = (xi ) = A−1 b
where
A = ((A(i, j))) ∈ R3×3 , b = ((b(i))) ∈ R3×1
are respectively defined by
A(i, j) = σ(l)−2 T r(Zil πl (R)fl fl∗ πl (R−1 )Zjl
∗
)
l
Advanced Probability and Statistics: Applications to Physics and Engineering 267
− σ(l)−2 Re(T r[(gl − πl (R)fl )fl∗ πl (R−1 )Zil∗ Zjl
∗
])
l
and
b(i) = σ(l)−2 Re(T r[(gl − πl (R)fl )fl∗ πl (R−1 )Zil∗ ])
l
We can write √ √
A = A1 + .A2 , b = c
where
A1 (i, j) = σ(l)−2 T r(Zil πl (R)fl fl∗ πl (R−1 )Zjl
∗
)
l
A2 (i, j) = − σ(l)−2 Re(T r[wl fl fl∗ π(RT )Zil∗ Zjl
∗
])
l
c(i) = σ(l)−2 Re(T r[wl fl∗ πl (RT )Zil∗ ])
l
x = A−1 b = 0
since = 0 implies
A = A1 , b = 0
with A1 being non-singular provided that we base our estimates on choosing a
sufficiently large number of indices l. Note that A, b depend on f, R. Now, we
have the approximate solution
√ −1
x= A1 c + O( )
or more precisely,
√ √
x= (A−1
1 − A−1 −1
1 A2 A1 )c + O(
3/2
)
√
= A−1 −1 −1
1 c − .A1 A2 A1 c + O(
3/2
)
From standard LDP for Gaussian random vectors, the approximate rate function
for x is given by
I(x) = (1/2)xT A−1 −1
1 Rcc A1 x
where
Rcc = E(ccT ) = ((Rcc (i, j)))
where
Rcc (i, j) = σ(l)−2 σ(k)−2 Cov(Re(T r[wl fl∗ πl (RT )Zil∗ ]), Re(T r[wk fk∗ πk (RT )Zjk
∗
]))
l,k
= (σ(l))−4 Cov(Re(fl∗ πl (RT )Zil∗ wl ), Re(fl∗ πl (RT )Zjl
∗
wl )
l,k
268 Advanced Probability and Statistics: Applications to Physics and Engineering
= (1/2) σ(l)−2 Re(fl∗ πl (RT )Zil∗ Zjl πl (R)fl )
l
Now
inf (I(x) : x > δ) =
inf ((1/2)xT A−1 −1
1 Rcc A1 x : x > δ)
= (δ 2 /2)λmin (T )
where λmin (T ) is the minimum eigenvalue of the 3 × 3 positive definite matrix
T = A−1 −1
1 Rcc A1
Define
Zn = Sn /n, Sn = X1 + ... + Xn
and let Λn (.) be the moment generating function of Zn . Clearly
Λn (nη) = nΛ(η)
Then let μn (.) denote the probability distribution of Zn and for some fixed
η ∈ R, let νn (.) denote the probability distribution on R defined by
= exp(−nΛ(η)))(d/d(nη))(exp(Λn (nη)))
= (d/d(nη))(Λn (nη)) = Λn (nη) = Λ (η) = y
say. We have
μn (B(y, δ)) = exp(−nηz + nΛ(η))dνn (z)
B(y,δ)
Advanced Probability and Statistics: Applications to Physics and Engineering 269
= exp(−nη(z − y) + nΛ(η) − nηy)dνn (z)
B(y,δ))
= exp(n(Λ(η) − yη)) exp(−nη(z − y))dνn (z)
B(y,δ)
≥ exp(n(Λ(η) − yη)) exp(−nη(z − y))dνn (z)
B(y,δ)
+Λ(η) − yη − ηδ
By the weak law of large numbers,
μn (B(y, δ)) → 1
= −Λ∗ (y)
This completes the proof of the lower bound in Cramer’s theorem on large
deviations for sample averages of iid random variables.
Assume that
n−1 Λn (nη) → Λ(η)
exists. The mean of νn is given by
n−1 .log(μn (B(y, δ))) ≥ n−1 .log(νn (B(y, δ)) + n−1 Λn (nη) − yη − ηδ
= Λ∗ (x) + Λ(η) − ηx ≥ 0
By the large deviation upper bound (proved easily using the Chebyshev-Markov
inequality)
by hypothesis on y. Thus,
νn (B(y, δ)c ) → 0
and hence we infer that
νn (B(y, δ)) → 1
from which it follows that
As a special case of this general theorem, Let X(t) be a process whose empirical
measure LX (t, .) defined by
t
LX (t, B) = t−1 χB (X(s))ds, t > 0
0
satisfies the LDP as t → ∞ with rate I(ν). Then Varadhan’s integral lemma
implies that
t
−1 −1
t .logEexp(t V (x)LX (t, dx)) = t .logE[exp( V (X(t))dt)]
0
where
f (x) = dν(x)/dx
is the probability density of the probability distribution ν
Chapter 10
Contributions of some
Indian Scientists
273
274 Advanced Probability and Statistics: Applications to Physics and Engineering
had applied this theory to give a rigorous derivation of the general form of ob-
servables like position, momentum, angular momentum, spin and energy of a
quantum mechanical system using the projective unitary representations of the
Galilean group. Varadarajan, after mastering Mackey’s theory returned to ISI
Calcutta and gave a course to the research students on al this. Simultaneously,
he gave courses at the ISI on the structure theory of Lie groups and Lie alge-
bras and on Von-Neumann’s celebrated work on operator theory in quantum
mechanics, all from a rigorous mathematical angle. Varadarajan was thus a
pioneer in Indian Mathematics even at a very young age when he injected se-
rious mathematics and mathematical physics like functional analysis, operator
theory, Lie groups, Lie algebras and their representations and the mathematical
foundations of quantum mechanics into an otherwise dull environment where
only probability and statistics existed. The outcome of Varadarajan’s lectures
on these new subjects can be seen today when many research students in his
audience at that time have flowered into celebrated mathematicians and math-
ematical physicists whose works in statistical mechanics, group representation
theory and quantum noise and non-commutative probability theory are known
all over the world.
Whilst visiting America, G.W.Mackey had advised Varadarajan to read
the ”terrifying algebraic machinery” of Harish-Chandra for constructing irre-
ducible representations of semisimple Lie groups and Lie algebras and deriv-
ing Plancherel formulae for semisimple Lie groups that have more than one
non-conjugate Cartan sub-groups extending by leaps and bounds the work of
Gelfand on the representation theory of complex semisimple Lie groups (which
have just a single non-conjugate Cartan subgroup). So Varadarajan proceded
to Princeton, met Harish-Chandra and after many discussions with the mas-
ter, he mastered the general theory of roots developed by Cartan leading to
the classification of all simple Lie algebras, distributional characters, Harish-
Chandra’s (g, K) modules and the Discrete series for real semisimple Lie groups,
the outcome of which was two marvellous books, one, a textbook on Lie groups,
Lie algebras and their representations and another on Harmonic analysis on
semisimple Lie groups. The former gives a detailed account accessible to the
graduate student on Harish-Chandra’s 1951 paper in which the master had de-
rived beautiful algebraic formulae for finite and infinite dimensional irreducible
representations of a semisimple Lie algebra corresponding to each dominant in-
tegral Cartan weight using quotients of the universal enveloping algebra by a
maximal ideal. In this textbook, Varadarajan gives a detailed and easily acces-
sible account of Cartan’s theory of roots and weights for classifying all simple
Lie algebras and their irreducible representations. This book culminates in the
proof of the celebrated Weyl character formula for all irreducible representa-
tions of compact group for a given dominant integral weight. An innumerable
number of exercises are present in this book which are all based on Varadara-
jan’s original research. Exercises include the later discovered Verma module for
constructing infinite dimensional irreducible representations of semisimple Lie
algebras and the Frobenius-Young theory of irreducible characters of the permu-
tation groups. Hints are provided wherever required which enable the student
Advanced Probability and Statistics: Applications to Physics and Engineering 275
to understand the theory much better than if he were given the complete solu-
tion. It is no exaggeration to say that any student who is able to work through
all these exercises can easily begin research in this field especially on infinite
dimensional representations and representations of Lie super-algebras with ap-
plications to supersymmetry. The second book is a priceless gem as it is all
about presenting Harish-Chandra’s original work on distributional characters,
the discrete series and Plancherel’s formula for real semisimple Lie groups in a
readily accessible form by considering the prototype example of SL(2, R) which
is the simplest example of semisimple Lie group having more that one non-
conjugate Cartan subgroup. Varadarajan, in this book, begins with the work of
Gelfand and Naimark on the representation theory of complex semisimple Lie
group and their Plancherel formula. For this, the Principal series representation
of the SL(n, C), namely that induced by the characters of the upper triangular
subgroup is first explained in full detail along with a proof of its irreducibil-
ity based on Mackey’s theory of induced representations. The supplementary
series discovered by Gelfand is then introduced in terms of an invariant inner
product on functions defined on C2 . Varadarajan then explains the startling
result of Gelfand that only the principal series appear in the Plancherel for-
mula for the complex semisimple Lie groups and notes that this was the reason
why Gelfand totally missed out the Discrete series while attempting to derive
a Plancherel formula for the real semisimple Lie groups. Varadarajan then
explains via the theory of Harish-Chandra’s (g, K) modules, the infinite dimen-
sional irreducible representations (more precisely (g, K) modules) of SL(2, R)
and how apart from the principal series, the discrete series also appears. Later
on in the book, Varadarajan introduces the discrete series modules from an
analytic viewpoint by realizing it within the principal series of representations.
Using the powerful theory of orbital integrals developed by Gelfand for arriving
at the Plancherel formula for the complex semisimple Lie groups, Varadarajan
remarks that Harish-Chandra perfected Gelfand’s orbital integral theory on the
group to orbital integral theory on the Lie algebra using the method of Fourier
transforms on a Lie algebra and using this theory arrived at the Plancherel
formula for real semisimple Lie groups like SL(2, R). Here, unlike the complex
case where the orbital integral theory is a generalization of Weyl’s integration
formula for compact Lie groups, the orbital theory involves integration over
the different non-conjugate Cartan subgroups and as in the complex case, not-
ing that the orbital integrals can directly be used to obtain the distributional
characters of the principal and discrete series. The discrete series are infinite
dimensional irreducible representations of SL(2, R), first introduced in this spe-
cial case by Bargmann, which when included along with the principal series
yields the Plancherel formula for SL(2, R). All this along with many interesting
stories are told in Varadarajan’s book like how Gelfand missed out the discrete
series because he adopted the Lie group approach rather than the Lie algebraic
approach and hence failed in deriving the Plancherel formula for groups having
more than one non-conjugate Cartan subgroup like SL(n, R), n ≥ 2. Whilst
dealing with the discrete series representations of SL(n, R) for n > 1 Harish-
Chandra introduced parabolic subgroups and parabolic induction to construct
276 Advanced Probability and Statistics: Applications to Physics and Engineering
of the proper orthochronous Lorentz group acting in the 2-D plane× time axis
and hence can be used to pattern recognition for two dimensional time vary-
ing image fields. Many engineers working in signal and image processing today
learn about the representation theory of SL(2, C) and SL(2, R) from Varadara-
jan’s book and apply it successfully to pattern recognition and estimation of
the Lorentz transformation from the initial object field and the final object
field corrupted with noise. Furhter the representation theory of the symplectic
group is important in doing pattern recognitin for classical mechanical systems.
For example, H(q, p) = (pT Ap + (1/2)qT Bq, where A, B are positive definite
matrices is the Hamiltonian of a system of linear coupled Harmonic oscillators.
Under this Hamiltonian dynamics, (q(t)T , p(t)T )T is obtained by applying a lin-
ear symplectic transformation to (q(0)T , p(0)T )T . Now consider an observable
f (x), (x = (q, p) on phase space. This observable may for example represent a
signal field produced by the particles of the oscillator system dependent on their
phase space configuration, ie positions and velocities, for example if they emit
electromagnetic waves of definite frequencies, then the measured field amplitude
will depend on their positions and the measured field frequencies will depend
on their momenta in view of the Doppler effect. After time t, the measured
signal field becomes f (T−1 t x) where Tt is a symplectic matrix dependent on
the matrices A, B. When this signal field gets corrupted with noise, we can
apply representation to the symplectic group to pull the matris Tt out of the
observable function and hence estimate the parameters on which A, B depend
easily. Varadarajan’s exposition of the representation theory of the symplectic
group is very useful. I have myself benefited a great deal by reading his text
book and applying it to this system feature estimation problem.
Even before meeting Harish-Chandra, just after meeting Mackey and lectur-
ing at the ISI on the axiomatic foundations of quantum mechanics, Varadarajan
had summarized his ISI lectures on the mathematical foundations of quantum
mechanics in the form of an impeccable book titled ”Geometry of quantum
theory” which later on matured into a volume published by Springer. In this
book, Varadarajan talks about various kinds of quantum logics that do not
obey Boolean rules, like for example, the logic based on orthogonal projections
in Hilbert space. He then deals with the precise formulation of Mackey’s imprim-
itivity theorem which states that upto a unitary isomorphism, every imprimitiv-
ity system is equivalent to a certain canonical imprimitivity system whose uni-
tary representation is obtained by inducing a representation of a smaller group
in a certain ”smaller” Hilbert space. Varadarajan’s book on quantum theory
also contains a nice proof of Gleason’s theorem which is one of the cornerstone
theorems in the foundations of quantum mechanics. This theorem states that
every probability measure on the lattice of orthogonal projections in a Hilbert
space can be obtained via a density operator/density matrix. Then Varadarajan
in this book discusses Wigner’s theorem on representing automorphisms of the
projection lattice by unitary and antinunitary operators in the Hilbert space and
showing that the theorem is intimately connected with Mackey’s imprimitivity
system. More precisely, Mackey’s system consists of covariant observables under
a group action which assumes that given a group G and a map g → τ (g) from
278 Advanced Probability and Statistics: Applications to Physics and Engineering
where
∂ 2 log(p(X|θ))
J(θ) = −E[ ]
∂θ∂θT
∂(log(p(X|θ)) ∂(log(p(X|θ)
= E[ . ]
∂θ ∂θT
This lower bound states that no matter how one may construct an estimate of
the parameter, one cannot obtain an accuracy beyond a certain limit. This is
just like the uncertainty principle of Heisenberg in quantum mechanics which
states that no matter what the state of a quantum system is, one cannot simulta-
neously measure two non-commuting observables like position and momentum
with infinite accuracy. Recently, mathematical physicists have shown that it
is possible to derive the Heisenberg uncertainty principle from the CRLB and
vice-versa. The idea is roughly to choose a wave function ψ(x) on R and regard
it as the position space wave function. We then shift this wave function by u so
that it becomes ψ(x|u) = ψ(x − u). The corresponding momentum space wave
function is its Fourier transform ψ̂(p|u) = exp(−ipu)ψ̂(p). The probability den-
sity of the position observable in this state is |ψ(x|u)|2 = |ψ(x − u)|2 while the
probability density of the momentum observable in this state is |ψ̂(p)|2 . When
we try to estimate the position shift u from measurement of the position x, we
denote this estimate by û(x). Then the CRLB implies that
(û(x) − u) |ψ(x|u)| dx ≥ 1/ (∂log(|ψ(x|u)|2 )/∂u)2 |ψ(x|u)|2 dx
2 2
or equivalently,
( (û(x) − u)2 |ψ(x − u)|2 dx)( ((|ψ(x − u)|2 ),u dx)2 /|ψ(x − u)|2 ) ≥ 1
If ψ(x) is a real wave function, then we can easily see by appropriate choice of
the estimator û(x) that this inequality implies
( x ψ(x) dx).( ψ (x)2 dx) ≥ 1/4
2 2
ie, r does not depend on θ. This implies that knowledge of T (X) alone suffices to
determine all the information required to estimate θ from X. C.R.Rao then did
a great deal of work on estimation of parameters in linear models using various
kinds of generalized inverses of matrices and algorithms for the construction of
such generalized inverses and their properties. For example, if X = Hθ + V is
a linear model and we wish to estimate θ from X by minimizing X − Hθ 2
where the norm may be taken w.r.t any positive semidefinite matrix W , then we
may have a unique or non unique solution. If the solution is non-unique and we
choose that solution having a minimum norm, then the estimate of θ is uniquely
determined. and we can write this solution as θ̂ = pinv(H)X where pinv(H)
is termed as the pseudo-inverse and is also called the Moore-Penrose inverse
of H. C.R.Rao derived necessary and sufficient conditions for the non-unique
least squares generalized inverse, and the unique pseudo-inverse. Further, if the
solution to the linear system X = Hθ exists for θ, then it may be non-unique,
but if we choose that solution having the minimum norm, then the solution
set becomes smaller and is given by the ”minimum-norm generalized inverse”
of H applied to X. C.R.Rao derived necessary and sufficient conditions that
the minimum norm generalized inverses of a given matrix must satisfy. It turns
out that all these generalized inverses can be expressed very easily in terms of
the singular value decomposition of the matrix. C.R.Rao did a variety of work
on singular Gaussian distributions which occur commonly in statistics. This
means that we have a Gaussian random vector amongst which one or more non-
trivial linear relations exist. This is equivalent to saying that the covariance
matrix of the Gaussian vector is singular. In this case, we cannot get an explicit
representation of the joint probability density of the Gaussian vector but we
can get an explicit representation for the joint characteristic function of the
Gaussian vector. One way to analyze such singular Gaussian distributions is to
express the Gaussian vector as a rectangular matrix acting on a non-singular
Gaussian random vector of a smaller size and to look at the density of the smaller
sized vector. However, this is not a coordinate free approach since there exist an
infinite number of such representations. The coordinate free approach developed
by C.R.Rao is based on his theory of generalized inverses of singular matrices and
is well summarized in his celebrated book ”Linear statistical inference and its
applications”. C.R.Rao has proved a variety of theorems concerning asymptotic
efficiency of statistical estimators of parameters. For example, if we are given an
infinite sequence X1 , X2 , ... of iid random variables having common pdf p(x|θ)
Advanced Probability and Statistics: Applications to Physics and Engineering 281
then a natural question to ask is that whether this sequence of estimators will
have a variance that asymptotically decays as the CRLB given by
[E[(∂log(p(X1 , ..., Xn |θ))/∂θ)2 ]]−1 .
He developed analytic tools for examining such problems also in the stationary
dependent case. A statistical estimator of a parameter is said to be efficient if its
variance equals the CRLB, it is said to be asymptotically efficient if the ratio of
its variance to the CRLB converges to unity almost surely as the number of data
samples goes to infinity. It is a well known fact that if there exists an estimator
that is efficient then the maximum likelihood estimator is one such. The proof
of this fact is contained in the appendix. C.R.Rao’s work on the CRLB has been
extended to quantum systems. The problem with quantum systems unlike clas-
sical systems is that given a state, it is not clear what observable to measure or
equivalently, what commuting family of observables to measure or equivalently,
what complete orthogonal family of orthogonal projections (PVM or projection
valued measurement) to measure in order to get an estimator that will have
the least variance. Researchers working in quantum information theory have
therefore arrived at the most general kind of measurement namely POVM;s (an
abbreviation for positive operator valued measurement) which contains all the
above set of measurements as special cases. Such a measurement is given by
M = {Mα : α ∈ I} where I is a countable index set such that Mα ≥ 0∀α ∈ I
and α∈I Mα = 1. If ρ is a state in the same Hilbert space on which these
measurement operators are defined, then on making the measurement M, the
probability of getting the outcome α ∈ I is given by
PM (α) = T r(ρ.Mα )
Thus, PM is a probability distribution on I. If we map I into the real line in any
way, then we can regard I as a countable subset of R and if ρ = ρ(θ) depends
on a parameter θ, we can write
PM (α|θ) = T r(ρ(θ)Mα )
Based on this measurement system, an estimator of θ is a function θ̂(α) of
the outcome α and by the classical CRLB, its mean square error based on the
measured outcome satisfies
EM [(θ̂(α) − θ)2 ] ≥ JM (θ)−1
where
∂log(PM (α|θ)) 2
JM (θ) = PM (α|θ)( )
α
∂θ
The optimal measurement M in order to estimate θ is obviously that POVM
M for which JM (θ) is a maximum and clearly, this depends on the parameter
θ itself. This result is readily extended to the vector parameter case.
282 Advanced Probability and Statistics: Applications to Physics and Engineering
probability measure P on the space of continuous paths such that when under
P , the process Mf,X (t) is a Martingale for all f ∈ C 2 , then in the language of
Stroock and Varadhan, we say that the Martingale problem is well posed and
the solution X having the probability law P is then called an Ito process. The
form of Lω is
Lω = μi (t, ω)∂/∂xi + (1/2) aij (t, ω)∂ 2 /∂xi ∂xj
i i,j
where ω is the coordinate process and μi , aij are progressively measurable pro-
cesses. Stroock and Varadhan determine existence and uniqueness conditions
for Ito processes and as one clearly expects, such processes are constructed by
patching together scaled Brownian motion processes with drift over infinitesi-
mal time intervals. The Stroock-Varadhan theory therefore extends the scope
of the Ito theory of stochastic differential equations by including drift and diffu-
sion coefficients satisfying only a boundedness condition and with the resulting
process not necessarily satisfying an sde in the Ito pathwise sense but rather
but rather it being defined by a probability measure on the space of continuous
paths which is uniquely determined by the solution to the Martingale problem.
After this work, Donsker and Vardhan in a series of historic papers developed
the theory of large deviations for general random variables taking values in a
metric space or even more generally, taking values in a topological vector space.
This includes process valued random variables. Varadhan while starting his
research in this field, was first inspired by the following question of Donsker:
Suppose B(t) is Brownian motion and consider evaluating the expectation
t
u(t, x) = E[exp( V (x + B(s))ds)]
0
It is a well established result of Feynman and Kac that u satisfies the pde
u(0, x) = 1
The solution to this can formally be expressed as
∞
u(t, x) = cn un (x).exp(λn t)
n=1
Donsker asked Varadhan the question, how to generalize this result to arbitrary
stochastic processes, not necessarily Brownian motion ? The result was a re-
markable result due to Varadhan, known as Varadhan’s integral lemma which
states that if a family of random variables Z( ), → 0 satisfies a large deviation
principle with rate function I(x), ie,
P r(Z( ) ∈ E)
[6] Freidlin and Wentzell created the theory of large deviations for diffusion
processes driven by weak white Gaussian noise, or equivalently, for stochastic
differential equations driven by Brownian motion with small amplitude. In a
sense these results can be obtained by using the contraction principle according
to which if I1 (x) is the rate function for a random variable/random process
x and if y = f (x) is another random variable/random process, then the rate
function for y is
I2 (y) = inf {I1 (x) : f (x) = y}
Varadhan has also posed the question that if the original process is any sta-
tionary process with probability measure P , then can the rate function for the
empirical density of this process be expressed as
I(Q|P ) = dQ(x0 , x−1 , x−2 , ...).log(dQ(x0 |x−1 , x−2 , ...)/dP (x0 |x−1 , x−2 , ...))
There are difficulties with the existence of such a rate function. For example if
286 Advanced Probability and Statistics: Applications to Physics and Engineering
both P and Q are ergodic processes, then P will be concentrated on the set
n−1
{(x0 , x−1 , x−2 , ...)|limn→∞ n−1 x−k = x0 dP = μ(P )}
k=0
n−1
{(x0 , x−1 , x−2 , ...)|limn→∞ n−1 x−k = x0 dQ = μ(Q)}
k=0
so if μ(P ) = μ(Q),
then the radon Nikodym derivative dQ(x0 |x−1 , x−2 , ...)/dP (x0 |x−1 , x−2 , ...)
will not exist.
[10] Varadhan obtained the rate function for random walks in a random
environment. Specifically, the environment consists of a lattice such that the
transition probability from x to x + z is a random function of z. More generally,
we can consider a random environment such that the transition probability from
x to x + z is of the form π(ω, x, z) such that for each z, the map x → π(ω, x, z)
is a stationary process on the lattice.
[5] Survey of some of the work in quantum probability by the Indian
school of probabilists
computing scattering probabilities in quantum mechanics
Work of Amrein, Sinha and Jauch on the determination of time
delay in scattering processes
Let H, H0 be two Hamiltonians. H0 corresponds to the free particle Hamilto-
nian while H = H0 +V is the Hamiltonian of the free particle plus its interaction
energy V with the scattering centre. Let Ω+ , Ω− denote the wave operators and
S the scattering matrix, ie, S = Ω∗+ Ω− . Then, we have
Ω− .exp(−itH0 ) = exp(−itH)Ω− , ∀t
and also
Ω+ .exp(−itH0 ) = exp(−itH).Ω+ , ∀t
so that
exp(itH0 ).S.exp(−itH0 ) = S, ∀t
which means that [S, H0 ] = 0. Now let |f > be a free particle state. Regarding
this as the input free state, the corresponding in-scattered state is Ω− |f >. The
average time spent by the particle in this state within the ball B = B(0, R) =
B(R) of radius R in position space is then
∞
T (R) = χB (Q)exp(−itH)Ω− f 2 dt
0
∞
= χB (Q)Ω− exp(−itH0 )f 2 dt
0
Advanced Probability and Statistics: Applications to Physics and Engineering 287
Let
Xt = V (2P t)dt = Kdt/|2P t| = Klog(t)/2|P |
So to ensure that this limit exists, we require that the function of time
Now,
[Q2 , V (2P t)] = 2it(QV (2P t) + V (2P t)Q)
= 2it(2itV (2P t) + 2V (2P t)Q) = −4t2 V (2P t) + 4itV (2P t)Q
288 Advanced Probability and Statistics: Applications to Physics and Engineering
Further,
function of an electron at a later time given that at time t = 0, all the proba-
bility mass of the wave function was concentrated within the critical blackhole
radius. The general form of the metric of a rotating blackhole is given by
dτ 2 = A(r, θ)dt2 − B(r, θ)dr2 − C(r, θ)dθ2 − D(r, θ)(dφ − ω(r, θ)dt)2
Thus,
g00 = A − Dω 2 , g11 = −B, g22 = −C, g33 = −D,
g03 = Dω
The the Klein-Gordon equation in this metric is
√ √
(g μν ψ,ν −g),μ + m2 −gψ = 0
d2 δxμ /dτ 2 + Γμαβ,ρ (x)δxρ (dxα /dτ ).(dxβ /dτ ) + 2Γμαβ (dxα /dτ ).dδxβ /dτ = 0
Lt f (x) = 0
f (x) = Re(h(t).exp(ik.r))
and find on substituting this into the above pde, the following matrix ode for
h(t):
L0 (t)h(t) + L1 (t)h (t) + ikm L2m (t)h(t)
+ikm L3m (t)h (t) − kp km L4pm (t)h(t) + L5 (t)h (t) = 0
In the special case of a non-expanding background metric, the L matrices
are all constant and the dispersion relation is obtained by assuming h(t) =
Re(h(0)exp(iωt)) the following dispersion relation relating the frequency ω to
the wave-vector (km )m=1,2,3 :
det(L0 + iωL1 + ikm L2m − km ωL3m − kp km L4pm − ω 2 L5 ) = 0
In the time varying case, ie, when the background metric describes an expanding
isotropic and homogeneous universe, we assume that the frequency varies slowly
with time and obtain approximations for it using time dependent perturbation
theory. Specifically, we assume that each of the L matrices is the sum of a large
time independent part and a small time dependent part. Such a decomposition
is obtained by writing the scale factor as
S(t) = 1 + δ.(S(t) − 1)
and considering δ(S(t) − 1) and all its derivatives to be of the first order of
smallness ie of O(δ) and then decomposing the frequency ω also in the same
way:
ω(t) = ω0 + δω1 (t)
and solving for ω0 , ω1 (t) by equating coefficients of δ 0 = 1 and δ 1 = δ on both
sides. Padmanabhan also talks about how statistical features like second and
higher order metric,velocity and density correlations evolve with time given their
initial values.
[f] Ashoke Sen on quantum gravity using string and superstring theories.
Advanced Probability and Statistics: Applications to Physics and Engineering 291
The usual way of describing the physics of systems in the world is either to
regard an object as an ensemble of point particles in describe the dynamics of
each particle in terms of differential equations if we adopt the classical viewpoint
or else if we adopt the quantum viewpoint, we describe the dynamics of the
wave function of an ensemble of a finite number of particles or else the wave
functional of a quantum field. In any case, dynamics of particles or systems
of interacting particles have motion described by the time variable. In string
theory, there are no point particles only strings. Thus, to describe classical
physics, rather than considering the world line of a particle xμ (τ ) as a function
of the proper time τ which is a single parameter, consider the world sheet of a
string X μ (τ, σ) where τ is some real parameter which we may call proper time
and σ is another parameter which may be assumed to vary over the interval
[0, 1] after an appropriate scaling. For each time τ , the map σ → X μ (τ, σ) from
[0, 1] in RD+1 , μ = 0, 1, ..., D is the parametric equation of a D + 1 dimensional
string and as time τ varies, this mapping traces a D+1-dimensional world sheet.
The metric of space-time in general relativity has the form
gμν (X)dX μ ∧ dX ν
In the former case, gμν dxμ dxν determines an infinitesimal proper time interval
while in the latter case, gμν dX μ ∧ dX ν describes an infinitesimal area element.
Just as minimizing the proper time interval dτ for point particles gives a
geodesic trajectory in the former case, so also minimizing the the proper area
interval gμν (X)dX μ ∧ dX ν gives a geodesic world sheet trajectory, ie, it tells
us the sheet traced out by the string. We note that
and setting its variation to zero gives us the world sheet equation:
μ
∂τ ∂L/∂X,τ μ
+ ∂σ ∂L/∂X,σ − ∂L/∂X μ = 0
where
L = L(X μ , X,τ
μ μ
, X,σ )=
μ
gμν (X)(X,τ ν
X,σ − X,σ
μ ν
X,τ )
For quantizing the world-sheet dynamics of a string, we may start with this
Lagrangian density L and compute the corresponding Hamiltonian density
where
μ
Pμ = ∂L/∂X,τ
292 Advanced Probability and Statistics: Applications to Physics and Engineering
The classical Hamiltonian equations that are equivalent to the above Euler-
Lagrange equations are
μ
X,τ = ∂H/∂Pμ ,
Pμ,τ = −∂H/∂X μ + ∂σ ∂H/∂X,σ
μ
The quantum mechanical wave equation for the wave functional of a string has
the form
i∂τ ψ(τ, X) = Hψ(τ, X)
where
X = {X(σ) : 0 ≤ σ ≤ 1}
and 1
H= H(X(σ), X,σ (σ), P (σ))dσ
0
with
Pμ (σ) = −i∂/∂X μ (σ)
[g] Numerical general relativity based on the ADM formalism carried out at
the NSUT school of astronomy
The ADM action expresses the Einstein-Hilbert action for the gravitational
field in terms of a spatial component and a temporal component. The idea is
basically to start with a space-time manifold having coordinates X μ and imbed
into this four dimensional manifold a family Σt of spatial manifolds for different
times t. Let gμν denote the metric w.r.t the coordinates X μ and g̃μν that w.r.t
the space-time coordinates xμ , x0 = t that parametrize Σt . Then, we write
g μν = g̃ αβ X,α
μ ν
X,β
= g̃ ab X,a
μ ν
X,b + 2g̃ a0 X,a
μ
X,tν
gμν (T μ − N a X,a
μ ν
)X,b =0
This gives
g̃ab N a = gμν T μ X ν , b = g̃0b
Advanced Probability and Statistics: Applications to Physics and Engineering 293
This is a system of three linear equations for the three components N a and its
inversion is easy. Using this identity, it is easily proven from the above identities
that
g μν = g̃ ab X μ , aX,b
ν
+ nμ n ν
ie, the metric tensor in the X μ system decomposes into the sum of a purely
spatial part and a purely normal part with the cross terms cancelling out. Note
that the cross term in the expansion of g μν is given by
g̃ a0 N (X,a
μ ν ν μ
n + X,a n ) + g̃ 00 N (N μ nν + N ν nμ )
g̃ a0 + g̃ 00 N a
g̃ab N b = g̃a0
N c − g̃ c0 g̃0b N b + g̃ c0 g̃00 = 0,
or equivalently,
N c + g̃ c0 /g̃ 00 = 0
which is what we sought to prove. We write qab for g̃ab and
q μν = g̃ ab X,a
μ
Xν, b
q μν nν = 0
Lμν = ∇μ nν − ∇ν nμ
294 Advanced Probability and Statistics: Applications to Physics and Engineering
= nν,μ − nμ,ν
and then to note that since nμ is normal to a three dimensional surface, it can
be expressed as
nμ = F G,μ
where F, G are scalar fields. Then,
qνμ X,a
ν
= (δνμ − nμ nν )X,a
ν
μ
= X,a
Now
μ
Kab = X,a ν
X,b (nν,μ − Γσνμ nσ )
ν
= nν,a X,b − (1/2)nσ X,a
μ ν
X,b (gσν,μ + gσμ,ν − gμν,σ )
= −nν X,ab
ν
− (1/2)nσ X,a
μ ν
X,b (gσν,μ + gσμ,ν − gμν,σ )
since nν X,b
ν
= 0. Further,
μ ν ν
X,a X,b gσν,μ = gσν,a X,b =
ν
(gσν X,b ),a − gσν X,ab
ν
Advanced Probability and Statistics: Applications to Physics and Engineering 295
ν
= (qσν X,b ),a − gσν X,ab
ν
and likewise,
μ
μ
X,a ν
X,b gσμ,ν = qσa,b − gσμ X,ab
Further,
μ ν
X,a X,b gμν,σ =
μ ν
qμν,σ X,a X,b
Substituting all these expressions and making appropriate cancellations gives us
Now,
nσ = (X,0
σ
− N σ )/N, N σ = N c X,c
σ
We thus get
N nσ qσb,a = qσb,a X,0
σ
− qσb,a N σ
= g̃0b,a − qσb X,0a
σ
− qσb,a N c X,c
σ
N nσ qσa,b =
and further,
N nσ qμν,σ X,a
μ ν
X,b =
μ
qμν,σ X,a ν
X,b σ
X,0 − qμν,σ X,a
μ ν
X,b σ c
X,c N
μ
= qμν,0 X,a ν
X,b − qμν,c X,a
μ ν
X,b Nc
= qab,0 − qμν (X,a
μ ν
X,b ),0
−qab,c N c + qμν (X,a
μ ν
X,b ),c N c
= qab,0 − qaν X,b0
ν
− qbν X,a0
ν
− qan,c N c
ν
+qaν X,bc N c + qbν X,ac
ν
Nc
[g] Abhay Ashtekar on quantum gravity using canonical Ashtekar variables.
quantum conversion of noisy and noiseless image field pairs, so that each pair
of such classical image fields is represented by a pair of pure states. The next
step is to approximate each pure state in each such pair by a quantum Gaussian
state (mixed) and then to purify all these Gaussian states by choosing the
Boson-Fock space appearing in the Hudson-Parthasarathy quantum stochastic
calculus as the reference Hilbert space. In this purification, the ”system part” of
the purified state is expressed in terms of the eigenstates of harmonic oscillators
because the Gaussian states are all diagonal w.r.t. these eigenstates provided
that one assumes that the Gaussian states are in diagonal form, ie they commute
with the Hamiltonian or equivalently with the number operator of each of these
harmonic oscillators. The reference states appearing as a tensor product term in
the purification of quantum Gaussian states are all obtained by orthonormalizing
a set of exponential/coherent vectors in the Boson Fock space. The reason for
choosing such a purification is simple. We are designing our unitary processor
based on the Hudson-Parthasarathy Schrodinger equation in which the Lindblad
system operators are linear combinations of creation and annihilation operators
of harmonic oscillators appearing also in the expression of the quantum Gaussian
state in the system Hilbert space. The noise operator processes on the other
hand are the creation and annihilation operator processes described in terms of
families of operators in the Boson Fock space of the reference system required
for purification. Now the system space creation and annihilation operators have
an easy to describe action on the number states appearing as the first term
in the tensor product that describes the purification of the Gaussian state.
On the other hand the noise operator processes of the Hudson-Parthasarathy
quantum stochastic calculus have an easy to describe action on the exponential
vectors of the reference Boson Fock space that appear as the second term in
the tensor product that describes the purification. Thus, the overall quantum
noisy generator of the unitary evolution in the Hudson-Parthasarathy noisy
Schrodinger equation has an easy to describe action on the purification of the
quantum Gaussian state. Specifically,
√
(a ⊗ dA(t)))|n > ⊗|e(u) >= u(t)dt n|n − 1 > ⊗|e(u) >
$SSOLHG'L൵HUHQWLDO
Equations
[1] Construction of adaptive parameter updates and controller gains that guar-
antee Lyapunov stability of a linearized robotic system as regards tracking error
and parameter estimation error.
where θ0 is the exact parameter value (ie link masses and lengths). Let θ(t) be
the parameter estimate at time t and define δθ(t) = θ(t) − θ0 , e(t) = qd (t) − q(t)
where qd (t) is the desired robot trajectory. We define τ (t) to be the computed
control torque for trajectory tracking:
Note that q, q are measurable at any instant of time perhaps via an EKF ob-
server based on noisy measurements. Then, the above differential equation
becomes after neglecting quadratic and higher terms in e(t), δθ(t),
M (q, θ0 )q + N (q, q , θ0 ) =
297
298 Advanced Probability and Statistics: Applications to Physics and Engineering
where
W (t, q, q , q ) =
−M (q, θ0 )−1 [∂M (q, θ0 )/∂θ)(I ⊗ q ) + (∂N (q, q , θ0 )/∂θ)]
Note that in (1), q, q , q on the rhs may be replaced by qd , qd , qd in view of the
linearized approximation. We now assume that Q1 , Q2 are two positive definite
matrices and propose the adaptive estimation law
[2] Analyze the teleoperation of two robot systems with teleoperation delay
feedback using infinite dimensional stochastic differential equations and pertur-
bation theory for approximately solving stochastic differential equations applied
to the computation of the statistical moments of the fluctuations of the robot
trajectories around the non-random value.
[3] Study the dynamics of several robots on a lattice in simple exclusion
interacting with each other via electromagnetic forces. Each robot at a lattice
site is described by its azimuth and elevation angle at time t and a current
flows along this axis thereby generating an electromagnetic field which interacts
with the currents in the robots at other lattice sites. A lattice site may or may
not be occupied by a robot and a robot can jump from one site to another
in accordance with the simple exclusion model. For a given configuration of
sites, the total kinetic energy of each robot as well as its potential energy of
Advanced Probability and Statistics: Applications to Physics and Engineering 299
interaction with other robots is calculated and thereby the Lagrangian of this
system of interacting robots is formed. This Lagrangian is a function of the
configuration of the lattice, ie it is dependent on which sites in the lattice are
occupied and which not. The statistical average of this Lagrangian is then taken
w.r.t the probability distribution of the occupation numbers of the sites which
follow the simple exclusion stochastic differential equation triggered by Poisson
clocks. Approximations are then made to obtain approximate solutions to the
link angles of the robots at the different sites.
p
dδXm (t) = Fm (Xm (t), t)δXm (t)dt+ Cm [k]ψm (Xs (t−(k+1)T )
k=0
These equations are of the following general form, ie, linear delay stochastic
differential equations:
N
dξ(t) = Fk (t)ξ(t − kT )dt + G(t)dB(t)
k=0
N
= F0 (t)ξ(t)dt + Fk (t)ξ(t − kT )dt + G(t)dB(t)
k=1
Here, ξ(t) is a vector valued process and the Fk (t) s are matrices. We can solve
this using perturbation theory by considering the delay terms to be of the first
order of smallness. The solution upto first order of smallness is then given by
N
t
ξ(t) = Φ(t, 0)ξ(0) + Φ(t, s)Fk (s)Φ(s − kT, 0)ξ(0)ds
k=1 0
t
+ Φ(t, s)G(s)dB(s)
0
where
∂Φ(t, s)/∂t = F0 (t)Φ(t, s), t ≥ s, Φ(s, s) = I
This formula can be used to evaluate statistical correlations upto second order
of smallness.
[3] Disturbance observer construction for speech models.
x[n] = f (x[n − 1], ..., x[n − p]) + d[n]
We construct the disturbance observer using the following recursive algorithm
ˆ = d[n
d[n] ˆ − 1])
ˆ − 1] = Ln (d[n] − d[n
We write
ˆ δ[n] = d[n] − d[n − 1]
[n] = d[n] − d[n],
and then get
[n] = δ[n] + [n − 1] − Ln (δ[n] + [n − 1])
= (1 − Ln )δ[n] + (1 − Ln ) [n − 1]
If Ln = L is a constant matrix, then we can solve the above difference equation
to get
n−1
[n] = (1 − L) [0] +
n
(1 − L)n−k δ[k]
k=0
302 Advanced Probability and Statistics: Applications to Physics and Engineering
Assuming that 1 − L has all singular within the unit circle, it follows that when
|δ[n]| ≤ K∀n
then
∞
limsupn→∞ | [n]| ≤ K. 1 − L k+1
k=0
1−L
= K.
1− 1 − L
[4] The UKF: The basic logic behind the UKF is that the EKF does not yield
accurate state estimates since it involves taking the expectation operator inside
a nonlinear function of a Gaussian state. This is not justified. A nonlinear
transformation of a Gaussian vector is non-Gaussian and hence we must make
use of the law of large numbers to calculate its expectation by simulating inde-
pendent realizations of it conditioned on the past observations. The philosophy
of the UKF is precisely based on using the law of large numbers to approximate
conditional expectations of nonlinear functions of Gaussian random vectors.
The state equations are
where B(.), V (.) are independent vector valued Brownian motions. In dis-
cretized form these are of the form
We then evaluate
K
X̂[n + 1|n] = E[Fn (X[n])|Zn ] ≈ (1/K) Fn (X̂[n|n] + P [n|n]ξ[k])
k=1
where
P [n|n] = Cov(X[n]|Zn ) = Cov(X[n] − X̂[n|n]|Zn )
and ξ[k], k = 1, 2, ..., K are iid standard normal random vectors. Further,
Advanced Probability and Statistics: Applications to Physics and Engineering 303
K
P [n + 1|n] = (1/K). (Fn (X̂[n|n] + P [n|n]ξ[k]) − X̂[n + 1|n]).( )T + Q
k=1
where
R = cov(V [n])
[5] The kinetic energy of the two 3-D link robot system is given by
K(t) = (1/2)T r(R (t)J1 R (t)T ) + (1/2)T r(S (t)J2 S (t)T ) + T r(R (t)J3 S (t)T )
where R(t), S(t) ∈ SO(3). Here, R(t) is the rotation experienced by the first
link and S(t)R(t) that by the second link. The potential energy of the system
can be expressed as
V (t) = aT1 R(t)b1 + aT2 S(t)b2
where a1 , b1 , a2 , b2 ∈ R3 . Thus the Lagrangian of this system of two rigid bodies
in the absence of external torques is given by
(1/2)T r(R (t)J1 R (t)T )+(1/2)T r(S (t)J2 S (t)T )+T r(R (t)J3 S (t)T )
−aT1 R(t)b1 −aT2 S(t)b2
304 Advanced Probability and Statistics: Applications to Physics and Engineering
Problem: Taking into account the constraints R(t)T R(t) = S(t)T S(t) = I
by writing R(t) = exp(X(t)), S(t) = exp(Y (t)) where X(t), Y (t) are real 3 × 3
skew-symmetric matrices and making use of the differential of the exponential
map
R (t) = R(t).(I − exp(−ad(X(t))/ad(X(t)))(X (t))
write down the above Lagrangian in terms of Lie algebra coordinates, ie, choose
a fixed real set of three linear independent 3 × 3 skew-symmetric matrices
L1 , L2 , L3 and write X(t) = x1 (t)L1 + x2 (t)L2 + x3 (t)L3 , Y (t) = y1 (t)L1 +
y2 (t)L2 +y3 (t)L3 and then express the above Lagrangian in terms of xk (t), xk (t),
yk (t), yk (t), k = 1, 2, 3.
References: See the published papers by Rohit Singla, Vijyant Agrawal and
Harish Parthasarathy, Rohit Singla and Harish Parthasarathy, Vijyant Agrawal
and Harish Parthasarathy, Rohit Rana, Vijyant Agrawal, Prerna Gaur and
Harish Parthasarathy and the papers communicated by Rohit Rana, Harish
Parthasarathy, Vijyant Agrawal and Prerna Gaur.
y0 (k + 1) = V.tanh(W y0 (k))
where d is disturbance and w is noise. The weight matrices follow the dynamics
of being approximately constant except for some small weight noise:
It should be noted that the plant dynamics function F is not known although
we are able to take noisy measurements of some linear transformation of the
state vector y(k) generated by it.Our goal is to use these measurements in the
NN to get a reasonably good approximation to the original plant dynamics. The
model for the disturbance estimate is
ˆ + 1) = d(k)
d(k ˆ + Lk (d(k + 1) − d(k)
ˆ + w(k + 1))
in the special case when H is the identity matrix.We assume that the disturbance
ˆ
estimation error d(k + 1) − d(k) = w1 (k + 1) is nearly white noise so that
w(k) + w1 (k) = w2 (k) is also white noise whose covariance is given by Q0 =
Cov(w1 (k)) + Cov(w(k)). Then the disturbance estimate follows the stochastic
model
ˆ + 1) = d(k)
d(k ˆ + Lk w2 (k + 1)
ˆ
and we construct an EKF for estimating d(k), Vk , Wk , y(k) based on the noisy
measurements z(k). To this end, we define the extended state vector
ˆ T ]T
ξ(k) = [y0 (k)T , V ec(Vk )T , V ec(WkT )T , d(k)
ξ(k + 1) = ψk (ξ(k)) + w3 (k + 1)
where
ˆ
ψk (ξ(k)) = [(Vk .tanh(Wk y0 (k)))T + d(k), ˆ T ]T
V ec(Vk )T , V ec(WkT )T , d(k)
and
ˆ
The fact that we are considering d(k + 1) − d(k) to be white noise means that
the disturbance must be slowly time varying, or more precisely, it must asymp-
totically converge to a constant vector so that
d(k + 1) − d(k) → 0, k → ∞
306 Advanced Probability and Statistics: Applications to Physics and Engineering
ˆ
This equation implies that if w(k) = 0 and d(k) → d(∞) and d(k) ˆ
→ d(∞),
then
ˆ
d(∞) ˆ
= (1 − L)d(∞) + Ld(∞)
which implies
ˆ
d(∞) = d(∞)
a result which states that if asymptotically, the disturbance converges to a
constant dc vector, and its estimate also converges then the limiting value of the
disturbance estimation error is zero. So the question arises that if noise is absent
and the disturbance is asymptotically constant, then under what conditions on
L will its estimate also be asymptotically constant ? Taking the one sided
Z-transform of (a) gives us
ˆ − − − (b)
(z − 1 + L)D̂(z) = zL(D(z) + W (z)) + z d(0)
k
ˆ =L
d(k) ˆ
(1 − L)k−m (d(m) + w(m)) + (1 − L)k d(0)
m=0
Thus, if we assume that 1 − L < 1 (in spectral norm), then (1 − L)k → 0 and
the transient term
ˆ ≤ 1 − L k d(0)
(1 − L)k d(0) ˆ → 0, k → ∞
Writing
ˆ
E(z) = D(z) − D̂(z), e(k) = d(k) − d(k)
or
(z − 1 + L)E(z) = (z − 1)(1 − L)D(z) − zLW (z)
which gives on Z-transform inversion,
k−1
k
e(k) = (1 − L)k−m (d(m) − d(m − 1)) − L (1 − L)k−m w(k)
m=0 m=0
Advanced Probability and Statistics: Applications to Physics and Engineering 307
This formula clearly shows that if d(m) − d(m − 1) → 0 and noise is absent,
then e(k) → 0 as required. It is also clear that in the absence of noise, the rate
at which e(k) converges to zero is −log( 1 − L . To increase this rate, we
must make 1 − L as small as possible. But then if L is close to unity, in the
presence of noise, the d.o. estimation error variance will become large as shown
by the above formula. Explicitly, we choose an > 0 and a positive integer
N = N ( ) such that
N
e(k) ≤ 1 − L k−m d(m) − d(m − 1)
m=0
k−1
+ 1 − L k−m
m=N +1
∞
→ . 1 − L r
r=1
= . 1 − L /(1− 1 − L ), k → ∞
Then letting → 0, we conclude that in the absence of noise,
e(k) → 0
E[ e(k) 2 ] ≤
k−1
2( 1 − L k−m d(m) − d(m − 1) )2
m=0
k
2
+L 1 − L 2(k−m) σw
2
m=0
L 2 σw
2
/(1− 1 − L 2 )
Hence, to reduce this noise variance bound and simultaneously to get a decently
fast rate of convergence of the d.o. estimation error to zero, we must choose L
so that a cost function of the form
2
C(L) = a. 1 − L +b.σw L 2 /(1− 1 − L 2 )
308 Advanced Probability and Statistics: Applications to Physics and Engineering
E[ e(k) 2 ] ≤
k−1
2K 2 ( 1 − L k−m )2
m=0
k
+ L 2 1 − L 2(k−m) σw
2
m=0
2K 2 1 − L 2 /(1− 1 − L )2
+ L 2 σw
2
/(1− 1 − L 2 )
are
ˆ + 1|k) = ψk (ξ(k|k)),
ξ(k ˆ
where
T
K(k + 1) = P (k + 1|k)Hk+1 T
(Hk+1 RW Hk+1 + RV )−1
and finally,
+K(k+1)PV K(k+1)T
[8] Some remarks on fracture analysis of materials
A brittle object like a chalk is in some sense an elastic material. When a
twist torque or deforming torque is applied to it, it suddenly gets fractured
along certain curves on its boundary surface. Let D denote the surface of the
material. On applying the twisting torque, the material gets partitioned into
disjoint open sets Dk , k = 1, 2, ..., p. The boundaries on the material surface
are therefore Djk = Cl(Dj ) ∩ Cl(Dk ) where 1 ≤ j < k ≤ p where Cl(E)
denotes the closure of the set E. Within each domain Dj , the equations of
elasticity are valid. Thus if D0 denotes the union of the disjoint open sets
Dj , j = 1, 2, ..., p, then in D, the material has a displacement field u(t, r) with
cartesian components uk (t, r), k = 1, 2, 3. The strain tensor of the material in
D0 is then
ujk = (1/2)(uj,k + uk,j )
and if C(jklm) denote the elastic constants, then the Lagrangian density of the
material is given by
3
L(uj , uj,k , uj,t ) = (ρ/2) u2j,t − (1/2) C(jklm)ujk ulm
j=1 jklm
Applying the variational principle to this Lagrangian density gives us the wave
equation
ρuj,tt = C(jklm)ul,mk
where the Einstein summation convention has been used, ie, summation over
repeated indices. This equation is solved within each open set Dj resulting in
a waves with unknown coefficients. Then we apply the condition that on the
boundary between two regions Dj and Dk , ie, on Djk a dis-continuity condition
is imposed, for example on this boundary, we may impose the condition that
the difference between the displacements is prescribed. Such a discontinuity
condition must be imposed because it is precisely this that characterises the
nature of the fracture. To make this analysis more precise, we take the spatial
Fourier transform of the above wave equation leading to
We absorb the density ρ within the elastic constants and then solve the above
equation to get
û(t, k) = exp(itF(k))a(k) + exp(−itF(k))b(k)
310 Advanced Probability and Statistics: Applications to Physics and Engineering
where F(k) is a square root of the matrix ((C(jplm)km kp ))1≤k,l≤3 . The constant
vectors a(k), b(k) depend on the region Dj under consideration. Specifically,
the inverse spatial Fourier transform of the above equation has the general form
u(t, r) = uj (t, r) = G(t, r − r )Aj (r )d3 r , r ∈ Dj
Dj
Now suppose we take into account an external force/torque applied to the ma-
terial. Then, the above anisotropic wave equation gets replaced with a wave
equation with source:
uj,tt (t, r) = C(jplm)ul,mp (t, r) + fj (t, r)
and then after time T , this solution can be matched at the boundaries Djk to
the appropriate discontinuity condition.
Advanced Probability and Statistics: Applications to Physics and Engineering 311
x=c
Then the wave equations in the two regions x < c and x > c are the same:
where
k 2 = α2 + (mπ/b)2
Applying the boundary condition,
û(k, c + 0, y) − û(k, c − 0, y) = fˆ(k, y) = fˆ(k, m)sin(mπy/b)
m
we get that
A2 (k, m).sin(α(m)(c − a)) − A1 (k, m)sin(α(m)c) = fˆ(k, m)
This gives us one relationship between the coefficients A1 (k, m), A2 (k, m). How-
ever, in these two examples, we’ve not taken into account the external deforming
forces. We shall now do so. In the first 1-D example, our equations of motion
are
u,tt (t, x) − u,xx (t, x) = g(t, x)
which gives on taking the temporal Fourier transform,
where b
B2 (k) + (sin(k(x − x ))/(k(x − x ))g(k, x )dx = 0
a
The fracture boundary condition is
ˆ
û(k, c + 0) − û(k, c − 0) = d(k)
and that gives one relationship between the coefficients A1 (k), A2 (k), B2 (k).
Now we come to the fracture analysis of general anisotropic materials. It
is clear that to describe the fracture using domains in the spatial regions, we
should not use spatial Fourier transforms. Rather, we must use temporal Fourier
transforms. This leads to the generalized anisotropic Helmholtz equation
det(k 2 I3 − F (K)) = 0
where F (K) is the matrix ((C(lpmn)Kp Km ))l,n Writing formally the solu-
tions as k = ±k r (K), r = 1, 2, 3, with the corresponding eigenvectors being
Ar (K), B r (K)Ār (K)r = 1, 2, 3, we get
3
un (t, r) = (Arn (K)exp(ik r (K)t) + Bnr (K)exp(−ik r (K)t)exp(iK.r)d3 K
r=1
The functions An (K), Bn (K) are different for the different domains and a dis-
continuity matching condition at the boundary has be applied. Since un (t, r) is
a real function, we can write
3
un (t, r) = Re(Arn (K).exp(i(k r (K)t − K.r)))d3 K
r=1
or equivalently, upto a proportionality factor, in the temporal frequency domain,
3
un (k, r) = [ Arn (K)exp(−iK.r)δ(k−k r (K))+Ārn (K)exp(iK.r)δ(k+k r (K))]d3 K
r=1
Note that k r (−K) = k r (K) since C(lpmn)Kp Km does not change sign if the
3-vector K is replaced by −K. In the j th region, we therefore have the solution
3
un (t, r, j) = Re(Arn (K, j).exp(i(k r (K)t − K.r)))d3 K, r ∈ Dj
r=1
we get
3
Re((Arn (K, j) − Arn (K, l)).exp(i(k r (K)t − K.r)))d3 K − dn (t, r, j, l) = 0,
r=1
314 Advanced Probability and Statistics: Applications to Physics and Engineering
= ua,t
Thus, the Hamiltonian density is given by
H(t, r, ua , ua,b , πa ) =
πa ua,t − L =
We now calculate the Hamiltonian for this elasticity field inside a cube of side-
length L:
H(t, ua , πa ) = Hd3 r
[0,L]3
for twice the kinetic energy and for twice the elastic potential energy,
2V (t) = C(abcd) uab (t, r)ucd (t, r)d3 r
B
= 4C(abcd) ua,b uc,d d3 r
B
= 4π 2 LC(abcd) ua (t, n)ūc (t, n)nb nd
n
where
V ∗ ∈ Rn×p , W ∗ ∈ Rp×n − − − (3)
To improve upon this approximation, we make NN weights adaptive so that the
NN approximation to the chaotic system is
and
V ec(Ŵk+1 ) = V ec(Ŵk ) − K2 ỹ(k + 1)
We know that
V ec(ỹ(k+1)) = ỹ(k+1) = (tanh(Wk y(k))T ⊗In )V ec(Ṽk )+(y(k)T ⊗Vk Dk )V ec(W̃k )
To implement the EKF, we use D̂k in place of Dk and also use Ŵk , V̂k in place
of Wk , Vk respectively. Thus, the EKF observer becomes
V ec(V̂k+1 ) = V ec(V̂k )
−K1 ((tanh(Ŵk y(k))T ⊗In )V ec(V̂k −V ∗ )+(y(k)T ⊗Vk D̂k )V ec(Ŵk −W ∗ ))),
and
V ec(Ŵk+1 ) = V ec(Ŵk )
−K2 ((tanh(Ŵk y(k))T ⊗In )V ec(V̂k −V ∗ )+(y(k)T ⊗Vk D̂k )V ec(Ŵk −W ∗ ))),
where
D̂k = diag[sech2 ((Ŵk y(k))(i) : i = 1, 2, ..., p]
Taking noise effects into account, the Kalman gains K1 , K2 must be chosen so
as to minimize
T rE(V ec(Ṽk+1 ).V ec(Ṽk+1 )T ) + T rE(V ec(W̃k+1 ).V ec(W̃k+1 )T )
where
Ṽk+1 = V̂k+1 − V ∗ , W̃k+1 = Ŵk+1 − W ∗
We observe that with these definitions, the weight update equations can be
expressed as
V ec(Ṽk+1 ) = V ec(tildeVk )
−K1 ((tanh(Ŵk y(k))T ⊗In )V ec(Ṽk )+(y(k)T ⊗Vk D̂k )V ec(W̃k )),
Advanced Probability and Statistics: Applications to Physics and Engineering 317
and
V ec(Ŵk+1 ) = V ec(Ŵk )
−K2 ((tanh(Ŵk y(k))T ⊗In )V ec(V̂k −V ∗ )+(y(k)T ⊗Vk D̂k )V ec(Ŵk −W ∗ ))),
Also for the disturbance observer, we take
ˆ + 1) = d(k)
d(k ˆ + L(d(k + 1) − d(k))
ˆ
and the total output power in the band {ω : |ω| > ω0 } is given by
∞
Py (ω0 ) = 2 Sx (ω)/ω 2
ω0
318 Advanced Probability and Statistics: Applications to Physics and Engineering
To obtain the analogue of the 3-dB bandwidth we choose a fraction 0 < α < 1
like for example 1/2 and ask the question for what value of ω0 is Py (ω0 ) smaller
than Py ( ). The minimum value of such an ω0 satisfies the equation
Py (ω0 ) = α.Py ( )
For example, choosing x(t) to be Dβ w(t) where w(t) is white Gaussian noise
(Sw (ω) = 1∀ω), we find that Sx (ω) = |ω|2β . Here, β is any real number and
Dβ is a fractional derivation if β > 0 or a fractional integrator if β < 0. If
2β − 2 < −1, ie, β < 1/2, we have
∞
Py ( ) = 2 dω.ω 2β−2 = (2/(1 − 2β)) 2β−1
and hence the alpha-bandwidth of this lowpass filter for such an x(t) is given
by
1/ω01−2β = α/ 1−2β
or equivalently,
ω0 = .α1/(2β−1) >
More generally, we can choose an input
N
x(t) = c(m)Dβm wm (t)
m=1
where wm (.), m = 1, 2, ..., N are independent white noise processes and βm <
1/2, m = 1, 2, ..., N . Then the input power spectral density is
N
Sx (ω) = |c(m)|2 |ω|2βm
m=1
and corresponding to this input signal the α-bandwidth ω0 satisfies the equation
N
N
(|c(m)|2 /(1 − 2βm ))ω02βm −1 = α. (|c(m)|2 /(1 − 2βm )) 2βm −1
m=1 m=1
It gives
H(z) = (1 − z −1 )−1 , |z| > 1
There is a singularity at z = 1 which prevents one from defining the DTFT and
it can be sorted out as follows:
1 − r−1 exp(jω)
= limr→1
(1 + r−2 − 2r−1 .cos(ω))
Since the singularity is at ω = 0, we consider the above expression for small |ω|.
It is approximately given by
1 − r−1 − jω
(1 − r−1 )2 + ω 2
Note that unlike the continuous time case, here, the frequency range is [−π, π).
Now we take a random signal x(t) with power spectral density Sx (ω) that is
zero when |ω| < . The total output power after passing this signal through the
above low pass filter is given by
π
Py ( ) = 2 (1 − exp(−jω)|−2 Sx (ω)dω
π
=4 (1 − cos(ω))−1 Sx (ω)dω
= α.Py ( )
To get closed form solutions for the α-bandwidth, we may take Sx (ω) = (1 −
cos(ω))n where n is a positive integer and then corresponding to this input
spectrum, the α-bandwidth ω0 satisfies
π
(1 − cos(ω)) n−1
dω = α. π (1 − cos(ω))n−1 dω
ω0
320 Advanced Probability and Statistics: Applications to Physics and Engineering
The integrals in this expression are readily evaluated and the α-bandwidth de-
termined. Note that in both the discrete and continuous time situations, we
must set a threshold on the lower frequency region, so that the input spectrum
does not have any frequency components smaller than this threshold because of
the singularity of the integrator at zero frequency. This is equivalent to requir-
ing that the input signal does not contain any d.c. component since a constant
or dc signal when integrated or summed over an infinite time range gives an
infinite output.
ˆ + 1) = d(t)
d(t ˆ + L(d(t + 1) − d(t))
ˆ ˆ + L (t + 1)
= d(t)
is given by
x̂(t + 1|t) = V̂ (t|t).tanh(Ŵ (t|t)x̂(t|t)) +ˆˆd(t|t),
Ŵ (t + 1|t) = Ŵ (t|t), V̂ (t + 1|t) = V̂ (t|t),
ˆ + 1|t) = d(t|t)
d(t ˆ
Advanced Probability and Statistics: Applications to Physics and Engineering 321
or equivalently,
ˆ + 1|t) = ψ(ξ(t|t))
ξ(t ˆ
where
ˆ T , V ec(W )T , V ec(V )T , dˆT ]T
ψ(ξ) = [(V.tanh(W x) + d)
for
ξ = [xT , V ec(W )T , V ec(V )T , dˆT ]T
and the next step in the EKF is
ˆ + 1|t + 1) = ξ(t
ξ(t ˆ + 1|t) + K.(z(t + 1) − H ξ(t
ˆ + 1|t))
where
H = [1, 0T ]T
Note that
z(t) = Hξ(t) + v(t)
The Kalman gain vector K is chosen so that T r(P (t + 1|t + 1)) is a minimum,
where
ˆ + 1|t + 1)) =
P (t + 1|t + 1) = cov(ξ(t + 1) − ξ(t
ˆ + 1|t) + e(t + 1|t) − ξ(t
= cov(ξ(t ˆ + 1|t) − K(He(t + 1|t) + v(t + 1)))
= ψ (ξ(t|t))e(t|t)
ˆ + noise
so that
P (t + 1|t) = ψ (ξ(t|t))P
ˆ (t|t).ψ (ξ(t|t))
ˆ T
+Q
Now suppose that we run this EKF iteration loop for T iterations. Then, the
converged weight matrices are W ∗ = Ŵ (T |T )), V ∗ = V̂ (T |T )), our approxi-
mated dynamical system is then
We wish to compare this dynamical system with the original dynamical system,
or more precisely, estimate the parameter θ of the original dynamical system
322 Advanced Probability and Statistics: Applications to Physics and Engineering
based on this approximated system. For that purpose, we apply the EKF to the
original dynamical system to estimate θ based on output measurements ỹ(t) of
the approximated dynamical system. Define the state vector
ˆ T
η(t) = [y(t), y(t − 1), ..., y(t − q), θ(t), d(t)]
= φ(η(t)) + noise
The EKF for this is
η̂(t + 1|t) = φ(η̂(t|t))
η̂(t + 1|t + 1) = η̂(t + 1|t) + K0 .(z̃(t + 1) − H0 η̂(t + 1|t))
Note that the measurement z̃(t), is taken from the approximated plant:
Here,
H0 = [1, 0T ]T ,
and the Kalman gain K0 and the error correlations P (t|t), P (t + 1|t) are com-
puted as usual but based on the original plant φ(.):
[12] Dual EKF with neural network for (a) modeling the plant dynamics by
a neural network and (b) using the output of the neural network to estimate
the plant parameters. The plant dynamics is given by
and the neural network that approximates this plant has a dynamics given by
Based on z(n), we construct the EKF for estimating the weights W (n), V (n) and
based on z1 (n), we construct another EKF for estimating the plant parameters
θ = (b, c). These two EKF’s are run parallely to each other and constitute the
dual EKF. Write the plant dynamics in state variable form
xs (n + 1) = ψ(xs (n)) + s (n + 1)
where
d (n
ˆ
+ 1) = d(n + 1) − d(n)
the first component of s being d, and
ˆ T
xs (n) = [y(n), y(n − 1), ..., y(n − q), θ(n)T , d(n)]
ˆ + 1) = d(n)
d(n ˆ + L d (n + 1)
Thus,
ψ(xs ) = [F (xs,1 (n), xs,q+1 (n)|xs,q+2 (n), ..., xs,q+p+1 (n)), xs,1 (n), ...,
xs,1 (n) = y(n), ..., xs,q (n) = y(n − q + 1), θ(n) = [xs,q+2 (n), ..., xs,q+p+1 (n)]T ,
ˆ = xs,q+p+2 (n)
d(n)
The DEKF:
where
ˆ T , φ(xN ) = [V.tanh(W y1 ), V, W, d]
xN = [y1 , V, W, d] ˆT
Write
Kt (x, y) = ∂Kt (x, y)/∂x, Kt (x, y) = ∂ 2 Kt (x, y)/∂x2
The Kushner-Kallianpur filter without any approximations is given by
where
πt (φ) = E[φ(x(t))|Zt ]
Here, the measurement process is
where
μt (x) = Kt (x, y)ydy
so that
πt (Kt (x2 )) = E(Kt (x(t), y)|Zt )y 2 dy
2
≈ Kt (x̂(t), y)y dy + (1/2)( Kt (x̂(t), y)y 2 )P (t)
Further,
πt (x2 ) = E(x(t)2 |Zt ) = x̂(t)2 + P (t),
πt (h) ≈ h(x̂(t))
πt (x h(x)) ≈ x̂(t) h(x̂(t)) + (1/2)(2h(x̂(t)) + 4h (x̂(t))x̂(t) + h (x̂(t))x̂(t)2 )P (t)
2 2
Thus,
πt (x2 h(x)) − πt (x2 )πt (h(x)) ≈ 2h (x̂(t))x̂(t)P (t)
Thus, we get with these approximations,
dP (t)+2x̂(t)dx̂(t)+(dx̂(t))2
= Mt (x̂(t))dt+Lt (x̂(t))P (t)dt+2σv−2 h (x̂(t))x̂(t)P (t)(dz(t)−h(x̂(t))dt)
or equivalently, using Ito’s formula,
dP (t)+2x̂(t)(μt (x̂(t))dt+σv−2 P (t)h (x̂(t))(dz(t)
−h(x̂(t))dt))+σ −2 P (t)2 h (x̂(t))2 dt
v
Consider now the special situation in which x(t) satisfies the sde
where N (.) is a Poisson process with rate λ. Then, the generator of x(.) is given
by
Kt φ(x) = μ(x)φ (x) + λ(φ(x + σ(x)) − φ(x))
We get
Kt (x) = μ(x) + λ.σ(x),
Kt (x ) = 2μ(x)x + λ(σ(x)2 + 2xσ(x))
2
We have
πt (Kt (x)) ≈ μ(x̂(t)) + λ.σ(x̂(t))
πt (Kt (x2 )) ≈ 2μ(x̂(t))x̂(t) + 2μ (x̂(t))P (t)
πt (x.h(x)) − πt (x).πt (h(x)) ≈
h (x̂(t))P (t)
and the EKF in this case simplifies to
The estimation error covariance P (t) must be shown to remain bounded with
time. This will guarantee the validity of our approximations. More precisely,
we have to show that the variance of
is bounded where x̂(t) satisfies the EKF while x(t) satisfies the above sde driven
by the Poisson process N (.). We cannot use P (t) for the variance of e(t) since
it is based on linearization of the original sde around the state estimate. We
shall do better by expanding the functions μ(x), σ(x) around x̂(t) upto second
degree terms in e(t) and then calculating how E(e(t)2 |Zt ) evolves with time.
This will be a more accurate analysis of the error than that obtained using only
Advanced Probability and Statistics: Applications to Physics and Engineering 327
P (t). In order to proceed with this computation, we shall make a central limit
approximation that e(t) is zero mean Gaussian with variance P (t) conditioned
on Zt :
The aim is to choose the control input u(n) as a non-random function of x(n)
so that
N
E L(n, x(n), u(n))
n=0
328 Advanced Probability and Statistics: Applications to Physics and Engineering
and the resulting u(k) that optimizes this will automatically be guaranteed to be
a function of x(k) and k only. This optimal u(k) is precisely the control input
at time k. We can write this equation, also called the stochastic Bellmann-
Hamilton-Jacobi equation, as
V (k, x) = minu (L(k, x, u)+ V (k+1, f (x, w, u))p(w)dw), k = N −1, N −2, ..., 0
V (t, r(t), v(t)) = minu(t) (L(r(t), v(t), u(t), t)dt+E[V (t+dt, r(t)+dr(t),
v(t)+dv(t))|r(t), v(t)])
or equivalently,
D(u) = vT ∂/∂r + f (t, r, v, u)T ∂/∂v + (1/2)T r(σ(t, r, v)σ(t, r, v)T ∂ 2 /∂v∂vT )
where
p
Fk (r) = Fk (r(q)), δrk = (∂rk /∂qj )δqj
j=1
and hence the equations of motion, taking into account the constraints are
n
[mk (d2 rk /dt2 , ∂rk /∂qj ) − (Fk , ∂rk /∂qj )] = 0, j = 1, 2, ..., p
k=1
In the special case when the forces are derived from a potential V (r), ie,
where
p
drk /dt = (∂rk /∂qj )qj
j=1
where
n
Mjk (q) = mk (∂rm /∂qj , ∂rm /∂qk )
m=1
and
U (q) = V (r(q))
Remark: The constraints physically mean that there exist constraint forces
that are normal to the p dimensional surface defined by the equations r = r(q),
ie rk = rk (q1 , ..., qp ), k = 1, 2, ..., n that cause the particles to remain always
on this surface or equivalently, cause the particles to move tangential to this
surface or in other words, Newton’s equations of motion hold after projecting
both sides of it onto the tangent plane to the surface at each point. Now consider
a situation in which we wish to model this constrained dynamics using a neural
network. Let W1 , ..., Wn−1 denote the weight matrices of the first, second,...
nth layers respectively. Then, the states of the NN can be expressed as
where X(k) is the signal vector at the k th layer. u = X(1) is the input signal
vector and y = X(n) is the output signal vector. In the discretized Lagrangian
dynamics, we assume that noise is present. Therefore, the discretized dynamics
can be expressed as
w.r.t W. Suppose that we already have a guess value W0 for the weight matrix.
Let the optimal weights be a small perturbation of this, ie,
W = W0 + δW
and we get
E[ F (u, w) − G(W, u) 2 ≈
E[ F (u, w) − G(W0 , u) 2 ]+
+δWT E[G (W0 , u)T G (W0 , u)]δW
+2δWT E[G (W0 , u)T (F (u, w) − G(W0 − u))]
−(δW ⊗ δW)T E[G (W0 , u)T (F(u, w) − G(W0 , u))]
In these equations, it is being assumed that the input process u is also a random
process just as the dynamical system noise w is. We now define the column
vector
c = E[G (W0 , u)T (F(u, w) − G(W0 , u))]
Then,
K
(δW ⊗ δW)T c = δWa .δWb .c(K(a − 1) + b)
a,b=1
= δWT CδW
where
C = ((c(K(a − 1) + b)))1≤a,b≤K
[16] Quantum neural network for estimating the joint pdf of a ran-
dom signal
y(t), t = 0, 1, 2, ... is a given stationary random process. We wish to estimate
the joint probability density of y(t) = (y(t + τk ), k = 1, 2, ..., p)T .
Let f (y, t) be the tentative joint pdf. We wish to improve upon it using
quantum mechanics. It is known that if the initial wave function ψ(y, 0) of a
quantum system evolves according to Schrodinger’s equation with a real poten-
tial, then |ψ(y, 0)|2 dp y = 1 implies |ψ(y, t)|2 dp y = 1∀t > 0. This suggests
to us that we improve upon our knowledge about the wave function by defining
an ”error potential”
and the wave function required for approximating the data pdf evolves according
to the Schrodinger equation
This Schrodinger evolution will always guarantee that |ψ(y, t)|2 remains a prob-
ability density function. The idea is that if the neural weight W (y, t) is large
positive, then the decay term −β1 W in the weight learning algorithm will guar-
antee a decrease in the same provided that |ψ(y, t)|2 < f (y, t). On the other
hand, if f (y, t) >> |ψ(y, t)|2 , then the second term in the weight learning algo-
rithm will guarantee rapid increase of the weight W (y, t) causing the potential
V (y, t) to get large and then the Schrodinger equation will guarantee increase
of |ψ(y, t)|2 so that it gets closer to f (y, t).
Chapter 12
333
334 Advanced Probability and Statistics: Applications to Physics and Engineering
[16] Quantum teleportation using entangled states as a fast means for trans-
mitting a d-qubit quantum state by transmitting only 2d classical bits.
[17] Quantum image processing using the Hudson-Parthasarathy unitary
evolution operator.
[18] Quantum entropy of a state and quantum relative entropy between two
states with application to the proof of the classical-quantum Shannon coding
theorem by encoding classical alphabets into density matrices, ie, mixed states.
[19] Generation of entangled states from mixed states for fast communication
using the Schrodinger unitary dynamics.
[2 a] Wave operators given two unbounded Hermitian operators with appli-
cation to scattering theory. Application of scattering theory to the design of
quantum gates. Choosing the control scattering potential so that the S-matrix
at a given energy E is as close as possible to a given unitary matrix.
[2 b] Scattering theory for quantum field theory in which the two Hamiltoni-
ans are functionals of a set of quantum fields like the creation and annihilation
operators in momentum space for electrons, positrons and photons.
[3] An introduction to quantum stochastic calculus and quantum filtering
and control theory.
[a] Boson Fock space,
[b] Exponential vectors in Boson Fock space.
[c] The Weyl operator and its role in the construction of the fundamental
noise operator fields on a Hilbert space and on the space of Hermitian operators
in a Hilbert space.
[d] The Weyl operator and its role in the construction of the fundamental
quantum noise processes.
[e] Alternative construction of the fundamental quantum noise processes
using the algebra generated by an infinite sequence of independent harmonic
oscillators.
[f] The Hudson-Parthasarathy noisy Schrodinger equation.
[g] Non-demolition measurements in the sense of Belavkin.
[h] Derivation of the Belavkin quantum filter.
[i] The equations of motion of a spin operator in the presence of quantum
noisy processes and its estimation in real time based on non-demolition mea-
surements.
the canonical commutation relations between the position and momentum den-
sity fields associated with the Lagrangian density of the gravitational field ap-
proximated upto quadratic orders in the fluctuating metric. Cubic terms in this
Lagrangian density are also considered as small perturbations (self-interacting
terms) to the quadratic component of the Lagrangian density and its effect on
graviton propagator corrections is derived.
where the integrals appearing on the rhs are standard quantum stochastic inte-
grals. To this end, we construct a sequence of adapted processes Xn (t), n ≥ 0
such that X0 (t) = X(0) and
t
Xn+1 (t) = X(0) + (L1 Xn (s)dA(s) + L2 Xn (s)dA(s)∗ + L3 Xn (s)ds), n ≥ 0
0
336 Advanced Probability and Statistics: Applications to Physics and Engineering
we get
Fn (t)f e(u) 2 =
t
L2 Dn (s)dA(s)∗ |f e(u) >2
0
t t
=< L2 Dn (s)dA(s)∗ f e(u), L2 Dn (s)dA(s)∗ f e(u) >
0 0
t t
= 2Re( < L2 Dn (s)f e(u), Fn (s)f e(u) > u(s)ds) + L2 Dn (s)f e(u) 2 ds
0 t 0
t
+K(u) Fn (s)f e(u) 2 ds
0
t
+ L2 2 Dn (s)f e(u) 2 ds
0
t
= 2 L2 2 Dn (s)f e(u) 2 ds
0
t
+K(u) Fn (s)f e(u) 2 ds
0
where
K(u) = sups≥0 |u(s)|2
Combining all these inequalities, we get
Δn+1 (t) ≤
t
3t u 2 L1 2 Δn (s)ds
0
t
+3 Fn (t)f e(u) 2 +3t L3 2 Δn (s)ds
0
where
Fn (t)f e(u) 2 ≤
t t
2 L2 2 Δn (s)ds + K(u) Fn (s)f e(u) 2 ds
0 0
n+p
supt∈[0,T ] Xn+p (t) − Xn (t) 2 ≤ p. δ(k)
k=n+1
Then writing t
c(t) = b(s)ds
0
we have
c (t) ≤ a(t) + K.c(t)
and hence t
c(t) ≤ exp(K(t − s))a(s)ds
0
t
≤ exp(tK). a(s)ds
0
N t t
≤ ... ≤ a(t)+ (K n /(n−1)!) (t−s)n−1 a(s)ds+(K N +1 /N !) (t−s)N b(s)ds
n=1 0 0
where
|a(m, n)| = X(m, n), |b(m, n)| = 1 − |a(m, n)|2
where X(m, n) has been shifted and normalized so that it falls in the range
[0, 1]. Specifically if
by a pure state ρ belonging to some family F, like say the family of all Gaussian
states:
ρ = argmaxσ∈F < ψ|σ|ψ >
This is equivalent to minimizing
σ − |ψ >< ψ| 2
over σ ∈ F where the norm used is the Frobenius norm. To process this quantum
image state, we first take the tensor product of this with the bath coherent state
|φ(u) >< φ(u)| where
and then allow the resulting state to evolve under the HP noisy Schrodinger
dynamics so that after time t, the state of the system⊗bath becomes
ρ(t) = U (t)(ρ ⊗ |φ(u) >< φ(u)|)U (t)∗
Normally, in standard quantum image processing theory, we would take the
partial trace of this evolved state after time T , construct its pure state approxi-
mation and then transform the resulting state into a classical image field by the
reverse of the process described above involving conversion of a classical image
field to a pure quantu state. In order to do this, however, we require to measure
the quantum state ρ(T ) after time T . An accurate measurement of this state
is however difficult and hence we adopt Belavkin’s quantum filtering method in
which we take non-demolition measurements of the form
where
Yi (t) = c1 A(t) + c̄1 A(t)∗ + c2 Λ(t)
and then based on ηo (t) = {Yo (s) : s ≤ t}, we estimate the state using Belavkin’s
filter as ρB (t) which is defined by
where
πt (X) = E[jt (X)|ηo (t)], jt (X) = U (t)∗ XU (t)
is the Belavkin filtered observable at time t. Here, X is a system observable and
ρ is the system state at time t = 0. ρB (t) should be interpreted as a random
system space density matrix. We can write in terms of dual operators,
ρB (t) = πt∗ (ρ)
The Belavkin filter for observables is
dπt (X) = πt (LX)+(πt (Mt X+XMt∗ )−πt (X)πt (Mt +Mt∗ ))(dYo (t)−πt (Mt +Mt∗ )dt)
and by duality, for states, it is
dπt∗ (ρ) = L∗ πt∗ (ρ)dt + [(πt∗ (ρ)Mt + Mt∗ πt∗ (ρ)
−T r(πt∗ (ρ)(Mt +Mt∗ ))πt∗ (ρ)][dYo (t)−T r(πt∗ (ρ)(Mt +Mt∗ ))].[dYo (t)
−T r(πt∗ (ρ)(Mt +Mt∗ )))
After thus estimating the system state at time t, we can construct its optimum
pure state approximation and then do a quantum → classical conversion. We
note that if ρ is the initial system state and expectations are taken w.r.t the
probability distribution of the Abelian family Yo (t), t ≥ 0 in the state |f ⊗
φ(u) >, then with T rs denoting trace over the system Hilbert space, we have
[1] Let a1 , ..., ap be operators in a Hilbert space such that [ak , a∗m ] = δkm
and [ak , am ] = 0. Consider the Gaussian state
p
ρ = C(Q).exp(− Qkm a∗k am )
k,m=1
where
C(Q) = det(1 − exp(−Q))
Here, Q is a positive definite p × p matrix. Prove that T r(ρ) = 1 and that ρ ≥ 0.
Calculate the quantum Fourier transform of ρ, ie
where
p
W (z) = exp( (z̄k ak − zk a∗k )) = exp(a(z) − a(z)∗ )
k=1
and hence
W (z) = C(p) |φ(u) >< φ(u)|W (z)|φ(v) >< φ(v)|dudv
Advanced Probability and Statistics: Applications to Physics and Engineering 343
so that
ρ̂(z) = C(p) < φ(v)|ρ|φ(u) >< φ(u)|W (z)|φ(v) > dudv
with
< φ(u)|W (z)|φ(v) >= exp(−|z|2 /2− < z, v >)exp(< u|v+z >)exp(−|u|2 /2−|v|2 /2)
we obtain a sequence of linear equations for the matrix elements < n|ρ|m > of ρ
between the number states of the independent harmonic oscillators. Alternately,
once we know < φ(u)|ρ|φ(v) > by inverting a classical Fourier transform, we
can determine ρ by the formula
ρ = C(p) |φ(u) >< φ(u)|ρ|φ(v) >< φ(v)|dudv
[2] Evaluate the matrix elements of exp(− k,m Qkm a∗k am ) between the oc-
cupation number states |n >.
hint: Write the spectral decomposition of Q as
Q= c(k)ek e∗k , c(k) ≥ 0, e∗k em = δkm
k
Then
Qkm a∗k am = c(k)(a∗ ek )(e∗k a)
k,m k
= c(k)a(ek )∗ a(ek )
k
where
a(ek ) = ēk (m)am
m
and clearly
[a(ek ), a(em )∗ ] = δkm , [a(ek ), a(em )] = 0
Now let |f (n) > be the normalized eigenstate of a(ek )∗ a(ek ) with eigenvalue nk :
a(ek )∗ a(ek )|f (n) >= nk |f (n >, k = 1, 2, ..., p, < f (m)|f (n) >= δ[n − m]
We write
|f (n) >= K(n, m)|m >
m
344 Advanced Probability and Statistics: Applications to Physics and Engineering
and therefore,
nk K(n, q) =
= ek (r)ēk (s)K(n, m) < q|a∗r as |m >
r,s,m
√ √
= ek (r)ēk (s)K(n, m) q r ms δ[q − m + us − ur ]
r,s,m
= ek (r)ēk (s)K(n, q + us − ur )
r,s
where ur is the p × 1 vector with a one in its rth position and zeros at all the
other positions. This set of linear equations has to be solved for the Kernel
K(n, m), n, m ∈ Zp+ .
[3] Evaluate the matrix element of exp(a(z)∗ a(z)) between two coherent
states or equivalently between two occupation number states where
p
a(z) = z̄k ak , [ak , a∗j ] = δkj , [ak , aj ] = 0
k=1
W (u)a(z)|e(v) >=< z, v > W (u)|e(v) >=< z, v > exp(−|u|2 /2− < u, v >)|e(v+u) >
a(z)W (u)|e(v) >= a(z)exp(−|u|2 /2− < u, v >)|e(v + u) >
= exp(−|u|2 /2− < u, v >) < z, v + u > |e(v + u) >
Thus,
[W (u), a(z)]|e(v) >= − < z, u > W (u)|e(v) >
and hence
W (u)a(z)W (u)∗ = a(z)− < z, u >
Then,
W (u)f (a(z), a(v)∗ )W (u)∗ = f (a(z)− < z, u >, a(v)∗ − < u, v >)
In particular,
= ρ̂(Az).exp((−1/2)R(z)T KR(z))
Now since ρ is a Gaussian state, we have
ρ̂(z) = exp(−(1/2)R(z)T SR(z) + mT R(z))
where S is a complex Hermitian matrix of size 2p× 2p and m is a complex 2p × 1
vector. Hence T ∗ transforms Gaussian states into Gaussian states. The problem
is to realize the transformation T using the GKSL equation. This problem was
first solved by K.R.Parthasarathy who considered a qsde that yields a family
of unitary operators which are dilations of T and hence by tracing out over
the bath, the unitary evolution becomes a GKSL equation that transforms a
Gaussian state after any time t into another Gaussian state.
[5] Let z ∈ Cp . Choose a unitary operator U on Cp so that U z = |z|e1 . For
example, writing
z = z1 e1 + ... + zp ep
where {e1 , ..., ep } is the standard onb for Cp , we define
since
Γ(U )a(z)|e(v) >=< z, v > |e(U v) >
on the one hand while on the other,
a(U z)Γ(U )|e(v) >= a(U z)|e(U v) >=< U z, U v > |e(U v) >=< z, v > |e(U v) >
so that,
Γ(U )a(z) = a(U z)Γ(U )
or equivalently,
More generally, let |f1 >, ..., |fp > be any onb for Cp . Choose a unitary operator
in Cp so that U |fk >= |ek >, k = 1, 2, ..., p. We have
p
U= |ek >< fk |
k=1
Then,
p
< e(u)|exp(− βk a(fk )∗ a(fk ))|e(v) >=
k=1
< e(U u)|exp(− βk a(ek )∗ a(ek ))|e(U v) >=
k
p
= exp(− βk nk ) < e(U u)|n1 ...np >< n1 ...np |e(U v) >
n1 ,...,np ≥0 k=1
where
< n1 ...np |e(z) >= z1n1 ...zpnp / n1 !...np !
Thus,
p
< e(u)|exp(− βk a(fk )∗ a(fk ))|e(v) >=
k=1
p
= exp(− βk nk ).¯(U u)n1 1 ...¯(U u)np p (U v)n1 1 ...(U vp )np /n1 !...np !
n1 ...np k=1
so defining
Q= βk |fk >< fk |
k
Advanced Probability and Statistics: Applications to Physics and Engineering 347
we have
U ∗ .exp(−D).U = exp(−Q)
and thus,
p
< e(u)|exp(− βk a(fk )∗ a(fk ))|e(v) >= exp(< u|exp(−Q)|v >)
k=1
It then follows that on defining the quantum Gaussian state
p
ρ = C.exp(− βk a(fk )∗ a(fk ))
k=1
where
C = det(1 − exp(−Q))
we have
ρ̂(z) = T r(ρ.W (z)) = π −2p exp(−|u|2 /2−|v|2 /2)
LT JL = J
where
0 Ip
J=
−Ip 0
Clearly detL = ±1 and let ρ be a Gaussian state. We wish to show that
Γ(L)ρΓ(L)∗ is also a Gaussian state where Γ(L) is the unique unitary operator
acting in L2 (Rp ) = Γs (Cp ) defined by the equation
or equivalently,
be positive definite for all z1 , ..., zN ∈ Cp and for all N = 1, 2, .... Now the Weyl
commutation relations give
Thus
Pab = exp(−i.Im(< za , zb >))T r(ρ.W (za − zb ))
= exp([mT1 , mT2 ][R(za −zb )T , I(za −zb )T ]T −iIm(< za , zb >)−[R(za −zb )T ,
V = [v1 , ..., v2p ] = [V1 |V2 ], V1 = [v1 , ..., vp ], V2 = [vp+1 , ..., v2p ]
Also define
C = diag[c1 , ..., cp ]
so we get
N T N V1 = V1 C, N T N V2 = V2 C −1 , N T N V = V.diag[C, C −1 ]
Note that
V T V = V V T = I2p
ie V is a real orthogonal matrix. This is possible since N T N is real symmetric.
Define
W = [V2 |V1 ]
Then,
N T N W = W.diag[C −1 , C]
Now,
N T N JV = J(N T N )−1 V = JV.diag[C −1 , C]
or equivalently,
N T N Jvk = c−1
k Jvk , N N Jvk+p = ck vk+p , 1 ≤ k ≤ p
T
Hence, if we assume that the ck s are all distinct and further all differ from unity,
or more specifically, we may assume that c1 , ..., cd < 1 so that c−1 −1
1 , ..., cd ≥ 1,
then
Jvk = αk vk+p , Jvk+p = −αk−1 vk , 1 ≤ k ≤ p
where α1 , ..., αp are non-zero real numbers and since J T J = I, and the vk s are
normalized, it follows that
αk2 = 1, k = 1, 2, .., p
ie,
αk = ±1, 1 ≤ k ≤ p
Thus, we have
where
A1 = diag[α1 , ..., αp ]
and
V T JV = V T W A
Advanced Probability and Statistics: Applications to Physics and Engineering 351
Now,
V2T o Ip
WTV = [V1 |V2 ] = = K = V TW
V1T Ip 0
say. Thus,
V T JV = KA
Then, taking transpose,
AT K = −V T JV = −KA
or since A is diagonal,
KA + AK = 0
or equivalently,
A1 = A−1
1
which merely tells us what we already know, ie, αk = ±1. We can actually,
without loss of generality assume that αk = 1, k = 1, 2, ..., p. In fact, this
simply amounts to changing the sign of those eigenvectors vk+p , k = 1, 2, ..., p for
which αk = −1. Equivalently, we may first define v1 , ..., vp as real orthonormal
eigenvectors of N T N with eigenvalues c1 , ..., cp respectively, and then define
vk+p = Jvk , k = 1, 2, ..., d. Then by simplecticity of N T N , it follows that
N T N vk+p = N T N Jvk = J(N T N )−1 vk = c−1 k Jvk , k = 1, 2, ..., p. We then
easily get A = diag[I, −I] and hence,
V T JV = J
N T N = V.diag[C, C −1 ]V T , V T V = I = V V T
q
ρ = C.exp(−(1/2)[q T , pT ]Q )
p
where
q
C < e(u)|Γ(L)∗ Γ(L).exp((−(1/2)[q T , pT ]Q )|Γ(L)∗ Γ(L)|e(v) >
p
Writing √ √
ak = (qk + ipk )/ 2, a∗k = (qk − ipk )/ 2
we get that
[ak , aj ] = 0, [a∗k , a∗j ] = 0, [ak , a∗j ] = δkj
and then since
qk2 + p2k = 2a∗k ak + 1
we get
n
< e(u)|ρ|e(v) >= C. < e(L̃u)|exp((−1/2) dk (2a∗k ak + 1))|e(L̃v) >
k=1
n
= Cexp(−(1/2)T r(D)). < e(L̃u)|exp(− dk a∗k ak )|e(L̃v) >
k=1
= Cexp((−1/2)T r(D)) < e(L̃u)|n >< n|e(L̃v) > exp(−d.n)
n
= Cexp((−1/2)T r(D)) exp(−d.n)conj((L̃u)n ).(L̃v)n /n!
n
T r(exp((−1/2) dk (2a∗k ak + 1))) =
k
exp((−1/2)T r(D)) exp(−d.n) = exp((−1/2)T r(D))det(1 − exp(−D))
n
It follows that
and hence ρ is a Gaussian state. Note that the Fourier transform of ρ is given
by
ρ̂(z) = T r(ρW (z)) =
= π −n < e(u)|ρ|e(v) >< e(v)|W (z)|e(u) > exp(−|u|2 /2 − |v|2 /2)dudv
−n
=π det(1 − exp(−D)) exp((−1/2) < u|L̃∗ exp(−D)L|v >)
Here, c(k, j), d(k, j) are complex numbers. The GKSL equation in Heisenberg
matrix mechanics is
where
θ(X) = (L∗k Lk X + XL∗k Lk − 2L∗k XLk )
k
= (L∗k [Lk , X] + [X, L∗k ]Lk )
k
We write
θT (X) = i[H, X] − (1/2)θ(X)
354 Advanced Probability and Statistics: Applications to Physics and Engineering
Then
X(t) = Tt (X(0)), Tt = exp(tθT )
Note that
Tt+s = Tt oTs , t, s ≥ 0
Tt∗is a CPTP map. If ρ is the state at time 0, then ρ(t) = Tt∗ (ρ) is the state at
time t:
T r(ρ(t)X) = T r(Tt∗ (ρ)X) = T r(ρ.Tt (X))
So
ρ̂(t, z) = T r(ρ(t)W (z)) = T r(ρ.Tt (W (z))
We shall derive a pde satisfied by ρ̂(t, z), the quantum Fourier transform of the
state at time t under the GKSL dynamics, ie dynamics of a quantum system
coupled to a bath. We have
[H, W (z)] = ωk [a∗k ak , W (z)] =
k
ωk ([a∗k , W (z)]ak + a∗k [ak , W (z)])
k
= ωk (z̄k W (z)ak + zk a∗k W (z))
k
= ωk (−|zk |2 + z̄k ak + zk a∗k )W (z)
k
Also,
[Lk , W (z)] = c(k, j)[aj , W (z)] + d(k, j)[a∗j , W (z)]
j
= (c(k, j)zj + d(k, j)z̄j )W (z)
j
So,
L∗k [Lk , W (z)] = (c(k, j)zj L∗k + d(k, j)z̄j L∗k )W (z)
k k,j
= [c(k, j)zj (c̄(k, m)a∗m +d(k,
¯ m)am )+d(k, j)z̄j (c̄(k, m)a∗ +d(k,
m
¯ m)am )]W (z)
k,j,m
Likewise,
[W (z), L∗k ] = [W (z), (c̄(k, j)a∗j + d(k,
¯ j)aj )]
j
=− ¯ j)zj )W (z)
(c̄(k, j)z̄j + d(k,
j
so that
[W (z), L∗k ]Lk =
k
Advanced Probability and Statistics: Applications to Physics and Engineering 355
− ¯ j)zj W (z)Lk )
(c̄(k, j)z̄j W (z)Lk + d(k,
k,j
=− ¯ j)zj ([W (z), Lk ] + Lk W (z))
c̄(k, j)z̄j ([W (z), Lk ] + Lk W (z)) + d(k,
k,j
=[ c̄(k, j)z̄j (c(k, m)zm + d(k, m)z̄m − c(k, m)am − d(k, m)a∗m )
k,j,m
+ ¯ j)zj (c(k, m)zm + d(k, m)z̄m − c(k, m)am − d(k, m)a∗ )]W (z)
(d(k, m
k,j,m
where
q0 (z) = (−1/2) ¯ j)c(k, m)zj zm +c̄(k, j)d(k, m)z̄j z̄m
[c̄(k, j)c(k, m)z̄j zm +d(k,
k,j,m
¯ j)d(k, m)zj z̄m ]
+d(k,
is quadratic in z, z̄ and
ψm (z) = (1/2) ¯ j)d(k, m)zj
[(c(k, j)c̄(k, m)zj +d(k, j)c̄(k, m)z̄j )−(d(k,
k,j
+c̄(k, j)d(k, m)z̄j )]
is linear in z, z̄. It follows that
θT (W (z)) = (q(z) + (φm (z)am + χm (z)a∗m )W (z)
m
where
q(z) = q0 (z) − i ωk |zk |2 ,
k
356 Advanced Probability and Statistics: Applications to Physics and Engineering
∂ ρ̂(t, z)/∂t =
Now
exp(δt. (φ̄m (z)am − φm (z)a∗m ))W (z)
m
and hence
Thus,
= δt.q(z).ρ̂(t, z)+
T r(ρ(t).exp(δt.(φ̄.z − φ.a∗ ))W (z))
= δt.q(z).ρ̂(t, z) + exp(−(δt/2)(φ̄(z).z − φ(z).z̄)).T r(ρ(t).W (z + δtφ(z)))
= δt.q(z)ρ̂(t, z) + exp(−(δt/2)(φ̄(z).z − φ(z).z̄)).ρ̂(t, z + δt.φ(z))
with neglect of O(δt2 ) terms. Thus,
∂ ρ̂(t, z)/∂t =
LALT = diag[D, D]
0 D
B=U UT
−D 0
Thus without loss of generality, we may assume that λ1 , ..., n > 0. Then, define
and moreover,
uk = (vk + v̄k )/2, wk = (vk − v̄k )/2i
implies in conjunction with the fact that since vk is an eigenvector of the Her-
mitian matrix −iB with eigenvalue λk while v̄k is an eigenvector of −iB with
eigenvalue −λk , we have vkT vj = 0, k, j = 1, 2, ..., n. Then
and further,
LT AL = diag[D, D]
It is clear that LJLT = J implies J = L−1 JL−T which in turn implies on taking
inverse that LT JL = J. Thus, L is a symplectic matrix satisfying the desired
conditions.
so
< e(v)|Γ(L)∗ W (z)Γ(L)|e(u) >=< e(L̃v)|W (z)|e(L̃u) >
Advanced Probability and Statistics: Applications to Physics and Engineering 359
< e(v)|W (L̃z)|e(u) >= exp(−|L̃z|2 /2− < L̃z|u > + < v|L̃z + u >)
Question: What is the class of symplectic matrices L for which this identity is
true ?
[11] A set of prerequisites for understanding the mathematical foundations
of quantum mechanics, quantum stochastics and quantum scattering theory.
[1] Unbounded operators in a Hilbert space.
[2] Born scattering
[3] Basics of quantum field theory.
[4] Statistics of Brownian motion, Poisson processes and other stochastic
processed derived from these.
[5] Quantum stochastic integration and quantum stochastic calculus.
[7] Lippman-Schwinger equations in quantum scattering theory.
|Φa > are the free particle states and |Ψa > are the corresponding scattered
states. The input scattered state |Ψa > at energy Ea satisfies
|Ψ+
a >= |Φa > −(H0 − Ea + i )
−1
V |Ψ+
a > − − −(1)
where
Tba =< Φb |V |Ψ+
a >
If the contour for the integral w.r.t a = Ea on the rhs is taken over the infinite
lower semicircle, then the pole at Ea = Eb + i is not enclosed and the resulting
contour integral is zero. However, in this case, as t → ∞, it is clear that the
contribution to the contour integral from semicircular arc goes to zero since
−iEa t for Ea having a negative imaginary part goes to −∞ as t → ∞. Thus,
the contour integral is the same as the integral over R and hence, we deduce
that
g(a)exp(−iEa t)|Ψ+ a > da − g(a)exp(−iEa t)|Φa > da
360 Advanced Probability and Statistics: Applications to Physics and Engineering
converges to zero as t → ∞. This proves that |Ψ+ a > defined by the Lippman-
Schwinger equation (1) is an out scattered state, ie, its time dependent version
converges at t → ∞ to the corresponding time dependent version of the free
particle state |Φa >. A similar argument shows that if |Ψ− a > is defined by
|Ψ−
a >= |Φa > −(H0 − Ea − i )
−1
V |Ψ−
a > − − −(2)
then |Ψ−
a > is an in-scattered state, ie, its time dependent version
g(a)exp(−iEa t)|Ψ−
a >
where
R(E) = 2πi(V − V (H − E + i )−1
To see this, we start with the Lippman-Schwinger equations
or equivalently,
R(b, a)(Eb −Ea ) = V (H −Eb −i )−1 (V −Eb +Ea )−(Eb −Ea +V )(H −Ea −i )−1 V
Thus,
R(b, a)(Eb − Ea )δ(Eb − Ea ) =
−(Eb − Ea )δ(Eb − Ea ).[V (H − Ea − i )−1 + (H − Ea − i )−1 V ]
This equation is the same as
V − V (H − E + i )−1 V = −V (H − E + i )−1 (V − (H − E + i ))
= V (H − E + i )−1 (E − H0 − i )
362 Advanced Probability and Statistics: Applications to Physics and Engineering
where P x−1 equals zero when x = 0 and x−1 otherwise, we get on multiplying
both sides of the above equation by δ(Eb , Ea )(Eb −Ea +i )−1 = −iπ.δ(Eb −Ea ),
and likewise,
−2πi < Ea |V −V (H−Ea +i )−1 V |Ea > dEa =< Ea |V (H−Ea +i )−1 |Ea > dEa
Hψ(r) = Eψ(r)
within the box and that ψ vanishes at the boundary. Show that this energy
spectrum is given by
Use this fact to solve the quantum mechanical tunneling problem: If the poten-
tial is V (x) = V1 , x < 0, V (x) = V2 , 0 < x < L, V (x) = V3 , x > L and if the
particle comes from x = −∞ with an energy of V1 , V3 < E < V2 , then the sta-
tionary state Schrodinger equation for this particle is given by (take h/2π = 1)
The solution corresponding to plane waves from the left getting partly reflected
and partly transmitted at x = 0 and getting transmitted fr0m x = L to x → ∞
is given by
ψ(x) = C1 .exp(ik1 x) + C2 .exp(−ik1 x), x < 0,
ψ(x) = C3 exp(αx) + C4 exp(−αx), 0 < x < L,
364 Advanced Probability and Statistics: Applications to Physics and Engineering
For proving that x, p are self-adjoint, you may use Von-Neumann’s condition:
Verify that the operators (x + i)−1 , (p + i)−1 are bounded operators in L2 (R),
and hence x + i, p + i both have range equal to the whole of L2 (R).
[5] Path integrals
[a] If H = p2 /2m + V (x) is the Hamiltonian of a non-relativistic particle,
the associated Lagrangian is L(q, q ) = mq 2 /2 − V (q). Evaluating
ψ1 (q2 ) = C. exp(iΔ.L(q1 , (q2 − q1 )/Δ)))ψ0 (q1 )dq1
upto O(Δ) show that with the constant C chosen appropriately, we have
Conclude that if T > 0, we can express the solution to the Schrodinger evolution
equation after time T as
ψT (q2 ) = KT (q2 , q1 )ψ0 (q1 )dq1
where
n−1
KT (q, q0 ) = limn→∞ Cn . exp(i Δ.L(qk , (qk+1 − qk )/Δ))dq1 ...dqn−1
k=0
where
Δ = Δn = T /n, qn = q
Advanced Probability and Statistics: Applications to Physics and Engineering 365
for an appropriate sequence of constants {Cn }. Justify that this solution to the
Schrodinger evolution kernel can be expressed as a path integral
T
KT (q, q0 ) = exp(i L(q(t), q (t))dt)Π0<t<T dq(t)
q(0)=q0 ,q(T )=q 0
[b] Evaluate the Schrodinger evolution kernel for a 1-D quantum harmonic
oscillator with Hamiltonian
H = p2 /2m + mω 2 q 2 /2
using path integrals and verify that it agrees with that evaluated by actually
solving the stationary Schrodinger equation for the eigenfunctions ψn (x), n =
0, 1, 2, ... with respective energy eigenvalues En , n = 0, 1, 2, ... and then forming
the evolution kernel
∞
KT (x2 , x1 ) = exp(−iEn T )ψn (x2 )ψ̄n (x1 )
n=0
Now expand q(t) as a Fourier sinewave series over [0, T ] keeping the end points
fixed:
q(t) = a + bt + qn 2/T .sin(nπt/T )
n≥1
where
a = x1 , a + bT = x2
Then justify that the path measure Π0<t<T dq(t), after an appropriate normal-
ization, can be replaced by the product measure Πn≥0 dqn and then substitute
the Fourier series expansion for q(t) into the action integral and evaluate the
path integral using standard Gaussian integrals.
[c] Evaluate using the above method, the path integral for a forced harmonic
oscillator described by the Lagrangian
L(q, q , t) = mq 2 /2 − mω 2 q 2 /2 + f (t)q, 0 ≤ t ≤ T
where q , q are real vectors and q, p are the position and momentum operator
vectors. Show that this kernel can be expressed as
Note that now H(t, q , p ) is a real/complex number, not an operator. Now show
that the position space wave functions that are eigen-functions of the momentum
operator are
< q |p >= C.exp(iq , p )
Hence, the above kernel becomes
exp(ip .(q − q ) − Δ.H(t, q , p ))dp
Deduce using this result that the finite time evolution kernel corresponding to
this Hamiltonian is the path integral
T
exp(i (p(t).q (t) − H(t, q(t), p(t)))Π0<t<T dq(t)dp(t)
0
Deduce that in the special case when H(t, q, p) = p2 /2m + V (q), we get on
integration w.r.t p in this path integral the earlier result for non-relativistic
quantum mechanics involving a path integral only over q.
Then, the transition probability from state |n > to state |m > in time t is given
by t
2
Pt (m|n, ) = | f (s) < m|V (s)|n > ds|2 + O( 3 )
0
Advanced Probability and Statistics: Applications to Physics and Engineering 367
and hence
IT (x) = (1/2) Rf−1
f (t, s)x(t)x(s)dtds
[0,T ]2
c2 Q(x0 ) = z
so
c= z/Q(x0 )
and hence
IZ (z) = IT (cx0 ) = z.IT (x0 )/Q(x0 ) = μ0 z
This is the rate function for the family Z( ), → 0.
368 Advanced Probability and Statistics: Applications to Physics and Engineering
p
H0 = ω(k)a∗k ak ,
k=1
V = f (a∗ , a)
where
a = (a1 , ..., ap ), a∗ = (a∗1 , ..., a∗p )
f is assumed to be a polynomial function. The Bosonic commutation relations
are
[ak , a∗m ] = δkm
We have
[W (z), ak ] = −zk W (z), [W (z), a∗k ] = −z̄k W (z)
Equivalently,
and hence,
W (z)f (a∗ , a)W (−z) = f (a∗ + z̄, a + z)
which gives
[f (a∗ , a), W (z)] = −(W (z)f (a∗ , a)W (−z) − f (a∗ , a))W (z)
Let
Lk = lk .a + mk .a∗ , L∗k = ¯lk .a∗ + m̄k .a
or equivalently, when written in full,
p
Lk = (lk (j)aj + mk (j)a∗j ),
j=1
p
L∗k = (¯lk (j)a∗j + m̄k (j)aj )
j=1
Advanced Probability and Statistics: Applications to Physics and Engineering 369
We shall more generally consider Lindblad operators that are arbitrary functions
of the creation and annihilation operators:
Lk = Fk (a∗ , a)
In the special case of Lk s that are linear in the a, a∗ , we have as observed earlier,
= −Fk∗ (Fk (a∗ + z̄, a + z) − Fk )W (z) − (Fk (a∗ − z̄, a − z) ∗ −Fk∗ )Fk W (z)
Then taking H = V = f (a∗ , a), we get
and
χ(z, a∗ , a) = −f (a∗ +z̄, a+z)+(1/2) [Fk (a∗ , a)∗ Fk (a∗ +z̄, a+z)+
k
Fk (a∗ −z̄, a−z)∗ Fk (a∗ , a)]
Then we can write
θT (W (z)) = [ψ(a∗ , a) + χ(z, a∗ , a)]W (z)
Thus,
Tt (W (z)) = exp(t(ψ(a∗ , a), χ(z, a∗ , a)))W (z)
and hence
r̂ho(t, z) = T r(ρ(0).exp(t(ψ(a∗ , a) + χ(z, a∗ , a)))W (z))
370 Advanced Probability and Statistics: Applications to Physics and Engineering
We have then,
∂ ρ̂(t, z)/∂t =
T r(ρ(t).(ψ(a∗ , a) + χ(z, a∗ , a))W (z))
Now write
∗ ∗
ψ(a , a) + χ(a, a , a) = K(z, u).exp(ū.a − u.a∗ )dudū
Then,
∂ ρ̂(t, z)/∂t =
= K(z, u).T r(ρ(t).exp(ū.a − u.a∗ )W (z))dudū
Now,
exp(ū.a − u.a∗ )W (z) = W (u)W (z)
W (z)|e(v) >= exp(−|z|2 /2− < z, v >)|e(v + z) >
W (u)W (z)|e(v) >= exp(−|z|2 /2− < z, v > −|u|2 /2− < u, v+z >)|e(v+z+u) >
= exp(−|z + u|2 /2− < z + u, v > +Re(< z, u >)− < u, z >)|e(v + z + u) >
= exp(−iIm(< u, z >))W (u + z)|e(v) >
and hence,
∂ ρ̂(t, z)/∂t =
= K(z, u).T r(ρ(t).exp(ū.a − u.a∗ )W (z))dudū
= K(z, u)exp(−iIm(< u, z >))ρ̂(t, u + z)dudū
H = (α, p) + βm + V (r)
Define
αr = (α, n), n = r/r
Then,
αr= 1,
Note that
σ 0
α=
0 −σ
We also denote by σ, the 4 × 4 matrix vector
diag[σ, σ] = I2 ⊗ σ
Advanced Probability and Statistics: Applications to Physics and Engineering 371
(σ, n)(σ, p) 0
αr (α, p) =
0 (σ, n)(σ, p)
= pr + i(σ, n̂ × p) = pr + ir−1 (σ, L)
where
pr = (n, p) = −i∂/∂r, L = r × p
Thus,
(α, p) = αr (pr + ir−1 (σ, L))
and we get
H = αr (pr + ir−1 (σ, L)) + βm + V
Note that
[αr , pr ] = r−1 [r−1 (α, r), (r, p)]
= r−2 αk xm [xk , pm ] + r−1 [r−1 , (r, p)](α, r)
= ir−2 (α, r) + r−1 xk [r−1 , pk ](α, r)
= ir−1 αr − ir−1 .xk xk /r3 (α, r)
= ir−1 αr − ir−1 .(xk xk /r3 )(α, r) = 0
Now define the observable
k = β((σ, L) + 1)
where by σ, we mean diag[σ, σ]. We have
0 (σ, L) + 1
k=
(σ, L) + 1 0
Then,
[αr , k] = [diag[(σ, n), (σ, n)], k]
0 {(σ, n), (σ, L) + 1}
=
−{(σ, n), (σ, L) + 1} 0
where {., .} means anticommutator. Now,
since
n.L = L.n = 0
Now,
(n × L + L × n)a = (abc)(nb Lc − Lc nb )
= (abc)[nb , Lc ]
But,
[nb , Lc ] = (crs)[nb , xr ps ] = (crs)xr [nb , ps ]
= i (crs)xr ∂nb /∂xs
372 Advanced Probability and Statistics: Applications to Physics and Engineering
= i (crb)xr /r
and thus,
(abc)[nb , Lc ] = i (abc) (crb)xr /r = 2ixa /r = 2ina
It follows that
{(σ, n), (σ, L)} = −2(σ, n)
and hence, we deduce that
[αr , k] = 0
+O(dt3 )
The authors simplify the computation of this evolution operator by passing over
to the momentum representation where the kinetic energy operator T becomes
diagonal. To go over from the position to the momentum representation, they
use the quantum Fourier transform. Finally, to simulate the reaction dynamics,
the authors use a 3-qubit system based on discretizing space into eight pixels
and representing the wave function in space by an 8 × 1 column vector that
varies with time. The paper is interesting both from a theoretical and an ap-
plication viewpoint. Some indication of higher order in dt approximations of
the Schrodinger unitary evolution operator U (t + dt, t) may be provided in the
Advanced Probability and Statistics: Applications to Physics and Engineering 373
where the tensor product is taken in the lexicographic order. Define the column
vector
K
|a(u, v) >= (δ(u − u , v − v )|1 > +(1 − δ(u − u , v − v ))|0 >)
u ,v =1
Then clearly
< a(u, v)|ψp,q >= Xp,q (u, v)
and therefore,
K
|ψp,q >= Xp,q (u, v)|a(u, v) >
u,v=1
We call this a diagonal Gaussian state since it is diagonal w.r.t. the canonical
basis |n >= |n1 , ..., nR >, nk = 0, 1, ... with
a∗k ak |n >= nk |n >, 1 ≤ k ≤ R
We approximate ρ by its truncated version
Q−1
ρ̃ = |n1 , ..., nR > p(n1 , ..., nR |λ)(< n1 , ..., nR |
n1 ,..,nR =0
374 Advanced Probability and Statistics: Applications to Physics and Engineering
where
p(n1 , ..., nR |λ) = exp(−λ1 n1 + ... + λR nR )/ZQ (λ)
where
Q−1
ZQ (λ) = exp(−λ1 n1 + ... + λR nR )
n1 ,...,nR =0
= ΠR
k=1 ZQ (λk )
with
Q−1
ZQ (x) = exp(−nx) = (1 − exp(−Qx))/(1 − exp(−x))
m=0
Now consider the Boson Fock space Γs (L2 (R+ )). Choose distinct vectors
u1 , ..., uP in L2 (R+ ) so that P = QR . Now consider the vectors
P
|f (r) >= c(r, s)|e(us ) >, r = 1, 2, ..., P
s=1
or equivalently,
C̄WC = IP
where
C = ((c(r, s))), W = ((exp(< us |us >)))
One way to choose C is to take
C = W−1/2
One purification of ρ̃ is given by
Q−1
|φ >= p(n|λ)|n > ⊗|f (r(n) >
n1 ,...,nR =0
where
annihilation differentials in Boson Fock space relative to the vectors |f (r(n) >
, n ∈ {0, 1, ..., Q − 1}R :
< f (r(m))|dA(t)|f (r(n)) >= c̄(r(m), s).c(r(n), s ) < e(us )|dA(t)|e(us ) >
s,s
= dt c̄(r(m, s).c(r(n), s )us (t).exp(< us |us >)
s,s
and likewise,
< f (r(m))|dA(t)∗ |f (r(n)) >= c̄(r(m), s).c(r(n), s ) < e(us )|dA(t)∗ |e(us ) >
s,s
= dt c̄(r(m, s).c(r(n), s )ūs (t).exp(< us |us >)
s,s
Q−1
|ψ >= q(n|λ)|n > ⊗|f (r(n) >
n1 ,..,nR =0
We consider the following second order Dyson series approximation for the HP
equation in the absence of a Hamiltonian:
T
W = I−i (LdA(t)−L∗ dA(t)∗ )− (LdA(t)−L∗ dA(t)∗ )(LdA(s)−L∗ dA(s)∗ )
0 0<s<t<T
−iT LL∗ /2
We then evaluate the matrix element
[18] EKF applied to state estimation in quantum systems. The state vector
follows the noisy Schrodinger dynamics
where f (t) is a random process. If f (t) is white noise, then we must formulate
it as an sde with an Ito correction term that guarantees unitary dynamics:
Measurements are taken at discrete times t1 < t2 < ... < tn < ... on taking into
account the notion of state collapse. Let Ma , a = 1, 2, ..., N define a POVM,
376 Advanced Probability and Statistics: Applications to Physics and Engineering
ie Ma > 0, a Ma = I. Then, after the measurement at time tn is taken, the
state collapses to
ψ(tn + 0) = Man ψ(tn − 0)/ < ψ(tn − 0)|Man |ψ(tn − 0) >1/2
provided that the measured outcome is an . It should be noted that the proba-
bility of this outcome is
Then the state at time tn + 0 following the measurement and after noting the
outcome is given by
ψ(tn + 0) = M (η(n))ψ(tn − 0)/ pn (η(n))
It should be noted that during the time interval (tn , tn+1 ), the dynamics of
the state is the above noisy Schrodinger dynamics. If we adopt a discrete time
version of this state and measurement model, we have
X(n + 1) = A1 X(n) + w(n)A2 X(n),
Alternate models:
Advanced Probability and Statistics: Applications to Physics and Engineering 377
X(2n + 2) = Z(2n + 1)
where
P (η(2n + 1) = a|X(2n + 1)) = q(a|X(2n + 1)), a = 1, 2, ..., p
This is one particular model for quantum state measurement in which state
evolution from an even time instant to the next (odd) time instant takes place
according to noisy Schrodinger dynamics while measurement followed by state
collapse takes place from an odd time instant to the next (even) time instant.
In another model, the state evolution under noisy Schrodinger dynamics takes
place from time Kn to time K(n + 1) − 1 and measurement followed by state
collapse takes place from time K(n + 1) − 1 to time K(n + 1). The difference
equation model for this is
X(m + 1) = (A1 + w[m + 1]A2 )X(m), Kn ≤ m ≤ K(n + 1) − 1
In other words, since the state after the measurement, collapses to the measured
state, there is no need for a filtering algorithm here. However, if there is an
additive noise present during the measurement process, we would then require
a filtering algorithm.
In yet another simplified model, the state evolves without collapse and mea-
surements on the state are made at every time instant with the measured state
at any given time instant being given by a function of the state collapse at the
previous time instant and the current state. For this model,
= N/D
where
N= p(Z(n + 1)|X(n + 1), X(n))p(X(n + 1)|X(n))p(X(n)|Yn )dX(n)
and
D= N dX(n + 1)
Now,
φ(Z(n + 1))p(Z(n + 1)|X(n + 1), X(n))dZ(n + 1)
or equivalently,
E(φ(X(2n + 1))|Y2n ) = E[ φ((A1 + wA2 )X(2n))p(w)dw|Y2n ]
380 Advanced Probability and Statistics: Applications to Physics and Engineering
where
P (2n + 2|2n + 2) =
and
iψ2,t (t, r) = (−1/2m)∇2 ψ2 (t, r) − (2e2 /r)ψ2 (t, r)
+Ke2 ( |ψ1 (t, r )|2 d3 r /|r − r |)ψ2 (t, r)
Advanced Probability and Statistics: Applications to Physics and Engineering 381
Plugging this into the equation and using the prime notation for variational
derivatives, we get a sequence of differential equations by equating coefficients
of each power of :
iψ1,0 (t) = H0 ψ1,0 (t), iψ2,0 (t) = H0 ψ2,0 (t),
iψ1,1 (t) = H0 ψ1,1 (t) + V (ψ2,0 (t))ψ1,0 (t),
iψ2,1 (t) = H0 ψ2,1 (t) + V (ψ1,0 (t))ψ2,0 (t),
iψ1,2 (t) = H0 ψ1,2 (t) + V (ψ2,0 (t)).ψ1,1 (t)
iψ2,2 (t) = H0 ψ2,2 (t) + V (ψ1,0 (t))ψ2,1 (t)
etc.
Now to simulate this system, we design a 2-layer neural network. The zeroth
layer is the initial state ψ1 (0), ψ2 (0) and the
[20] Neural network for simulating N -electron atom using the Hartree-Fock
equations derived from a variational principle.
For the N -electron atom with Hamiltonian
H= Ha + Vab
a a<b
with the constraint that ψ1 , ..., ψN are orthonormal vectors. Then, with
H0 = Ha ,
a
< ψ|H0 |ψ >= N C 2 sgn(σρ) < ψσ1 |H1 |ψρ1 >< ψσ2 |ψρ2 > ... < ψσN |ψρN >
σ,ρ
= N C2 [sgn(σρ) < ψσ1 |H1 |ψρ1 > δ(σ2, ρ2)...δ(σN, ρN )]
σ,ρ
= N C2 < ψσ1 |H1 |ψσ1 >
σ
= N (N − 1)!C 2 = < ψa |H1 |ψa >
a
Further, with
V = Vab ,
a<b
we have
N
< ψ|V |ψ >= < ψ|V12 |ψ >=
2
N
C2 < ψσ1 ⊗ ...ψσN |V12 |ψρ1 ⊗ ... ⊗ ψρN >
2 σ,ρ
N
= C2 [sgn(σρ). < ψσ1 ⊗ ψσ2 |V12 |ψρ1 ⊗ ψρ2 >
2 σ,ρ
[21] Using entangled states to communicate between the interior and exterior
of a Schwarzchild blackhole.
Let ψ0 (r) be in the initial state of the system and ψ(t, r) the state after time
t. The Schrodinger equation for the state is based on replacing the Laplacian
by the Laplace-Beltrami operator w.r.t the spatial metric:
2
γab = (g0a g0b − gab )/g00
This spatial metric is derived as follows. Let a light pulse start at time t from
r and arrive at time t + dt1 at r + dr. Likewise, it starts at time t from r + dr
and arrives at time t + dt2 at r. It now easily follows that dt1 is the positive
root of the quadratic equation
Thus,
dt1 = [−g0a dxa + (g0a g0b − g00 gab )dxa dxb ]/2g00
and
dt2 = [g0a dxa + (g0a g0b − g00 gab )dxa dxb ]/2g00
and
L
Rk N |ψ >< ψ|N ∗ Rk∗ = λ(N, ψ)|ψ >< ψ|∀ψ ∈ C, ∀N ∈ N − − − (2)
k=1
Indeed it is clear that (1) implies (2). Conversely, suppose (2) holds. Then for
any |φ >⊥ |ψ >, we have
< φ|Rk N |ψ >< ψ|N ∗ Rk∗ |ψ >= 0
k
or equivalently,
| < φ|Rk N |ψ > |2 = 0
k
and hence
< φ|Rk N |ψ >= 0∀k∀|φ >⊥ |ψ >
Thus,
Rk N |ψ >= λk (N, ψ)|ψ >
proving the claim.
Remark: It is easy to see that if C corrects N with Rk , k = 1, 2, ..., L as
recovery operators, then for any mixed state ρ whose range is contained in C,
and for any Nk ∈ N , k = 1, 2, ..., K, we have that
∗
Rk ( Nm ρNm )Rk∗ = λ.ρ
k m
384 Advanced Probability and Statistics: Applications to Physics and Engineering
and the converse is also trivially true. In fact, more generally, it is immediate
to see that if T is a quantum noisy channel of the form
T (ρ) = Nk ρNk∗ , Nk ∈ N , Nk∗ Nk = I
k k
ie
L
Nk∗ Nk = I
k=1
N0 = {N ∈ N : λ(N ∗ N ) = 0}
and taking
N3 = c 1 N 1 + c 2 N 2
results in
c1 N1 + c2 N2 ∈ N0
Now let M = N /N0 and define
N1 = N1 + M1 , N2 = N2 + M2 , M1 , M2 ∈ N0
then we get
λ(N1∗ N2 ) = λ(N1∗ N2 )
because
|λ(N1∗ M2 )|2 ≤ λ(N1∗ N1 ).λ(M2∗ M2 ) = 0
ie,
λ(N1∗ M2 ) = 0
and likewise,
λ(M1∗ N2 ) = λ(M1∗ M2 ) = 0
The positive definiteness of < ., . > follows from the following: Let
/ N0 , Ñ = N + N0
N ∈ N,N ∈
Then
< Ñ , Ñ >= λ(N ∗ N ) > 0
Now choose an onb Ñ1 , ...Ñp for M. We can write
Ñk = Nk + N0 , k = 1, 2, ..., p
Then obviously
Define
p
Pk = Nk P Nk∗ , k = 1, 2, ..., p, Q = 1 − Pk
k=1
Then,
Pk∗ = Pk , Pk Pj = Nk P Nk∗ Nj P Nj∗ = λ(Nk∗ Nj )Nk P Nj∗
= δkj Pk
and hence {Pk : k = 1, 2, ..., p} ∪ {Q} is an orthogonal resolution of the identity.
Define
Rk = P Nk∗ , k = 1, 2, ..., p, Rp+1 = Q
386 Advanced Probability and Statistics: Applications to Physics and Engineering
We have
p+1
p
p
Rk∗ Rk = Nk P Nk∗ + Q = +Q = I,
k=1 k=1 k=1
and secondly, if
N ∈ N , |ψ >∈ C
then
Finally,
Rp+1 N0 |ψ >= 0
since
N0 |ψ >= 0
because
Note that we are taking |ψ >∈ C, so P |ψ >= |ψ > is true. This completes the
proof of the Knill-Laflamme theorem.
(σ, L) = σx Lx + σy Ly + σz Lz
= σ− L+ + σ+ L− + σz Lz
where
0 1
σ+ = (σx + iσy )/2 = ,
0 0
0 0
σ− = (σx − iσy )/2 =
1 0
Thus,
(σ, p)χ =
(σ, p)(σ, r)[d1 Yl,m1 (r̂), d2 Yl,m2 (r̂)]T v(r)
= [(rpr − 3i)v(r) + v(r)(σ+ L− + σ− L+ + σz Lz )][d1 Yl,m1 , d2 Yl,m2 ]T
= (rpr − 3i)v(r)[d1 Yl,m1 , d2 Yl,m2 ]T + v(r)[d2 L− Yl,m2 , d1 L+ Yl,m1 ]T
+v(r)[d1 Yl,m1 , −d2 Yl,m2 ]T
= (rpr − 3i)v(r)[d1 Yl,m1 , d2 Yl,m2 ]T + v(r)[d2 b(l, m2 )Yl,m1 , d1 a(l, m1 )Yl,m2 ]T
= (E + eV − m)φ = (E + V − m)u(r)[c1 Yl,m1 , c2 Yl,m2 ]T
388 Advanced Probability and Statistics: Applications to Physics and Engineering
These are four ordinary differential equations for two functions u(r), v(r) and
hence for consistency, some relations between the constants c1 , c2 , d1 , d2 are
required. These are as follows: For (1) and (2) to correspond to the same
equation, we need
and for (3) and (4) to correspond to the same equation, we need
Let
α = d1 /c1 = d2 /c2 , β = c2 /c1
Then, the above conditions are equivalent to
= Vj Ej |ψlm >< ψlm |N |ψ >
lm
= Vj |ψlj >< ψlj |N |ψ >= |ψl >< ψlj |N |ψ >
l l
We wish to show that the noise operators Nlj can be chosen to be independent
of the index l. In fact, we have
proving the reconstruction property for {Rj }. Further, we have from the above
that
< φ|N2∗ Rj∗ Rj N1 |ψ >=
j
ā(j, N2 )a(j, N1 ) < φ|ψ >
j
= λ(N2∗ N1j )λ(N1j
∗
N1 ) = ā(j, N2 ).a(j, N1 ) = a(N1 , N2 )
j j
Thus,
< φ|N2∗ Rj∗ Rj N1 |ψ >= λ(N2∗ N1 ) < φ|ψ >=< φ|N2∗ N1 |ψ >
j
p
This is the same as saying that j=1 Rj∗ Rj equals I on N C, ie on the subspace
span{N |ψ >: N ∈ N , |ψ >∈ C}. We can now add another operator Rp+1 so
∗
that it is zero on N C and Rp+1 Rp+1 = I on (N C)⊥ , thereby guaranteeing the
reconstruction property that Rm N |ψ > is proportional to |ψ > for all m =
1, 2, ..., p + 1 and simultaneously
p+1
∗
Rm Rm = I
m=1
[1] Derive using the Kronig-Pinney model for the periodic potential of a lattice,
the Bloch wave functions and hence the existence of energy bands in a solid.
[2] Derive the total diffusion plus drift current in a doped semiconductor
in the presence of an external electric field. What does the equation of charge
conservation/continuity give for the density of electrons/holes in the presence
of a charge generation term.
[3] When a potential is applied across a doped pn junction semiconductor
with the space charge regions having definite widths on both sides of the junc-
tion, then write down Poisson’s equation for the potential and evaluate the space
charge widths in terms of the applied potential and the concentration of donors
and acceptors.
[4] Prove using the Gibbs distribution that the current in pn junction diode
is given by I = I0 (exp(eV /kT ) − 1)
[5] If in a material the potential is V (r), then by the Gibbs distribution
principle, the charge density is ρ0 exp(−qV (rx)/kT ) and V satisfies Poisson’s
equation
∇2 V (r) = −(ρ0 / ).exp(−qV (r)/kT )
Obtain a perturbative series solution for this equation.
[6] Derive the conductivity of a plasma in a weak electric field using the
Boltzmann kinetic transport equation.
[2] Ph.D thesis report for the thesis ”Design of voltage/current
mode analog circuits using second generation current conveyor
The author begins by noting that current mode designs of amplifiers are
based on replacing biasing voltages and other dc voltages by biasing dc currents
and other dc currents. Voltage mode based amplifier/oscillator designs are usu-
ally carried out using BJT transistors and nonlinear and small signal analysis
of such circuits are based on the Ebers-Moll model for the transistor. On the
395
396 Advanced Probability and Statistics: Applications to Physics and Engineering
other hand, current mode designs are usually carried out using MOSFET tran-
sistors. The reason for this may kindly be highlighted at the time of the final
presentation. The canddate’s argument must be based on comparison of the
Ebers-Moll model for BJT with the following model for MOSFETs: BJT:
Vi
I=
Ri + RL
dvo (t)
J − I1 (t) = C0
dt
Advanced Probability and Statistics: Applications to Physics and Engineering 397
so that
dvo (t)
C0 = K − K.(vi (t) − vo (t) − VP )2
dt
Solving this differential equation for vo (t) with vi (t) as a square wave gives us
immediately the slew-rate. On the other hand, for a voltage mode BJT amplifier
cct with one collector resistance and one load capacitance, the KCL gives, with
I1 (t) as the collector curent,
IS = K(VG − VS − VP )2 , ID = IG + IS ,
IG = CGS d(VG −VS )/dt+(VG −VS )(/RGS +CGD d(VG −0VD )/dt+(VG −VD )/RGD
Some clear equations require to be given for two port models of MOSFETs
designed using SOI technology. Issues like leakage current, sensitivity, stability
of current source to temperature fluctuations etc. need to be addressed.
Derive the squarer equation given on p.67 from the cct given in fig.4.4 from
first principles. Explain clearly the derivation. For each MOSFET transistor,
use only the algebraic relations and then explain how transient effects caused by
capacitances between gate and source and gate and drain introduce transient
effects in the squarer ie we get squarer plus some small memory terms.
In the fifth chapter, the author states that low power circuits can be designed
using some special kinds of current conveyors. Two kinds of current conveyors
are compared, one BD and two BDQFG. Some reasons for preferring the latter
over the former are given. Some theoretical proofs of these may be presented
during the viva-voce exam. The basic equations given by the author regarding a
current conveyor are that there are two input nodes and two output nodes and
that the current in the output nodes must match that in one of the input nodes
and the author states that one of the input nodes must have a low impedance
while the other input and both the output nodes must have very high impedance.
Why such a requirement is imposed must be stated.
In conclusion, I congratulate the author for having proposed many new tech-
niques regarding the design of current mode transistor circuits as a preference
398 Advanced Probability and Statistics: Applications to Physics and Engineering
to voltage mode circuits. I must say that the author has put in a lot of effort
to analyze and design current mode circuits using standard circuit software.
This study will motivate other researchers to study current mode based cir-
cuits more carefully and try to get improved two port models for MOSFETs
and apply these models to analyzing MOSFET circuits. The candidate well
deserves a PhD degree for this mammoth task. Before awarding her the degree,
I would however appreciate it if she can present some partial answers to the
above queries.
Remark: The Ph.D student for this thesis was Mrs.Bindu Thakral and her
supervisor was Dr.Arthi Vaish.
Chapter 14
399
400 Advanced Probability and Statistics: Applications to Physics and Engineering
[3] Survey of the work in quantum probability by the Indian school of prob-
abilists.
[a] The work of R.L.Hudson and K.R.Parthasarathy on quantum Ito’s for-
mula and the precise meaning of a noisy Schrodinger equation in quantum me-
chanics.
[b] The work of K.R.Parthasarathy on quantum Markov processes defined
in terms of star unital homomorphisms.
[c] Quantum stochastic differential equations as unitary dilations of the quan-
tum master equation of Gorini, Kossakowski, Sudarshan and Lindblad.
[d] The work of W.O.Amrein, K.B.Sinha and Jauch on time delay in quantum
scattering theory. To determine expressions for the difference in the average
times spent by the particle in a Borel subset of space before the scattering and
after the scattering.
[e] The work on defining the Scattering matrix for Coulomb scattering after
noting that the wave operators do not exist for the Coulomb potential.
[4] Probability and statistics in general relativity and cosmology; The work of
the Indian school of general relativists. How small random perturbations in the
positions and velocities of stars in a galaxy evolve under Newton’s inverse square
law of gravitation and under Einstein’s law of gravitation into clusters having
specific shapes like globular clusters, spiral galaxies etc. Also if we solve the
Einstein-Maxwell field equations in the presence of matter under initial random
conditions on the metric perturbations, the velocity and density perturbations
and the electromagnetic four potential perturbations, then what will be the
mean square fluctuations in the same quantities as time progresses ?
[a] Work on simulating the Belavkin filter for mixtures of quantum Gaussian
and Poisson noise measurements and computing the entropy evolution in of the
filter using Lie algebraic methods.
[b] Work on applying classical nonlinear filtering techniques for estimating
a quantum electromagnetic field using a time varying windowed version of this
field to excite a quantum mechanical system.
[c] Work on quantum image processing, specifically, to transform a classical
image field into a quantum state vector having many more degrees of freedom
and then processing this quantum state using optimal unitary operators and
finally converting the processed quantum state into a classical image field.
[d] Work on quantum image processing using Gaussian states. This involves
first transforming a classical image field into a quantum state, then approxi-
mating this quantum state by a quantum Gaussian state and applying stan-
dard processing algorithms on quantum Gaussian states based on the Hudson-
Parthasarathy noisy Schrodinger equation.
N
μN,p = N −1 δX(n),X(n+1),...,X(n+p−1)
n=1
Then compute
limN →∞ N −1 .log(P r(μN,p ∈ B)
where B is a subset of measures on Rp . When B is a subset of stationary
measures on RZ , then compute the above probablity when p → ∞. The solution
to this problem is a generalization of Sanov’s theorem that the rate function for
the empirical distribution of iid random variables equals the relative entropy
between the distribution of a random variable in the sequence and the value
assumed by the empirical distribution.
[20] Application of the quantum Belavkin filter to image processing prob-
lems. We first convert the classical image field into a pure quantum state. After
coupling this quantum state to a coherent state of the bath, we then process this
state using a family of unitary operators satisfying the HP noisy Schrodinger
equation with the Lindblad parameters selected according to some optimum
criteria (training the Lindblad parameters). We then take noisy measurements
on the system plus bath satisfying the non-demolition property to estimate the
processed quantum state and we use this processed quantum state to reconstruct
the quantum processed classical image field. The design of the optimal quantum
processor is based on the coherent state vector an not on the creation, annihila-
tion and conservation processes that drive the qsde. Therefore, the only way of
estimating the processed quantum state must involve some real time algorithm
for removing out the quantum noise from non-demolition measurements and the
Belavkin filter does precisely that.
[21] Some remarks and problems on quantum Gaussian states. The main
focus of this section is to determine how the Weyl operator in system space
Advanced Probability and Statistics: Applications to Physics and Engineering 403
transforms under the GKSL equation with the Hamiltonian and Lindblad oper-
ators being some functions of the creation and annihilation operators in system
Hilbert space and to use this transformation to determine how the quantum
Fourier transform of a state evolves with time. This result can then be used to
show that if the Hamiltonian in the GKSL equation is a quadratic function of
the creation and annihilation operators ie the Hamiltonian of a system of har-
monic oscillators and further, if the Lindblad operators in the GKSL equation
are linear functions of the creation and annihilation operators, then Gaussianity
of a state is preserved under the GKSL dynamics.
[a] Intuitive derviation of the form of the energy and momentum operators
in wave mechanics using Planck’s quantum hypothesis and De-Broglie’s wave-
particle duality.
[b] Energy spectru of a free particle in a 3-D box and verification of the
orthonogality of the stationary state eigenfunctions.
[c] Proof of the continuity of the wave function and its spatial gradient across
a boundary for finite potentials starting from Schrodinger’s equation.
[d] Definition of the (maximal) domains of the position and momentum oper-
ators in quantum mechanics and proofs of the self-adjointness of these operators.
[e] Proof that Feynman’s path integral for the evolution kernel satisfies the
Schrodinger equation. Also includes a discussion of the infinite normalization
constant involving in defining the path integral.
[f] Evaluating the path integral for the forced harmonic oscillator using two
methods. One by direct discretization of the Gaussian path integral, evaluating
the finite dimensional Gaussian integral and then passing over to the limit, and
two, by expressing the path integral in the frequency domain using expansion
of the position process as a Fourier series over the finite time interval [0, T ] and
using the transformation of the path measure to the measure on the countable
Fourier coefficient space.
[g] As Planck’s constant approaches zero, prove that the path integral reduces
to a single phase factor with the phase proportional to the action integral along
the classical path. This proves that in the limit of zero Planck’s constant,
quantum mechanics reduces to classical mechanics, or equivalently, interference
terms in quantum mechanics arising from a superposition over different paths
disappear reducing to just a contribution from just a single classical path.
[h] Large deviation theory in quantum mechanics. We consider a quantum
mechanical system with a small randomly time varying potential perturbing
the Hamiltonian. We assume that this perturbation is a Gaussian process and
calculate approximately the probability of transition between two stationary
states under this small randomly time varying perturbation. Using the LDP
rate function for a Gaussian process and the contraction principle, we then
calculate the rate at which this transition probability approaches zero in the
limit when the perturbation amplitude tends to zero.
[i] Evaluating the evolution of the quantum Fourier transform of a state under
the dynamics of an open quantum system whose Hamiltonian and Lindlbad
operators are functions of only the creation and annihilation operators of a
sequence of independent harmonic oscillators. The main idea in this calculation
is to exploit the commutation relations between the creation and annihilation
operators with the Weyl operators to derive the basic equations that describe
the evolution of the Weyl operator under the Heisenberg dynamnics of the open
quantum system.
constant tensor. The position field is the strain tensor and the kinetic energy is
a quadratic form in the velocity field, ie, in the time derivative of the displace-
ment vector while the potential energy is a quadratic form in the strain tensor.
The strain tensor is a symmetric linear function of the spatial derivatives of the
displacement vector and the quadratic form for the potential energy is one half
of the inner product between the strain tensor and the stress tensor. The stress
tensor is obtained by mutiplying the strain tensor of second rank with the fourth
rank elastic constant tensor. Finally, we discuss the canonical quantization of
this elastic field theory based on regarding the displacement field as the canon-
ical position fields, the partial derivative of the Lagrangian density w.r.t the
time derivative of the displacement field as the canonical momentum fields, in-
troducing canonical commutation relations between the canonical position and
canonical momentum fields and then formulating the functional Schrodinger
equation in which the canonical position fields become multiplication opera-
tors and the canonical momentum fields become partial functional/variational
derivatives w.r.t. the canonical position fields. This idea could be of use when
we are interested in determining the quantum probability laws for fracture of
molecular bonds on the Angstrom scale.
Reference: I wish to acknowledge my debt to mu colleague Dr.Abhishek
Tevatia for suggesting this problem to based on his research into the theoretical
and experimental aspects of fracture of materials.
[29] Neural network based EKF with disturbance observer for chaotic system
modeling. The crucial idea is to identify the chaotic system plant function by
approximating it with a neural network whose weights are updated using the
EKF driven by the noisy output of the original chaotic system. The fact that the
outptut of the chaotic system drives the neural EKF guarantees that after sev-
eral iterations on the neural weights, the neural newtork will well approximate
the original plant dynamics.
[30] Some remarks on classical and quantum entropy.
[31] Large deviations in classical and quantum hypothesis testing problems.
This section deals with the problem that if we have many independent copies
each of two quantum states and we apply a decision POV operator to discrim-
inate between these to tensor product states with the POV operator selected
according to the Neyman-Pearson rule, ie minimize the probability of false alarm
keeping the miss probability fixed, then at what is the maximum possible rate
does the false alarm error probability converge to zero as the number of copies
in the tensor product tends to infinity if we assume that the miss probability
rate converges to zero at some positive rate?
[32] Large deviations applied to some problems of stochastic control theory.
[33a] Large deviation principle in super-conductivity. When the vector po-
tential that drives the Fermionic fields in superconductivity has a small random
component that is modeled as a stationary stochastic process in time, then we
wish to compute the rate at which the superconductivity current fluctuations
converge to zero. The super-conductivity current density is computed using par-
tial differential operators acting on the temperature Greens function with the
406 Advanced Probability and Statistics: Applications to Physics and Engineering
[86] Quantum neural network for estimating the joint pdf of a random sig-
nal. Focuses on using the Schrodinger equation to generate in time a family of
probability densities that are smooth approximations of a given family of prob-
ability densities. The focus is on using the fact that for any time varying real
potential, the wave function in Schrodinger equation has a magnitude square
equal to a probability density because of unitary evolution and hence by just
manipulating the potential in accordance with a given error, we can generate a
family of pdf’s that track a given family of pdf’s. In other algorithms for pdf
tracking, we have at each iteration to impose the condition that the pdf remains
non-negative and integrates to unity whereas in a quantum neural network, that
is naturally guaranteed.
[87] Problems in electrodynamics related to general relativity. Focuses on
two problems. One, evaluation of the Ricci tensor components for a general
spherically symmetric metric and using these expressions to solve the Maxwell
equations in a spherically symmetric background metric. Two, study the dy-
namics of small perturbations around the Schwarzchild metric and use these
perturbed metric coefficients to determine the corresponding perturbation in
the electromagnetic field by solving Maxwell’s equations in the perturbed back-
ground metric using perturbation theory.
[88] Syllabus for a course on quantum computation.
[89] Some remarks on operator theory related to quantum scattering. The
aim here is to derive a far field formula for the scattered wave-function for an
incident plane wave function in terms of the Legendre polynomials and hence
obtain a formula for the total scattering cross section by integrating over all
solid angles. Azimuthal symmetry around the direction of the incident plane
wave is assumed, indeed this is a consequence of the radial character of the
interactio potential and the aximuthal symmetry of the incident plane wave.
[90] Some properties of compact operators in the context of scattering theory.
[91] Gravitational radiation.
[92] The post-Newtonian equations of hydrodynamics. Focuses on expand-
ing the metric tensor as well as the energy-momentum tensor of the matter field
in powers of the characteristic velocity of the system (or equivalently in terms of
the square root of the characteristic mass of the system) to derive a sequence of
linear equations for terms of each order of perturbation in terms of the lower or-
der perturbation terms and thereby obtain corrections to the Newtonian theory
of a fluid evolving under its own gravitational potential.
[93] Scattering of a gravitational wave by an electromagnetic field. This
problem focuses on determining the metric perturbations caused by the energy-
momentum tensor of the background electromagnetic field. The relevant equa-
tions that describe these perturbations are obtained by linearizing the Einstein
field equations with the energy-momentum tensor of the electromagnetic field
being treated as a first order perturbation.
six components of the metric tensor to be solved for rather than the original
ten. We compute the three velocity in terms of the four velocity and write
down the geodesic equation in terms of the three velocity. The time coordinate
required in the definition of the three velocity is the synchronized coordinate
time obtained by separating the proper time differential square in terms of a
spatial component and a time component.
[98] The Knill-Laflamme theorem. This theorem derives necessary and suf-
ficient conditions for the existence of recovery operators on the range of the set
of mixed states in a finite dimensional Hilbert space in relationship to a noise
manifold of operators so that the state after being transmitted through a noisy
quantum channel constructed out of the noise manifold operators in the form
of the Choi-Kraus/Stinespring representation can be decoded without any error
by passing this output state through another quantum channel built out of the
recovery operators.
412 Advanced Probability and Statistics: Applications to Physics and Engineering
[100] The use of group representation theory in inferring about the brain
disease. The brain surface signal is modeled by a partial differential equation
in space-time driven by a noise source. The coefficients in this pde model are
the parameters to be estimated which give us information about the brain dis-
ease. From noisy measurements on the output of this pde, we can estimate
these parameters using the EKF on a real time basis. When further, the pde
operators are invariant under a group of transformations acting on the curved
brain surface, and further the driving noise also has G-invariant statistics, then
the pde signal model can be considerably simplified by using the group the-
oretic Fourier transform. A special case of this is when the brain surface is
modeled as a sphere with the group of rotations acting on it. In this case under
G-invariance of the system pde dynamics and noise statistics, taking the group
theoretic Fourier transform amounts to expanding the signal and noise fields in
terms of spherical harmonics and in this domain, the signal representation and
computational complexity is considerably reduced.
[102] The use of quantum neural networks in estimating the brain signal pdf.
Quantum neural networks are nature inspired algorithms that naturally generate
a whole family of probability densities and hence can be used to estimate the
joint pdf of the EEG signal on the brain surface.
[103] The Dyson-Schwinger equations and its connection with the vacuum
polarization tensor and the electron self-energy. This section considers first writ-
ing down the Maxwell equations driven by the Dirac four current density source
and the Dirac equation driven by the electromanetic potential connection term.
The Dirac wave function and the electromagnetic potentials are treated as wave
Advanced Probability and Statistics: Applications to Physics and Engineering 413
field operators in the second quantized formalism. We then derive using these
field equations, exact differential equations for the electromagnetic field propa-
gator and the Dirac field propagator. Extra interaction terms appear in these
propagator differential equations in the form of trilinear vertex terms which
are vacuum expectations of the time ordered product of three field operators,
namely a Dirac field operator, its adjoint and an electromagnetic field operator.
These are known as the Dyson-Schwinger equations and can be used to develop
power series expansions for the exact photon and electron propagators.
[104] Syllabus for end-sem exam for SPC01, Applied linear algebra
well known engineers like Jerry Mendel, Georgios B.Giannakis and Anantharam
Swamy developed a host of applications of polyspectra to signal and image pro-
cessing algorithms. The primary feature of higher order statistics and polyspec-
tra is that it can detect non-Gaussianity, nonliearity and determine when two
collections of random variables are mutually statistically independent (test of
independence) etc.