Main 2122
Main 2122
R. M. Potvliege
Contents
1 Introduction 5
1.1 What this course is about . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Revision: The hydrogen atom and the Stern-Gerlach experiment 5
1.3 Quantum states and vector spaces . . . . . . . . . . . . . . . . . 11
1.4 Mathematical note . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1
3 Operators (I) 46
3.1 Linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 Matrix representation of an operator . . . . . . . . . . . . . . . 49
3.3 Adding and multiplying operators . . . . . . . . . . . . . . . . . 56
3.4 The inverse of an operator . . . . . . . . . . . . . . . . . . . . . 59
3.5 Commutators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.6 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . 62
3.7 The adjoint of an operator . . . . . . . . . . . . . . . . . . . . . 64
5 Operators (II) 76
5.1 Hermitian operators . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Projectors and the completeness relation . . . . . . . . . . . . . 80
5.3 Bases of eigenvectors: I. Finite-dimensional spaces . . . . . . . . 84
5.4 Bases of eigenvectors: II. Infinite-dimensional spaces . . . . . . 91
2
7 Wave functions, position and momentum 113
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.2 The Fourier transform and the Dirac delta function . . . . . . . 114
7.3 Eigenfunctions of the momentum operator . . . . . . . . . . . . 116
7.4 Normalization to a delta function . . . . . . . . . . . . . . . . . 119
7.5 Probability densities . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.6 Eigenfunctions of the position operator . . . . . . . . . . . . . . 123
7.7 The position representation and the momentum representation 124
7.8 The commutator of Q and P . . . . . . . . . . . . . . . . . . . . 127
7.9 Position and momentum operators in 3D space . . . . . . . . . . 127
7.10 Continua of energy levels . . . . . . . . . . . . . . . . . . . . . . 129
7.11 Probabilities and wave functions . . . . . . . . . . . . . . . . . . 132
7.12 The parity operator . . . . . . . . . . . . . . . . . . . . . . . . . 133
3
11 Time evolution 163
11.1 The Schrödinger equation . . . . . . . . . . . . . . . . . . . . . 163
11.2 The evolution operator . . . . . . . . . . . . . . . . . . . . . . . 166
11.3 The Schrödinger picture and the Heisenberg picture . . . . . . . 169
11.4 Constants of motion . . . . . . . . . . . . . . . . . . . . . . . . . 172
4
1 Introduction
~2 2 e2 1
H=− ∇ − , (1.1)
2µ 4π0 r
5
where µ is the reduced mass of the electron - nucleus system, e is the charge of
the electron in absolute value (e > 0), r is its distance to the nucleus, 0 is the
vacuum permittivity, ~ is the reduced Planck constant, and ∇2 is the square of
the gradient operator with respect to the coordinates of the electron. The latter
can be taken to be x, y and z, in which case
∂2 ∂2 ∂2
∇2 ≡ + + . (1.2)
∂x2 ∂y 2 ∂z 2
Alternatively, one can refer the position of the electron by the spherical polar
coordinates r, θ and φ, in which case ∇2 is a more complicated combination of
first and second order partial derivatives.
As defined, H is an operator which acts on functions and transforms a func-
tion of x, y and z (or a function of r, θ, φ) into another function of the same
variables. H is actually a combination of several operators: ∇2 is an operator
which, when acting on a function φ(x, y, z), transforms this function into the
sum of its second order derivatives with respect to x, y and z. The 1/r term in
the Hamiltonian is also an operator, which simply transforms φ(x, y, z) into the
product of φ(x, y, z) with the potential energy. In particular, H acts on wave
functions ψ(r, θ, φ) representing bound states of the electron – nucleus system,
i.e., states in which there is a vanishingly small probability that the electron is
arbitrarily far from the nucleus. These wave functions can be normalized in
such a way that
Z ∞ Z π Z 2π
dr r 2
dθ sin θ dφ |ψ(r, θ, φ)|2 = 1. (1.3)
0 0 0
As you have seen in the Term 1 Quantum Mechanics course, this Hamilto-
nian has discrete eigenvalues En (n = 1, 2, . . .) corresponding to bound energy
eigenstates (the eigenvalues of the Hamiltonian are often called eigenenergies).
I.e., in the notation used in that course, there exist bound state wave functions
ψnlm (r, θ, φ) such that the function obtained by letting H acts on ψnlm (r, θ, φ)
is simply ψnlm (r, θ, φ) multiplied by a constant En :
6
the above Hamiltonian. These eigenenergies thus form a discrete distribution
of infinitely many energy levels (discrete meaning that these energy levels are
separated by a gap from each other).1
The eigenvalues of the Hamiltonian of Eq. (1.1) (the eigenenergies En ) are re-
lated in a very simple way to the energy of the photons emitted by excited hy-
drogen atoms: each photon is emitted in a transition from an energy eigenstates
to another one, and the photon energy is almost equal to the difference between
the respective eigenenergies. The photon energy is not exactly equal to this dif-
ference for a variety of reasons — e.g., because describing the atom by way of
this Hamiltonian amounts to neglecting spin-orbit coupling and other relativis-
tic effects. However, what is important here is that there is a very close relation
between the eigenvalues of the Hamiltonian and the results which would be
found in an actual measurement of the energy of the photons.
Example 2: The Stern-Gerlach experiment
We refer here to an experiment of historical importance done by the two physi-
cists whose names have remained associated with this type of measurements
ever since, Otto Stern and Walther Gerlach, in the early 1920s. The aim of Stern
and Gerlach was to test the predictions of the Bohr model of atomic structure in
regards to the magnetic moment of atoms (modern Quantum Mechanics had not
yet been developed). The principle of the experiment was simple: since particles
with a magnetic moment µ in a magnetic field B experience a force F equal to
the gradient of µ·B, the magnetic moment of an atom can be inferred from how
its trajectory is deflected when it passes through an inhomogeneous magnetic
field. Stern and Gerlach directed a beam of silver atoms through a specially de-
signed magnet producing a magnetic field B = Bz ẑ such that ∇Bz 6= 0. (ẑ is a
unit vector in the z-direction.) Each of these atoms was therefore submitted to a
force equal to µz ∇Bz , with µz the z-component of its magnetic moment. They
observed that this beam was split into two when passing through this magnet,
from which they concluded that only two values of µz were possible for these
atoms. (The splitting is very visible in the image of the spatial distribution of
1
Besides being eigenfunctions of H, the energy eigenfunctions ψnlm (r, θ, φ), as defined,
are also eigenfunctions of the angular momentum operators L2 and Lz . The quantum num-
bers l and m identify the corresponding eigenvalues. In the case of the Hamiltonian defined
by Eq. (1.1), energy eigenfunctions with different values of l or m but the same value of the
principal quantum number n correspond to the same eigenenergy En .
7
the atoms shown in the first lecture of the course. A copy of this image can be
found in the Lecture 1 folder.)
At the time of this experiment Stern and Gerlach would not have been able to
formulate these results in terms of modern concepts of Quantum Mechanics.
However, we now know (1) that the magnetic moment of a silver atom orig-
inates almost entirely from the magnetic moment of the electrons it contains
(the magnetic moment of the nucleus is comparatively much smaller and can
be ignored in good approximation); (2) that for silver atoms in their ground state
the contribution of the electrons to the total magnetic moment can be written
in terms of a spin operator; and (3) that the two values of µz found in the ex-
periment correspond to two different eigenvalues of this spin operator.
More specifically, one can say that observing whether an atom is deflected
in one direction or the other amounts to a measurement of its spin in the z-
direction. We can represent the spin state of the atoms deflected in one direc-
tion by a column vector χ+ and the spin state of the atoms deflected in the other
direction by a column vector χ− , and set
1 0
χ+ = and χ− = . (1.5)
0 1
As can be easily checked, these two column vectors are eigenvectors of the
matrix
~/2 0
Sz = . (1.6)
0 −~/2
In this formulation of the problem, this matrix is the spin operator mentioned
in the previous paragraph. (A 2 × 2 matrix is an operator which transforms 2-
component column vectors into 2-component column vectors since multiplying
a 2-component column vector by a 2 × 2 matrix gives a 2-component column
vector as a result. See Appendix A of these course notes for a reminder of the
rules of matrix multiplication.) The matrix Sz has two eigenvalues, ~/2 and
−~/2. Accordingly, in the experiment of Stern and Gerlach, the only possible
values of µz were γ~/2 and −γ~/2, where γ is a certain constant whose value
is not important for this discussion.
Before it enters the magnet, an individual atom could be in the spin state repre-
sented by the column vector χ+ or in the spin state represented by the column
vector χ− . More generally, it could also be in a superposition state represented
8
by a column vector of the form c+ χ+ + c− χ− , where c+ and c− are two complex
numbers such that
|c+ |2 + |c− |2 = 1. (1.7)
In the latter case, there would be a probability |c+ |2 that it would be found to
have µz = γ~/2 and a probability |c− |2 that it would be found to have µz =
−γ~/2: only ±γ~/2 could be found for µz , even if the atom is initially in a
superposition state. Supposing that neither |c+ |2 nor |c− |2 is zero, then one
cannot predict the value of µz which would be found for that individual atom.
Only the probability of each of the two possible values of µz can be predicted.
Important observations
The quantum systems considered in these two examples are obviously quite
different, both physically and mathematically — e.g., the mathematical objects
representing the state of the system are functions of position co-ordinates in the
first example and column vectors of numbers in the second example. Nonethe-
less, they are similar in regards to key aspects of their theoretical description:
• The values these physical quantities could be found to have are given by
the eigenvalues of the corresponding operator.
• Each of these values has a certain probability to be found, and these prob-
abilities can be calculated from the wave function of column vector rep-
resenting the state of the system.
9
experiment, we have set the column vectors χ+ and χ− and the matrix Sz
to be as given by Eqs. (1.5) and (1.6). However, we could equally well have
decided to represent the two spin states by the column vectors
√ √
0 −1/√ 2 0 1/√2
χ+ = and χ− = (1.8)
1/ 2 1/ 2
As can be checked without difficulty, χ0+ and χ0− are eigenvectors of Sz0
corresponding, respectively, to the eigenvalues ~/2 and −~/2, exactly as
χ+ and χ− are eigenvectors of Sz corresponding to the same eigenvalues.
Formulating the problem as per Eqs. (1.8) and (1.9) instead of (1.5) and
(1.6) changes the details of the mathematics involved in the calculations.
However, these two formulations are equivalent from a Physics point of
view.
As another example, consider the ground state wave function of a linear
harmonic oscillator, ψ0 (x). You have seen in the Term 1 Quantum Me-
chanics course that
mω 1/4
ψ0 (x) = exp(−mωx2 /2~), (1.10)
π~
where m is the mass of the oscillator and ω its angular frequency. Clearly,
this function is continuous and is such that the integral
Z ∞
|ψ0 (x)| dx
−∞
exists. You may remember from one of your maths courses that these two
mathematical facts guarantee that ψ0 (x) can be written in the form of a
Fourier integral. I.e., there exists a function φ0 (p) such that
Z ∞
1
ψ0 (x) = φ0 (p) exp(ipx/~) dp. (1.11)
(2π~)1/2 −∞
In fact, the function φ0 (p) can be obtained by taking the inverse transfor-
mation:
Z ∞
1
φ0 (p) = ψ0 (x) exp(−ipx/~) dx. (1.12)
(2π~)1/2 −∞
10
Thus knowing φ0 (p) is knowing ψ0 (x) and knowing ψ0 (x) is knowing
φ0 (p). In other words, the ground state is represented by the function
φ0 (p) as well as by the function ψ0 (x): it is possible to use φ0 (p) rather
than ψ0 (x) if this would be convenient in some calculations, and the two
formulations are completely equivalent from a Physics point of view. This
topic will be discussed further in the course. [We just note at this stage
that ψ0 (x) is the “ground state wave function in position space" whereas
φ0 (p) is the “ground state wave function in momentum space".]
11
2 Vector spaces and Hilbert spaces
More precisely, a real vector space is a set V in which are defined a vector
addition and a multiplication by a scalar subject to the following axioms.
1. The vector addition associates one and only one element of V with
each pair v 1 , v 2 of elements of V . This element is called the sum of
v 1 and v 2 and is denoted by v 1 + v 2 . (The elements of V are called
vectors. One says that V is closed under vector addition, meaning that
the sum of two elements of V is always an element of V .)
(v 1 + v 2 ) + v 3 = v 1 + (v 2 + v 3 ). (2.1)
v+0=v (2.2)
for any element v of V . The vector 0 which has this property is called
the zero vector (or the null vector).
12
4. Every element v of V has one and only one inverse element in V ,
namely a vector −v such that
v + (−v) = 0. (2.3)
(If there is a risk of confusion with other meanings of the word inverse,
one can say that the vector −v is the additive inverse of v.)
5. The vector addition is commutative. I.e., for any two elements v 1 and
v 2 of V ,
v1 + v2 = v2 + v1. (2.4)
7. This operation is distributive. I.e., for any real numbers α and β and
any elements v, v 1 and v 2 of V ,
(α + β) v = α v + β v (2.5)
and
α (v 1 + v 2 ) = α v 1 + α v 2 . (2.6)
1 v = v. (2.8)
13
The definition of a complex vector space is identical, except that the scalars
are taken to be complex numbers, not real numbers.
+ These axioms ensure that whatever the elements of a vector space are,
calculations involving these elements follow all the rules you routinely
use when adding vectors representing positions, forces or velocities and
multiplying those by numbers. For example, these axioms imply that
for any vector v, (1) 0 v = 0 and (2) (−1) v = −v.
Proof: (1) Let w = 0 v. Setting α = β = 0 in Eq. (2.5) gives 0 v =
0 v + 0 v, i.e., w = w + w. Adding the vector −w to each side of this
equation gives w + (−w) = (w + w) + (−w). By virtue of Eqs. (2.1),
(2.2) and (2.3), this last equation simplifies to 0 = w, which shows that
indeed 0 v = 0 for any vector v. (2) Setting α = 1 and β = −1 in
Eq. (2.5) gives 0 v = 1 v + (−1) v, i.e., 0 = v + (−1) v. Hence (−1) v
the inverse of an additive inverse As, by axiom 4, −v is the only vector
which fulfills this equation, (−1) v is necessarily −v.
+ One can also define more general vector spaces in which the scalars
multiplying the vectors are not specifically real or complex numbers.
Instead, these scalars are taken to be the elements of what mathemati-
cians call a field (not to be confused with a vector field), namely a set of
numbers or other mathematical objects endowed with two operations
following the same rules as the ordinary addition and multiplication
between real numbers.
14
Notation
We will normally represent 3D geometric vectors (e.g., position vectors, velocity
vectors, angular momentum vectors, etc.) by boldface upright letters (e.g., v),
and elements of other vector spaces (e.g., vectors representing quantum states)
by normal fonts or some other symbols (e.g., χ+ or |ψi). Also, we will generally
use the symbol 0 to represent the zero vector, unless it would be desirable to
emphasize the different between this vector and the number 0.
Examples of vector spaces
• 3D geometric vectors
By 3D geometric vectors we mean the “arrow vectors" you have often
used to describe physical quantities which have both a magnitude and a
direction. Suppose that v1 and v2 are two such vectors (e.g., two different
forces acting on a same particle). You are familiar with the fact that v1
can be summed to v2 and that the result is also a geometric vector (the
vector given by the parallelogram rule). One can also multiply a geomet-
ric vector by a real number: by definition of this operation, the product
of a vector v by a real number α is a vector of the same direction as v (if
α > 0) or the opposite direction (if α < 0) and whose length is α times
the length of v. Geometric vectors form a real vector space under these
two operations.
+ As an exercise, check that these two operations have all the properties
required by the definition of a vector space. The zero vector here is
the vector whose length is zero (and whose direction is therefore un-
defined). Moreover, any geometric vector v has an additive inverse,
−v, which can be obtained by multiplying v by −1. (If v is not the
zero vector, then −v is a vector of same length as v but of opposite
direction.)
15
spin-1/2 particle. Such column vectors can be added together to give a
column vector of the same form: by definition of this operation, if a, a0 , b
and b0 are complex numbers,
0
a + a0
a a
+ 0 = . (2.9)
b b b + b0
One can also multiply a column vector of complex numbers by a com-
plex number, the result being a column vector of complex numbers. For
example,
a 2a + 3ia
(2 + 3i) ≡ . (2.10)
b 2b + 3ib
+ It is easy to see that these two operations have all the properties re-
quired by the definition of a vector space. The zero vector, here, is a
column vector of zeros, since adding a column vector of zeros to an-
other column vector does not change the latter:
a 0 a
+ = . (2.11)
b 0 b
16
such functions and multiplication by a scalar as the ordinary product by
a number. To the risk of being pedantic:
+ Again, check, as an exercise, that this set and these two operations
fulfill the definition of a vector space. The function 0(x) whose value
is zero for all values of x plays the role of the zero vector for this vector
space, since
f (x) + 0(x) ≡ f (x) (2.15)
for any function f (x) if 0(x) ≡ 0. Moreover, it is also the case that
to any function f (x) corresponds a function −f (x) ≡ (−1)f (x) such
that
f (x) + [−f (x)] ≡ 0(x). (2.16)
[Don’t be confused by the terminology: If f (x) is regarded as an ele-
ment of this vector space, then −f (x) is its inverse element (its additive
inverse), not its inverse function. The latter is the function f −1 (y) such
that if f (x) = y then f −1 (y) = x.]
17
• Square-integrable functions of a real variable
By a square-integrable function of a real variable we mean a function f (x)
(possibly taking complex values) such that the integral
Z b
|f (x)|2 dx
a
For example,
(
b (b1−α − 1)/(1 − α) α 6= 1,
Z
1
= (2.18)
1 xα log b α = 1.
18
+ The question of whether the set of all square-integrable functions forms a
vector space is rather subtle and cannot be fully addressed without using
mathematical concepts outside the scope of this course. Recall that the
axioms of a vector space require that the sum of any two elements of a
vector space V is also an element of V , and likewise for the product of
any element of V by any number. In other words, they require that V is
closed under vector addition and multiplication by a scalar. It is clear that
multiplying a square-integrable function f (x) by a real or complex finite
number α always results in a square-integrable function, in agreement
with the axioms of a vector space: since
Z ∞ Z ∞
|αf (x)|2 dx = |α|2 |f (x)|2 dx, (2.19)
−∞ −∞
the integral of |αf (x)|2 exists and is finite if the integral of |f (x)|2 exists
and is finite.
For the set of all square-integrable functions to be a vector space, it is also
necessary that the sum of any two square-integrable functions is a square-
integrable function. This is not difficult to prove for the case where we
would only consider functions that are continuous everywhere (which is
not restrictive for us, as wave functions used in Quantum Mechanics are
normally continuous).
Proof: The sum of two continuous functions f (x) and g(x) is a continuous
function and so are the functions |f (x)|2 , |g(x)|2 and |f (x) + g(x)|2 . The
case where these functions are defined and continuous on a closed interval
[a, b] is simple, as continuity on a closed interval implies integrability on
that interval; hence, f (x) + g(x) is square-integrable on [a, b]. The case
of of functions defined and continuous on the infinite interval (−∞, ∞)
is not as simple, however, since not all functions that are continuous on
(−∞, ∞) are also integrable on (−∞, ∞). Let us assume that f (x) and
g(x) are both square-integrable and continuous. Hence, the integrals
Z ∞ Z ∞
|f (x)|2 dx and |g(x)|2 dx
−∞ −∞
19
Similarly,
From this last equation, and from the fact that |f (x) − g(x)|2 ≥ 0, we
deduce that
Hence
Z ∞ Z ∞ Z ∞
2
|f (x) + g(x)| dx ≤ 2 2
|f (x)| dx + 2 |g(x)|2 dx. (2.24)
−∞ −∞ −∞
2.2 Subspaces
A subspace of a vector space V is a subset of V which itself forms a vector space
under the same operations of vector addition and scalar multiplication as in V .
Examples
20
• We have seen that the set of all N -component column vectors of real
numbers is a real vector space. For N = 3, these column vectors take the
following form,
a
b ,
c
where a, b and c are three real numbers. Amongst these vectors are those
whose third component is zero, i.e., column vectors of the form
a
b .
0
+ It is clear that summing any two column vectors whose third compo-
nent is zero gives a column vector whose third component is zero, and
likewise, that multiplying any of them by a number also gives a col-
umn vector whose third component is zero. This set is therefore closed
under vector addition and under multiplication by a scalar, as required
by the axioms of a vector space.
By contrast, the set of all 3-component column vectors whose third
component is 1 is not closed under these two operations, and is there-
fore not a vector space.
• As seen at the end of the previous section, the set of all functions of a real
variable is a vector space and the set of all square-integrable functions
of a real variable is also a vector space. The set of all square-integrable
functions is a subset of the set of all functions since all square-integrable
functions are functions but not all functions are square-integrable. Corre-
spondingly, the set formed by all square-integrable functions is a subspace
of the vector space formed by all functions.
Many other subspaces of this vector space can be considered, e.g., the sub-
space formed by all continuous functions, the subspace of that subspace
formed by all functions which are both continuous and have a continuous
21
first order derivative, the subspace formed by the periodic functions with
period 2π, etc.
is the direct sum of the space formed by the 3-component column vec-
tors
a
b
0
and the space formed by the 3-component column vectors
0
0 .
c
a+b
22
is a linear combination of two-component complex column vectors;
2 sin x + cos 3x
is a linear combination of the functions sin x and cos 3x; etc.
The general form of a linear combination of N vectors is
c1 v1 + c2 v2 + c3 v3 + · · · + cN vN
where v1 , v2 , v3 , . . . , vN are vectors and c1 , c2 , c3 , . . . , cN are scalars.
+ Extending this definition to a linear combination of an infinite number
of vectors is fraught with mathematical difficulties. However, the ex-
tension is not impossible. For instance, a convergent Fourier series is
in effect an infinite linear combination of vectors, each vector being a
complex exponential (or a sine or cosine function).
Examples
23
is the vector space formed by all 3-component column vectors of the form
a
b .
0
+ The two numbers a and b are real if we exclude linear combinations in-
volving complex numbers. In this case, we can take these two numbers
to be the x- and y-coordinates of a point in 3D space, the z-coordinate
of this point being 0. If we do so, the column vectors
1 0
0 and 1
0 0
then represent unit vectors starting at the origin and oriented, respec-
tively, in the x- and y-directions, and saying that these two column
vectors span the space formed by all column vectors of the form
x
y
0
is the same as saying that these two unit vectors span the whole xy-
plane.
• As you have seen in a previous course, the spherical harmonics Ylm (θ, φ)
are certain functions of the polar angles θ and φ labelled by the quantum
numbers l and m. You may remember that the quantum number m can
only take the values −1, 0 and 1 for l = 1. These three functions span
the set of all the functions of the form
f (θ, φ) = c−1 Y1−1 (θ, φ) + c0 Y10 (θ, φ) + c1 Y11 (θ, φ), (2.25)
24
2.5 Linear independence
A set formed by N vectors v1 , v2 , . . . , vN , and these N vectors themselves, are
said to be linearly independent if none of these vectors can be written as a linear
combination of the other vectors of the set. In the opposite case, one says that
the set is linearly dependent (or that these N vectors are linearly dependent).
Examples
c1 v1 + c2 v2 + · · · + cN vN = 0 (2.27)
25
vn so that n = 1 for this non-zero coefficient). We could then rewrite
this equation as
which shows that v1 must then be the zero vector or a non-zero lin-
ear combination of the vectors v2 , . . . , vN . These two possibilities are
in contradiction with the hypothesis that the N vectors v1 , v2 , . . . , vN
are both non-zero and linearly independent. The condition is thus nec-
essary. The condition is also sufficient, as if, e.g., the vector v1 was a
linear combination of the vectors v2 , . . . , vN we could write
v1 = a2 v2 + · · · + aN vN (2.29)
+ The Exchange Theorem implies, for example, that the three column
vectors 0 00
a a a
b , b0 , b00
0 0 0
are always linearly dependent, irrespective of the values of a, a0 , a00 , b,
b0 and b”, since these three vectors belong to a vector space spanned by
less than three vectors (see Section 2.4).
26
2.6 Dimension of a vector space
A vector space may be finite-dimensional of infinite-dimensional. A vector
space is finite-dimensional and has a dimension N if it contains a linearly inde-
pendent set of N vectors but no linearly independent set of more than N vec-
tors. It is infinite-dimensional if it contains an arbitrarily large set of linearly
independent vectors. (Note that the dimension of a vector space is unrelated to
the number of elements this vector space contains. Finite-dimensional or not,
vector spaces always contain an infinite number of vectors.2 )
A vector space spanned by N linearly independent vectors is finite-dimensional
and its dimension is N .
Proof: The dimension of this vector space cannot be less than N since by
construction it contains a linearly independent set of N vectors. Moreover, it
cannot be larger than N since any set of more than N vectors belonging to
this vector space is necessarily linearly dependent by virtue of the Exchange
Theorem.
A corollary of this theorem is that a vector space spanned by N linearly in-
dependent vectors cannot also be spanned by a set of fewer than N linearly
independent vectors.
Examples
27
• As mentioned above, the set of all 2π-periodic functions is a vector space.
This set includes, in particular, the complex exponentials exp(inx) (n =
0, ±1, ±2, . . .). [These functions are 2π-periodic since, for any real x and
any integer n, exp[in(x + 2π)] = exp(inx) exp(2nπi) = exp(inx).] As
we will see later, the set formed by the complex exponentials exp(inx)
(n = 0, ±1, ±2, . . . , ±nmax ) is linearly independent. This set is arbitrarily
large since nmax can be as large as one wants. Therefore the vector space
of all 2π-periodic functions is infinite dimensional.
2.7 Bases
A basis of a finite-dimensional vector space is a set of linearly independent
vectors such that any vector belonging to this vector space can be written as a
linear combination of these basis vectors. (The concept of basis for an infinite-
dimensional vector space is more complicated; it is addressed in Section 2.11.)
Although certain bases are more convenient than others, the choice of basis
vectors is arbitrary as long as these vectors are linearly independent. In fact,
any set of N linearly independent vectors belonging to a vector space of
dimension N is a basis for this vector space. Infinitely many such bases can
therefore be constructed.
28
that element could be joined to these N basis vectors to form a linearly
independent set of N + 1 vectors, which is in contradiction with the
hypothesis that the dimension of V is N .
Given a basis of a vector space, there is one and only way of writing each
element of that space as a linear combination of these basis vectors.
Proof: Suppose that there would be more than one way of writing a
vector a as a linear combination of basis vectors v1 , v2 ,. . . , vN . I.e., one
could write
a = c1 v1 + c2 v2 + · · · + cN vN (2.30)
and also
a = c01 v1 + c02 v2 + · · · + c0N vN (2.31)
with cn 6= c0n for at least one value of n. However, subtracting these
two equations gives
0 = (c1 − c01 )v1 + (c2 − c02 )v2 + · · · + (cN − c0N )vN . (2.32)
Examples
29
form a basis for the vector space of the two-component column vectors
since
a 1 0
=a +b (2.33)
b 0 1
for any complex numbers a and b.
This basis is not unique. For instance, the two column vectors
1 3
and
2 4
also form a basis for this vector space since for any complex numbers a
and b one can always find a number α and a number β such that
a 1 3
=α +β . (2.34)
b 2 4
30
2.8 Inner product
The vector spaces most relevant to Quantum Mechanics are inner product spaces,
namely vector spaces equipped with an operation called inner product (or scalar
product) which extends the familiar dot product between geometric vectors to
more general vector spaces.
We will denote the dot product of two geometric vectors v and w by the usual
symbol v · w, and the inner product of two other vectors v and w by (v, w) or
a similar symbol. Other notations are sometimes used in Mathematics.
2. For any vectors v1 , v2 and w and for any complex numbers α and β,
3. (v, v) = 0 if and only if v is the zero vector, and (v, v) > 0 if v is not
the zero vector.
31
Proof: In view of Eqs. (2.37) and (2.38),
The inner product is defined in the same way for real vector spaces, the only
difference being that (v, w) and the scalars α and β are real numbers in the
case of a real vector space.
Warning: The definition of the inner product used in this course differs
from a similar notation widely used in Mathematics. As defined here, the
inner product (v, w) is antilinear in the vector written on the left whereas
in Mathematics (v, w) is usually taken to be antilinear in the vector writ-
ten on the right. I.e., for us, (α v1 + β v2 , w) = α∗ (v1 , w) + β ∗ (v2 , w) and
(v, α w1 + β w2 ) = α(v, w1 ) + β(v, w2 ), whereas mathematicians would in-
stead write (α v1 + β v2 , w) = α(v1 , w) + β(v2 , w) and (v, α w1 + β w2 ) =
α∗ (v, w1 ) + β ∗ (v, w2 ).
Examples
• For the 2-component complex column vectors used to represent spin states,
the inner product is defined in the following way: If
a
v= (2.43)
b
and 0
a
w= 0 , (2.44)
b
then
a0
(v, w) = a ∗
b∗
= a∗ a0 + b ∗ b 0 . (2.45)
b0
32
See Appendix A for a reminder of how to multiply a column vector by a
row vector. Note that the components of the column vector v, which is
on the left of (v, w), are complex conjugated when this inner product is
calculated, and also that this column vector is written as a row vector in
the product.
The rule is the same for column vectors of any number of components: We
will always calculate the inner product (v, w) by transforming the com-
plex conjugate of the column vector v into a row vector and multiplying
the column vector w by this row vector. (Other ways of calculating the
inner product are possible in principle but are not often used in Quantum
Mechanics.)
Changing the column vector appearing on the left into a row vector can
be taken as a convention. It differs from the one you probably follow
when calculating the dot product of two geometric vectors, in which this
calculation is written as a dot product of two column vectors formed by
the x-, y- and z-components of the respective geometric vectors. These
two conventions are entirely equivalent, although we will see that there
are mathematical reasons for preferring the one adopted in this course.
• The inner product of two square-integrable functions f (x) and g(x) is
defined as follows:
Z ∞
(f, g) = f ∗ (x) g(x) dx, (2.46)
−∞
or, for functions which are square-integrable on a finite interval [a, b],
Z b
(f, g) = f ∗ (x) g(x) dx. (2.47)
a
33
2.9 Norm of a vector
The norm of a vector v, which we will represent by the symbol ||v|| (or |v| for
3D geometric vectors) is the real number defined by the following equation:
(2.50)
p
||v|| = (v, v)
√
(or |v| = v · v for geometric vectors).
A vector is said to be normalized if it has unit norm. (A normalized vector is
also called a unit vector.)
Any non-zero vector can be normalized by multiplying it by the inverse of its
norm: If u = v/||v||, then,
(2.51)
p
||u|| = (v, v)/||v|| = 1.
(Clearly, the zero vector has a zero norm and cannot be normalized. By defi-
nition of an inner product, the zero vector is the only vector which has a zero
norm.)
Example
The vector √
1/√ 2
v= (2.52)
i/ 2
is a unit vector since
√
√ √ 1/ 2
(v, v) = 1/ 2 −i/ 2 √ = 1/2 + (−i)(i)/2 = 1. (2.53)
i/ 2
+ As you know, v · w = |v| |w| cos θ where θ is the angle between the
vectors v and w. Since cos θ is a number between −1 and 1, its absolute
value is never larger than 1. Hence |v · w| ≤ |v| |w|. This result is a
particular case of the more general inequality
34
Proof: If v = 0 or w = 0 (i.e., v or w is the zero vector), then (v, w) = 0,
||v|| = 0 or ||w|| = 0, and the inequality reduces to 0 ≤ 0, which is true.
If neither v nor w is the zero vector, then we can define the normalized
vectors v 0 = v/||v|| and w0 = w/||w|| such that (v 0 , v 0 ) = (w0 , w0 ) = 1,
and set u0 = v 0 − (w0 , v 0 ) w0 . Now, since (w0 , v 0 ) = (v 0 , w0 )∗ ,
35
two integers and m 6= n, then m − n is a non-zero integer and therefore
Z 2π Z 2π
∗
[exp(inx)] exp(imx) dx = exp[i(m − n)x] dx (2.60)
0 0
2π
exp[i(m − n)x]
= (2.61)
i(m − n) 0
exp[2(m − n)πi] − 1
= (2.62)
i(m − n)
1−1
= (2.63)
i(m − n)
= 0. (2.64)
(We used the fact that exp(2kπi) = 1 for any integer k.)
• The zero vector is orthogonal to every other vector.
Gram-Schmidt orthogonalization
A set of linearly independent vectors can always be transformed into a set of
vectors othogonal to each other by using a method known as Gram-Schmidt
orthogonalization.
+ The orthogonalization method we are talking about differs from the
similar method of Gram-Schmidt orthonormalization you may have
studied in other courses.
The method is perhaps best explained by way of examples. First, let us consider
just two geometric vectors (arrow vectors), a and b, say. We want to make b
orthogonal to a. To this effect, we write b as the sum of a vector bk parallel to a
and a vector b⊥ orthogonal to a. Clearly, bk is a unit vector in the direction of a,
â, multiplied by bk · â. Since â = a/(a·a)1/2 , altogether bk = [(bk ·a)/(a·a)] a.
Therefore b⊥ = b − [(bk · a)/(a · a)] a.
As a second example, suppose that we want to form three vectors a0 , b0 and
c0 orthogonal to each other, starting from three linearly independent non-zero
column vectors a, b and c. For the sake of the illustration, let us imagine that
1 1 1
a = 1 , b = 1 , c = 0 .
(2.65)
1 0 1
36
We can decide to include one of the latter amongst our set of orthogonal vectors.
For instance, let us take a0 to be the vector a:
1
0
a = a = 1 .
(2.66)
1
Here
1 1
(a0 , b) = 1 1 1 1 = 2 and (a0 , a0 ) = 1 1 1 1 = 3. (2.68)
0 1
Note that b0 cannot be the zero-vector, as otherwise the vectors a and b would
not be linearly independent. We then form a vector c0 orthogonal to both a0 and
b0 by the same process. I.e., we set
Here (a0 , c) = 2, (a0 , a) = 3, (b0 , c) = −1/3 and (b0 , b0 ) = 6/9 = 2/3, and thus
1 1 1/3 1/2
2 1
c0 = 0 − 1 + 1/3 = −1/2 . (2.72)
3 2
1 1 −2/3 0
37
As can be checked easily, the three vectors so obtained,
1 1/3 1/2
a0 = 1 , b0 = 1/3 and c0 = −1/2 , (2.73)
1 −2/3 0
Suppose further that the vectors uj are orthonormal — i.e., that (ui , uj ) = δij
where δij is the Kronecker delta:
(
1 i = j,
δij = (2.75)
0 i 6= j.
The coefficients cj and dj can then be obtained as the inner product of the re-
spective vector with the corresponding basis vector: cj = (uj , v) and dj =
(uj , w). Moreover,
XN
(v, w) = c∗j dj (2.76)
j=1
and therefore
N
X N
X
2
||v|| = (v, v) = |cj | 2
and 2
||w|| = (w, w) = |dj |2 . (2.77)
j=1 j=1
38
Proof: Taking the inner product of v with u1 , we see that
N
X N
X
(u1 , v) = cj (u1 , uj ) = cj δ1j = c1 . (2.78)
j=1 j=1
Orthogonal subspaces
A subspace V 0 of a vector space V is said to be orthogonal to another subspace
V 00 of V if all the vectors of V 0 are orthogonal to all the vectors of V 00 .
For example, the space spanned by the two column vectors
1 0
0 and 1
0 0
is orthogonal to the space spanned by the column vector
0
0
1
since
0
a∗ b ∗ 0 0 = 0 (2.80)
c
for any complex numbers a, b and c.
+ We have seen, in Section 2.4, that the space spanned by these first two
column vectors can be represented geometrically by the xy-plane. Sim-
ilarly, the one-dimensional space spanned by the third column vector
39
can be represented by the z-axis, and saying that these two spaces are
orthogonal is the same as saying that the z-axis is orthogonal to the
xy-plane.
Proof: As seen in Section 2.5, we can be sure that N vectors are linearly
independent if the equation
c1 v1 + c2 v2 + · · · + cN vN = 0 (2.81)
is possible only if all the coefficients cn are zero. Suppose that the N
vectors vn are mutually orthogonal and non-zero. Suppose further that
the equation would be possible with some of the coefficients being non-
zero. Let cj be a non-zero coefficient. Taking the inner product of the
vector vj with each side of the equation gives
For example, taken as functions defined on the interval [0, 2π], the complex
exponentials exp(inx) and exp(imx) are linearly independent for any inte-
ger n and m 6= n since, as established above, these functions are orthogonal
on that interval.
40
Such vector spaces are called Hilbert spaces. (Recall that inner product spaces
are vector spaces in which an inner product is defined. For the mathematical
property called completeness, see below.)
We have already encountered several Hilbert spaces: The vector spaces formed
by N -component column vector of complex numbers (e.g., column vectors rep-
resenting spin states) are Hilbert spaces, and so are the vector spaces formed by
square-integrable functions (e.g., wave functions).
What completeness is is not important for this course. Briefly, if one takes an
infinite set of vectors belonging a vector space V and this infinite set forms a
convergent sequence (in a precise mathematical sense, see the example in the
note below for more information), then for V to be complete the limit of this
sequence must also a vector belonging to V . This property plays an impor-
tant role in the mathematical theory of infinite-dimensional vector spaces.
One can show that any finite-dimensional inner product space is complete
and is therefore a Hilbert space. However, not all infinite-dimensional inner
product spaces are Hilbert spaces.
41
Thus ||fn − fm || goes to zero for n and m → ∞, which means that
the sequence converges (loosely speaking, the difference between the
functions fn (x) and fm (x) becomes vanishingly small when n and m
increase). If you have taken a course in Analysis, you may have rec-
ognized that these functions actually form a Cauchy sequence, which
is the mathematically rigorous way convergence is defined in this con-
text: for any positive number one can find an integer N such that
||fm − fn || < for all m and n > N . Although this sequence con-
verges and only contains functions continuous everywhere on [0, 1],
however, the function it converges to is not continuous on [0, 1]: the
limit of x1/n for n → ∞ is indeed the discontinuous function
(
1 0 < x ≤ 1,
f (x) = (2.87)
0 x = 0.
Therefore the vector space of all the functions continuous on [0, 1] and
equipped with this inner product is not complete and does not qualify
as a Hilbert space.
One needs to enlarge the space in order to complete it, in particular by
including functions that are not continuous everywhere. How to do this
is well outside the scope of the course, but the result can be stated rela-
tively simply: the space of all square-integrable functions on an interval
[a, b] or on the infinite interval (−∞, ∞) is complete with respect to the
inner products defined by Eqs. (2.44) and (2.45). (For this result to hold,
however, these integrals must be understood as being Lebesgue inte-
grals — see page 20. Recall that square-integrable functions are called
L2 functions.) The mathematical theory of the corresponding Hilbert
spaces underpins the whole of wave quantum mechanics.
42
number of such functions, e.g., s0 (x), s1 (x), s2 (x),. . . , with
N
X
sN (x) = cn exp(inx). (2.88)
n=−N
being understood that this equality does not necessarily hold at all val-
ues of x. It turns out that for any L2 function on [0, 2π] one can find
a set of coefficients cn such that the function can be expanded in such
a way. In this sense, the complex exponentials exp(inx) form a basis
for this Hilbert space. (There would be much more to say about the
concept of basis on infinite-dimensional vector space, but saying much
more would go well beyond the scope of this course.)
43
in column vectors, e.g.,
vx
vy ,
vz
Eq. (2.90) defines a one-to-one correspondence between the elements of the vec-
tor space of 3D geometric vectors and the vector space of 3-component column
vectors of real numbers. In fact, adding geometric vectors or multiplying them
by a number is equivalent to adding the corresponding column vectors or mul-
tiplying these by the same number. Also, the dot product of any two geometric
vectors can be calculated as the inner product of the corresponding column
vectors: If v = vx x̂ + vy ŷ + vz ẑ and w = wx x̂ + wy ŷ + wz ẑ, then
wx
v · w = vx wx + vy wy + vz wz = vx vy vz wy . (2.91)
wz
(In writing the row vector, we have assumed that the components vx , vy and vz
are real numbers and therefore equal to their complex conjugate.)
In recognition of these facts, one says that the vector space of 3D geometric
vectors and the vector space of 3-component column vectors of real numbers
are isomorphic (which means, literally, that they have the same form). In a
sense, there is only one such vector space, and its elements can be represented
equally well by arrow vectors as by column vectors.
As another example, take the vector space spanned by the three spherical har-
monics Y1−1 (θ, φ), Y10 (θ, φ) and Y11 (θ, φ) (see Section 2.4). You have seen in
the Term 1 course that these three functions are orthonormal; in fact, for any l,
l0 , m and m0 ,
Z π Z 2π
dθ sin θ ∗
dφ Ylm (θ, φ) Yl0 m0 (θ, φ) = δll0 δmm0 . (2.92)
0 0
These three spherical harmonics are thus linearly independent. Any element
f (θ, φ) of this vector space is a linear combination of the form
f (θ, φ) = c−1 Y1−1 (θ, φ) + c0 Y10 (θ, φ) + c1 Y11 (θ, φ), (2.93)
where c−1 , c0 and c1 are three complex numbers. Since the three spherical har-
monics are linearly independent, each of these functions can be written in only
44
one way as a linear combination of that form. These functions are therefore in
one-to-one correspondence with the column vectors
c−1
c0
c1
and the vector space spanned by these three spherical harmonics is isomorphic
to the vector space of 3-component column vectors of complex numbers.
Exercise: Let
f (θ, φ) = c−1 Y1−1 (θ, φ) + c0 Y10 (θ, φ) + c1 Y11 (θ, φ), (2.94)
g(θ, φ) = d−1 Y1−1 (θ, φ) + d0 Y10 (θ, φ) + d1 Y11 (θ, φ). (2.95)
Show that the inner product of these two functions, defined as the integral
Z π Z 2π
dθ sin θ dφ f ∗ (θ, φ) g(θ, φ),
0 0
is c∗−1 d−1 + c∗0 d0 + c∗1 d1 . Show that the same result is also obtained by
taking the inner product of the corresponding column vectors,
c−1 d−1
c0 and d0 .
c1 d1
45
3 Operators (I)
46
This operator transforms functions into functions. For instance, it trans-
forms the function
ψ(x) = exp(−α2 x2 /2), (3.6)
where α is a real constant, into the function
~2 d2 ψ
Hψ(x) = − 2
+ V (x)ψ(x) (3.7)
2m2 dx
mω 2 2
~
= − α 4 x2 − α 2 + x exp(−α2 x2 /2). (3.8)
2m 2
Domain of an operator
This last example illustrates the need for a careful definition of the vector
space in which an operator acts. Since it includes a term in d2 /dx2 , the
Hamiltonian operator of Eq. (3.4) is defined only when acting on functions
which can be differentiated twice. Such functions form a subspace of the vec-
tor space of all functions of x. Thus H maps twice-differentiable functions
to functions (the latter may or may not be twice-differentiable themselves,
hence the mapping is not from the space of all twice-differentiable functions
to the space of all twice-differentiable functions).
More generally, an operator is a mapping from a subspace V 0 of a vector space
V to the vector space V itself. The vector space V 0 in which the operator acts
is called the domain of this operator.
It might be that V 0 = V , in which case the domain of the operator is the
whole of V . (By definition of a subspace, V 0 is a subset of V ; however, as
a subset of a set can be this set itself, V 0 can be the whole of V .) This is
the case of the operator Sy defined in the first example: any 2-component
column vector can be multiplied by a 2 × 2 matrix, hence the domain of this
operator is the whole of the vector space of 2-component column vectors.
However, Quantum Mechanics also makes use of operators whose domain
is smaller than the vector space they map to — typically, these are operators
acting on wave functions, such as the Hamiltonian operator H of Eq. (3.4).
47
Linear operators
Operator representing measurable physical quantities and most of the other
operators used in Quantum Mechanics have the important mathematical prop-
erty to be linear. An operator A is said to be linear if it fulfils the following two
conditions:
These two conditions can be summarized into the single condition that for any
vector v1 and v2 and any scalar c1 and c2 ,
A(c1 v1 + c2 v2 ) = c1 Av1 + c2 Av2 . (3.9)
For example, the differential operator d/dx is linear since, if c1 and c2 are con-
stants,
d df1 df2
[c1 f1 (x) + c2 f2 (x)] = c1 + c2 . (3.10)
dx dx dx
Throughout the rest of the course we will always assume that the operators we
are talking about are linear operators.
+ Not all operators are linear. For example, consider an operator O which
would multiply vectors by their norm, i.e., such that
Ov = ||v|| v (3.11)
for any vector v of the space in which this operator acts. This operator
violates the conditions operators must fulfil to qualify as linear opera-
tors. In particular, it is not the case that O(cv) = c Ov for any scalar c
and any vector v, since
O(cv) = ||cv|| cv, (3.12)
and ||cv|| cv 6= c ||v|| v unless c = 0, |c| = 1 or v = 0. Hence O is not
a linear operator.
+ Linear operators are particular instances of more general mappings
called linear transformations.
48
The identity operator
The identity operator, which is usually denoted by the letter I, is the operator
which maps any vector into itself:
Iv = v (3.13)
for any vector v. We will use this operator from time to time.
w = Av. (3.14)
v = c1 u1 + c2 u2 + · · · + cN uN , (3.15)
w = d1 u1 + d2 u2 + · · · + dN uN , (3.16)
and that cn = (un , v) and dn = (un , w), n = 1, 2,. . . , N . (See Section 2.10 for
further information about orthonormal bases.) Eq. (3.14) can be directly written
as a relation between the coefficients cn and the the coefficients dn : One can
organise these two sets of coefficients into two column vectors, c and d, such
that
c1 d1
c2 d2
c = .. and d = .. , (3.18)
. .
cN dN
49
and in terms these two column vectors Eq. (3.14) reads
d = Ac (3.19)
where A is the N × N matrix
(u1 , Au1 ) (u1 , Au2 ) · · · (u1 , AuN )
(u2 , Au1 ) (u2 , Au2 ) · · · (u2 , AuN )
A= .. .. .. .. . (3.20)
. . . .
(uN , Au1 ) (uN , Au2 ) · · · (uN , AuN )
Proof: Written in terms of the basis set expansions of the vectors v and
w, Eq. (3.14) reads
XN XN
dj uj = cj Auj . (3.21)
j=1 j=1
50
The matrix A is said to represent the operator A in the basis {un }. Its elements
— i.e., the inner products (ui , Auj ) — are called the matrix elements of A in that
basis.
It is clear that the column vectors c and d representing the vectors v and w and
the matrix A representing the operator A all depend on the basis: changing the
basis from a set of orthonormal vectors {un } to a set of orthonormal vectors
{u0n } changes the column vectors c and d into column vectors c0 and d0 of ele-
ments c0n = (u0n , v) and d0n = (u0n , w) and changes the matrix A into a matrix
A0 of elements (u0i , Avj0 ). As long as the set {u0n } is also an orthonormal basis,
however, the column vector d0 is related to A0 and to c0 in the same way as d is
related to A and to c:
d0 = A0 c0 . (3.26)
Many different sets of units vectors can form an orthonormal basis.3 Therefore,
a same operator can be represented by many different matrices.
Examples
• The spherical harmonics Ylm (θ, φ) are orthonormal in the sense that
Z π Z 2π
dθ sin θ ∗
dφ Ylm (θ, φ) Yl0 m0 (θ, φ) = δll0 δmm0 . (3.27)
0 0
Therefore the three l = 1 spherical harmonics Y1−1 (θ, φ), Y10 (θ, φ) and
Y11 (θ, φ), constitute an orthonormal basis for the vector space of all linear
combinations of the form
where c−1 , c0 and c1 are complex numbers (see Section 2.4 for this vector
space). You may remember from the Term 1 course that the spherical har-
monics are eigenfunctions of the angular momentum operator Lz . More
precisely,
∂
Lz = −i~ (3.28)
∂φ
3
Actually, infinitely many different sets of units vectors can form an orthonormal basis if the
dimension of the vector space is 2 or higher. There is no choice of basis set possible in spaces
of dimension 1.
51
and
Lz Ylm (θ, φ) = m~Ylm (θ, φ). (3.29)
Hence, when Lz acts on a linear combination of spherical harmonics with
l = 1, the result is also a linear combination of spherical harmonics with
l = 1; in fact,
One can therefore represent the operator Lz by a 3×3 matrix Lz in the ba-
sis formed by the spherical harmonics Y1−1 (θ, φ), Y10 (θ, φ) and Y11 (θ, φ),
the elements of this matrix being the integrals
Z π Z 2π
dθ sin θ ∗
dφ Ylm (θ, φ) Lz Yl0 m0 (θ, φ). (3.31)
0 0
These integrals are easy to calculate in view of Eqs. (3.27) and (3.29). The
calculation gives
−~ 0 0
Lz = 0 0 0 . (3.32)
0 0 ~
Written in terms of this matrix, Eq. (3.30) reads
−~ 0 0 c−1 −~c−1
0 0 0 c0 = 0 . (3.33)
0 0 ~ c1 ~c1
It is worth noting that the column vectors representing the linear combi-
nations of spherical harmonics and the matrix representing the operator
Lz depend on the order of the basis functions in the basis set. Eqs. (3.32)
and (3.33) apply to the case where the basis is the ordered set
52
these two equations would have been
0 0 0
Lz = 0 ~ 0 (3.34)
0 0 −~
and
0 0 0 c0 0
0 ~ 0 c1 = ~c1 . (3.35)
0 0 −~ c−1 −~c−1
Other choices of basis functions are also possible. For instance, we can
work with the functions Y1x (θ, φ), Y1y (θ, φ) and Y1z (θ, φ) defined as fol-
lows:
√
Y1x (θ, φ) = [Y1−1 (θ, φ) − Y11 (θ, φ)]/ 2, (3.36)
√
Y1y (θ, φ) = i[Y1−1 (θ, φ) + Y11 (θ, φ)]/ 2, (3.37)
Y1z (θ, φ) = Y10 (θ, φ). (3.38)
It is not particularly difficult to show that these three functions also form
an orthonormal basis for the vector space spanned by the spherical har-
monics Y1−1 (θ, φ), Y10 (θ, φ) and Y11 (θ, φ), and that in the basis {Y1x (θ, φ), Y1y (θ, φ), Y1z (θ, φ)}
the operator Lz is represented by the matrix
0 i~ 0
L0z = −i~ 0 0 . (3.39)
0 0 0
• The five spherical harmonics Y2−2 (θ, φ), Y2−1 (θ, φ), Y20 (θ, φ), Y21 (θ, φ)
and Y22 (θ, φ) constitute an orthonormal basis for the vector space of all
linear combinations of the form
c−2 Y2−2 (θ, φ) + c−1 Y2−1 (θ, φ) + c0 Y20 (θ, φ) + c1 Y21 (θ, φ) + c2 Y22 (θ, φ),
53
sented by the matrix
−2~ 0 0 0 0
0 −~ 0 0 0
(3.40)
Lz =
0 0 0 0 0.
0 0 0 ~ 0
0 0 0 0 2~
+ Although its analytical form [Eq. (3.28)] is the same as in the previous
example, this operator is represented by a 5 × 5 matrix here, not by
a 3 × 3 matrix. It may seem bizarre that a same operator can be rep-
resented by matrices of different sizes. However, technically, the Lz
operator considered here is not the same operator as the Lz operator
considered above: The mathematical definition of an operator includes
a specification of the vector space in which the operator acts, and this
vector space differs between these two examples.
• Since (ui , Iuj ) = (ui , uj ) = δij if I is the identity operator and the vectors
ui and uj are orthonormal, this operator is always represented by the unit
matrix in any orthonormal basis (specifically, by the N × N unit matrix
if the space is of dimension N ).4
54
the inner products (ui , Auj ). (Aij is the element of A located on the i-th row
and in the j-th column.) Likewise, a vector v of this Hilbert space is repre-
sented by a column vector c whose i-th component (i = 1, 2, 3,. . . ) is the
inner product (ui , v). In this representation, the vector Av is calculated as
the product of the column vector c by the matrix A.
Take, for example, the Hilbert space of all square-integrable functions of the
polar angles θ and φ, which is infinite-dimensional. (We are talking about
the Hilbert space of all square-integrable functions of these two angles, not
about a finite-dimensional Hilbert space of functions that can be written as
a linear combination of spherical harmonics with same l values as in the
previous examples.) One can show that the spherical harmonics Ylm (θ, φ)
form an orthonormal basis for this Hilbert space. In the basis
{Y00 (θ, φ), Y1−1 (θ, φ), Y10 (θ, φ), Y11 (θ, φ), Y2−2 (θ, φ), Y2−1 (θ, φ), . . .},
the angular momentum operator Lz is represented by the infinite matrix
0 0 0 0 0 0 ···
0 −~ 0 0 0 0 · · ·
0 0 0
0 0 0 · · ·
Lz = 0 0 0 ~ 0 0 · · · . (3.41)
0 0 0
0 −2~ 0 · · ·
0 0 0 0 0 −~ · · ·
.. .. .. .. .. .. ..
. . . . . . .
Likewise, a function f (θ, φ) which can be expanded as
∞ X
X l
f (θ, φ) = clm Ylm (θ, φ) (3.42)
l=0 m=−l
55
Warning: The mathematical theory of finite matrices does not extend
straightforwardly to infinite matrices. For example, a column vector of in-
finitely many components can be multiplied by a row vector of infinitely
many components only if the resulting sum of products of components con-
verges; the issue does not arise in the finite-dimensional case.
(A + B)v = Av + Bv (3.44)
for any vector v the operators A and B can both act on.
+ For example, the linear harmonic oscillator Hamiltonian given by Eq. (3.4)
can be seen as being the sum of the operator
~2 d2
−
2m dx2
and the operator V (x), representing, respectively, the kinetic energy and
the potential energy of the oscillator. The latter is a multiplicative opera-
tor: seen as an operator, V (x) transforms a function ψ(x) into the function
V (x)ψ(x) [e.g., transforms the function exp(−α2 x2 /2) into the function
(mω 2 x2 /2) exp(−α2 x2 /2)].
Not surprisingly, the matrix representing the sum of two operators is the sum
of the corresponding matrices: E.g., if the operators A and A0 are represented
by the matrices 0 0
a b a b
A= and A = 0 0 ,
0
(3.45)
c d c d
then the operator A + A0 is represented by the matrix
0 0
a + a0 b + b0
a b a b
A+A = 0
+ 0 0 = . (3.46)
c d c d c + c0 d + d 0
56
Multiplication by a scalar
Operators can also be multiplied by scalars. To state the obvious, if α is a num-
ber, the operator αA is defined as the operator such that
for any vector v the operator A can act on. Therefore one can make linear
combinations of operators in the same way as one can make linear combinations
of vectors. Examples of such linear combinations are the ladder operators Jˆ+
and Jˆ− used in the theory of the angular momentum, which are defined in terms
of the angular momentum operators Jˆx and Jˆy as Jˆ± = Jˆx ± iJˆy (you may have
encountered these operators in Term 1, and we will come back to them later in
this course).
In terms of matrix representations, multiplying an operator by a number amounts
to multiplying the corresponding matrix by this number, which also amounts
to multiplying each of the elements of that matrix by this number:
a b αa αb
α = . (3.48)
c d αc αd
Products of operators
Less obvious perhaps is that operators acting in the same space can also be
multiplied: If A and B are two operators, then, by definition, their product AB
is the operator such that
(AB)v = A(Bv) (3.49)
for any vector v (or more precisely, for any vector v for which the right-hand
side of this equation is defined). I.e., operating on a vector v with the product
operator AB is operating on v with B and then operating on the resulting vector
with A. Recall that the order of the operators in such products often matters:
in many cases, AB and BA are different operators (see Section 3.5 for further
information about non-commuting operators).
Clearly, an operator can be multiplied by itself to form the square of that opera-
tor, and this process can be iterated to form higher powers. For example, if A is
an operator, the operator A2 is the product AA, A3 is the product AA2 (which
can also be written AAA and A2 A), etc.
57
In terms of matrix representations, the matrix representing a product operator
AB in a given basis is the product of the matrix representing A with the matrix
representing B in the same basis. (These matrices do not commute if the oper-
ators A and B do not commute, in which case they must be multiplied in the
same order as the corresponding operators.)
(m)
where the coefficients ci are scalars. Thus
N
(m)
X
(un , ABum ) = ci (un , Aui ). (3.51)
i=1
(m)
However, ci = (ui , Bum ) since the basis is orthogonal. Therefore
N
X
(un , ABum ) = (ui , Bum )(un , Aui ). (3.52)
i=1
Since the matrix elements (un , Aui ) and (ui , Bum ) are numbers and
numbers commute, Eq. (3.52) can also be written as
N
X
(un , ABum ) = (un , Aui )(ui , Bum ). (3.53)
i=1
Eq. (3.53) says that the element on the n-th row and m-th column of
the matrix representing AB is obtained by multiplying the n-th row of
the matrix representing A by the m-th column of the matrix represent-
ing B. Hence, the matrix representing AB is the product of these two
matrices.
58
+ You may have noticed that what is written above is a bit imprecise in
regards to the domain of the operators concerned. A product operator
AB may act only on vectors v which are in the domain of B and such
that Bv is in the domain of A.
Exponentials of operators
More complicated functions of operators can also be defined. One which often
crops up in Quantum Mechanics is the exponential of an operator. Recall that
if z is a number,
∞
1 1 X 1
exp(z) = 1 + z + z 2 + z 3 + · · · = zn. (3.54)
2! 3! n=0
n!
59
1. If the operators A and B are both invertible, then the product AB is also invert-
ible and
(AB)−1 = B −1 A−1 . (3.57)
3. If the operator A is invertible, then Av is the zero vector only if v is the zero
vector.
Note quite all operators are invertible. Recall, for example, that a matrix whose deter-
minant is zero is not invertible. One says that an operator is singular if it not invertible.
3.5 Commutators
Two operators A and B are said to commute if ABv = BAv for any vector v these
operators may act on. (Recall from the previous section that ABv is the vector obtained
by first transforming v with B and then with A, while BAv is the vector obtained by
first transforming v with A and then with B.) If A and B commute then AB = BA
and AB − BA = 0. (The right-hand side of this last equation is the zero operator,
i.e., the operator which transforms any vector into the zero vector. Acting with this
operator on a vector amounts to a multiplication by the scalar 0.)
+ More precisely, A and B commute if ABv = BAv for all the vectors
v which are in the domain of AB as well as in the domain of BA. It is
possible for these different operators to have different domains if they
act in an infinite-dimensional vector space.
The commutator of two operators A and B is the operator AB − BA. This operator is
usually represented by the symbol [A, B]:
Therefore
[A, B]v = ABv − BAv. (3.60)
60
Clearly, the commutator of two commuting operators is zero.
For example, take the matrix operators
1 0 0 1
σz = and σx = ,
0 −1 1 0
which are used to represent spin operators (see later in the course). These two operators
do not commute since σz σx and σx σz are different matrices:
1 0 0 1 0 1
= (3.61)
0 −1 1 0 −1 0
whereas
0 1 1 0 0 −1
= . (3.62)
1 0 0 −1 1 0
Clearly,
0 1 0 −1 0 2
[σz , σx ] = − = . (3.63)
−1 0 1 0 −2 0
+ Note: You may remember to have seen the equation [x, px ] = i~, where
x and px are, respectively, the position and momentum operators for the
x-direction. Since the left-hand side of this equation is an operator, the
correct (but pedantic) way of writing its right-hand side is i~I, where I
is the identity operator. Writing [x, px ] = i~ is completely acceptable,
though, unless there would be a risk of confusion.
• [A, A2 ] = 0, and more generally [A, f (A)] = 0 for any function f (A) of the
operator A [e.g., exp(A)].
[A, [B, C]] + [C, [A, B]] + [B, [C, A]] = 0. (3.64)
61
3.6 Eigenvalues and eigenvectors
Suppose that v is a non-zero vector such that
Av = λv (3.65)
for a certain scalar λ. One then says that λ is an eigenvalue and v an eigenvector of the
operator A. Or, more specifically,one says that v is an “eigenvector of A with eigenvalue
λ" or an “eigenvector of A belonging to the eigenvalue λ". (Don’t be confused by this
equation: the left-hand side represents the vector resulting from the action of A on
v while the right-hand side represents the vector obtained by multiplying v by the
number λ.)
Clearly, if the operator A and the vector v appearing in Eq. (3.65) are represented by a
matrix A and a column vector c, then one also has
Ac = λc. (3.66)
I.e., the column vector c is an eigenvector of the matrix A.
One often uses the term eigenfunction instead of eigenvector if the operator considered
acts on functions, as in the case, for example, of Eq. (3.29) of Section 3.2. It may be
worth stressing that the words “eigenfunction" and “wave function" do not mean the
same thing. An eigenfunction is what we just defined. A wave function is a function
representing a quantum state. An eigenfunction may or may not represent a certain
quantum state, and therefore may or may not also be a wave function. Similarly, a
wave function may or may not be an eigenfunction of some interesting operator. For
example, the time-dependent function
√
Ψ(r, θ, φ, t) = [ψ100 (r, θ, φ) exp(−iE1 t/~)+ψ211 (r, θ, φ) exp(−iE2 t/~)]/ 2 (3.67)
is a wave function representing a linear superposition of the 1s and 2pm=1 states of
atomic hydrogen. Although a valid wave function, solution of the time-dependent
Schrödinger equation, Ψ(r, θ, φ, t) is not an eigenfunction of the Hamiltonian, the an-
gular momentum operator L2 or the angular momentum operator Lz .
+ It should be noted that only vectors belonging to the space in which the
operator is defined are regarded as being eigenvectors of this operator
(eigenvectors in the sense normally given to this term in Mathemat-
ics). For example, the differential operator d/dx has infinitely many
eigenfunctions if regarded as an operator acting on any differentiable
function, since
d
exp(λx) = λ exp(λx) (3.68)
dx
62
for any real or complex λ. But d/dx has no eigenfunction if regarded
as an operator acting only on differentiable square-integrable functions
on (−∞, ∞), since all the solutions of the equation
dy
= λy(x) (3.69)
dx
are of the form C exp(λx), where C is a constant, and none of these
solutions is square-integrable on (−∞, ∞). We will come back to this
issue in a later part of these notes.
Degenerate eigenvalues
It may happen that several linearly-independent vectors belong to a same eigenvalue.
This eigenvalue is said to be degenerate in that case.
In particular, one says that the eigenvalue λ is M -fold degenerate (or that its degree of
degeneracy is M ) if there exist M linearly-independent vectors v1 , v2 ,. . . , vM such that
and if any other eigenvector belonging to that eigenvalue is necessarily a linear com-
bination of these M linearly-independent vectors.
For example, ignoring spin, the n = 2 eigenenergy of the non-relativistic Hamiltonian
of atomic hydrogen is 4-fold degenerate since (1) the 2s, 2pm=0 , 2pm=1 and 2pm=−1
wave functions all belong to this eigenenergy, (2) these four wave functions are mutu-
ally orthogonal and therefore linearly independent, and (3) it is not possible to find a
fifth n = 2 energy eigenfunction that would be orthogonal to all these four functions.
The words “linearly independent" are an important part of the definition above. If v is
an eigenvector of an operator A with eigenvalue λ, then any multiple of v is also an
eigenvector of A with that same eigenvalue since, for any scalar c,
Therefore it is always the case that infinitely many eigenvectors belong to a same eigen-
value. However, vectors multiple of each other are not linearly independent. (As we will
see, in Quantum Mechanics vectors multiple of each other represent the same quan-
tum state, while linearly independent vectors necessarily represent different states. An
eigenvalue is degenerate if it corresponds to several different quantum states.)
63
+ Suppose that the operator A appearing in Eq. (3.70) acts in a vector
space V . The M linearly-independent eigenvectors v1 , v2 ,. . . , vM then
span a M -dimensional subspace of V , called an invariant subspace,
whose elements are transformed by A into elements of the same sub-
space.
for any vector v and w. The symbol †, pronounced “dagger", is traditionally used in
Physics to denote the adjoint of an operator.
+ Saying, as above, that Eq. (3.72) must apply to any vector v and w of the
Hilbert space in which the operator A acts is unproblematic for finite-
dimensional Hilbert spaces. For infinite-dimensional spaces, however,
a mathematically sound definition of the adjoint requires a careful spec-
ification of the domains of the operators A and A† . One says that A† is
the adjoint of A if (v, Aw) = (w, A† v)∗ for any vector w in the domain
of A and any vector v in the domain of A† , the latter being defined as
the set of all the vectors v such that there exists a vector vA for which
(v, Aw) = (vA , w) for any vector w in the domain of A.
64
+ It can be shown that any operator has one and only one adjoint.
(A + B)† = A† + B † . (3.73)
3. The adjoint of a product of two operators is the product of their adjoints in reverse
order:
(AB)† = B † A† . (3.75)
(A† )† = A. (3.76)
+ The ladder operators a+ and a− used in the Term 1 course for calculat-
ing the energy levels of a linear harmonic oscillator are examples of an
operator and its adjoint: As we will see later on, a− = a†+ and a+ = a†− .
Of particular importance in Quantum Mechanics are operators that are identical to their
adjoint. We will explore their properties later in these notes.
65
4 Quantum states and the Dirac notation
Recall that Hilbert spaces are vector spaces and that their elements are called vectors.
At the beginning of the course we have seen an example of system in which the vectors
describing the states of interest were square-integrable functions of r, θ and φ, and an
example where they were 2-component column vectors. States of quantum systems can
be represented in a more general way by what Dirac called ket vectors. These vectors
are usually denoted by the symbol | . . . i, with . . . standing for whatever label would
be used to identify the particular vector (e.g., |ψi, |ni, | ↑i, |x, +i, etc.).5 Ket vectors
are often called kets, in short, or state vectors. Depending on the system, they can in
turn be represented by a function of r, θ and φ, or by a 2-component column vector, or
by some other mathematical object appropriate for the problem at hands.
We stress that ket vectors themselves do not depend on specific coordinates. Only their
representations in terms of wave functions do. They do not depend on the choice of
basis vectors either, contrary to the column vectors representing them. Ket vectors
are vectors in their own right, however vectors belonging to an abstract Hilbert space
rather than a Hilbert space of functions or column vectors.
Take, for example, the non-relativistic Hamiltonian of atomic hydrogen given by Eq. (1.1)
of Section 1.2 of these notes,
~2 2 e2 1
H=− ∇ − , (4.1)
2µ 4π0 r
and the equation
Hψnlm (r, θ, φ) = En ψnlm (r, θ, φ) (4.2)
defining the energy levels and energy eigenfunctions of that Hamiltonian. This oper-
ator H acting on wave functions representing possible quantum states of a hydrogen
5
The zero ket vector is usually represented by the numeral 0 rather than by a | . . .i symbol.
E.g., the combination |ψi + 0 represents the sum of the vector |ψi with the zero vector, not the
sum of the vector |ψi with the number 0, and |ψi + 0 = |ψi for any |ψi.
66
atom corresponds to an operator Ĥ acting on ket vectors representing the same quan-
tum states. Passing to this other formulation, Eq. (4.2) becomes
Ĥ |n, l, mi = En |n, l, mi, (4.3)
where Ĥ is a certain operator acting on the vectors |n, l, mi (operators are discussed
in Chapter 3). Neither |n, l, mi nor Ĥ are given by some combinations of the vari-
able r, θ and φ. However, they can be represented by such combinations — i.e., by
the wave functions ψnlm (r, θ, φ) and the operator H of Eq. (4.1) — in the sense that
there is a one-to-one correspondence between the kets |n, l, mi and the wave functions
ψnlm (r, θ, φ). (In the mathematical terminology of Section 2.12, one would say that the
Hilbert space inhabited by these ket vectors is isomorphic to the one inhabited by these
wave functions.) In particular, if the ket vectors |ψa i and |ψb i are represented by the
wave functions ψa (r, θ, φ) and ψb (r, θ, φ), then their linear combination α|ψa i+β |ψb i
is represented by the wave function αψa (r, θ, φ) + βψb (r, θ, φ).
The same also applies to inner products: Inner products of ket vectors can be calcu-
lated as inner products of the wave functions or column vectors representing these ket
vectors. Suppose, for instance, that the ket vectors |ψa i and |ψb i describe certain states
of a linear harmonic oscillator and correspond to the wave functions ψa (x) and ψb (x).
The inner product of these two wave functions is the integral
Z ∞
ψa∗ (x) ψb (x) dx.
−∞
Because of the correspondence between the ket vectors |ψa i and |ψb i and the wave
functions ψa (x) and ψb (x), the inner product of the latter, (ψa , ψb ), is equal to the
inner product of the former, hψa |ψb i:
Z ∞
hψa |ψb i = ψa∗ (x) ψb (x) dx. (4.4)
−∞
(The notation introduced in Chapter 2 for denoting the inner product of vectors is not
used for ket vectors: We denote the inner product of a ket |ψa i with a ket |ψb i by
the symbol hψa |ψb i, not by (|ψa i, |ψb i).) If |ψa i and |ψb i described quantum states
of atomic hydrogen instead, and corresponded to the wave functions ψa (r, θ, φ) and
ψb (r, θ, φ), we would have had, similarly,
Z ∞ Z π Z 2π
hψa |ψb i = (ψa , ψb ) = dr r 2
dθ sin θ dφ ψa∗ (r, θ, φ) ψb (r, θ, φ). (4.5)
0 0 0
The same applies to column vectors: If the ket vectors |χi and |χ0 i describe spin states
and are represented by the column vectors
0
a 0 a
χ= and χ = 0 , (4.6)
b b
67
then
a0
0
hχ|χ0 i = (χ, χ ) = a∗ b∗ . (4.7)
b0
The correspondence between ket vectors and wave functions or column vectors goes
both ways. The discussion above is phrased in terms of ket vectors being represented
by wave functions or column vectors, but one can say equally well that wave functions
or column vectors are represented by the corresponding ket vectors. (Seen in that way,
symbols such as |ψi and |χi may be thought of as being merely a simplified notation
for the corresponding wave functions or column vectors. However, it is useful to keep
in mind that these symbols actually refer to vectors in their own right.)
+ The rule stated at the beginning of this section relates the states of a
quantum system to wave functions or some other vectors. But what do
one mean by “states" in this context?
The meaning may seem almost obvious at first sight. For example, it
is an extremely well established experimental fact that an atom may
behave quite differently when exposed to a laser beam than when left
alone, although it is still the same atom (same electrons, same nucleus).
It is natural to say that the atom is in a different state when exposed to
the laser beam than when left alone. The rule says that each of these
states can be described by a wave function or some other vector, and
the same for any other state the atom could be in.
Digging a little deeper, however, one hits a difficulty with defining pre-
cisely what a “state" is in this context. What the issue is is perhaps
best explained by a simple example. Take a classical system consisting
of a mass hanging from a spring, and suppose that this mass moves
only in the vertical direction (i.e., that it does not swing laterally like a
pendulum). One could say that the position and the momentum of this
mass define its state of motion: to know them is to know the ampli-
tude, frequency and phase of its oscillation, and in fact anything that
one might want to know about how the mass moves. The trajectory of
this mass can be represented by the function z(t) describing how its z-
coordinate varies as a function of time. Someone knowing this function
could predict, with complete accuracy, where the mass is at any given
time. Position measurements on several identical oscillators in exactly
the same state of motion would return exactly the same results if done
at exactly the same time (within experimental error, of course, but this
limitation is not fundamental). The same can be said for a system of
68
many particles. In Classical Mechanics, in general, a state is defined by
the positions and momenta of all its constituents.
The situation is quite different in the case of a quantum oscillator. A
measurement of the position of the mass would only return a random
result in this case (random within the distribution of probability de-
termined by the wave function). If position measurements were made
simultaneously on several identical oscillators, all prepared in exactly
in the same way, then a different result would normally be found for
each of these oscillators (even if the measurements were so accurate
and precise than the experimental error was negligible). In these cir-
cumstances, it would make little sense to say the mass follows a cer-
tain trajectory. Whether its position and momentum could be taken
as defining its state is altogether questionable, too, since its position
and momentum cannot be both assigned precise values at the same
time, neither experimentally nor theoretically (recall the uncertainty
relation).
Hence, what the rule stated at the beginning really means becomes
rather unclear if one goes beyond the intuitive notion of state men-
tioned above. In fact, this issue touches to the philosophy and interpre-
tation of Quantum Mechanics and is still a matter of controversy.
69
• To each ket vector |ψi there must be a bra vector hψ|, and to each bra vector
hψ| there must be a ket vector |ψi (in other words, bra and ket vectors are in
one-to-one correspondence).
hψ| ←→ ψ ∗ (x), a∗ b∗ .
• If the ket vectors |ψi and |φi correspond to the bra vectors hψ| and hφ|, then the
ket vector |ψi + |φi must correspond to the bra vector hψ| + hφ|.
• If c is a complex number and the ket vector |ψi corresponds to the bra vector
hψ|, then the ket vector c|ψi corresponds to the bra vector c∗ hψ|. (Note that the
factor multiplying hψ| is the complex conjugate of c.)
+ A bra vector hψ| defines a mapping from ket vectors to complex num-
bers such that any ket vector |φi is mapped to the complex number
hψ|φi and that any linear combination c1 |φ1 i + c2 |φ2 i is mapped to
the complex number c1 hψ|φ1 i + c2 hψ|φ2 i. Such mappings are called
linear functionals. Mathematically, bra vectors are best seen as being
linear functionals on the vector space of ket vectors. The set of all lin-
ear functionals on a vector space V is also a vector space, called the
dual of V . The one-to-one correspondence between ket and bra vectors
mentioned above is guaranteed by a theorem of functional analysis, the
Riesz Representation Theorem.
70
4.3 Operators and the Dirac notation
As we have just seen, quantum states can be represented not only by wave functions or
column vectors of complex numbers, as befits the problem considered, but also by “ket
vectors" belonging to an abstract Hilbert space. We have also seen that each operator
acting on wave functions or column vectors has a counterpart in term of an operator
acting on ket vectors. To take the same example as in Section 4.1, the Hamiltonian
operator
~2 e2 1
H = − ∇2 − , (4.8)
2µ 4π0 r
which acts on wave functions depending on the space coordinates r, θ and φ corre-
sponds to an operator Ĥ acting on ket vectors. In particular, the eigenvalue equation
becomes
Ĥ |n, l, mi = En |n, l, mi (4.10)
when expressed in terms of ket vectors rather than wave functions. One can say that the
wave functions ψnlm (r, θ, φ) represent the ket vectors |n, l, mi and that the differential
operator H represents the operator Ĥ. More generally, if a ket vector |ψi is represented
by, e.g., a wave function ψ(r, θ, φ) and an operator  acting on |ψi is represented by an
operator A acting on ψ(r, θ, φ), then the ket vector Â|ψi is represented by the function
Aψ(r, θ, φ). For example, Eq. (3.71) defining the adjoint of an operator would read
if written in terms of ket vectors |φi and |ψi rather than in terms of generic vectors v
and w.
Given this correspondence, the matrix elements of an operator acting on ket vectors
can be calculated as the matrix elements of the corresponding operator acting on wave
functions or column vectors. That is to say, if the two orthonormal vectors ui and uj
(wave functions or column vectors) represent the orthonormal ket vectors |ui i and |uj i
and the operator A acting on ui and uj represents the operator  acting on |ui i and
|uj i, then the inner product of |ui i with Â|uj i — i.e., hui | Â|uj i — is nothing else than
the inner product of ui with Auj :
71
For example, when written in terms of ket vectors, Eq. (3.20) reads
hu1 | Â|u1 i hu1 | Â|u2 i · · · hu1 | Â|uN i
hu2 | Â|u1 i hu2 | Â|u2 i · · · hu2 | Â|uN i
A= .. .. . .. . (4.13)
. . . . .
huN | Â|u1 i huN | Â|u2 i · · · huN | Â|uN i
Thus calculations involving operators acting on ket vectors can generally be reduced
to calculations involving operators acting on wave functions or column vectors if this
would be necessary for obtaining the result.
The reduction is often not necessary, though. Consider, for example, the following
problem of quantum optics: calculate the matrix element hα|â+ |αi, where α is a com-
plex number, the ket vector |αi is a normalized eigenvector of the operator â− with
eigenvalue α (â− |αi = α|αi and hα|αi = 1), and â+ and â− are ladder operators
(â+ = â†− ). This matrix element can be calculated immediately without passing to
a representation in terms of wave functions or column vectors: In view of Eq. (4.11),
of the fact that â†+ = (â†− )† = â− and of the assumptions that â− |αi = α|αi and
hα|αi = 1,
1. The inner product hφ|ψi is in general a complex number, and hφ|ψi = hψ|φi∗ .
2. The bra vector conjugate to the ket vector Â|ψi can be written as hψ| † . (Recall
that any ket vector |ψi has a conjugate bra vector hψ|, see Section 4.2 above.)
+ That hφ|ψi = hψ|φi∗ simply follows from the axioms of the inner product.
To understand why the bra conjugate to Â|ψi can be written as hψ| † ,
consider the ket vector |Aψi = Â|ψi and the inner product hφ|Aψi,
where |φi is arbitrary. Note that
72
+ Take, for example, hα|â†− . (hα| is the bra conjugate to the ket |αi and as
above â− |αi = α|αi.) Since α∗ hα| is the bra conjugate to the ket α|αi,
we see that hα|â†− = α∗ hα|. In other words, hα| is a left eigenvector of
â†− , in the same way as |αi is a right eigenvector of â− . The words left and
right can be taken as defining the direction in which the operator acts:
in â− |αi = α|αi, the operator â− acts “on the right" on |αi, whereas in
hα|â†− = α∗ hα| the operator â†− acts “on the left" on hα|.
The above calculation of the matrix element hα|â+ |αi could thus be done
quickly by letting â+ act “on the left" on hα|: since â+ = â†− ,
We will use the Dirac notation from now on, for simplicity, unless we would talk specif-
ically about operators acting on wave functions or column vectors. However, all the
results stated normally apply to any Hilbert space, not just spaces of ket vectors.
73
quantum system, then any linear combination of these two vectors also represents a
physically possible state of that system.
Take, for example, an atom of hydrogen. This atom can be in the ground state, whose
time-dependent wave function is ψ100 (r, θ, φ) exp(−iE1 t/~). It can also be in the 2s
state, whose time-dependent wave function is ψ200 (r, θ, φ) exp(−iE2 t/~). Therefore
an atom of hydrogen can also (at least in principle) be in a linear superposition of these
two states, e.g., in a state whose wave function is given by the equation
1 1
Ψ(r, θ, φ, t) = √ ψ100 (r, θ, φ) exp(−iE1 t/~) + √ ψ200 (r, θ, φ) exp(−iE2 t/~).
2 2
(4.20)
An atom in this state is, in a sense, both in the 1s state and in the 2s state. If checked,
this atom could be found to be in the ground state or, with the same probability, to
be in the 2s state. Note that the state described by Ψ(r, θ, φ, t) is neither the 1s state
nor the 2s state. Some of its physical properties are quite different; for instance, it is
not difficult to see that the probability density |Ψ(r, θ, φ, t)|2 varies in time whereas
|ψ100 (r, θ, φ) exp(−iE1 t/~)|2 and |ψ200 (r, θ, φ) exp(−iE2 t/~)|2 don’t.
As a second example, let us consider the spin states of the silver atoms in Stern’s and
Gerlach’s experiment. As mentioned in Section 1.2 of these notes, these spin states can
be represented by 2-component column vectors, e.g.,
1 0
χ+ = and χ− = . (4.21)
0 1
Later in the course, we will see that conventionally these two column vectors repre-
sent a state of spin up (χ+ ) or spin down (χ− ) in the z-direction. We can make linear
combinations of χ+ and χ− to form new column vectors, for instance
√
1 1 1 1 1 0 1/√2
χa = √ χ+ + √ χ− = √ +√ = . (4.22)
2 2 2 0 2 1 1/ 2
Since χ+ and χ− represent possible spin states of an atom of silver, by the Principle of
Superposition χa also represents a possible spin state of that system. In fact, χa repre-
sents a state of “spin up" in the x-direction (not the z-direction, this will be explained
later in the course). Moreover, introducing vectors representing states of “spin up" and
“spin down" in the y-direction, respectively
√ √
1/√ 2 i/ √2
χy+ = and χy− = , (4.23)
i/ 2 1/ 2
74
we can write χa not only as a linear combination of the vectors χ+ and χ− but also as
a linear combination of the vectors χy+ and χy− : as can be checked easily,
1 1
χa = (1 − i)χy+ + (1 − i)χy− . (4.24)
2 2
An atom of silver in the spin state described by χa can thus be understood as being in
a state of “spin up" in the x-direction, or as being in both the state of spin up and the
state of spin down in the z-direction, or also in both the state of “spin up" and the state
of “spin down" in the y-direction. These three descriptions are equivalent, and each
one is as good as the other two, even though they may seem contradictory.
75
5 Operators (II)
for any vectors |φi and |ψi this operator may act on. Many of the important operators
of Quantum Mechanics are Hermitian, although not quite all of them.
 = † . (5.2)
Eq. (5.1) refers only to the action of  on the vectors in the domain of Â,
irrespective of the domain of † . Eq. (5.2) says that  not only satisfies
Eq. (5.1) but also that  and † have the same domain. An operator
satisfying Eq. (5.2) is said to be self-adjoint. An operator which is self-
adjoint is also Hermitian but the converse is not true; some operators
are Hermitian but not self-adjoint (an example is given below). The dis-
tinction between Hermiticity and self-adjointness is of key importance
in Mathematics. Physical quantities such as the energy, the linear mo-
mentum and the orbital angular momentum correspond to operators
that are self-adjoint, not merely Hermitian. Why this needs to be so
will be addressed later in the course.
Let us take an example. The z-component of the angular momentum of
a particle (specifically, the “orbital angular momentum" of that particle)
can be shown to correspond to an operator
∂
Lz = −i~ , (5.3)
∂φ
where φ is the azimuth angle of the particle in spherical polar co-
ordinates. Suppose that we take the domain of Lz to be the space of all
differentiable square-integrable functions y(φ) defined on [0, 2π] and
such that y(0) = y(2π) = 0. Lz is Hermitian in that space, since, if
76
f (φ) and g(φ) are two such functions,
Z 2π Z 2π
∗ ∂g
f (φ) Lz g(φ) dφ = −i~ f ∗ (φ)dφ
0 0 ∂φ
∂f ∗
Z 2π
2π
= −i~ f ∗ (φ)g(φ) + i~ g(φ) dφ
0 0 ∂φ
Z 2π ∗
∗ ∂f
= −i~ g (φ) dφ , (5.4)
0 ∂φ
and therefore
Z 2π Z 2π ∗
∗ ∗
f (φ) Lz g(φ) dφ = g (φ) Lz f (φ) dφ . (5.5)
0 0
and this set includes functions f (φ) which are finite but non-zero at
φ = 0 or at φ = 2π and therefore do not belong to the domain of Lz
(the boundary term vanishes for any function f (φ) finite at φ = 0 and
φ = 2π, as long as g(0) = g(2π) = 0).
By contrast, Lz is not only Hermitian but also self-adjoint if we take its
domain to be the space of all differentiable square-integrable functions
y(φ) defined on [0, 2π] and such that y(0) = y(2π) (i.e., we do not
longer require that y(φ) vanishes at φ = 0 and φ = 2π). Indeed, in
order for the boundary term
to be zero for any function g(φ) such that g(0) = g(2π), it is necessary
that f (0) = f (2π). The domain of the adjoint of Lz coincides with the
domain of Lz in this case.
That the domain of Lz matters is illustrated by the fact that this op-
erator has no eigenfunctions if we require that y(0) = y(2π) = 0,
and that it has infinitely many eigenfunctions if we only require that
77
y(0) = y(2π). (Explanation: any eigenfunction of Lz must be of the
form exp[i(λ/~)φ]. There is no value of λ for which an exponential
function of this form vanishes both at φ = 0 and at φ = 2π; however,
exp[i(λ/~)0] = exp[i(λ/~)2π] for λ = 0, ±~, ±2~,. . . )
Examples
• One of the systems you have studied in your level 1 Quantum Mechanics course
was that formed by a particle of mass m submitted to no force between x = −a
78
and x = a but confined to that interval by impenetrable potential barriers at
x = ±a. The Hamiltonian for this problem can be taken to be
~2 d2
H=− , (5.9)
2m dx2
considered as an operator acting on twice-differentiable, square-integrable func-
tions of x which vanish at x = −a and x = a. This operator is Hermitian since
for any such functions,
Z a Z a
∗ ~2 d2 ψ
φ (x) H ψ(x) dx = − φ∗ (x) 2 dx
−a 2m −a dx
2
a Z a ∗
~ ∗ dψ dφ dψ
=− φ (x) − dx
2m dx −a −a dx dx
dψ a dφ∗ a Z a 2 ∗
~2
∗ d φ
=− φ (x) − ψ(x) + 2
ψ(x) dx
2m dx −a dx −a −a dx
Z a 2 ∗
~2
d φ
=− ψ(x) dx
2m −a dx2
Z a 2 ∗ Z a ∗
~2 d φ ∗ ∗
=− ψ (x) dx = ψ (x) H φ(x) dx . (5.10)
2m −a dx2 −a
• The ladder operators a+ and a− are non Hermitian. (The proof of this assertion
is left as an exercise.)
79
Proof: Suppose that  is a Hermitian operator, and also that Â|ψ1 i =
λ1 |ψ1 i and Â|ψ2 i = λ2 |ψ2 i with λ1 6= λ2 . Thus
It is worth noting that not all Hermitian operators have eigenvalues and eigenvectors
(see Section 5.4) and that non-Hermitian operators may also have real eigenvalues or
orthogonal eigenvectors. However, if an operator is Hermitian, one can be sure that its
eigenvalues and eigenvector (if any) have these two important properties.
Hermitian matrices
Matrices representing Hermitian operators are Hermitian. Recall that a Hermitian ma-
trix is one equal to its conjugate transpose — e.g., for 2 × 2 matrices, a matrix such
that ∗ ∗
a b a c
= ∗ . (5.13)
c d b d∗
(The proof of this assertion is left as a short exercise.)
80
vector v can be written as the sum of a vector v0 parallel the x-direction and a vector
v0⊥ orthogonal to that direction (thus parallel to the yz-plane). In terms of Cartesian
components, if
v = vx x̂ + vy ŷ + vz ẑ (5.14)
with x̂, ŷ and ẑ unit vectors in the respective directions, then v0 is necessarily vx x̂
and v0⊥ is necessarily vy ŷ + vz ẑ. Clearly, v0 is the projection of v onto the x-axis.
More generally, if we write each vector v of V as the sum of a vector v 0 belonging to V 0
and a vector v 0⊥ belonging to V 0⊥ , mapping v to v 0 amounts to projecting v onto the
subspace V 0 . Operators effecting this transformation are called projection operators, or
projectors. In the example above, projecting v onto the x-axis amounts to transforming
this vector into the vector x̂ multiplied by the inner product of x̂ and v (since vx = x̂·v).
Let us consider, for example, a ket vector of unit norm |φi and the operator P̂φ whose
action on any ket vector |ψi is to project it onto the 1-dimensional subspace spanned
by |φi. Thus P̂φ transforms |ψi into the ket vector |φi multiplied by the inner product
of |φi and |ψi:
P̂φ |ψi = hφ|ψi|φi. (5.15)
Since hφ|ψi|φi is just a scalar (a number), and scalars commute with vectors, this equa-
tion can also be written in the form
in the sense that the action of |φihφ| on a ket vector |ψi is to transform it into |φihφ|ψi.
The operator P̂φ so defined is a projector.
It is easy to show that P̂φ is idempotent:
where we have used our assumption that hφ|φi = 1. It is also possible to show that P̂φ
is Hermitian.
81
Proof: P̂φ is Hermitian since hψa | P̂φ |ψb i = hψb | P̂φ |ψa i∗ for any ket |ψa i,
|ψb i:
+ Suppose that we would use wave functions instead of ket vectors. E.g., |ψi
and |φi would correspond, respectively, to a wave function ψ(r) and a wave
82
function φ(r). In this case, the operator P̂φ would correspond to an operator
Pφ defined by the following equation:
Z
Pφ ψ(r) = φ(r) d3 r0 φ∗ (r0 )ψ(r0 ) . (5.20)
or, writing the scalar hφn |ψi on the right rather than on the left of the vector |φn i (this
is just a change of notation),
N
X
|ψi = |φn ihφn |ψi. (5.24)
n=1
83
Hence
N
X
|ψi = P̂n |ψi (5.25)
n=1
where P̂n = |φn ihφn |. Since Eq. (5.25) must be true for any vector |ψi, the sum of the
projectors P̂n over all the vectors in the basis can only be the identity operator:
N
X
ˆ
|φn ihφn | = I. (5.26)
n=1
This last equation is an important result called the completeness relation, or closure
relation. (We stress that the |φn i’s must be orthonormal and form a complete set for it
to hold.)
84
of H orthogonal to all these NA vectors (recall that linearly independent
vectors can always be made orthogonal to each other using the Gram-
Schmidt method). These N − NA vectors span H⊥ . Let us denote them
by |φn i, n = NA + 1, . . . , N . Joining them to the NA basis vectors
spanning HA gives a set of N basis vectors spanning the whole of H,
{|φn i}, n = 1, . . . , N .
By definition, any vector |γi in HA can be written as a linear combina-
tion of eigenvectors of Â:
X
|γi = ci |ψi i, (5.27)
i
where each vector |ψi i is such that Â|ψi i = λi |ψi i for some number
λ. But then X X
Â|γi = ci Â|ψi i = ci λi |ψi i. (5.28)
i i
The operator  thus transforms vectors belonging to HA into vectors
belonging to HA .
Next we show that  also transforms vectors belonging to H⊥ into
vectors belonging to H⊥ : Suppose, as above, that |γi is in HA . Then,
as just shown, |γA i = Â|γi is also in HA . Now, suppose that |βi is in
H⊥ and consider |βA i = Â|βi. Since  is Hermitian,
hγ |βA i = hγ | Â|βi = hβ | Â|γi∗ = hβ |γA i∗ . (5.29)
However, since |γA i is in HA and |βi is in H⊥ , hβ |γA i∗ = 0 and there-
fore hψ|βA i=0. Since this is true for any vector |γi belonging to HA ,
Â|βi must be in H⊥ if |βi is in H⊥ . Thus  transforms vectors in H⊥
into vectors in H⊥ .
The upshot is that hφi | Â|φj i = 0 if i ≤ N and j ≥ N +1 or if i ≥ N +1
and j ≥ N . The matrix A representing the operator  in the {|φn i}
basis is therefore block-diagonal:
× × ... × × 0 ... 0
× × . . . × × 0 . . . 0
. . .
. . ... ... ... . . . ...
.. ..
× × . . . × × 0 . . . 0
A= , (5.30)
× × . . . × × 0 . . . 0
0 0 . . . 0 0 × . . . ×
.. .. . . . . . .. .
. .. .. .. . ..
. .
0 0 ... 0 0 × ... ×
85
where the crosses indicate which elements of A may be non-zero. The
upper and lower diagonal blocks of A are, respectively, NA × NA and
(N − NA ) × (N − NA ) square matrices. Any eigenvector of A corre-
sponds to an eigenvector of Â; indeed, if the numbers cn , n = 1, . . . , N
are the components of a vector c such that Ac = λc, then Â|ψi = λ|ψi
for
XN
|ψi = cn |φn i. (5.31)
n=1
86
sponding to different eigenvalues are always orthogonal). As any vec-
tor of the Hilbert space in which this operator acts can be written as
a linear combination of its eigenvectors, this set of orthonormal eigen-
vectors is a basis for that space.
(We replaced hψi |ψj i by δij in the last step since the ket vectors |ψj i are assumed to
be orthonormal.) Therefore, the off-diagonal elements of the matrix A representing Â
in that basis are all zero, and the diagonal elements are the eigenvalues λj :
λ1 0 · · · 0
0 λ2 · · · 0
... .. .. .. (5.33)
A= .
. . .
..
0 0 . λN
This last equation expresses the operator  in terms of its eigenvalues and of the projec-
tors |ψn ihψn |. The right-hand side of this equation is called the spectral decomposition
of Â.
87
+ We have seen in Section 3.3 that the exponential of an operator can be
defined by a series of powers of that operator. More general functions
can also be defined in terms of the spectral decomposition of that op-
erator. I.e., given Eq. (5.34), a function f (Â) of the operator  can be
taken to be the operator defined by the equation
N
X
f (Â) = f (λn ) |ψn ihψn |. (5.35)
n=1
For example,
N
1 X 1
= |ψn ihψn |. (5.36)
 − λIˆ n=1
λn − λ
(It is clear from this last equation that the operator  − λIˆ is not in-
vertible when λ is an eigenvalue of Â.)
1. If  and B̂ commute and |ψn i is an eigenvector of Â, then the vector B̂ |ψn i is
also an eigenvector of  corresponding to the same eigenvalue.
88
One can show that H(t) commutes with the angular momentum operator Lz .
Thus, if Ψ(r, θ, φ, t = t0 ) is an eigenfunction of Lz , then Ψ(r, θ, φ, t = t0 + dt)
is also an eigenfunction of Lz corresponding to the same eigenvalue. The upshot
is that the wave function Ψ(r, θ, φ, t) then remains an eigenfunction of Lz at all
times, even though it may change in a very complicated way under the effect of
this time-varying electric field.
+ Suppose that  and B̂ commute and that Â|ψn i = λn |ψn i. The eigen-
value λn may or may not be degenerate.
It follows from the fact that |ψn i and B̂ |ψn i are both eigenvectors of Â
corresponding to a same eigenvalue that this eigenvalue is degenerate
if these two vectors are linearly independent. In fact, most eigenvalue
degeneracies arise from the fact that the operator of interest commutes
with another operator.
On the other hand, if λn is not degenerate, then |ψn i is an eigenvector
of B̂ as well as of Â. Indeed, |ψn i and B̂ |ψn i cannot be linearly inde-
pendent if λn is not degenerate; instead, these two vectors must differ
at most by a scalar factor. Hence, if λn is not degenerate, there must
exist a number µn such that B̂ |ψn i = µn |ψn i, which means that |ψn i
is an eigenvector of B̂.
2. One can find a basis constructed from vectors which are eigenvectors both of
 and of B̂ if and only if  and B̂ commute. I.e., there exists a basis set {|ψ1 i,
|ψ2 i, . . . , |ψN i} and scalars λn and µn , n = 1, 2 . . . , N , such that
89
of the same subspace. Within that subspace B̂ is therefore equiv-
alent to a Hermitian operator acting in a Hilbert space of dimen-
sion M . Hence it is always possible to form an orthonormal basis
of eigenvectors of B̂ spanning this M -dimensional subspace, and
each vector in that basis is also an eigenvector of  correspond-
ing to the eigenvalue λ. This process can be repeated for each
eigenvalue of  in turn, resulting in a basis of N vectors that are
eigenvectors of both  and of B̂.
We now prove the converse, that  and B̂ commute if one can
find a basis constructed from vectors which are eigenvectors
both of  and of B̂. Suppose that there exists a basis set {|ψ1 i,
|ψ2 i, . . . , |ψN i} and scalars λn and µn , n = 1, 2 . . . , N , such
that
N
X N
X
ÂB̂|ψi = cn ÂB̂ |ψn i = cn λn µn |ψn i
n=1 n=1
XN
= cn µn λn |ψn i
n=1
XN
= cn B̂ Â|ψn i = B̂ Â|ψi. (5.39)
n=1
If each pair of eigenvalues (λn , µn ) corresponds to a unique |ψn i (up to a scalar factor),
one says that the operators  and B̂ form a complete set of commuting operators. This
means, in practice, that any joint eigenvector of  and B̂ can be unambiguously defined
by a pair of quantum numbers.
This definition generalizes to sets of more than two operators and to infinite-dimensional
spaces. For example, the non-relativistic Hamiltonian of an atom hydrogen, Ĥ, com-
mutes with the angular momentum operators L̂2 and L̂z and the latter two also com-
90
mute with each other:
[Ĥ, L̂2 ] = [Ĥ, L̂z ] = [L̂2 , L̂z ] = 0. (5.40)
The wave functions Ψnlm (r, θ, φ) you have studied in Term 1 are simultaneous eigen-
functions of Ĥ, L̂2 and L̂z . The corresponding eigenvalues completely define these
joint eigenfunctions (up to a constant factor), and therefore Ĥ, L̂2 and L̂z form a com-
plete set of commuting operators.
As we will see in Chapter 6, that two Hermitian operators representing measurable
physical quantities commute means that there is no uncertainty relation limiting the
predictions one can make about what values these quantities could be found to have if
measured jointly.
1. As seen in Section 5.1, the Hamiltonian operator defined by Eq. (5.9) is Hermitian
if taken as acting in the Hilbert space of the square-integrable functions which
vanish at x = ±a. It is possible to form an orthonormal basis set of eigenfunc-
tions of that operator, such that any element φ(x) of that Hilbert space can be
written in the form of an expansion on that set of eigenfunctions. I.e., denoting
the orthonormal basis functions by ψn (x), n = 0, 1, 2, . . ., one can write φ(x)
in the form
X∞
φ(x) = cn ψn (x), (5.41)
n=0
where the ψn (x) functions are eigenfunctions of H of unit norm and are such
that Z a
ψi∗ (x)ψj (x) dx = δij . (5.42)
−a
91
with kn = (n + 1)π/(2a) and An a normalization factor. These
functions satisfy the differential equation
~2 d2 ψn
− = En ψn (x), (5.44)
2m dx2
where En is a constant (in fact, En = ~2 kn2 /2m). Since they are
also square-integrable on [−a, a] and zero at x = ±a, they qualify
as eigenvectors of this operator. These functions are orthogonal to
each other since they all correspond to different eigenvalues and H
is Hermitian. Moreover, they can be made to have unit norm by
an appropriate choice of the normalization factors An . That any
function φ(x) belonging to that Hilbert space can be written as an
expansion in these ψn (x) functions is a standard result of Fourier
analysis.
~2 d2 ψ
− = E ψ(x) (5.45)
2m dx2
for a certain value of the constant E and should also be square-
integrable on (−∞, ∞). (By non-trivial solution one means a solu-
tion other than ψ(x) ≡ 0.) Let k = (2mE/~2 )1/2 . Any non-trivial
solution of Eq. (5.45) is of the form
with c+ and c− two arbitrary complex numbers (not both zero), and
such solutions exist for any real or complex value of E. However,
none of these solutions go to zero both for x → ∞ and x → −∞.
Hence, no function of that form is square-integrable on (−∞, ∞).
92
3. Adding a potential energy term mω 2 x2 /2 to the Hamiltonian of the previous
example changes it into the Hamiltonian of a linear harmonic oscillator,
~2 d2 m
H=− 2
+ ω 2 x2 . (5.47)
2m dx 2
Taken as acting in the Hilbert space of the square-integrable functions on (−∞, ∞),
this Hamiltonian is Hermitian and has infinitely many eigenvalues. It is possible
to form an orthonormal basis set of eigenfunctions of that operator, such that
any square-integrable function can be written in the form of an expansion on
that set of eigenfunctions. (You have studied this system in the Term 1 Quantum
Mechanics course, and we will come back to it a little later in these notes.)
+ Advanced mathematical concepts are necessary to prove the impor-
tant result, stated above, that any square-integrable function can be
written in terms of eigenfunctions of that Hamiltonian.
We will come back to this issue later in the course, when studying the position and
momentum operators and other operators with a continuous spectrum.
93
6 Measurements and uncertainties
94
instead of “the atom is in the state described by the state vector |ψi".
It is important to understand that what is measured in experiments is not the wave
function or state vector describing a quantum state (although the results obtained in
measurements may of course reveal a lot about the state of a system). Namely, there
is no “spin-meter" which would return the values of the coefficients a and b in the ket
vector
|χi = a| ↑ i + b| ↓ i (6.1)
representing the spin state of a given atom of silver. A measurement of the spin in the
z-direction would only say whether each atom measured is found to be in a state of spin
up (| ↑ i) or a state of spin down (| ↓ i). It may be that information about the values of
a and b could be obtained by making well-chosen measurements on many atoms all
prepared in the same state, but no single measurement will return values for a and b.
The wave functions and state vectors are not measureable as such; they are theoretical
constructs which are used to calculate quantities that can be measured.7
The Born rule
Suppose that one would consider checking whether or not a quantum system prepared
in a state |ψi is in a state |φi (for example, cheking whether or not an atom prepared
in the spin state of Eq. (6.1) is in the state of spin up, | ↑ i.) It might seem bizarre
that a system in a state |ψi could be found to be in a different state |φi in a careful
experiment; however, see Section 4.4, about the Principle of Superposition, and note
that we can always write |ψi as a linear combination of |φi and another vector:
Whether or not a measurement would find the system to be in state |φi cannot (nor-
mally) be predicted with certainty, but it is possible to calculate the probability Pr(|φi; |ψi)
of finding it in that state, given that it was in state |ψi just before the measurement. It
is a fundamental principle of Quantum Mechanics, often referred to as the Born rule or
Born postulate, that this probability is given by the following equation:
|hφ|ψi|2
Pr(|φi; |ψi) = . (6.3)
hφ|φihψ|ψi
[To keep the notation simple, Pr (|φi; |ψi) will usually be written as Pr (|φi) in the
lectures.]
7
This situation is not unique to Quantum Mechanics: e.g., in Classical Mechanics, it is not
possible to measure the Lagrangian of a system of particle; the Lagrangian is merely an auxiliary
function introduced for helping with the calculation of positions and velocities.
95
Note that the numerator and denominator of Eq. (6.3) would both be zero if |ψi or |φi
was the zero vector (the zero vector has zero norm and its inner product with any other
vector is zero). As Eq. (6.3) would then be meaningless, the zero vector never describes
a possible state of a quantum system.
The probability Pr (|φi; |ψi) is exactly the same whether the state of the system is
described by the vector |ψi or by the vector |cψi = c|ψi, where c is a non-zero complex
number. Indeed, |hφ|cψi|2 = |c|2 |hφ|ψi|2 since hφ|cψi = chφ|ψi, and hcψ|cψi =
|c|2 hψ|ψi since |cψi = c|ψi and hcψ| = c∗ hψ|. Thus
From now on we will work only with normalized state vectors and eigenvec-
tors.
The inner product hφ|ψi is called the probability amplitude of finding the system in the
state |φi when it is initially prepared in the state |ψi. (Don’t be confused: A probability
amplitude is not a probability! It is, in general, a complex number. The probability
Pr(|φi; |ψi) is is a real number, equal to the square of the modulus of the probability
amplitude.)
+ Even though they describe the same quantum state, |ψi and c|ψi are nonethe-
less two different vectors (unless of course c = 1). The value of c may thus
matter. For example, the linear combinations |ψ1 i + |ψ2 i and |ψ1 i − |ψ2 i
may describe very different states although the kets |ψ2 i and −|ψ2 i de-
scribe the same state.
96
+ Since multiplying the state vector by a non-zero overall factor does not
change anything to the predictions of the theory, as long as probabilities
are correctly calculated, it is sometimes said that quantum states are repre-
sented by rays rather than by vectors (given a vector |ψi, the correspond-
ing ray is the 1-dimensional subspace spanned by |ψi, excluding the zero
vector).
1. The probability Pr (|φi; |ψi) is zero if, and only if, the two states |ψi and |φi
are orthogonal (i.e., hφ|ψi = 0). Hence, aside from experimental errors, it is
impossible that a quantum system be found in a state orthogonal to that in which
it was immediately prior to the measurement.
2. The probability Pr(|φi; |ψi) is 1 if, and only if, |φi = c|ψi with c 6= 0 (see proof
below). The two ket vectors |φi and |ψi then describe the same quantum state.
As would be expected, aside from experimental errors, an experiment aiming
at finding whether a quantum system is in the state in which it was prepared
immediately before the measurement will certainly find it in that state.
3. In any other case, 0 < Pr(|φi; |ψi) < 1: the measurement may or may not find
the system to be in the state |φi, and whether it will find it to be in the state
|φi cannot be predicted with certainty. Referring to Eq. (6.2) above, there is a
non-zero probability that it be found in the state |φi and a non-zero probability
that it be found in the state |ψi − |φi; in that sense, one can say that the system
is simultaneously in the states |ψi, |φi and |ψi − |φi immediately before the
measurement (although one should be careful not to put too much meaning in
this interpretation: the theory only predicts probabilities, nothing more).
For example, if we take the spin state given by Eq. (6.1), the probability that the atom is
found to be in the state of spin up with respect to the z-direction is |h ↑ |χi|2 , assuming
that |χi and | ↑ i are normalized (hχ|χi = h ↑ | ↑ i = 1). Since h ↑ | ↓ i = h ↓ | ↑ i = 0
and h ↓ | ↓ i = 1, this probability is |a|2 , and furthermore |b|2 = 1 − |a|2 . (The latter
equation follows from the normalization condition hχ|χi = 1 and from the fact that
hχ|χi = |a|2 h ↑ | ↑ i + a∗ bh ↑ | ↓ i + ab∗ h ↓ | ↑ i + |b|2 h ↓ | ↓ i = |a|2 + |b|2 .) Thus,
in the case where a = 0, the probability to find in the atom in the state | ↑ i is zero (in
fact, the atom is then in the state | ↓ i, which is orthogonal to the state | ↑ i. In the case
where a = 1, then b = 0 and |χi = | ↑ i: the atom will certainly be found in the state
97
| ↑ i. Otherwise, there is a non-zero probability |a|2 to find it in the state | ↑ i and a
non-zero probability |b|2 = 1 − |a|2 not to find it in that state.
• Given the Hilbert space of the state vectors or wave functions describing the
possible states of the system of interest, each dynamical variable of this system
is associated with a linear operator acting in this Hilbert space.
• The only values a dynamical variable may be found to have, when measured, are
the eigenvalues of the operator with which this variable is associated.
For example, the operator associated with the internal energy of an atom of hydrogen
is the Hamiltonian Ĥ of this system. This operator acts on ket vectors |ψi describing
98
the state of the atom. A measurement of the energy on an atom in a state |ψi would
return one of the eigenvalues of Ĥ as a result (within the experimental uncertainties,
of course).
As noted in Section 1.2, what is directly measured in an actual experiment is often not
the dynamical variable of interest but rather some other quantities the value of this dy-
namical variable can be inferred from. For example, what was directly measured in the
experiment of Stern and Gerlach was the position of each of the silver atoms recorded
after the magnet; however, measuring each of these positions amounted to a measure-
ment of the component of the spin of the respective atom in the direction of the mag-
netic field. Hence, the operator associated with this measurement can be taken to be the
relevant spin operator rather than an operator whose eigenvalues would correspond to
the possible positions at which atoms could have been found in the experiment.
We will soon see that it is important for the consistency of the theory that the opera-
tors associated with dynamical variables each have a complete set of eigenvectors. It
is also important that the eigenvalues of these operators are real (not complex) since
physical quantities such as the position, momentum, etc, are real. For these two rea-
sons, dynamical variables are represented by Hermitian operator whose eigenvectors
form a complete set. Recall from Sections 5.3 that Hermitian operators acting in a
finite-dimensional space always have a complete set of eigenvectors. Hermitian oper-
ators acting in an infinite-dimensional space may or may not have a complete set of
eigenvectors (here, by eigenvectors, we mean ordinary eigenvectors as well as the gen-
eralized eigenvectors associated with the continuous spectrum, see Chapter 8 of these
notes).
As the term “dynamical variable" is a bit heavy and a bit undescriptive, in the follow-
ing we will use the word “observable" to refer to a dynamical variable amenable (in
principle) to a measurement.
99
The question we are thus concerned with is calculating the probability that the outcome
of the measurement of a certain observable is a given eigenvalue of the Hermitian oper-
ator representing this observable in the theory. We will assume that the wave function
or state vector representing the state of the system is known. How such probabilities
depends on whether the eigenvalues of interest form part of the discrete spectrum of
the relevant operator or part of its continuum spectrum (if there would be one). The
latter situation may arise only in infinite-dimensional Hilbert spaces and is more fully
addressed in Chapter 8. At this stage we exclusively consider the case of dis-
crete eigenvalues. Much of what we will say for the discrete case also applies to the
continuous case, with the replacement of discrete summations by integrals. There is,
however, an important difference: in the case of a continuum spectrum the calculation
gives densities of probability of finding specific eigenvalues, whereas in the case of a
discrete spectrum it gives probabilities of finding specific eigenvalues.
Suppose that a measurement of an observable A is made on a system in the state |ψi.
Suppose, also, that this observable is associated with a Hermitian operator Â. Consider
a discrete eigenvalue λn of Â, which for the time being we will assume to be non-
degenerate, and an eigenvector |ψn i of  corresponding to that eigenvalue (i.e., the
vector |ψn i is such that Â|ψn i = λn |ψn i.) We also assume that the state vector |ψi
and the eigenvector |ψn i are normalized: hψ|ψi = hψn |ψn i = 1. Then, the probability
Pr(λn ; |ψi) that the observable A is found to have the value λn is given by the equation
100
4. Obtaining one eigenvalue or another are mutually exclusive possibilities. There-
fore the probabilities Pr(λn ; |ψi) should sum to 1:
X
Pr(λn ; |ψi) = 1, (6.8)
n
where the summation runs over all the eigenvalues of Â. To keep the notation
simple, let us assume that the eigenvalues λn are all non-degenerate and that the
corresponding eigenvectors |ψn i are orthonormal. Then Eq. (6.8) says that
Since this equation must hold for any state |ψi, we conclude that
X
ˆ
|ψn ihψn | = I, (6.9)
n
where Iˆ is the identity operator. Eq. (6.9) is the completeness relation, Eq. (5.26)
of Section 5.2 of these notes. For the probabilistic interpretation of the inner
products hψn |ψi to make sense, it is thus necessary that the eigenvectors |ψn i
form a complete set.
We know from Section 5.3 that if  is defined in a finite-dimensional space it is
always possible to form an orthonormal basis of eigenvectors of  spanning that
space; hence, in spaces of dimension N , the requirement that the eigenvectors
of  form a complete set does not add to the requirement that  is Hermitian.
The case where the space is infinite-dimensional is more complicated, though.
A Hermitian operator  can represent an observable only if its has a complete
set of eigenvectors (including generalized eigenvectors in the sense of Chapter
8 of these notes).
+ That the theory would not be consistent if these eigenvectors did
not form a complete set can also be understood from the following
argument. Suppose that the eigenvectors of  would not form a
complete set — i.e., that there would exist a vector |ψi orthogonal
to all the eigenvectors of Â. If the system was prepared in that state,
there would then be a zero probability of obtaining any of the eigen-
values of  in a measurement of the dynamical variable associated
with Â, since the inner products hψn |ψi would be zero for all n; this
would be inconsistent with the rule that the only possible values of
this variable are the eigenvalues of Â.
101
+ Important note. The above considerations generalize to the case
of quantum states described by elements of an infinite-dimensional
Hilbert space; however, there are many subtleties and the mathe-
matically rigorous theory is far from straightforward. For example,
we will see, in Chapter 8, that in the case of quantum states de-
scribed by wave functions defined in 3D space, Eq. (6.9) may take
on the form X
ψn (r)ψn∗ (r0 ) = δ(r − r0 ), (6.10)
n
or the form Z
φk (r)φ∗k (r0 ) d3 k = δ(r − r0 ), (6.11)
102
a stronger condition than Hermiticity: an operator may be Hermi-
tian but not self-adjoint. Whether a Hermitian operator is or is not
self-adjoint depends on whether the domain of the adjoint of this op-
erator is or is not the same as the domain the operator itself (which
is something that can be difficult to establish). In Quantum Me-
chanics, observables are always described by self-adjoint operators.
Contrary to mere Hermiticity, self-adjointness guarantees that the
probabilities derived from the theory as described above always sum
up to 1, and also that the operator has a spectral decomposition, i.e.,
can be expressed in terms of a (discrete or continuous) sum of pro-
jectors as seen in Section 5.3 in the finite-dimensional case. Both
are essential for the consistency of the theory.
Throughout the rest of this course, we will always assume that the
observables of interest correspond to self-adjoint operators.
where Pr (λj ; |ψi) is the probability that the value λj is found if the corresponding
physical quantity was measured on a system in the state |ψi. (The proof is left as an
exercise.)
The reason why hψ| Â|ψi is called an “expectation value" is perhaps best grasped from
the following example: Suppose that you toss a coin with someone, repeatedly, and at
each toss bet one pound that you will get head. Thus at each toss you gain one pound
if you get head and lose one pound (i.e., “gain" minus one pound) if you get tail. In
probabilistic terms, the “expectation value" of your gain at each toss is the amount you
gain if you get head times the probability of that outcome, plus the (negative) amount
you gain if you get tail times the probability of that outcome — i.e., (1 pound) × 0.5 +
(−1 pound) × 0.5 (obviously, this amounts to 0 pound, assuming that the coin is fair).
Eq. (6.13) says just the same in a Quantum Mechanical context: The expectation value
103
of the observable  is the eigenvalue λ1 of that operator times the probability that the
outcome of the experiment is λ1 , plus the eigenvalue λ2 times the probability that the
outcome is λ2 , etc.
(It is clear that the values of hAi and (∆A)2 depend on the state of the system, |ψi; for
simplicity, we do not specify this dependence in the notation.) hAi can also be written
in terms of the operator  as
As we have seen in the previous section, hψ| Â|ψi is called the expectation value of Â
in the state |ψi. The variance (∆A)2 can be written in two equivalent ways in terms
of Â:
104
+ Proof: First, note that hψ|(Â − hAi)2 |ψi should really be written as
hψ|(Â−hAiI) ˆ 2 |ψi, since hAi is a scalar, not an operator. Since hψ|ψi = 1,
∆A, the square root of the variance (∆A)2 , is often referred to as the uncertainty in
the variable A. Eqs. (6.17) and (6.18) provide a precise definition of this quantity.
Suppose we have N identical copies of the quantum system of interest (e.g., N atoms
of hydrogen), that all these copies are prepared in the same state |ψi, and that we
make the same measurement of the observable A on each of them. Experimental errors
put aside, the value found for A in each of these N measurements will be one of the
eigenvalues λn of Â. If the theoretical description of these measurements is correct,
the probability distribution Pr (λn ; |ψi) predicts the frequency distribution of these
experimental results.
Let λ(1) be the value of A found for system 1, λ(2) the value found for system 2, etc.,
and λ(N ) the value found for system N . These N results form a statistical sample of
mean λ̄ and standard deviation σ, with
v
N u N
1 X u 1 X 2
λ̄ = λ(j)
and σ= t λ(j) − λ̄ .
N N −1
j=1 j=1
105
Apart for the experimental error, there is certainty about what value A would be found
to have if and only if ∆A = 0. This is the case if the state vector |ψi is in an eigenvector
of Â, and only if it is in an eigenvector of  (see proof below); obviously, the value found
for A would then be the eigenvalue of  this eigenvector corresponds to.
+ The proof of Eq. (6.20) is similar to the one you have seen in the Term 1
QM course for a particular case of this uncertainty relation. Before starting,
we recall that  and B̂ are self-adjoint operators (see the important note
at the end of Section 6.2). Let Â0 = Â − hAiIˆ and B̂ 0 = B̂ − hBiI. ˆ
By definition of the uncertainties ∆A and ∆B, (∆A)2 = hψ| Â02 |ψi and
(∆B)2 = hψ| B̂ 02 |ψi. Introducing the real number λ, we observe that
(A0 + iλB 0 )|ψi is a vector and that the square of the norm of this vector
is necessarily non-negative (as is the case for any vector). Thus
106
Given that the operators  and B̂ are self-adjoint and that hAi and hBi
are real, it is also true that
That is,
This inequation is fulfilled for λ = 0 and must be fulfilled for any real
value of λ. Thus the quadratic equation λ2 (∆B)2 − λC + (∆A)2 = 0,
as an equation for λ, cannot have two distinct real solutions, which is the
case only if C 2 − 4(∆A)2 (∆B)2 ≤ 0. Eq. (6.20) follows.
+ Despite the minus sign, the right-hand side of this equation is always a
non-negative real number.
Proof: Since  and B̂ are self-adjoint, hψ| ÂB̂ |ψi = hψ| B̂ † † |ψi∗ =
hψ| B̂ Â|ψi∗ . Thus hψ|[Â, B̂]|ψi = hψ| ÂB̂ − B̂ Â|ψi = hψ| B̂ Â −
ÂB̂ |ψi∗ = hψ|[B̂, Â]|ψi∗ = −hψ|[Â, B̂]|ψi∗ . Now, hψ|[Â, B̂]|ψi is a
number, and we have shown that this number is equal to the negative of
its complex conjugate. This number must therefore be imaginary (i times
a real number), which implies that hψ|[Â, B̂]|ψi)2 is negative.
1. This inequality is a relation between the widths of two different probability dis-
tributions, under the assumption that each one is based on the same state |ψi.
There is no assumption that the variables A and B are measured simultaneously
or even measured in a same experiment.
2. A state |ψi in which the product ∆A∆B is zero may exist if the right-hand side
of this inequality is zero. This is the case, in particular, if  commutes with B̂.
(In fact, if  commutes with B̂, these two operators have common eigenstates,
107
and for such states the uncertainties ∆A and ∆B are both zero.) There is no
theoretical limit on how small ∆A and ∆B may both be in a same quantum
state when  commutes with B̂.
3. The time-energy uncertainty relation is not amenable to this formulation. Time
is a parameter in Quantum Mechanics, not a dynamical variable associated with
a Hermitian operator.
+ One can also show that in finite-dimensional spaces, hψ|[Â, B̂]|ψi = 0 if |ψi
is an eigenvector of  or B̂. This result is consistent with the fact previously
mentioned that ∆A = 0 if |ψi is an eigenvector of Â. However, it does not
apply to the case where one would work in an infinite-dimensional space and
take |ψi to be a generalized eigenvector of  or B̂ (i.e., an eigenvector belong-
ing to the continuous spectrum of one of these operators, in the “physicists’
definition" of these, as discussed in Chapter 8).
108
If a second measurement is made on this system, the outcome of this second mea-
surement therefore needs to be calculated from the state vector |φi, in the same way
as the outcome of the first measurement needs to be calculated from the state vector
|ψi. Thus, if the system is found to be in the state |φi and an identical measurement
would be made immediately after the first, this second measurement would also find
that the system is in state |φi.
In some respects, this rule may seem completely intuitive: if the system is found to
be in state |φi, why would it not be found to be still in the same state if checked and
the state has not changed between the two measurements? Note, however, that the
collapse postulate implies that the state vector of the system changes abruptly upon
a measurement. Suppose that an atom would be in the spin state |χi = a| ↑ i+b| ↓ i
and somehow one would measure the z-component of the spin and find it to be up,
then at the point of this measurement the spin state of this atom changes from |χi
to | ↑ i. It is often said that the measurement “collapses" the wave function (or here
the state vector) from a linear combination of several states to the state found in
the measurement. However, the situation is normally rather more complicated than
just stated and requires a detailed analysis of what a measurement really entails.
There has been much discussion about the status of the collapse postulate and the
interpretation of the abrupt change over from one state to another it describes — e.g.,
whether it corresponds to something “real" in the physical properties of the atom, or
whether it merely reflects a change in what we know about the state of the atom. In
many respects, however, the issue of how the collapse postulate is best interpreted
is a problem of Philosophy rather than of Quantum Theory, and is outside the scope
of this course.
Let us illustrate the above by yet another thought experiment on spin states. Suppose
that the spin state of a spin-1/2 atom (e.g., an atom of silver) is initially represented
by the column vector
1
χ= √ √1 . (6.25)
3 2
As we will see later in the course, the column vectors
1 1 1
α= and (x)
α =√
0 2 1
are eigenvectors of, respectively, the z-component and the x-component of the spin
operator, and both correspond to a state of spin “up" in the respective direction (i.e.,
they both correspond to an eigenvalue of ~/2 rather than −~/2. Note that these
three column vectors are normalized. Now, imagine an experiment in which one
109
checks whether an atom initially in the state χ is or is not in the state α (thus in the
state of spin up with respect to the z-direction). The probability of finding it in that
state is given by Eq. (6.5) as
p 2
1/3 2
(6.26)
p p
1 0 p = 1× 1/3 + 0 × 2/3 = 1/3.
2/3
Suppose that the atom is found to be in the state α. Upon this finding, its state
changes from χ to α. If this state is not perturbed by some interaction or another
measurement, a subsequent check of whether it is in the state α will find it to be in
that state with a probability of 1 since
2
1
1 0 = 1. (6.27)
0
Instead of checking whether it is in the state α after the first measurement, let us
imagine that it would be checked whether it is in the state α(x) . It would be found
to be in that state with a probability of
2
1
(6.28)
p p
1/2 1/2 = 1/2.
0
Assumint that it is found to be in that state, its state would then be α(x) after this
second measurement. If now a third measurement is made, on whether this atom is
or is not in the state α, the probability of finding it in that state is now 1/2, rather
than 1, since
p 2
1/2
(6.29)
1 0 p = 1/2.
1/2
The collapse postulate can be recast in terms of the projector operator P̂φ introduced
in Section 3.10, as we will now see. A system initially in a state |ψi can be found
to be in a state |φi only if hφ|ψi = 6 0. If so, and in view of Eq. (3.103), P̂φ |ψi =
6 0.
Given that c|φi represents the same quantum state as |φi for any non-zero value of
the number c, we see that the collapse postulate can also be phrased as follows: “If
a system is initially in a state |ψi and is found to be in a state |φi in a measurement,
then immediately after the measurement this system is in a quantum state described
by the state vector P̂φ |ψi." Or, in other words,
110
The same applies to the case where what is measured is the value of a certain observ-
able. Suppose that a particular eigenvalue λn of the Hermitian operator  represent-
ing this observable is obtained as a result and that this eigenvalue is non-degenerate.
Then, if |ψn i is a normalized eigenvector of  corresponding to this eigenvalue, the
collapse postulate sets that the measurement leaves the system in the state |ψn i.
This principle can be generalized to the case where the eigenvalue λn is degenerate.
Recall that one can find a set of M orthonormal eigenvectors of  all belonging to the
eigenvalue λn if this eigenvalue is M -fold degenerate. Let |ψn1 i, |ψn2 i, . . . , |ψnM i
be such a set. These M orthonormal vectors span a M -dimensional subspace of the
whole Hilbert space, and the operator
M
X
P̂λn = |ψnr ihψnr | (6.30)
r=1
projects any vector of this Hilbert space onto that M -dimensional subspace. In gen-
eral, the collapse postulate sets that if a system is initially in a state |ψi, finding
the eigenvalue λn of  in a measurement of the corresponding dynamical variable
projects |ψi onto the subspace of this eigenvalue. Correspondingly, the measure-
ment transforms the state vector of the system from |ψi to P̂λn |ψi (or, in terms of
a normalized vector, to P̂λn |ψi/||P̂λn |ψi||).
The case of a pair of observables represented by commuting operators is particularly
noteworthy. Suppose that the observables A and B are represented by the opera-
tors  and B̂, respectively, with [Â, B̂] = 0, and suppose that A is measured. The
measurement transforms the initial state vector into a superposition of eigenvectors
of  belonging to the eigenvalue found in the measurement (λn , say):
M
X
|ψi → P̂λn |ψi = cn |ψnr i (6.31)
r=1
However (the proof is left as an exercise), the resulting superposition is still an eigen-
vector of  belonging to the eigenvalue λn . Therefore, if A is measured again, after
111
the measurement of B, λn is found with probability 1. In other words, a measure-
ment of B would not affect the result of a measurement of A. In view of this fact,
the observables A and B are said to be compatible. Two observables are compati-
ble when, and only when, they are representing by operators commuting with each
other.
112
7 Wave functions, position and momentum
7.1 Introduction
Wave functions are commonly used to describe the quantum state of atoms and molecules,
or more generally of systems characterized by the positions or the momenta of their
constituents. These wave functions are vectors belonging to a Hilbert space of square-
integrable functions. As noted previously in this course, such spaces are infinite di-
mensional, and the mathematics of infinite dimensional spaces is considerably more
complicated than that of finite dimensional spaces.
In regards to Quantum Mechanics, the most important difference between finite and
infinite dimensional Hilbert spaces concerns the eigenvalues and eigenvectors of the
Hermitian operators representing dynamical variables. For example, spin states can
be described by vectors belonging to a finite dimensional space, e.g., by N -component
column vectors where N is a finite number (N = 2 for spin-1/2 particles). Spin observ-
ables can be represented by N × N Hermitian matrices acting on these column vectors.
The eigenvectors of these matrices are themselves N -component column vectors and
the corresponding eigenvalues are finite in number and form a discrete distribution
(different eigenvalues are separated by a gap). Moreover, these sets of eigenvectors are
complete and span the whole of the Hilbert space in which these matrices act, which is
essential for the consistency of the probability calculations outlined in Chapter 4.
Because they act on vectors belonging to an infinite-dimensional space, however, the
operators representing such dynamical variables as the position, the momentum and
the energy do not have, or do not always have, a complete set of eigenvectors. By
eigenvectors, here, we mean square-integrable eigenfunctions f such that Af = λf ,
where A is the operator and λ is a constant. Hence, it is often not possible to calculate
probabilities for the corresponding measurements using the same method as for finite-
dimensional Hilbert spaces. A mathematically rigorous way round this difficulty, based
on projector operators, has been known since the early days of Quantum Mechanics.
However, the probabilities of interest can often be calculated by following a differ-
ent approach, in which the concept of eigenfunction is generalized to encompass non
square-integrable functions. The possible results of a measurement are then taken to
be the generalized eigenvalues associated with these generalized eigenfunctions. These
“eigenvalues" are normally infinite in number and may form a continuous rather than
a discrete distribution. This approach is illustrated in Section 7.3 by the example of
the momentum operator (the operator corresponding to the momentum of a particle,
p = mv in Classical Mechanics).
113
7.2 The Fourier transform and the Dirac delta function
We start with a review of important mathematical facts particularly relevant in this
context. In principle, this section contains nothing that you have not already seen pre-
viously. Please refer to your maths courses, to the term 1 Quantum Mechanics course,
to a maths or Quantum Mechanics textbook and/or to reliable online material if you
are unfamiliar with any of the results stated below.
The Fourier transform
Let ψ(x) be a function integrable on (−∞, ∞). The Fourier transform of ψ(x) is a
function φ(k) defined by the following equation:
Z ∞
1
φ(k) = exp(−ikx)ψ(x) dx. (7.1)
(2π)1/2 −∞
If φ(k) is the Fourier transform of ψ(x) then ψ(x) is the inverse Fourier transform of
φ(k) and Z ∞
1
ψ(x) = exp(ikx)φ(k) dk. (7.2)
(2π)1/2 −∞
Not all functions have a Fourier transform or an inverse Fourier transform. The con-
ditions under which a function can be Fourier-transformed have been studied in great
details by mathematicians and it would be outside the scope of the course to state them
in full generality. A good understanding of this question would be necessary for an
in depth knowledge of the maths of wave mechanics but is not required at the level
of this course. We just quote a useful result you have probably seen before: if ψ(x)
is a continuous function and |ψ(x)| can be integrated on (−∞, ∞), then ψ(x) has a
Fourier transform φ(k) defined by Eq. (7.1).
The Dirac delta function
The Dirac delta function δ(x − x0 ) is a mathematical object such that
Z ∞
δ(x − x0 )f (x0 ) dx0 = f (x) (7.3)
−∞
for any integrable function f (x) of a real variable x. (The Dirac delta function is not
defined for complex arguments.) There is no difference between δ(x−x0 ) and δ(x0 −x),
so that Eq. (7.3) can also be written
Z ∞
δ(x0 − x)f (x0 ) dx0 = f (x).
−∞
114
It is customary to refer to δ(x − x0 ) as a function. However, the mathematical symbol
δ(x − x0 ) does not represent a function. There is no function δ(x − x0 ), in the usual
sense of the word function, such that Eq. (7.3) could hold for any x and any f (x). The
delta “function" belongs to a different class of mathematical objects called distributions
or generalized functions.
The delta function δ(x − x0 ) can be represented by various mathematical expressions.
In particular, Z ∞
1
exp[ik(x − x0 )] dk = δ(x − x0 ). (7.4)
2π −∞
+ Eq. (7.4) can be justified by the following argument: In view of Eq. (7.1),
Eq. (7.2) can also be written as
Z ∞ Z ∞
1 1 0 0 0
ψ(x) = exp(ikx) exp(−ikx )ψ(x ) dx dk.
(2π)1/2 −∞ (2π)1/2 −∞
+ The following results are explored in the fifth set of workshops and are
worth remembering. Please refer to the corresponding worksheet for proofs
and examples of applications.
1. For any x0 , Z ∞
δ(x − x0 ) dx = 1 (7.6)
−∞
and (
Z b f (x0 ) x0 ∈ (a, b),
δ(x − x0 ) f (x) dx =
a 0 x0 6∈ [a, b].
(The left-hand side of this last equation is not defined if x0 = a or
x0 = b.)
2. δ(αx) = δ(x)/|α| if a 6= 0.
115
3. If F (x) is a differentiable function, then
X 1
δ[F (x)] = δ(x − xn ), (7.7)
n
|F 0 (x n )|
where the xn ’s are the zeros of F (x) (i.e., the values of x at which
F (x) = 0) and
dF
F 0 (xn ) = . (7.8)
dx x=xn
δ[F (x)] has no mathematical meaning if it happens that F (x) and
F 0 (x) are simultaneously zero. For example, δ(x2 ) has no meaning.
4. For any q1 and q2 (q1 6= q2 ),
Z ∞
δ(x − q1 ) δ(q2 − x) dx = δ(q2 − q1 ). (7.9)
−∞
116
such that the integral Z ∞
|ψp (x)|2 dx
−∞
exists and is finite. (By non-trivial solution one means a solution ψp (x) which is not
zero for all values of x.) We can deduce from Eq. (7.12) that
with C a constant.9 Such solutions exist for any value of p, real or complex. However,
they never belong to the Hilbert space of square-integrable functions on (−∞, ∞), and
therefore they do not qualify as eigenvectors of the operator P in the mathematical
sense of this term.
117
the operator  − λIˆ is not invertible, where Iˆ is the identity operator.
This set always contains all the eigenvalues of this operator (i.e., all the
scalars λ such that Â|ψi = λ|ψi for some vector |ψi belonging to H).
However, if H is infinite-dimensional, the spectrum may also contain
values of λ for which there is no vector |ψi such that Â|ψi = λ|ψi.
This is the case for the momentum operator: mathematicians would say
that the spectrum of P is R, the set of all real numbers, even though this
operator has no eigenvalues. The spectrum of P is thus a continuous
distribution of numbers. That Eq. (7.12) has no square-integrable solu-
tions is not an accident, as it can be shown that the continuous part of
the spectrum of an operator is never associated with square-integrable
eigenfunctions.
118
In that sense, any continuous and absolutely integrable wave
function can be expanded on the set of the functions ψp (x) de-
fined by Eq. (7.13) with C = (2π~)−1/2 and p real, and there is
no need to include functions ψp (x) with complex values of p in
this set to make it complete.
Absolute integrability is not the same as square-integrability,
though. One can find function that are absolutely integrable but
not square-integrable. Likewise, one can imagine wave func-
tions that are square-integrable but not absolutely integrable. The
above result, about the existence of a Fourier transform, extends
to the more general case of any square-integrable wave function,
although the definition of the Fourier transform must then be
suitably modified (the straightforward definition in terms of ordi-
nary Riemann integrals you have probably seen in a maths course
does not apply to functions that are square-integrable but not ab-
solutely integrable).
119
(We have passed from the first to the second equation by changing variable from x to
ξ = x/~, and from the second to the last by using Eq. (7.4) with x − x0 replaced by
p0 − p and k replaced by ξ.) It is convenient in the applications to choose the constant
C such that this integral is exactly δ(p0 − p). Setting C = (2π~)−1/2 and writing
1
ψp (x) = exp(ipx/~) (7.18)
(2π~)1/2
ensures that Z ∞
ψp∗ (x)ψp0 (x) dx = δ(p0 − p). (7.19)
−∞
Functions ψp (x) satisfying this equation are said to be “normalized to a delta function
in momentum space".
Instead of the functions ψp (x), it is often more convenient to introduce the wave num-
ber, k (k = p/~ and p = ~k), and use the functions
1
ψk (x) = exp(ikx). (7.20)
(2π)1/2
Functions such as ψk (x) satisfying this last equation are said to be “normalized to a delta
function in k-space". Eqs. (7.19) and (7.22) replace Eq. (7.16) for non-square-integrable
functions such as ψp (x) and ψk (x).
120
7.5 Probability densities
The fact that the possible values of p are continuously distributed also affects how
probabilities are defined. In regards to predicting the result of a measurement of the
x-component of the momentum, the quantity of interest is not the probability Pr(p) of
obtaining a given value p, but rather the probability Pr([p1 , p2 ]) of obtaining a result
between a certain value p1 and a certain value p2 . There are two reasons for this: (1) The
probability of obtaining a value specified to infinitely many digits in a measurement of a
continuously distributed variable is zero, in the same way as the probability of drawing
(at random) one particular ball from an urn containing infinitely many balls would
be zero. (2) Since no detectors have an infinite resolution, an actual measurement of a
variable distributed continuously can only determine a range of values for this variable.
The probability Pr([p1 , p2 ]) can be written as the integral of a certain density of prob-
ability (or probability density function) P(p):
Z p2
Pr([p1 , p2 ]) = P(p) dp. (7.23)
p1
I.e., P(p) dp is the probability of obtaining a value of the momentum between p and
p+dp (or, which is equivalent because dp is an infinitesimal, the probability of obtaining
a value of the momentum between p − dp/2 and p + dp/2). We stress that P(p)
is a density of probability, not a probability. Pr ([p1 , p2 ]) is a probability and has no
physical dimensions. By contrast, P(p) has the physical dimensions of the inverse of
a momentum.
Suppose that the particle is in a state described by a normalized wave function ψ(x).
According to the rules of Quantum Mechanics, and assuming that the eigenfunctions
ψp (x) are normalized as per Eq. (7.19),
Z ∞ 2 Z ∞ 2
1
P(p) = ψp∗ (x)ψ(x) dx = exp(−ipx/~)ψ(x) dx . (7.24)
−∞ (2π~)1/2 −∞
Rather than working with the functions ψp (x), it is often convenient to express the
momentum p in terms of the wavenumber k and work with the ψk (x) functions defined
by Eq. (7.20). These two formulations are equivalent. In particular, the probability
Pr([k1 , k2 ]) of obtaining a value between k1 = p1 /~ and k2 = p2 /~ in a measurement
of the particle’s wave number in the x-direction is
Z k2
P(k) dk,
k1
121
where P(k) = |φ(k)|2 with φ(k) being the Fourier transform of ψ(x):
Z ∞ Z ∞
∗ 1
φ(k) = ψk (x)ψ(x) dx = exp(−ikx)ψ(x) dx. (7.25)
−∞ (2π)1/2 −∞
122
We can now do the integral over k easily, since
Z ∞ Z ∞
0
dk exp(ikx) exp(−ikx ) = exp[ik(x−x0 )] dk = 2πδ(x−x0 ).
−∞ −∞
(7.31)
Therefore
Z ∞ Z ∞ Z ∞
P(k) dk = 2π|C|2 dx dx0 ψ ∗ (x)ψ(x0 )δ(x − x0 )
−∞
Z−∞
∞
−∞
= 2π|C|2 dx ψ ∗ (x)ψ(x)
−∞
= 2π|C| .2
(7.32)
123
ψ(x), the probability Pr([x1 , x2 ]) of finding it between x1 and x2 can be written as
Z x2
P(q) dq,
x1
where Z ∞ 2
P(q) = ψq∗ (x)ψ(x) dx (7.33)
−∞
with ψq (x) = δ(x − q). Indeed, replacing ψq (x) by δ(x − q) in this last equation gives
Z ∞ 2
P(q) = δ(x − q)ψ(x) dx = |ψ(q)|2 , (7.34)
−∞
124
have no solutions in the Hilbert space of the square-integrable functions on (−∞, ∞).
However, let us proceed as if the vectors |qi and |pi existed, formed a complete set, and
were orthonormal in the sense that
hq|q 0 i = δ(q 0 − q) and hp|p0 i = δ(p0 − p).
Then any vector |ψi could be written both as
Z ∞
|ψi = ψ(q)|qi dq (7.37)
−∞
and as Z ∞
|ψi = φ(p)|pi dp, (7.38)
−∞
with, respectively,
ψ(q) = hq|ψi and φ(p) = hp|ψi. (7.39)
Eqs. (7.37) and (7.38) are formally equivalent to Eq. (7.36), apart that their right-hand
sides are integrals rather than discrete sums. The coefficients ψ(q) and φ(p) of these
two expansions are wave functions. The functions ψ(q) (or ψ(x) if instead of q we
use the letter x to denote the position) are “wave functions in position space", and the
functions φ(p) are “wave functions in momentum space". The former are nothing else
than the wave functions you have already encountered in Year 1. We have already
mentioned their probabilistic interpretation: the density of probability that the particle
is found to be at a position q is |ψ(q)|2 (i.e, |hq|ψi|2 ). Likewise, the density of probability
that the particle is found to have a momentum p is |φ(p)|2 (i.e., |hp|ψi|2 ).
Both ψ(q) and φ(p) are sets of numbers representing the ket |ψi in the bases of the
“eigenvectors" of Q̂ and of those of P̂ , in the same way as {cn } is a set of numbers
representing |ψi in the {|ψn i} basis used to write Eq. (7.36). Working in the {|qi} basis
is working in the “position representation", and working in the {|pi} basis is working
in the “momentum representation".
+ In the same way as ket vectors are represented by wave functions, oper-
ators acting on ket vectors are represented by operators acting on wave
functions. For example, in a basis {|uj i, j = 1, 2, . . . , N } of orthonor-
mal ordinary ket vectors, the identity operator would be represented by
the matrix of elements hui | Iˆ|uj i, i.e., by the identity matrix (hui | Iˆ|uj i =
hui |uj i = δij since the basis is orthonormal). In the position representa-
tion, the identity operator would be represented by the operator of “ele-
ments" hq| Iˆ|q 0 i, i.e., by the delta function δ(q 0 − q) (note that hq| Iˆ|q 0 i =
hq|q 0 i = δ(q 0 − q)).
125
Let us call the position co-ordinate x now, rather than q. In the position representa-
tion, ket vectors are represented by functions of x, Q̂ by the operator multiplying any
function ψ(x) by x, and P̂ by the operator −i~ d/dx. Conversely, in the momentum
representation ket vectors are represented by functions of p, the momentum operator
P̂ by the operator multiplying any function φ(p) by p, and the position operator Q̂ by
the operator i~ d/dp (this last equation will be obtained in a workshop). Moreover, the
wave function in momentum space is the Fourier transform of the wave function in po-
sition space, and conversely the wave function in position space is the inverse Fourier
transform of the wave function in momentum space [see Eqs. (7.1) and (7.2)]:
Z ∞
−1/2
φ(p) = (2π~) exp(−ipx/~)ψ(x) dx, (7.40)
Z−∞
∞
ψ(x) = (2π~)−1/2 exp(ipx/~)φ(p) dp. (7.41)
−∞
+ Let us justify these last two equations. We have seen that the functions
ψp (x) defined by Eq. (7.18) are normalized “eigenfunctions" of this last
operator. Hence we can write ψp (x) as hx|pi and ψp∗ (x) as hp|xi:
1 1
hx|pi = exp(ipx/~), hp|xi = exp(−ipx/~).
(2π~)1/2 (2π~)1/2
(7.42)
Combining Eq. (7.37) with the identity φ(p) = hp|ψi yields
Z ∞
φ(p) = ψ(x)hp|xi dx, (7.43)
−∞
which in view of Eq. (7.42) is nothing else than Eq. (7.40). Similarly,
Eq. (7.41) follows from the equation
Z ∞
ψ(x) = φ(p)hx|pi dp, (7.44)
−∞
126
7.8 The commutator of Q and P
Recall that in the position representation the position operator Q and the momentum
operator P transform any wave function ψ(x) into, respectively, xψ(x) and −idψ/dx.
As you have seen in the Term 1 course, these two operators do not commute; instead,
It follows from Eq. (6.20) that measurements of the corresponding dynamical variables
(x and p) are subject to the uncertainty relation ∆x∆p ≥ ~/2.
[x̂, p̂y ] = [x̂, p̂z ] = 0, [ŷ, p̂x ] = [ŷ, p̂z ] = 0, [ẑ, p̂x ] = [ẑ, p̂y ] = 0. (7.47)
Moreover, position operators commute with position operators and momentum oper-
ators with momentum operators:
[x̂, ŷ] = [x̂, ẑ] = [ŷ, ẑ] = [p̂x , p̂y ] = [p̂x , p̂z ] = [p̂y , p̂z ] = 0. (7.48)
127
All what we have seen for the 1D case generalizes to the 3D case. For example, in posi-
tion representation, the 1D momentum operator −i~d/dx becomes the 3D momentum
operator −i~∇, where
∂ ∂ ∂
∇ = x̂ + ŷ + ẑ , (7.50)
∂x ∂y ∂z
with x̂, ŷ and ẑ the unit vectors in the x-, y- and z-directions (the hats here indicate
that these vectors have unit norm, not that they are operators). Likewise, the functions
ψk (x) of Eq. (7.20) become the “plane waves"
Therefore
Z Z ∞
1
ψk∗ (r)ψk0 (r) d3 r = exp[i(kx0 − kx )x] dx ×
(2π) −∞
Z ∞
1
exp[i(ky0 − ky )y] dy ×
(2π) −∞
Z ∞
1
exp[i(kz0 − kz )z] dz. (7.53)
(2π) −∞
Hence, Z
ψk∗ (r)ψk0 (r) d3 r = δ(kx0 − kx )δ(ky0 − ky )δ(kz0 − kz ). (7.54)
More succinctly,
Z
ψk∗ (r)ψk0 (r) d3 r = δ(k0 − k), (7.55)
128
7.10 Continua of energy levels
For systems described by state vectors belonging to a finite dimensional space, the
eigenvalues of the Hamiltonian (the eigenenergies of the system) are finite in numbers
and form a discrete distribution (i.e., each energy level is separated from the adjacent
levels by an energy gap). This may or may not be the case in infinite dimensional spaces.
For example, harmonic oscillators and infinite square wells only have discrete energy
levels. However, there are also systems for which the Hamiltonian has no eigenvalue
in the mathematical definition of the term but has generalized eigenvalues forming
a continuous distribution, and systems which have both discrete eigenvalues and a
continuous distribution of generalized eigenvalues.
A free particle in 3D is an example of system with a continuous distribution of gener-
alized eigenvalues. That a particle of mass m is free means that its Hamiltonian can
be written as −(~2 /2m)∇2 , without a potential energy term. The functions ψk (r)
of Eq. (7.51) are generalized eigenfunctions of this Hamiltonian and the correspond-
ing generalized eigenenenergies are ~2 k 2 /(2m) with k = |k|: To see this, note that
∇ exp(i k · r) = ik exp(i k · r) and therefore ∇ · ∇ exp(i k · r) = (i)2 k · k exp(i k · r),
whence
~2 2 ~2 k 2
− ∇ ψk (r) = . (7.57)
2m 2m
Since the wave number k varies continuously, these generalized eigenenergies are con-
tinuously distributed and form a continuum of energy levels.
Typically systems such as finite square wells, atoms, molecules and nuclei have both
discrete energy levels and a continuum of energy levels. Atomic hydrogen is a good
example of such systems. You have studied the bound states of that atom in Term 1.
The corresponding energy levels are discrete and correspond to wave functions which
go to zero for r → ∞, where r is the distance between the electron and the nucleus.
This means that the electron has a vanishingly small probability to be arbitrarily far
away from the nucleus (which is why we can say that in such states the electron is
bound to the nucleus).
The energy of these bound states is negative, which can be understood from the fol-
lowing argument: Suppose for an instant that the electron is a classical particle with a
well defined trajectory obeying the rules of Classical Mechanics. Its energy, E, would
then be the sum of its potential energy,
e2 1
V (r) = − , (7.58)
4π0 r
which is negative for all values of r, and its kinetic energy, T , which cannot be negative
129
(mv 2 /2 ≥ 0 since m > 0 and v 2 ≥ 0). Classically, an electron with a total energy E =
T +V (r) can be located at any distance r from the nucleus at which T = E −V (r) ≥ 0.
I.e., when E < 0 the electron cannot go beyond the distance rE at which V (rE ) = E,
according to the rules of Classical Mechanics.
According to the rules of Quantum Mechanics, however, the electron may go beyond
rE by “tunnelling" through the potential barrier, but the probability of finding it at a
distance r must then go to zero for r → ∞. Mathematically, the Schrödinger equation
has a solution finite everywhere and going to zero for r → ∞ only for certain values
of the energy; these values are the bound state eigenenergies you have found in Term
1.
There is no classical potential barrier for E > 0, though: Since V (r) < 0, the kinetic
energy, E − V (r), is positive at all values of r when E > 0. Therefore a classical
electron can go arbitrarily far from the nucleus if its total energy is positive. Corre-
spondingly, in Quantum Mechanics, the atom can be in an unbound state of positive
energy. As the electron can be arbitrarily far in such states, its wave function does not
need to go to zero for r → ∞. Therefore the boundary condition which restricts the
energy of bound states to discrete values does not apply for unbound states, with the
consequence that such states exist for any positive values of E. The corresponding
eigenenergies (in the sense of generalized eigenvalues of the Hamiltonian) form a con-
tinuous distribution. The corresponding wave functions can be obtained analytically,
but they are considerably more complicated than the bound state wave functions you
have studied in Term 1.
More generally, the energy eigenfunctions of a Hamiltonian H can correspond to bound
states or to continuum states. The eigenenergies of bound states form a discrete distri-
bution of energy levels and the corresponding eigenfunctions (ψi (r), say) are square-
integrable:
Z
Hψi (r) = Ei ψi (r), with ψi∗ (r)ψj (r) d3 r = δij . (7.59)
(The index used here to label these eigenfunctions stands for the set of all the quantum
numbers necessary to identify each of the functions unambiguously. E.g., for atomic
hydrogen, i ≡ {n, l, m} where n is the principal quantum number, l the orbital angular
momentum quantum number and m the magnetic quantum number.)
The eigenenergies of continuum states are distributed continuously. The corresponding
eigenfunctions (ψk (r), say) are not square-integrable but can be normalized to a delta
function:
Z
Hψk (r) = Ek ψk (r), with ψk∗ (r)ψk0 (r) d3 r = δ(k − k0 ). (7.60)
130
(For simplicity, we label these continuum wave functions by a wave vector k; in some
applications of this formalism one may need additional quantum numbers or several
wave numbers to uniquely identify each of these eigenfunctions.)
Bound states wave functions are always orthogonal to continuum state wave functions
since these states correspond to different eigenenergies:
Z
ψn∗ (r)ψk (r) d3 r = 0. (7.61)
Since this relation must hold for any wave function ψ(r) and at any position vector r,
we see that Z
X
ψn (r)ψn∗ (r0 ) + ψk (r)ψk∗ (r0 ) d3 k = δ(r0 − r). (7.65)
n
This equation generalizes the completeness relation we have derived for the case of
a finite-dimensional Hilbert space in Section 5.2 of these notes. Here we work in an
infinite-dimensional Hilbert space, with the consequences that the summation over n
may encompass an infinite number of terms and that continuum eigenstates may need
to be included. Clearly, Eq. (7.65) includes an integral over k only if H has a continuous
spectrum and a sum over n only if H has a discrete spectrum.
131
7.11 Probabilities and wave functions
To recap, one can represent the quantum state of a particle by way of a wave function.
(We ignore spin here — the case of a particle with a non-zero spin is similar but slightly
more complicated as it involves column vectors of wave functions rather than a single
wave function.) One may choose to work in the position representation, in which case
the wave function is a function of the position of the particle (e.g., ψ(x) in 1D or ψ(r) in
3D). Alternatively one may choose to work in the momentum representation, in which
case the wave function is a function of the momentum of the particle (e.g., φ(p) in 1D
or φ(p) in 3D).
The squared modulus of the wave function is a density of probability. Integrating this
density of probability over a range of positions or momenta gives the probability that
the particle has a position or a momentum in that range. This fact is often expressed in
the following ways:
132
7.12 The parity operator
You may have come across the following terminology in a previous course: a func-
tion f (x) is said to be of even parity (or more simply, to be even) if f (−x) = f (x)
and of odd parity if f (−x) = −f (x). For example, cos x is an even function
since cos(−x) = cos x, whereas sin x is an odd function since sin(−x) = − sin x.
The function exp(x/a), with a a constant, is neither even nor odd, since changing
x into −x changes this function into exp(−x/a), which is neither exp(x/a) nor
− exp(x/a). The function xn is even (odd) for even (odd) integer values of n.
We are interested in the operator which transforms any function ψ(x) into the func-
tion ψ(−x). It is called the parity operator and we will denote it by Π:
133
or odd under this operation: As in 1D, ψ(x1 , y1 , z1 , . . . , xN , yN , zN ) is even if
ψ(−x1 , −y1 , −z1 , . . . , −xN , −yN , −zN ) = ψ(x1 , y1 , z1 , . . . , xN , yN , zN ) and is
odd if ψ(−x1 , −y1 , −z1 , . . . , −xN , −yN , −zN ) = −ψ(x1 , y1 , z1 , . . . , xN , yN , zN ).
+ In 1D, the eigenfunctions of the Hamiltonian are either even or odd under a
parity transformation if the potential energy is even, i.e., if V (x) = V (−x)
for all x. In 3D, they are either even or odd or may be chosen to be either
even of odd if the potential energy is even, i.e., if V (r) = V (−r) for any
r.
See Fig. 6 of Professor Cole’s notes for an example: the eigenfunctions of
the 1D harmonic oscillator with potential energy V (x) = (1/2)kx2 are
alternatively even and odd, as should be expected since V (x) = V (−x)
for this system.
Proof: For simplicity, we will only consider the case of a single particle
in 3D. The proof generalizes immediately to the case of N particles. Sup-
pose that ψ(x, y, z) is an eigenfunction of the Hamiltonian: Hψ(x) =
Eψ(x, y, z), with
~2
2
∂2 ∂2
∂
H=− + + + V (x, y, z) (7.68)
2m ∂x2 ∂y 2 ∂z 2
∂2 ∂ ∂
Π 2
φ(x, y, z) = φ(−x, −y, −z)
∂x ∂(−x) ∂(−x)
∂ ∂
= (−1)2 φ(−x, −y, −z)
∂x ∂x
∂2
= Πφ(x, y, z). (7.69)
∂x2
Also, we note that Π also commutes with V (x, y, z) if V (x, y, z) =
V (−x, −y, −z). Π then also commutes with H, and it follows from gen-
eral theorems, (1) that Πψ(x, y, z) is also an eigenfunction of H, and
(2) that ψ(x, y, z) and Πψ(x, y, z) belong to the same eigenenergy (see
page 88).
Let us introduce the new functions ψ+ (x, y, z) = ψ(x, y, z) +
ψ(−x, −y, −z) and ψ− (x, y, z) = ψ(x, y, z) − ψ(−x, −y, −z). Clearly,
134
ψ+ (x, y, z) is of even parity and ψ− (x, y, z) is of odd parity, and
ψ+ (x, y, z) is orthogonal to ψ− (x, y, z). Given that Hψ = Eψ and
HΠψ = EΠψ, we also have that Hψ± = Eψ± .
If the eigenenergy E is not degenerate, ψ+ (x, y, z) and ψ− (x, y, z) can-
not be both eigenfunctions of E since they are linearly independent.
Thus either ψ+ (x, y, z) ≡ 0 or ψ− (x, y, z) ≡ 0, in which case either
ψ(x, y, z) = −ψ(−x, −y, −z) or ψ(x, y, z) = ψ(−x, −y, −z).
If the eigenenergy E is degenerate, then any pair of linearly independent
eigenfunction ψ(x, y, z) and Πψ(x, y, z), can always be replaced by the
linear combinations ψ(x, y, z) + Πψ(x, y, z) and ψ(x, y, z) − Πψ(x, y, z),
which are linearly independent and either even or odd.
The eigenenergies of a 3D system may or may not be degenerate (e.g.,
ignoring spin and relativistic effects, the ground state energy of atomic
hydrogen is not degenerate but all the other energy levels are). However,
in 1D the eigenenergies are never degenerate (proving this would require
a discussion) and as a consequence the energy eigenfunctions are always
of a well defined parity if the potential is even.
135
8 Quantum harmonic oscillators
This part of the course is essentially a brief revision of what you have seen in the
Michaelmas Term, with an extention to 3D oscillators. Harmonic oscillators are of
great importance in Classical Mechanics, not the least because the motion of systems
of particles in the vicinity of a configuration of stable equilibrium can often be described
in terms of coupled harmonic oscillators. This is also the case in Quantum Mechanics.
136
already seen several times:
~2 d2 1
H=− 2
+ mω 2 x2 . (8.4)
2m dx 2
It turns out that the time-independent Schrödinger equation can be solved exactly for
this Hamiltonian, either as a differential equation or by using algebraic methods based
on the ladder operator mentioned in the next section. The following is found:
1. The eigenenergies of the linear Harmonic oscillator and the corresponding en-
ergy eigenstates can be labelled by an integer n which can take any non-negative
value:
Ĥ |ψn i = En |ψn i, n = 0, 1, 2, . . . , (8.5)
or in the position representation,
These eigenenergies thus form a ladder of equally spaced energy levels. The
bottom “rung" — i.e., the ground state energy — is E0 = ~ω/2, and each level is
separated from the adjacent levels by an energy ~ω. (Note that the ground state
energy is non-zero.)
137
Note that this Hamiltonian is exactly the same as the Hamiltonian Ĥ of Eq. (8.3); the
only difference is that Ĥ is now written in terms of the operators â and ↠instead of
the operators x̂ and p̂x .
r r
mω 2 1
A= x+i p. (8.11)
2 2m
Passing √ to the quantum mechanical Hamiltonian is then done by setting
a = A/ ~ω, writing H(p, x) as ~ω(aa∗ + a∗ a)/2, and replacing a by
the operators â and a∗ by the adjoint of â with
r r r r
mω 1 † mω 1
â = x̂+i p̂, â = x̂−i p̂. (8.12)
2~ 2m~ω 2~ 2m~ω
√
A and A∗ were divided by ~ω in the above so as to make the cor-
responding operators â and ↠dimensionless. Since a∗ a ≡ aa∗ ,
~ω(aa∗ + a∗ a)/2 ≡ ~ωa∗ a. However, ~ω(↠â + â↠)/2 6= ~ω↠â since
↠â 6= â↠. Replacing a and a∗ by â and ↠in ~ω(aa∗ + a∗ a)/2 rather
than in ~ωa∗ a ensures that the correct quantum mechanical Hamilto-
nian is obtained. (Note that ~ω(↠â + â↠)/2 = ~ω(↠â + 1/2) since
â↠= ↠â + 1.)
As was shown in the term 1 QM course, the energy levels En can be deduced from
Eq. (8.9) and from the commutation relation of â and ↠by a purely algebraic method
(i.e., without solving the Schrödinger equation as a differential equation). A key result
from this approach is that it is possible to find a set of normalized energy eigenstates
{|ψn i, n = 0, 1, . . .} such that
√
â|ψn i = n|ψn−1 i with â|ψ0 i = 0, (8.13)
√
↠|ψn i = n + 1|ψn+1 i. (8.14)
138
A number of other results can be derived from this, e.g., that n̂ = ↠â is a number
operator:
n̂|ψn i = n|ψn i, (8.15)
and also that Ĥ |ψn i = ~ω(n + 1/2)|ψn i. Going up from the energy level En to the
one immediately above amounts to adding a “quantum of energy" ~ω to En . Compared
to the ground state |ψ0 i, the energy eigenstate |ψn i can be understood as containing n
quanta of energy. The operator n̂ thus “counts" the number of energy quanta contained
in the states it acts on.
+ Iterating Eq. (8.14) gives all the normalized energy eigenstates |ψn i in
terms of the ground state, |ψ0 i:
1
|ψn i = √ (↠)n |ψ0 i. (8.16)
n!
Because these various properties hold irrespective of whether the Schrödinger equation
can or cannot be written as a differential equation, they carry over to systems which
cannot be formulated in the position representation — e.g., to photon fields in quantum
electrodynamics. Similar ladder operators are also widely used in quantum field theory.
You may also remember that the eigenvalues of angular momentum operators can be
derived algebraically using ladder operators.
Extension to 3D
Similar ladder operators can be introduced for 3D harmonic oscillators. namely âx and
â†x (the same as â and ↠) and also ây , â†y , âz and â†z . The operators ây and âz and their
adjoints are related to the position operators ŷ and ẑ and to the momentum operators
p̂y and p̂z in the same way as âx and â†x are related to x̂ and p̂x . These operators are
such that
[âx , â†x ] = [ây , â†y ] = [âz , â†z ] = 1. (8.17)
Recall that position and momentum operators pertaining to orthogonal directions al-
ways commute with each other, and also that position operators always commute with
other position operators and momentum operators always commute with other mo-
mentum operators. Therefore ladder operators pertaining to orthogonal directions also
commute with each other. In particular,
139
8.3 The coherent states of a simple harmonic oscillator
As you may remember from a previous Quantum Mechanics course, the wave func-
tion Ψ(r, t) describing the quantum state of a free particle can always be written as
a superposition of plane waves — i.e., as an integral of the form
Z
1
Ψ(r, t) = φ(k) exp(ik · r) exp(−iEk t/~) d3 k, (8.21)
(2π)3/2
where Ek = ~2 k 2 /2m and φ(k) is a certain function of the wave vector k. (We stress
that this result applies to the case of a free particle, namely a particle not interact-
ing with anything. The potential energy of a free particle is the same everywhere
in space and at all times, and can be taken to be identically zero by choice of the
origin of the energy scale.) How the probability density |Ψ(r, t)|2 varies with r and
t depends on the function φ(k), and so does the uncertainty ∆x on the position of
this particle. However, whatever φ(k) is, it is always the case that ∆x will increase
without limit as t → ∞: a free particle always become more and more delocalized
at large times. This delocalization is often referred to as the spreading of the wave
packet.
Remarkably, non-spreading wave functions are possible for the case of a particle
trapped in a quadratic potential well (i.e., if the particle is a simple harmonic oscil-
lator). I.e, when the Hamiltonian is given by Eq. (8.4), the Schrödinger equation
∂Ψ
i~ = HΨ(x, t) (8.22)
∂t
has solutions for which ∆x remains constant in time. These wave functions describe
a particular class of states called coherent states (the word “coherent" meaning that
the wave packet “coheres", i.e., remains together rather than spreads out). Here is
a list of interesting facts about coherent states (see the homework and workshop
problems associated with the course for proofs of many these results):
1. The coherent states are the eigenstates of the ladder operator â. Any real or
complex number α is an eigenvalue of â and the coherent states are described
by the corresponding eigenvectors: If
then |αi describes a coherent state. (In contrast, the operator ↠has no eigen-
vector.)
140
2. The symbol |αi is usually reserved for the normalized eigenvectors of â. The
coherent state |0i corresponding to α = 0 is the ground state of the oscillator.
(I.e., |0i = |ψ0 i in the notation used above.) Coherent states other than |0i are
not eigenstates of the Hamiltonian. Hence they depend on time (when one
works in the so-called Schrödinger representation, see Chapter 11 of these
notes). Within an overall constant factor,
∞
X αn
|αi(t) = exp(−|α|2 /2) √ exp(−iEn t/~)|ψn i, (8.24)
n=0 n!
where the energies En and the energy eigenstates |ψn i are defined as in
Eqs. (8.7) and (8.16).
3. The expectation values of the position and of the momentum vary in time like
the position and the momentum of a simple harmonic oscillator in classical
mechanics. Specifically, in a state |αi,
r
~
hxi(t) = 2|α| cos(ωt − arg α), (8.25)
2mω
r
m~ω
hpi(t) = −2|α| sin(ωt − arg α). (8.26)
2
The modulus and the argument of the complex number α thus define the
amplitude and the phase of the oscillation.
141
where C is a normalization constant. Since φα (x) is the ground state wave
function when α = 0, a coherent state can be described as a “displaced ground
state".
142
9 Tensor products of Hilbert spaces
~2 d2 1
H1D = − + mω 2 x2 . (9.1)
2m dx2 2
Similarly, for a 3D isotropic oscillator,
~2
2
∂2 ∂2
∂ 1
H3D = − 2
+ 2 + 2 + mω 2 (x2 + y 2 + z 2 ). (9.2)
2m ∂x ∂y ∂z 2
~2 2 1
H3D = − ∇ + mω 2 r2 . (9.3)
2m 2
(Isotropic means “the same in every direction": the potential energy depends only on
r, the distance of the particle to the point of equilibrium, not on the polar angles θ and
φ describing its angular position.) For a 2D isotropic oscillator,
~2
2
∂2
∂ 1
H2D = − 2
+ 2 + mω 2 (x2 + y 2 ). (9.4)
2m ∂x ∂y 2
We observe that H2D can also be written as the sum of two 1D Hamiltonians, one in x
and one in y:
H2D = H1Dx + H1Dy , (9.5)
where H1Dx and H1Dy are given by the following equations:
~2 ∂ 2 1 ~2 ∂ 2 1
H1Dx = − + mω 2 x2 , H1Dy = − + mω 2 y 2 . (9.6)
2m ∂x2 2 2m ∂y 2 2
(We write H1Dx and H1Dy in terms of partial derivatives, contrary to what we did in
Eq. (9.1), because we are now dealing with several independent variables. It is custom-
ary to write derivatives as total derivatives rather than partial derivatives when there
is only one independent variable.)
143
Since the eigenenergies of a 1D harmonic oscillator are always of the form ~ω(n +
1/2), where n is any non-negative integer, the eigenenergies of H1Dx and of H1Dy can
be written as ~ω(n + 1/2) and ~ω(n0 + 1/2), respectively, with n, n0 = 0, 1, 2, . . .
Let us denote by ψn (x) a normalized eigenfunction of H1Dx with eigenenergy En =
~ω(n + 1/2) and by ψn0 (y) a normalized eigenfunction of H1Dy with eigenenergy
En0 = ~ω(n0 + 1/2) (n, n0 = 0, 1, 2, . . .):
H1Dx ψn (x) = En ψn (x), H1Dy ψn0 (y) = En0 ψn0 (y). (9.7)
The sets {ψn (x)} and {ψn0 (y)} are both orthonormal since we assume that the ψn (x)’s
and ψn0 (y)’s are normalized and different values of n or n0 correspond to different
eigenenergies:
Z ∞ Z ∞
∗
ψn (x)ψm (x) dx = δnm , ψn∗ 0 (y)ψm0 (y) dy = δn0 m0 . (9.8)
−∞ −∞
Since ψn (x) is an eigenfunction of H1Dx with eigenenergy En and ψn0 (y) is an eigen-
function of H1Dy with eigenenergy En0 , the product ψn (x)ψn0 (y) is an eigenfunction
of H2D with eigenenergy En + En0 :
H2D ψn (x)ψn0 (y) = [H1Dx ψn (x)] ψn0 (y) + ψn (x) [H1Dy ψn0 (y)]
= En ψn (x)ψn0 (y) + En0 ψn (x)ψn0 (y)
= (En + En0 )ψn (x)ψn0 (y). (9.9)
In fact, one can show that any eigenfunction of the Hamiltonian H2D is either a product
of the form ψn (x)ψn0 (y) or a linear combination of such products.
Given Eq. (9.8), it is easy to see that the products ψn (x)ψn0 (y) form an orthonormal
set:
Z ∞Z ∞
[ψn (x)ψn0 (y)]∗ [ψm (x)ψm0 (y)] dx dy
−∞ −∞
Z ∞ Z ∞
∗
= ψn (x)ψm (x) dx ψn∗ 0 (y)ψm0 (y) dy = δnm δn0 m0 . (9.10)
−∞ −∞
It is also possible to show that given any square-integrable function f (x, y), there al-
ways exists a set of constants cnn0 such that
∞ X
X ∞
f (x, y) = cnn0 ψn (x)ψn0 (y). (9.11)
n=0 n0 =0
The products ψn (x)ψn0 (y) thus form an orthonormal basis spanning the space of the
functions square-integrable on the xy-plane. The coefficients cnn0 do not depend on x
144
or y but may depend on other variables, e.g., time. In particular, any time-dependent
wave function Ψ(x, y, t) can be written as an expansion of form
∞ X
X ∞
Ψ(x, y, t) = cnn0 (t)ψn (x)ψn0 (y). (9.12)
n=0 n0 =0
Note what we are doing here: we combine two 1D system into a single 2D system, and
write the wave functions of the latter in terms of wave functions of the former.
Now, rather than a single particle confined to the xy-plane, consider two particles con-
fined to the x-axis — i.e., a particle of mass mA and coordinate xA and a particle of
mass mB and coordinate xB . We denote the Hamiltonian of the first particle by HA
and the Hamiltonian of the second particle by HB , and take
~2 ∂ 2 1
HA = − 2
2 2
+ mA ωA xA , (9.13)
2mA ∂xA 2
~2 ∂ 2 1
HB = − 2
2 2
+ mB ωB xB . (9.14)
2mB ∂xB 2
It might well be possible to treat these two harmonic oscillators as if they were com-
pletely on their own, and, doing this, describe the quantum state of the first one by a
certain wave function ΨA (xA , t) and the quantum state of the other by a certain wave
function ΨB (xB , t). However, it would be necessary to treat them has forming a sin-
gle 2-particle system, rather than two 1-particle systems, if they were interacting with
each other. Typically, an interaction between the two oscillators would depend on the
position of particle B relative to particle A and would be represented by a potential
energy term V (xA − xB ) in the Hamiltonian of the joint system, HAB :
Treating the two oscillators as a single system implies that the the quantum state of
this system is described by a wave function Ψ(xA , xB , t) rather than by separate wave
functions ΨA (xA , t) and ΨB (xB , t).
At this point, we note that HAB reduces to the sum HA + HB in the absence of this in-
teraction. HAB is then mathematically equivalent to the Hamiltonian H2D of Eq. (9.5),
apart from a trivial change of notation and the unimportant difference that the mass and
angular frequency of oscillators A and B may not be the same. Proceeding as above,
we can introduce a complete set of normalized eigenfunctions of HA and a complete
set of normalized eigenfunctions of HB , respectively {ψAn (xA ), n = 0, 1, 2, . . .} and
145
{ψBn0 (xB ), n0 = 0, 1, 2, . . .}, such that
Considering the two harmonic oscillators as a single system may be necessary even if
they are not interacting. For example, take the wave function ψA0 (xA )ψB1 (xB ), which
describes a state where the first oscillator is in its ground state and the second in its
lowest excited state, and the wave function ψA1 (xA )ψB0 (xB ), which describes a state
where the first oscillator is in its lowest excited state and the second in its ground state.
(We do not indicate a dependence on time, here, to keep the notation as simple as pos-
sible. This dependence is not important for our discussion. Assume, e.g., that we are
considering these wave functions at the instant t = 0.) These two wave functions de-
scribe possible states of the system formed two non-interacting oscillators. Hence, by
the Principle of Superposition, any linear combination of these wave functions [(e.g.,
ψA0 (xA )ψB1 (xB ) + ψA1 (xA )ψB0 (xB )] also describes a possible state of this system.
Such linear combinations link the state of the first oscillator to the state of the sec-
ond oscillator; therefore they do not describe quantum states in which the state of one
oscillator can be treated independently from the state of the other.
Terminology
In relation to this first example:
• These two oscillators, when considered as a single system, are said to form a bi-
partite quantum system (i.e., a quantum system composed of two distinct parts
which can be considered jointly or in isolation, depending on the circumstances).
Systems composed of more than two distinct parts are called multipartite sys-
tems.
• A quantum state of the joint system described by a wave function which can be
factorized in a product of the form ΨA (xA , t)ΨB (xB , t) is called a separable (or
product) state (e.g., the state described by the wave function ψA0 (xA )ψB1 (xB )
is separable).
146
• Quantum states that are not separable are called entangled states (e.g., the state
described by the wave function ψA0 (xA )ψB1 (xB ) + ψA1 (xA )ψB0 (xB ) is entan-
gled).
The integral is readily calculated using the orthonormality of the products ψAi (xA )ψBj (xB ),
with the result that Pr(0, 0; φAB ) = 2/15. Similarly, there is a probability of 4/15 that
Alice finds her oscillator to be in the excited state and Bob finds his to be in the ground
state, of 3/15 that Alice finds hers to be in the ground state and Bob finds his to be in
the excited state, and of 6/15 that both Alice and Bob find their oscillator to be in the
excited state:
Pr(0, 0; φAB ) = 2/15, Pr(1, 0; φAB ) = 4/15,
Pr(0, 1; φAB ) = 3/15, Pr(1, 1; φAB ) = 6/15. (9.22)
147
It is worth noting that there is no correlation between the results found by Alice and
those found by Bob: whether Bob finds his oscillator to be in the ground state or in
the excited state, the probability that Alice finds hers to be in the ground state is half
the probability that she finds it to be in the excited state, and similarly, whatever Alice
finds for her oscillator, the probability that Bob finds his to be in the ground state is 2/3
the probability that he finds it to be in the excited state.
We would arrive to the same conclusion for any separable state: in such states, the
results of any measurement on A are completely independent from the results of any
measurement on B.
Instead, let us now assume that these two oscillators are in an entangled state of nor-
malized wave function ψAB (xA , xB ), with
1
ψAB (xA , xB ) = √ [ψA0 (xA )ψB1 (xB ) + ψA1 (xA )ψB0 (xB )] . (9.23)
2
The probability that both Alice and Bob find their oscillator to be in the ground state is
now zero, since ψA0 (xA )ψB0 (xB ) is orthogonal to both ψA0 (xA )ψB1 (xB ) and ψA1 (xA )ψB0 (xB ).
Repeating the calculation for the other product functions gives
148
+ At first sight, the perfect anticorrelation between the results found by Al-
ice and by Bob in the measurements mentioned above is nothing different
than what you observe in everyday life. E.g., if Alice and Bob were toss-
ing a coin between them, there would be a perfect anticorrelation between
Alice’s wins and Bob’s wins: Bob would lose each time Alice would win
and Bob would win each time Alice would lose. This anticorrelation is
the same as that found in the measurements on the harmonic oscillators.
However, there is a profound difference between the quantum correlations
observed in measurements on entangled states and the classical correla-
tions observed in everyday life. This difference does not manifest in the
measurements discussed above but may be (and has been) revealed by well
chosen experiments.
Saying more about this difference would take us far beyond the scope of
the course. However, its conceptual importance can be appreciated from
the following. When you are tossing a coin, you take it for granted that
one of the two sides faces upwards before you check which one does, and
if you find this side to be head then you take it for granted that it was
head before you checked. By analogy, you may think that if Alice finds
her oscillator to be in the ground state rather than in the excited state,
and Bob finds the opposite for his oscillator, this must have been because
these two oscillators were in these states before they checked them. The
alternatives seem bizarre: if they were not in these states, then perhaps the
measurement itself would force them to be in these states through some in-
teraction between the two oscillators; however, such an interaction would
need to propagate faster than the speed of light since Alice and Bob may be
millions of kilometers away. Or perhaps it is not possible to say that the
oscillator A is truly separated from oscillator B, however far apart they
might be? Every day intuition suggests that instead, if A is found to be
in the ground state and B in the excited state, it is because A was in the
ground state and B in the excited state immediately before the measure-
ment. However, an in depth discussion of correlations between results of
measurements on entangled states shows that this view is untenable: when
in the state ψAB (xA , xB ), neither of the two oscillators can be assigned a
definite state (ground or excited) prior to the measurement. But if this is
the case for a quantum object, why wouldn’t it also be the case for a coin?
Could it be that which of the two sides faces up becomes determined only
when someone looks at it? Or instead, could it be that all what the wave
functions ψAB (xA , xB ), ψA0 (xA ), ψB1 (xB ), etc., refer to is what we can
149
say about the system, irrespective of what may actually be the case?
These issues of interpretation of Quantum Mechanics are deep, difficult
and unsettled.
Example 2
The two oscillators of Example 1 are distinct particles and are spatially separated.
However, what we have seen above in regards to the entanglement of the states of
separated particles also applies to the entanglement of different degrees of freedom
of a same particle — e.g., entanglement of its position and its spin, as we will now
see.
Recall the Stern-Gerlach experiment: a beam of silver atoms divides into two
branches when passing through an inhomogeneous magnetic field B. The beam
split into two because each atom has a magnetic dipole moment µ and experiences
a force ∇µ·B when passing through the field, and because the component of µ in the
direction of B has only two possible values. As these two values are spin-dependent,
the interaction with the magnetic field couples the spin state of each atom to its spa-
tial wave function. This coupling makes it necessary to consider spin and position
together rather than separately (in the same way as an interaction between two os-
cillators makes it necessary to consider them together rather than separately, i.e., as
a single system rather than as two separate systems).
For simplicity, we represent each atom by a mass point of coordinate R, ignoring
its internal structure. We describe its position by a time-dependent wave function
Ψ(R, t) and its spin state by a column vector χ. Let us assume that
1 0
χ=a +b . (9.25)
0 1
(We have seen previously that the two column vectors appearing in the right-hand
side describe the states of spin up and of spin down.) Before the atom has entered
the magnetic field, Ψ(R, t) and χ are uncoupled. For simplicity, we take the joint
spatial and spin state of the atom to be described by the product Ψ(R, t)χ, thus by
1 0
aΨ(R, t) + bΨ(R, t) .
0 1
The interaction with the magnetic field transforms this state into one of the form
1 0
aΨα (R, t) + bΨβ (R, t) ,
0 1
150
where the wave functions multiplying the two spin states now describe distinct dis-
tributions of position. Clearly, this transformed space + spin wave function does
not describe a separable state since it cannot be written as the product of a wave
function depending on R with a spin state: the interaction with the magnetic field
entangles the atom’s spatial and spin degrees of freedom.
Suppose that the probability density |Ψα (R, t)|2 is practically zero everywhere ex-
cept in a certain region A, and that |Ψβ (R, t)|2 is practically zero everywhere except
in a certain region B. If these two regions do not overlap, atoms found in the region
A are necessarily in a state of spin up and those found in the region B are necessarily
in a state of spin down. There would be no correlation between spin and position if
the joint spatial and spin state of the atom was separable: instead, finding the posi-
tion of the atom would reveal nothing about its spin. For the above entangled state,
measuring the position of an atom is in effect measuring whether the atom is in a
state of spin up or a state of spin down.
Example 3
Our last example is the system formed by the two electrons of a helium atom. You
will see in the level 3 Quantum Mechanics course that in the ground state of helium,
the two electrons are in a joint spin state described by the following combination of
column vectors:
" #
− 1 1 0 0 1
ψ12 = √ − . (9.26)
2 0 1 1 2 1 1 0 2
Here the subscript attached to each column vector indicates whether this column
vector represents a spin state of the first electron or one of the second electron.
(The superscript − is traditional for that state and is a reminder of the minus sign
−
in the right-hand side. We neglect spin-orbit coupling here.) Thus ψ12 is a linear
combination of a state in which electron 1 is spin up and electron 2 is spin down
with a state in which electron 1 is spin down and electron 2 is spin up. It is not
−
possible to write ψ12 as a single product of a spin state of electron 1 with a spin state
−
of electron 2; therefore ψ12 describes an entangled state.
It is important to realize that the products of column vectors appearing in Eq. (9.26)
are not dot products or inner products of some kind. They represent pairs of column
vectors, in which one of these vectors pertain to one part of the system (electron 1)
and the other to another part (electron 2). Note the analogy with Eq. (9.23), in which
the two products ψA0 (xA )ψB1 (xB ) and ψA1 (xA )ψB0 (xB ) also represent states of
individual parts of the joint system.
151
To illustrate this formalism, let us imagine a thought experiment in which you would
−
prepare a pair of electrons in the state ψ12 and measure whether electron 1 is or is
not in a state of spin up and electron 2 is or is not in the spin state represented by
the normalized column vector
1 1
√ .
2 i 2
From the Born rule, the probability of finding electron 1 in the state of spin up and
− 2
electron 2 in that particular spin state is |(φ12 , ψ12 )| , the square of the modulus of
−
the inner product of ψ12 with the vector φ12 , with
1 1 1
φ12 = √ . (9.27)
2 0 1 i 2
We note, at this stage, that it would not make sense to take the inner product of a
column vector describing a state of electron 1 with one describing a state of electron
2, not more than in the first example it would have made sense to calculate the inner
− 2
product of a function of xA with a function of xB . Calculating |(φ12 , ψ12 )| is done
by taking the inner products of column vectors pertaining to the same electron and
combining the results:
" #" #
− 1 1 0
(φ12 , ψ12 ) = 1 0 1 1 −i 2
2 0 1 1 2
" # " #!
0 1
− 1 0 1 1 −i 2
1 1 0 2
= (1 × (−i) − 0 × 1)/2 = −i/2. (9.28)
− 2
Hence |(φ12 , ψ12 )| = 1/4.
+ You have come across still another example of bi-partite system in the Term
1 course, although it was not presented as such: the hydrogen atom. Ig-
noring relativistic effects, the quantum state of an atom of hydrogen-1 can
be described by a wave function Ψ(rpr , rel , t), where rpr and rel are, re-
spectively, the position vector of the proton and the position vector of the
electron. The Hamiltonian of this 2-particle system is
~2 ~2 e2 1
Hat = − ∇2pr − ∇2el − , (9.29)
2mpr 2mel 4π0 |rpr − rel |
152
with mpr the mass of the proton, mel the mass of the electron, and ∇2pr and
∇2el the Laplace operators with respect to the coordinates of the proton and
those of the electron:
∂2 ∂2 ∂2 ∂2 ∂2 ∂2
∇2pr = + + , ∇ 2
el = + + 2. (9.30)
∂x2pr ∂ypr 2 2
∂zpr ∂x2el ∂yel2 ∂zel
However, instead of writing the wave functions and the Hamiltonian in
terms of the coordinates of these two particles, we can also write them
in terms of the coordinates of the centre of mass of the atom and of the
coordinates of the electron with respect to the proton. Let us denote by
rCM the position vector of the centre of mass and by r the position of the
electron respective to the proton:
mpr rpr + mel rel
rCM = , r = rel − rpr . (9.31)
mpr + mel
Using rCM and r instead of rpr and rel is a transformation of the coordi-
nates. It can be shown that this transformation separates Hat into the sum
of a Hamiltonian HCM depending only on the coordinates of the centre of
mass and a Hamiltonian H depending only on the relative coordinates:
Hat = HCM + H (9.32)
with
~2 2 ~2 e2 1
HCM = − ∇CM , H = − ∇2 − . (9.33)
2M 2µ 4π0 r
In these equations, ∇2CM and ∇2 are the Laplace operators with respect to
rCM and to r, M is the mass of the atom (M = mpr + mel ) and µ is the
reduced mass of the electron - proton system,
mpr mel
µ= . (9.34)
mpr + mel
The operator H is the Hamiltonian you have studied in Term 1 when you
obtained the energy levels of hydrogen. As written in Eq. (9.32), the Hamil-
tonian of the atom does not contain a term coupling the motion of the
electron relative to the nucleus to the centre of mass motion. The atom
can thus be in a separable state whose wave function is the product of a
function of rCM and a function of r. For such states, it makes sense to talk
about the eigenergies and eigenfunctions of the Hamiltonian H without
reference to the motion of the atom as a whole. However, this is not the
case when the atom is in an entangled state in which its internal state is
not independent from its state of motion — e.g., in a state described by a
wave function of the form ΨCM (rCM , t)Ψ(r, t) + ΦCM (rCM , t)Φ(r, t).
153
9.2 Tensor products
As shown by the examples discussed in the previous section, it is often the case that
the system of interest is formed of distinct parts which need to be considered jointly.
Consider a bipartite system consisting of two individual quantum systems, namely sys-
tem A and system B. Suppose that the quantum states of system A are described by
vectors belonging to a certain Hilbert space, HA say, and those of system B by vectors
belonging to another Hilber space, HB , say. Consider these two systems jointly, as a
single quantum system. The quantum state of the joint system are then described by
vectors belonging to still another Hilbert space, HAB , called the tensor product of HA
and HB .
How HAB is related to HA and HB is not difficult to understand from the examples
of the previous section. We will work with ket vectors here. Consider, for example, a
ket vector |ψiA representing a state of system A and a ket vector |φiB representing a
state of system B. To these two ket vectors we can associate a vector |ψiA |φiB which
represents a state of the joint system (a product state in which system A is in the state
|ψiA and system B is in the state |φiB ).
It should be noted that the symbol |ψiA |φiB does not represent a product in the usual
sense of the word, even though what it represents is commonly referred to as the prod-
uct of |ψiA and |φiB . Properly speaking, it denotes what is called the tensor product
of |ψiA and |φiB . An alternative notation is |ψiA ⊗ |φiB , which makes it clear that we
are not talking about an usual product (the symbol ⊗ stands for the tensor product).
We will use this notation throughout the rest of this section, for clarity, but not later in
the course. Both |ψn iA |φm iB and |ψn iA ⊗ |φm iB represent the same thing, which is
the pair {|ψiA , |φiB }.
+ As long as it is clear that |ψiA and |φiB refer, respectively, to a state of system
A and a state of system B, the order in which these two vectors appear in the
product does not matter: |ψn iA ⊗ |φm iB ≡ |φm iB ⊗ |ψn iA .
154
different vectors |φm iB , HAB can be of a much larger dimension than HA and HB : if
HA is N -dimensional and HB is M -dimensional, then HAB is (N × M )-dimensional.
Note that the vectors belonging to HAB include not only these product basis vectors
but also all the linear combinations that can be made of them. The terminology intro-
duced in the previous section applies generally: a state of the whole system is called a
separable state (or a product state) if it can be represented by the tensor product of a
vector of HA and a vector of HB , and the others are called entangled states.
Inner products and operators
The inner product of two vectors of HAB is defined in terms of the inner products for
HA and HB : If |ηiAB = |ψiA ⊗ |φiB and |η 0 iAB = |ψ 0 iA ⊗ |φ0 iB , then the inner
product of |ηiAB and |η 0 iAB is obtained by multiplying the inner product of |ψiA and
|ψ 0 iA by the inner product of |φiB and |φ0 iB :
0
AB hη|η iAB = A hψ|ψ 0 iA × B hφ|φ0 iB . (9.35)
(Both A hψ|ψ 0 iA and B hφ|φ0 iB are complex numbers, hence the right-hand side of this
equation is the product of two complex numbers.)
For example, take |ψn iA ⊗|φm iB and |ψn0 iA ⊗|φm0 iB , two of the basis vectors formed
by the tensor products of the orthonormal vectors |ψn iA with the orthonormal vectors
|φm iB . The inner product |ψn iA ⊗|φm iB and |ψn0 iA ⊗|φm0 iB is A hψn |ψn0 iAB hφm |φm0 iB ,
which is δnn0 δmm0 since both the |ψn iA ’s and the |φm iB ’s are orthonormal. Hence the
vectors |ψn iA ⊗ |φm iB are also orthonormal.
+ Since |ψiA and |φiB belong to different Hilbert spaces, their inner product is
not defined: the symbol A hψ|φiB has no mathematical meaning.
155
+ Tensor products of operators are also used in applications. By definition,
the tensor product of ÂA and B̂B is the operator ÂA ⊗ B̂B such that
for any |ψiA the operator ÂA may act on and any |φiB the operator B̂B
may act on. Since ÂA and B̂B always act on different vectors, it is clear
that ÂA ⊗ B̂B ≡ B̂B ⊗ ÂA .
For simplicity, the symbol ⊗ is usually not specified: in the same way that
|ψiA ⊗ |φiB is often written |ψiA |φiB , the operator ÂA ⊗ B̂B is often
written ÂA B̂B .
156
10 Unitary transformations
157
• The most important property of unitary transformations is that they conserve
the inner product: If |ψ 0 i = Û |ψi, |φ0 i = Û |φi and Û is unitary,
I.e., the inner product of ψa (x) and ψb (x) is equal to the inner product
of their Fourier transforms.
158
Suppose, further, that the vectors |ψi and |ηi are transformed into the vector |ψ 0 i and
|η 0 i by a certain unitary operator, Û :
Then we see that |η 0 i = Û Â|ψi = Û ÂÛ † |ψ 0 i, where in the last step we have used the
facts that |ψi = Û −1 |ψ 0 i and that Û −1 = Û † . I.e.,
|η 0 i = Â0 |ψ 0 i (10.50)
with Â0 = Û ÂÛ † . Since this is the case for any |ψi on which  acts, we can say that
a unitary transformation which transforms ket vectors according to the equation
|ψ 0 i = Û |ψi (10.51)
The transformed operator Â0 has the same properties as  in the following sense:
2. Sums and products of operators are transformed into sums and products of the
transformed operators: e.g., if  = αB̂ + β Ĉ D̂ where α and β are two complex
numbers, then Â0 = αB̂ 0 + β Ĉ 0 D̂0 .
+ Proof: We use the general relation between operators and their trans-
formed, and also Û Û † = I.
ˆ
Â0 = Û ÂÛ †
= Û (αB̂ + β Ĉ D̂)Û †
= αÛ B̂ Û † + β Û Ĉ(Û † Û )D̂Û †
= αÛ B̂ Û † + β(Û Ĉ Û † )(Û D̂Û † ),
159
4. Â and Â0 = Û ÂÛ † have the same eigenvalues. (The proof is left as an exercise.)
5. hφ0 | Â0 |ψ 0 i = hφ| Â|ψi for any |φi, |ψi. (The proof is also left as an exercise.)
160
(Note the order of the indexes: the coefficient of |φi i is denoted Mij , not Mji .)
Since the |φi i are orthonormal, Mij = hφi |ψj i (see Section 2.10). Moreover Mij∗ =
hψj |φi i since hφi |ψj i∗ = hψj |φi i (see Section 2.8).
These complex coefficients can be arranged in an N × N matrix M in the usual way:
M11 M12 · · · M1N
M21 M22 · · · M2N
M= . .. .. .. . (10.57)
.. . . .
M N 1 MN 2 · · · MN N
+ Proof: Let us work out the ij-element of the matrix M† M. Using Eq. (10.55),
N
X
M† M ij = M̂ † ik M̂ kj
k=1
XN
∗
= Mki Mkj
k=1
XN
= hψi |φk ihφk |ψj i
k=1
= hψi | Iˆ|ψj i = hψi |ψj i = δij .
Since the elements of the unit matrix I are equal to δij (the diagonal elements
of I are all equal to 1 and all the other elements are equal to 0), we see that
M† M = I. Proving that MM† = I can be done similarly.
We can thus pass from one basis to another by a unitary transformation. Transform-
ing the basis from {|φin } to {|ψin } as per the matrix M transforms both the column
vectors representing quantum states and the matrices representing operators. This
transformation is also unitary: If in the {|φin } basis the ket vector |ψi is represented
by the column vector c and the operator  is represented by the matrix A, and if the
same ket vector and the same operator are represented by the column vector c0 and
the matrix Â0 in the {|ψin } basis, then
(The proof of these equations is left as an exercise. Note that the elements of the
column vectors c transform according to M† whereas the basis vectors transform
161
according to M.) These two equations can be brought to the same form as Eqs. (10.51)
and (10.52) by rewriting them in terms of the adjoint of the basis change matrix:
Setting U = M† ,
c0 = Uc and A0 = UAU† . (10.59)
However, note that these equations arise from a mere change of basis which has
no impact on the ket vectors representing quantum states, whereas Eqs. (10.51) and
(10.52) arise from a transformation of the ket vectors.
162
11 Time evolution
∂Ψ
i~ = HΨ(x, y, z, t) (11.1)
∂t
if we describe the quantum states of the system by time-dependent wave functions
Ψ(x, y, z, t), or
d
i~ |Ψ(t)i = Ĥ |Ψ(t)i (11.2)
dt
if we describe them by time-dependent state vectors |Ψ(t)i.
You probably remember to have seen most of the following results and definitions, if
not all of them:
1. The operator Ĥ appearing in Eq. (11.2) is a Hermitian operator called the Hamil-
tonian of the system.
2. The inner product hΦ(t)|Ψ(t)i is constant if |Φ(t)i and |Ψ(t)i evolve in time
according to the Schödinger equation: hΦ(t)|Ψ(t)i = hΦ(t0 )|Ψ(t0 )i for any t, t0 .
In particular, the norm of a state vector does not change under time evolution.
+ Proof: We see from Eq. (11.2) that to first order in δt,
1
|Ψ(t + δt)i = |Ψ(t)i + Ĥ |Ψ(t)iδt. (11.3)
i~
Similarly,
1
hΦ(t + δt)| = hΦ(t)| − hΦ(t)| Ĥ † δt. (11.4)
i~
Therefore, to first order in δt,
1
hΦ(t + δt)|Ψ(t + δt)i = hΦ(t)|Ψ(t)i+ hΦ(t)| Ĥ |Ψ(t)iδt
i~
1
− hΦ(t)| Ĥ † |Ψ(t)iδt.
i~
(11.5)
163
The second and third terms in the right-hand side cancel since Ĥ is
Hermitian. Hence, to first order in δt,
Therefore
hΦ(t + δt)|Ψ(t + δt)i − hΦ(t)|Ψ(t)i
lim = 0, (11.7)
δt→0 δt
which means that dhΦ(t)|Ψ(t)i/dt = 0.
3. As the Schrödinger equation is linear and of first order in time, giving Ĥ and
specifying |Ψ(t)i at a time t0 determines |Ψ(t)i at all times (at least in principle,
in practice the Schrödinger equation may be impossible to solve to sufficient
accuracy).
and similarly for the continuum eigenstates of Ĥ if there are any (but then in
terms of generalized eigenvalues and generalized eigenvectors, see Section 6.9).
The eigenvalues En are called the eigenenergies of the system. Since Ĥ is Her-
mitian, the eigenergies are real (not complex).
5. Ĥ may or may not depend on time. For example, the Hamiltonian used in the
Term 1 course to describe an unperturbed hydrogen atom,
~2 2 e2 1
− ∇ − ,
2µ 4π0 r
does not depend on time. By contrast, the Hamiltonian
~2 2 e2 1
− ∇ − + e F(t) · r,
2µ 4π0 r
which describes an atom of hydrogen perturbed by a time-dependent electric
field F(t), does depend on time. The Hamiltonian is normally time-independent,
unless the system it represents is subject to a time-dependent interaction with
the rest of the world.
164
(a) The eigenenergies En and the energy eigenstates |ψn i are also time-independent,
and so is Eq. (11.8). This equation is often referred to as the time-independent
Schrödinger equation. Eq. (11.2) is the time-dependent Schrödinger equa-
tion.
(b) Given an energy eigenstate |ψn i and the corresponding eigenenergy En ,
the ket |ψn i exp(−iEn t/~) is a solution of Eq. (11.2):
d d
i~ |ψn i exp(−iEn t/~) = i~|ψn i exp(−iEn t/~)
dt dt
= i~|ψn i(−iEn /~) exp(−iEn t/~)
= En |ψn i exp(−iEn t/~)
= Ĥ |ψn i exp(−iEn t/~). (11.9)
This result shows that the probability of each possible outcome of a mea-
surement does not depend on the instant at which the measurement is
made. It is easy to see that we would arrive to the same conclusion for
the probability densities we would need to consider if the eigenvalues of
interest belonged to a continuum. As we have not assumed anything spe-
cific about the measured observable, the conclusion is that the probability
165
distribution of the results of any measurement which can be made on the
system is constant in time. In other words, the eigenvectors of the Hamil-
tonian describe stationary states, i.e., states whose physical properties are
the same at all times.
Stationary states are also states of well defined energy: Since the vector
|ψn i exp(−iEn t/~) is an eigenvector of Ĥ and eigenvectors correspond-
ing to different eigenenergies are orthogonal, a measurement of the energy
in the state |ψn i exp(−iEn t/~) would give En with probability 1. Corre-
spondingly, the uncertainty ∆E in the value of the energy is zero in that
state.
(d) Linear combinations of eigenvectors belonging to different eigenenergies
do not describe stationary states. In fact, any solution of Eq. (11.2) can be
written as an expansion on the eigenvectors and generalized eigenvectors
of Ĥ. Namely, if |Ψ(t)i is a time-dependent state vector and the Hamilto-
nian is time-independent, there exists a set of constant coefficients cn and
ck such that
X Z
|Ψ(t)i = cn exp(−iEn t/~)|ψn i + ck exp(−iEk t/~)|ψk i dk,
n
(11.11)
where the |ψn i’s and |ψk i’s are, respectively, eigenvectors and generalized
eigenvectors of Ĥ corresponding to the energies En and Ek . (As usual, n
and k represent the sets of quantum numbers which must be specified to
identify each of these eigenstates unambiguously.)
(e) Suppose that we know |Ψ(t)i at a particular time, t = 0 say (for simplicity).
Eq. (11.11) tells us that
X Z
|Ψ(t = 0)i = cn |ψn i + ck |ψk i dk. (11.12)
n
Assuming that the eigenvectors |ψn i and generalized eigenvectors |ψk i are
orthonormal, the coefficients cn and ck can then be calculated by projec-
tion: cn = hψn |Ψ(t = 0)i and ck = hψk |Ψ(t = 0)i. Plugging the results
into Eq. (11.11) then gives |Ψ(t)i at all times.
166
evolution can be described as a transformation of |Ψ(t0 )i into |Ψ(t)i, this transforma-
tion being effected by an operator Û (t, t0 ) depending on t0 and t:
More precisely, we define Û (t, t0 ) as being the operator which maps any vector |Ψ(t0 )i
to the vector |Ψ(t)i that |Ψ(t0 )i changes into under the time evolution governed by
the Schrödinger equation, for any t0 and any t. This operator is called the evolution
operator (or time evolution operator).
The requirement that |Ψ(t)i obeys the Schrödinger equation implies that
d
i~ Û (t, t0 ) = Ĥ Û (t, t0 ). (11.14)
dt
+ Proof: Differentiating Eq. (11.13) with respect to time and multiplying each
side by i~ yields
d d
i~ |Ψ(t)i = i~ Û (t, t0 )|Ψ(t0 )i. (11.15)
dt dt
We also have, from Eq. (11.2),
d
i~ |Ψ(t)i = Ĥ |Ψ(t)i = Ĥ Û (t, t0 )|Ψ(t0 )i. (11.16)
dt
Hence, for any |Ψ(t0 )i,
d
i~ Û (t, t0 )|Ψ(t0 )i = Ĥ Û (t, t0 )|Ψ(t0 )i. (11.17)
dt
Eq. (11.14) follows.
Note that solving Eq. (11.14) is, in general, as difficult as solving Eq. (11.2). The useful-
ness of introducing the time evolution operator lies primarily in the interesting theo-
retical developments possible in this approach. However, in the frequent case where
the Hamiltonian Ĥ is time-independent, a formal solution of Eq. (11.14) can be written
as
Û (t, t0 ) = exp[−iĤ(t − t0 )/~]. (11.18)
167
(See Section 3.3 for the definition of the exponential of an operator.) We stress that
Eq. (11.18) only applies to the case where Ĥ is time-independent. This equation is
generally not correct for time-dependent Hamiltonians.
Whether Eq. (11.18) applies or not, the evolution operator has the following properties:
1. Û (t0 , t0 ) = Iˆ since we must have that |Ψ(t0 )i = Û (t0 , t0 )|Ψ(t0 )i for any
|Ψ(t0 )i.
168
and must be true for any |Φ(t0 )i and any |Ψ(t0 )i. In particular, it
must be true for any |Ψ(t0 )i and for
However, with this choice of ket |Φ0 i, Eq. (11.24) reduces to the
equation hΦ(t0 )|Φ(t0 )i = 0, which implies that |Φ(t0 )i is the zero
vector. The operator Û † (t, t0 )Û (t, t0 ) − Iˆ thus maps every vector to
the zero vector, which is possible only if Û † (t, t0 )Û (t, t0 ) = I. ˆ Mul-
tiplying this equation on the right by the inverse of Û (t, t0 ) gives
Û † (t, t0 ) = Û −1 (t, t0 ). Therefore Û † (t, t0 ) is a unitary operator.
Let us come back to the assumption made at the start that the do-
main of Ĥ is the whole Hilbert space (this assumption was made
necessary by our use of the Schrödinger equation, which makes
sense only for vectors in the domain of the Hamiltonian). This as-
sumption is not inocuous in infinite dimensional Hilbert spaces, but
removing it would require to add to the proof a detailed discussion
of what the domains of Ĥ, of Ĥ † , of Û (t, t0 ) and of Û † (t, t0 ) actu-
ally are. However, this complication is unnecessary: time evolution
can be defined from the onset as being a unitary transformation ef-
fected by a unitary operator Û (t, t0 ) obeying Eq. (11.19), and the
Hamiltonian can then be introduced as the self-adjoint operator Ĥ
such that Û (t, t0 ) satisfies Eq. (11.14). (The Hamiltonian is indeed a
self-adjoint operator, i.e., Ĥ † = Ĥ, not merely a Hermitian opera-
tor.) The mathematical basis of this latter approach is an important
theorem of functional analysis called the Stone’s theorem.
169
hAi(t) can also be seen as the expectation value of the time-dependent operator
Remember what we have seen about calculating the probability of finding a par-
ticular value of a dynamical variable in a measurement — e.g, that the probabil-
ity of finding the eigenvalue λn of  is |hψn |Ψ(t)i|2 if λn is non-degenerate, if
Â|ψn i = λn |ψn i and if the ket vectors |ψn i and |Ψ(t)i are normalized. Since
ÂH (t) = Û (t0 , t)ÂÛ † (t0 , t) and Û (t0 , t) is a unitary operator, the operator ÂH (t)
has the same eigenvalues as the operator  (see page 160). Moreover, if Â|ψn i =
λn |ψn i, then (1) ÂH (t)|ψnH (t)i = λn |ψnH (t)i, where |ψnH (t)i = Û † (t, t0 )|ψn i (i.e.,
|ψnH (t)i is an eigenvector of ÂH (t) corresponding to the eigenvalue λn ), and also
(2) |hψn |Ψ(t)i|2 = |hψnH (t)|Ψ(t0 )i|2 . (The proof of these two assertions is left as an
exercise for the reader.) Thus
We see that the probability of finding λn can be calculated either in terms of the ket
vector |Ψ(t)i and an eigenvector of  or in terms of the ket vector |Ψ(t0 )i and an
eigenvector of ÂH (t), and these two approaches are completely equivalent. They cor-
respond to two alternative descriptions of quantum systems. The first one is probably
the most familiar of the two: quantum states are described by time-dependent vectors
and observables by (usually) time-independent operators. This description is referred to
as the Schrödinger picture of Quantum Mechanics (or the Schrödinger representation).
170
The alternative description, in which the states are represented by time-independent
vectors and the observables by (usually) time-dependent operators, is referred to as the
Heisenberg picture of Quantum Mechanics (or Heisenberg representation).
In the Schrödinger picture, time evolution is governed by the time-dependent Schrödinger
equation, Eq. (11.2). Its counterpart in the Heisenberg picture is the Heisenberg equa-
tion of motion. If  does not depend on time, this equation reads
d
i~ ÂH (t) = [ÂH (t), ĤH (t)], (11.31)
dt
where
ĤH (t) = Û (t0 , t)Ĥ Û † (t0 , t). (11.32)
d d †
i~ ÂH (t) = [ÂH (t), ĤH (t)] + i~ Û (t0 , t) Û (t0 , t). (11.33)
dt dt
Proof: Let ÂH (t) = Û † (t, t0 )Â(t)Û (t, t0 ). Differentiating this product of
operators is done by differentiating one operator at a time, as for ordinary
functions:
" #
d d † † dÂ
i~ ÂH (t) = i~ Û (t, t0 ) Â(t)Û (t, t0 ) + Û (t, t0 ) i~ Û (t, t0 )
dt dt dt
† d
+ Û (t, t0 )Â(t) i~ Û (t, t0 ) . (11.34)
dt
d †
− i~ Û (t, t0 ) = Û † (t, t0 )Ĥ † , = Û † (t, t0 )Ĥ, (11.35)
dt
where in the last step we have used the fact, mentioned in the note at
the end of Section 11.2, that Ĥ is self-adjoint. Making use of these two
relations in Eq. (11.34) yields Eq. (11.33).
171
+ It follows from Eqs. (11.28) and (11.33) and from Eq. (11.37) of Section 11.4
that
d 1 dÂ
hAi(t) = hΨ(t)|[Â, Ĥ]|Ψ(t)i + hΨ(t)| |Ψ(t)i, (11.36)
dt i~ dt
which is a general form of the Ehrenfest theorem (you have encountered
this theorem in the Term 1 QM course).
[ÂH (t), ĤH (t)] = Û (t0 , t)[Â, Ĥ]Û † (t0 , t). (11.37)
Thus [ÂH (t), ĤH (t)] ≡ 0 if [Â, Ĥ] ≡ 0. In that case, Eq. (11.31) says that
d
i~ ÂH (t) ≡ 0. (11.38)
dt
Therefore ÂH (t) is constant in time and ÂH (t) ≡  if  commutes with Ĥ. In turns,
this implies that the probability of finding any given value of the observable A re-
mains the same as t varies. On account of these facts, and by analogy with Classical
Mechanics, a dynamical variable represented by an operator  commuting with the
Hamiltonian is said to be a constant of motion.
~2 2 e2 1
H=− ∇ − + eFext (t)z. (11.39)
2µ 4π0 r
Because of the interaction term eFext (t)z, the solutions of the time-
dependent Schrödinger equation may be extremely complicated. However,
172
this Hamiltonian commutes with Lz , the z-component of the orbital an-
gular momentum operator. Hence, the z-component of the orbital angular
momentum of the electron is a constant of motion. For instance, if at some
time the atom is in in a eigenstate of Lz with eigenvalue m~, so that the
z-component of the electron’s orbital angular momentum is well defined
and equal to m~, then the atom will remain in an eigenstate of Lz corre-
sponding to that eigenvalue at all times, however complicated the wave
function might become as time increases.
173
12 Rotations and angular momentum
where r is the position vector of the mass point and pcl is its momentum.
As discussed previously, we can define a position operator x̂ and a momentum operator
p̂x for the x-direction, a position operator ŷ and a momentum operator p̂y for the y-
direction, and a position operator ẑ and a momentum operator p̂z for the z-direction.
These operators can be taken to be the x-, y- and z-components of a vector position
operator r̂ and a vector momentum operator p̂:
r̂ = x̂ x̂ + ŷ ŷ + ẑ ẑ, (12.2)
p̂ = p̂x x̂ + p̂y ŷ + p̂z ẑ. (12.3)
(As in the rest of these notes, x̂, ŷ and ẑ are unit vectors in the x-, y- and z-directions.
The hat sign in x̂, ŷ and ẑ indicates that these objects are unit vectors, not that they are
operators, whereas the hat sign in x̂, ŷ and ẑ indicates that these objects are operators,
not that they are unit vectors.)
Likewise, the orbital angular momentum operator, L̂, is a geometric vector whose x-,
y- and z-components, denoted L̂x , L̂y and L̂z , are themselves operators:
This operator is related to the position operator r̂ and to the momentum operator p̂
in the same way as, in Classical Mechanics, the angular momentum of a mass point is
related to its position and its momentum:
L̂ = r̂ × p̂. (12.5)
174
The usual rules of the vector product apply, although the vectors are operators here. L̂
can thus be calculated as a determinant:
x̂ ŷ ẑ
L̂ = x̂ ŷ ẑ . (12.6)
p̂x p̂y p̂z
This gives
L̂z = x̂pˆy − ŷ pˆx , (12.7)
and similar expressions for L̂x and L̂y which can be obtained from Eq. (12.7) by circular
permutation of the indices (x → y, y → z, z → x):
The operator L̂2 also plays an important role in the theory. It is defined as the dot
product of L̂ with itself, i.e.,
It is not particularly difficult to deduce from these definitions and from the properties
of the position and momentum operators that L̂x , L̂y , L̂z and L̂2 are Hermitian and
satisfy the following commutation relations:
[L̂x , L̂y ] = i~L̂z , [L̂y , L̂z ] = i~L̂x , [L̂z , L̂x ] = i~L̂y , (12.10)
and furthermore
[L̂x , L̂2 ] = [L̂y , L̂2 ] = [L̂z , L̂2 ] = 0. (12.11)
Given Eqs. (12.7) and (12.8) and how the momentum operators p̂x , p̂y and p̂z are rep-
resented, the operators L̂z , L̂x and L̂y take on the following forms in the position
representation:
∂ ∂
Lz = −i~ x −y , (12.12)
∂y ∂x
∂ ∂
Lx = −i~ y −z , (12.13)
∂z ∂y
∂ ∂
Ly = −i~ z −x . (12.14)
∂x ∂z
175
+ Passing to spherical polar coordinates (r, θ, φ) brings Lz to a particu-
larly simple form:
∂
Lz = −i~ . (12.15)
∂φ
However, somewhat more complicated expressions are found for Lx
and Ly :
∂ ∂
Lx = −i~ − sin φ − cot θ cos φ , (12.16)
∂θ ∂φ
∂ ∂
Ly = −i~ cos φ − cot θ sin φ . (12.17)
∂θ ∂φ
(We define the angles θ and φ in the usual way in Physics: θ is measured
from the positive z-axis and φ is measured in the xy-plane from the
positive x-axis. Lx , Ly and Lz do not depend on r or on derivatives
with respect to r.)
∂2 2 ∂ 1 ∂2 cot θ ∂ 1 ∂2
∇2 = + + + + . (12.19)
∂r2 r ∂r r2 ∂θ2 r2 ∂θ r2 sin2 θ ∂φ2
This last equation can be somewhat simplified, and made more trans-
parent, by using the fact that
2
1 ∂2
∂ ∂
2 2 2 2
L = Lx +Ly +Lz = −~ 2
+ cot θ + . (12.20)
∂θ2 ∂θ sin2 θ ∂φ2
Namely,
∂2 2 ∂ 1 L2
∇2 = + − . (12.21)
∂r2 r ∂r ~2 r2
Therefore
~2 2 ~2 ∂2 L2
2 ∂
H=− ∇ + V (r) = − 2
+ + + V (r).
2m 2m ∂r r ∂r 2mr2
(12.22)
176
Note that we take the potential to be central. I.e., we assume that V (r)
depends only on the distance of the particle to the origin, not on its
angular position. Such potentials are said to be central because the cor-
responding classical force, −∇V , is a vector directed towards or away
from a fixed point (the origin), the “centre of force". Remember that in
Classical Mechanics the angular momentum vector Lcl is a constant of
motion if the potential is central. Since H depends on θ and φ only
through L̂2 , that Lˆz commutes with L̂2 and that L̂2 and L̂z do not
depend on r, L̂2 and L̂z commute H. These two operators thus cor-
respond to quantum mechanical constants of motion if the potential is
central (see Section 11.4).
It is interesting to compare the Hamiltonian given by Eq. (12.22) with
the classical Hamiltonian for the same system,
1 2 |Lcl |2
Hcl = p + + V (r), (12.23)
2m r 2mr2
where pr is the generalized momentum conjugate to the radial variable
r. In Quantum Mechanics as in Classical Mechanics, the radial motion
of the particle is affected both by the potential energy V (r) and by
an “angular momentum barrier", L2 /(2mr2 ) in the quantum case or
|L|2 /(2mr2 ) in the classical case, which plays the role of an additional
potential energy.
12.2 Spin
Physicists became aware in the 1920s that many quantum systems have an angular-
momentum like property, distinct from the orbital angular momentum. At first, it was
theorized that this property could be related to some kind of self-rotation of the parti-
cles forming these systems, a bit as if electrons, protons, etc, were spinning tops. This
property became referred to as “spin" for that reason. It was soon realized that associ-
ating spin to an actual rotation is completely incorrect — an electron is not a spinning
top — but the word “spin" kept being used. The modern understanding of spin is that
this property has nothing to do with an actual motion and has no analogue in Classical
Mechanics.
There is a relation with the orbital angular momentum, though, in that spin is a dy-
namical variable described by a Hermitian vector operator whose components obey
177
the same commutation relations as those of the orbital angular momentum operator L̂.
We denote this vector operator by Ŝ and its x-, y- and z-components by Ŝx , Ŝy and Ŝz :
[Ŝx , Ŝy ] = i~Ŝz , [Ŝy , Ŝz ] = i~Ŝx , [Ŝz , Ŝx ] = i~Ŝy , (12.25)
and also
[Ŝx , Ŝ2 ] = [Ŝy , Ŝ2 ] = [Ŝz , Ŝ2 ] = 0, (12.26)
with the operator Ŝ2 being the dot product of Ŝ with itself:
However, the commutation relations for Ŝx , Ŝy and Ŝz cannot be derived from those of
the position and momentum operators, in contrast to those of L̂x , L̂y and L̂z . We will
see that they can be obtained from the rotational properties of quantum states.
It is important to understand that spin is unrelated to position, momentum or orbital
angular momentum. Thus
I.e., the order in which rotations are made matters when they are made about different
axes.
178
+ Should you have any doubt about the above statement, try this experiment:
Define two orthogonal axes, fixed with respect to the room, e.g., a vertical
axis and a horizontal axis. Take a book, and rotate it first by 90 deg about
the vertical axis and then by 90 deg about the horizontal axis. Note its
new position. Then start again with the book in the same initial position
as before, but now rotate it first by 90 deg about the horizontal axis and
then by 90 deg about the vertical axis. Its new position won’t be the same
as what you found in the first sequence of rotations...
Clearly, rotations about a same axis do commute. For example, rotating a
book by 20 deg about the vertical axis and then by 30 deg about the same
axis is the same as rotating it first by 30 deg and then by 20 deg.
179
+ The following, rather obvious facts are also worth noting in view of
their importance in the mathematical theory of these transformations:
These four facts mean that these transformations form a group with
respect to the composition of rotations (in the mathematical meaning
of the word group).
As noted above, rotating a point transforms its coordinates, and this
transformation can be represented by a certain 3 × 3 matrix. Matrices
representing these rotations have three special features: they are real
(not complex), their transpose equal their inverse, and their determi-
nant is 1. A real invertible matrix whose transpose is its inverse is said
to be orthogonal (or unitary since for a real matrix the transpose is the
same as the conjugate transpose). Orthogonal matrices of unit deter-
minant form a group under matrix multiplication: the product of two
such matrices is also an orthogonal matrix of unit determinant, matrix
multiplication is associative, the identity matrix is an orthogonal matrix
of unit determinant, and any such matrix has an inverse, which is also
an orthogonal matrix of unit determinant. This group is called SO(3)
(SO stands for “special orthogonal", the word special referring to the
condition that the determinant is 1). Transformations of coordinates
amounting to a rotation are in 1 to 1 correspondance with elements of
SO(3).
SO(3) is related to the group SU(2), the group of the unitary 2 × 2 com-
plex matrices of unit determinant. In fact general rotations in 3D space
are best described by SU(2) matrices rather than by SO(3) matrices. An
180
explanation of why this is the case would require a lengthy mathemat-
ical analysis of rotations, as would a detailed account of the mathemat-
ics of these 2 × 2 matrices and their relationship with SO(3) matrices.
We just note that these two descriptions may differ significantly for fi-
nite rotations, but they don’t for infinitesimal rotations (rotations by an
infinitesimal angle). For example, it is possible to show that Eqs. (S.12–
S.14) imply that
where the symbol O(3 ) means that terms cubic and of higher order in
have been neglected. (See the homework problem QT2.6 for details.)
Eq. (S.30) is a relation between the SO(3) matrices of Eqs. (S.12–S.14).
However, exactly the same relation would be obtained for the SU(2)
matrices describing the same rotations.
We start by looking at how wave functions transform when we map each point of
space to its image by a rotation about the x-axis. Consider, e.g., an atom of hydro-
gen in a state described by a wave function ψ(x, y, z), ignoring spin. At a point P of
coordinates (xP , yP , zP ), the value of this wave function is a certain complex num-
ber ψ(xP , yP , zP ). The rotation maps P to a point P 0 of coordinates (x0P , yP0 , zP0 ). In
general, the value of the wave function ψ(x, y, z) at P 0 differs from its value at P . How-
ever, we can define a function ψ 0 (x, y, z) whose value at P 0 is the same as the value of
ψ(x, y, z) at P . How to do this is simple: we take ψ 0 (x, y, z) to be the function whose
value at the point of coordinate (x, y, z) is ψ(x00 , y 00 , z 00 ), where (x00 , y 00 , z 00 ) are the co-
ordinates of the point sent to the point (x, y, z) by the rotation [e.g., (x00 , y 00 , z 00 ) would
be (xP , yP , zP ) if (x, y, z) was (x0P , yP0 , zP0 )]. Clearly, if ψ(x, y, z) describes an atomic
state oriented in the z-direction (e.g., the 2pm=0 state), ψ 0 (x, y, z) describes a state ori-
ented in a different direction but otherwise identical to that described by ψ(x, y, z).
181
Passing from ψ(x, y, z) to ψ 0 (x, y, z) is a transformation. Formally,
+ The reader is referred to the model solution of Problem QT2.6 for the
principle of the calculation leading to Eq. (12.37).
The wave function ψ 0 (x, y, z) thus describes a state which is rotated about the x-axis
compared to the state described by the wave function ψ(x, y, z). In a sense, going from
ψ(x, y, z) to ψ 0 (x, y, z) is “rotating the atom" from one orientation to another. Imagine
182
an experiment in which the atom is excited from the ground state to the 2pm=0 state
by a laser beam whose electric field component is oriented in the z-direction (you will
study this process in the level 3 QM course). Let ψ(x, y, z) be the corresponding wave
function. Arranging for the electric field component of the laser to be oriented in a
direction rotated by an angle α about the x-axis will instead lead to an excited state
oriented in that different direction. This state can be described by the wave function
ψ 0 (x, y, z). The two states ψ(x, y, z) and ψ 0 (x, y, z) thus differ by a rotation R of the
apparatus used to prepare them. Moreover, any prediction one can make about the
results of measurements on the atom in the state ψ(x, y, z) are exactly the same as
those for measurements on the atom in the state ψ 0 (x, y, z), provided the measuring
apparatus is also rotated by R.
This carries over to more general systems, even to systems which are not amenable to
a description in terms of functions of x, y, and z. Imagine an experiment in which a
certain quantum system can be prepared and measured with the apparatus either in
an orientation A or in an orientation B differing from A by a rotation. Any quantum
state |ψi relevant for measurements made with the apparatus in the orientation A has
a counterpart |ψ 0 i for measurements made with the apparatus in the orientation B.
How each of the states |ψi is related to the “rotated state" |ψ 0 i depends on the axis and
the angle of the rotation which brings the apparatus from one orientation to the other.
This transformation can be expressed by the equation
183
As seen in the homework problem QT2.6, it follows from the commutation relation
between 3D rotations that the three operators Jˆx , Jˆy and Jˆz do not commute with each
other, and instead that
[Jˆx , Jˆy ] = i~Jˆz , [Jˆy , Jˆz ] = i~Jˆx , [Jˆz , Jˆx ] = i~Jˆy . (12.39)
+ The detail of the calculations leading to this key result can be found in
the model solution of this problem.
The derivation may seem to be based on a (a priori reasonable) assump-
tion that if R1 (α1 ) and R2 (α2 ) are two rotation matrices describing ge-
ometric rotations of points in 3D space, and R1 (α1 ) and R2 (α2 ) are
the corresponding rotation operators, then R2 (α2 )R1 (α1 ) is the rota-
tion operator corresponding to the geometric rotation described by the
matrix R2 (α2 )R1 (α1 ). In fact, this assumption would not be correct
in general. It is correct for the transformations of wave functions dis-
cussed earlier, though, and it is always correct if the angles α1 and α2
are infinitesimal (which is the case considered in the derivation). The
issue is related to an important mathematical detail alluded to above,
which is that the geometric rotations of points in 3D space are described
by elements of the group SO(3) (the group of the orthogonal 3 × 3 real
matrices with determinant equal to 1) whereas the rotation operators
transforming quantum states are represented by elements of the related
group SU(2) (the group of the unitary 2 × 2 complex matrices with de-
terminant equal to 1).
+ A position vector r is transformed into the vector r+dr = r+ n̂×r dα
by an infinitesimal rotation of angle dα about an axis in the direction of
the unit vector n̂. If the x-, y- and z-components of n̂ are, respectively,
sin Θ cos Φ, sin Θ sin Φ and cos Θ, then
r+dr = r+sin Θ cos Φ (x̂×r) dα+sin Θ sin Φ (ŷ×r) dα+cos Θ (ẑ×r) dα.
(12.40)
Correspondingly,
R̂n̂ (dα) = Iˆ − (i/~) sin Θ cos Φ Jˆx + sin Θ sin Φ Jˆy + cos Θ Jˆz dα.
(12.41)
Thus Jˆn̂ = n̂ · Ĵ with Ĵ = Jˆx x̂ + Jˆy ŷ + Jˆz ẑ.
+ Because the norm of the ket |ψ 0 i = R̂n̂ (α)|ψi ought to be the same as
the norm of the ket |ψi, the rotation operator R̂n̂ (α) must be unitary:
R̂† (α)R̂n̂ (α) = R̂n̂ (α)R̂† (α) = I.
n̂
ˆ
n̂ (12.42)
184
For infinitesimal rotations,
One says that the angular momentum operator Jˆn̂ is the infinitesimal generator of the
rotations about the axis n̂ in the Hilbert space of the state vectors of the system.
Eq. (12.45) does not apply to the case of a finite rotation angle. Instead, for any finite or
infinitesimal value of α,
R̂n̂ (α) = exp − iαJˆn̂ /~ . (12.46)
Remember that the exponential of an operator is defined by its Taylor series (see Section
3.3). Here,
α 1 α 2 ˆ2 1 α 3 ˆ3
exp − iαJˆn̂ /~ = Iˆ + −i Jˆn + Jn + · · · (12.47)
−i Jn + −i
~ 2! ~ 3! ~
Thus Eqs. (12.45) and (12.46) are consistent up to first order in α.
185
+ The momentum operator is the infinitesimal generator of translations in
space: as seen in a workshop problem,
exp(−ix0 P/~)ψ(x) = ψ(x − x0 ), (12.48)
where x0 is a length and P = −i~ d/dx.
Likewise, the Hamiltonian is the infinitesimal generator of translations in
time (see Section 11.2 of these notes).
hψ 0 | = hψ| R̂n̂† (α) = hψ| R̂n̂−1 (α) = hψ| R̂n̂ (−α). (12.51)
186
As seen above, R̂n̂ (±α) can be taken to be equal to Iˆ ∓ iαJˆn̂ /~ if the
angle α is infinitesimally small, with Jˆn̂ the component of the angu-
lar momentum operator in the n̂ direction and Iˆ the identity operator.
Denoting this infinitesimal angle by , we can therefore write
0 0 ˆ i ˆ ˆ i ˆ
hψ | Ĥ |ψ i = hψ| I + Jn̂ Ĥ I − Jn̂ |ψi
~ ~
i i 2
= hψ| Ĥ + Jˆn̂ Ĥ − Ĥ Jˆn̂ + 2 Jˆn̂ Ĥ Jˆn̂ |ψi. (12.52)
~ ~ ~
Neglecting the term of order 2 compared to the terms of order on
account that is infinitesimally small, we obtain
i ˆ i
hψ 0 | Ĥ |ψ 0 i = hψ| Ĥ + Jn̂ Ĥ − Ĥ Jˆn̂ |ψi
~ ~
i ˆ
= hψ| Ĥ + [Jn̂ Ĥ − Ĥ Jˆn̂ ]|ψi
~
i
= hψ| Ĥ |ψi + hψ|[Jˆn̂ , Ĥ]|ψi. (12.53)
~
In view of Eq. (12.50), hψ|[Jˆn̂ , Ĥ]|ψi must be zero. Since this must be
the case for any state vector |ψi, we can conclude that [Jˆn̂ , Ĥ] = 0.
We see that the requirement that an isolated system is invariant under rotation, which
stems from the isotropy of space (space is identical in any direction), implies that the
angular momentum operator commutes with the Hamiltonian, hence that the angular
momentum is a constant of motion (see Section 11.4).
This relationship between the symmetry of the system (here, invariance under rotation)
and the existence of a conserved quantity (here, the angular momentum vector) is in
fact very general. For example, momentum is conserved if the system is invariant under
a spatial translation, and energy is conserved if the system is invariant under a “time
translation", t → t + τ with τ constant (there is invariance under a “time translation"
in the absence of any time-dependent interaction).
187
12.5 Angular momentum operators
A vector operator
Ĵ = Jˆx x̂ + Jˆy ŷ + Jˆz ẑ (12.54)
is said to be an angular momentum operator if its three components Jˆx , Jˆy and Jˆz are
Hermitian and satisfy the commutation relations
[Jˆx , Jˆy ] = i~Jˆz , [Jˆy , Jˆz ] = i~Jˆx , [Jˆz , Jˆx ] = i~Jˆy . (12.55)
The orbital angular momentum operator L̂ and the spin operator Ŝ are particular in-
stances of angular momentum operators.
• Angular momenta are often denoted by the letter J. Accordingly, in this section
we use J to denote general angular momentum operators. This letter is also often
used to represent, specifically, the “total angular momentum operator" L̂ + Ŝ;
what we cover here applies to L̂ + Ŝ as well as to any other angular momentum
operator.
• Knowing that [Jˆx , Jˆy ] = i~Jˆz , the other commutation relations can be obtained
by circular permutation of the indices (x → y, y → z, z → x).
It is not difficult to show that the commutation relations satisfied by Jˆx , Jˆy and Jˆz
imply that these three operators commute with Ĵ2 :
1. The eigenvalues of Ĵ2 are j(j + 1)~2 , where j is a non-negative integer (0, 1,
2,. . . ) or half-integer (1/2, 3/2, 5/2,. . . ). In particular,
188
2. The eigenvalues of Ĵz are m~, where m is an integer (0, ±1, ±2,. . . ) or half-
integer (±1/2, ±3/2, ±5/2,. . . ). In the case of simultaneous eigenvectors of Ĵ2
and Jˆz , the possible values of j and m are restricted to the range −j ≤ m ≤ j,
with m running from −j to j by integer steps. Therefore, for any of the |j, mi’s,
For example, in the case of an orbital angular momentum operator, the operators Jˆz
and Ĵ2 correspond, in the position representation, to the operators Lz and L2 = L2x +
L2y + L2z , and the eigenvectors |j, mi to the spherical harmonics Ylm (θ, φ):
+ The choice of Jˆz , rather than another component of Ĵ, for defining the ba-
sis vectors |j, mi is purely conventional (however, it is a time-honoured
convention and everybody abides by). There is nothing special about the
z-direction. Instead of Jˆz , one could use, for example, the component Jˆn̂
of Ĵ in an arbitrary direction defined by the unit vector n̂ (Jˆn̂ = n̂ · Ĵ). All
what we said above about Jˆz also applies to Jˆn̂ : irrespective of the direc-
tion n̂, Jˆn̂ commutes with Ĵ2 , the eigenvalues of Jˆn̂ are m~ with m = 0,
±1/2, ±1, etc., one can construct a basis of the Hilbert space with simulta-
neous eigenvectors of Ĵ2 and Jˆn̂ , and for these eigenvectors −j ≤ m ≤ j.
However, if n̂ is not in the z-direction, these simultaneous eigenvectors of
Ĵ2 and Jˆn̂ in general will not be the same as the eigenvectors of Ĵ2 and Jˆz
defined above.
189
+ Rotating an eigenvector of Jˆn̂ by an angle α about the axis n̂ simply mul-
tiplies this eigenvector by a phase factor exp(−imα). For example, rotate
the state |j, mi about the z-axis: Since Jˆz |j, mi = ~m|j, mi, R̂z (α)|j, mi =
exp(−iαJˆz /~)|j, mi = exp(−iαm)|j, mi. Intriguingly, this means that
if j (and thus m) is a half integer, then a rotation by 2π transforms |j, mi
into −|j, mi: only a 4π rotation brings |j, mi back to itself...
Since Jˆn̂ commutes with Ĵ2 , any power of Jˆn̂ also commutes with Ĵ2 .
Therefore exp(−iαJˆn̂ /~) also commutes with Ĵ2 . Hence, if Ĵ2 |ψi =
j(j + 1)~2 |ψi, then, for any rotation angle α and any rotation axis n̂,
+ The operators Jˆ+ and Jˆ− , defined as Jˆ± = Jˆx ± iJˆy both commute
with Ĵ2 . However, they do not commute with each other and they are
†
not Hermitian — in fact, Jˆ+ = Jˆ− . These two operators play the role
190
of ladder operators for angular momentum. In particular, one finds,
through algebraic methods, that
{|0, 0i, |1/2, −1/2i, |1/2, 1/2i, |1, −1i, |1, 0i, |1, 1i, |3/2, −3/2i, . . .}
191
basis formed by the 2j +1 eigenvectors |j, mi with m = −j, . . . , j. Therefore these op-
erators have 1-dimensional representations (for j = 0), 2-dimensional representations
(for j = 1/2), 3-dimensional representations (for j = 1), etc.
The Pauli matrices
The j = 1/2 case is particularly important because electrons, quarks, protons and neu-
trons are spin-1/2 particles. As mentioned previously, spin corresponds to an angular
momentum operator Ŝ. The eigenvalues of Ŝ2 are s(s + 1)~2 with s = 0, 1/2, 1,. . .
The electron, the proton and the neutron are said to be spin-1/2 particles because the
kets representing their quantum state are always eigenvectors of Ŝ2 with s = 1/2
(otherwise these kets would be physically incorrect).
Let us consider the more general case of an angular momentum operator Ĵ and of a
system in an eigenstate of Ĵ2 with eigenvalue j = 1/2. When j = 1/2, the quantum
number m has only two possible values in simultaneous eigenstates of Ĵ2 and Jˆz , i.e.,
−1/2 and 1/2. The Jx , Jy and Jz operators are thus represented by 2 × 2 matrices in
that case.
It is customary to work in the basis {|1/2, 1/2i, |1/2, −1/2i}. (Note the order of the
basis vectors in that set; the matrices representing the relevant operators would be
different if we would work in the {|1/2, −1/2i, |1/2, 1/2i} basis.) For memory, the
two basis vectors |1/2, 1/2i and |1/2, −1/2i are such that
~
Jˆz |1/2, ±1/2i = ± |1/2, ±1/2i. (12.68)
2
Alternative notations for |1/2, 1/2i (the “state of spin up") and |1/2, −1/2i (the “state
of spin down") are |+i and |−i, |χ+ i and |χ− i, and | ↑ i and | ↓ i:
(The terms “spin up" and “spin down" are conventional and do not reflect a particular
orientation with respect to the vertical direction.)
The matrix Jz representing Jˆz in the {|+i, |−i} basis is therefore diagonal. Specifically,
~ 1 0
Jz = . (12.70)
2 0 −1
It can be shown that Jˆx and Jˆy are represented by the following matrices in this basis
(see Worksheet 9 for a proof):
~ 0 1 ~ 0 −i
Jx = , Jy = . (12.71)
2 1 0 2 i 0
192
Jx , Jy and Jz are often written in terms of the Pauli matrices σx , σy and σz :
where
0 1 0 −i 1 0
σx = , σy = , σz = . (12.73)
1 0 i 0 0 −1
Any j = 1/2 eigenvector of Ĵ2 can be written as a linear combination of the vectors
|+i and |−i. Namely, if |ψi is such that Ĵ2 |ψi = j(j + 1)~2 |ψi with j = 1/2, then
there always exists two complex numbers α and β such that
in the {|+i, |−i} basis. Since h+|+i = h−|−i = 1 and h+|−i = h+|−i = 0,
h+|ψi α
= . (12.75)
h−|ψi β
In particular, the states of spin up and spin down, |+i and |−i, are represented, respec-
tively, by the column vectors
1 0
and .
0 1
193
state |−i2 .) Other possibilities for this bipartite system include |+i1 |+i2 , representing
a state in which both electrons are in a state of spin up, and also |−i1 |+i2 and |−i1 |−i2 .
However, by virtue of the principle of superposition, the state of this system can also
be a linear combination of |+i1 |+i2 , |+i1 |−i2 , |−i1 |+i2 and |−i1 |−i2 . In general,
the two-electron system can be in a joint spin state represented by the ket vector
|ψi12 = α|+i1 |+i2 + β|+i1 |−i2 + γ|−i1 |+i2 + δ|−i1 |−i2 , (12.76)
where the kets |j1 , m1 i1 pertain to one part of the whole system and the kets |j2 , m2 i2
to the other part. Note that the angular momentum operator Ĵ1 acts only on the state
vectors pertaining to part 1 of the whole system while Ĵ2 acts only on the state vectors
pertaining to part 2. For example,
X
Jˆ2z |ψi12 = cj1 m1 j2 m2 |j1 , m1 i1 Jˆ2z |j2 , m2 i2 . (12.78)
j1 m1 j2 m2
One can show that under a rotation by an angle α about an axis n̂, |ψi12 → |ψ 0 i12 =
R̂n̂ (α)|ψi12 with
(12.79)
R̂n̂ (α) = exp − iα n̂ · Ĵ1 + n̂ · Ĵ2 /~ .
Let Ĵ = Ĵ1 + Ĵ2 . Each component of Ĵ is the sum of the corresponding components of
Ĵ1 and Ĵ2 :
with Jx = J1x + J2x , Jy = J1y + J2y and Jz = J1z + J2z . It is not difficult to show that
Ĵ is an angular momentum operator, i.e., that Jˆx , Jˆy and Jˆz are Hermitian and satisfy
the commutation relations
[Jˆx , Jˆy ] = i~Jˆz , [Jˆy , Jˆz ] = i~Jˆx , [Jˆz , Jˆx ] = i~Jˆy . (12.83)
194
Since Ĵ is an angular momentum operator, each of its components commute with Ĵ2 .
In particular, [Jˆz , Ĵ2 ] = 0. It is not difficult to show that the operators Ĵ21 and Ĵ22 also
commute with both Ĵ2 and Jˆz , besides commuting with each other:
[Ĵ21 , Ĵ22 ] = [Ĵ21 , Ĵ2 ] = [Ĵ21 , Ĵ2 ] = [Jˆz , Ĵ21 ] = [Jˆz , Ĵ22 ] = [Jˆz , Ĵ2 ] = 0. (12.84)
(However, Ĵ2 does not commute with J1z or J2z .) Hence, it is possible to construct a
basis of simultaneous eigenvectors of J21 , J22 , J2 and Jz . We will denote such simulta-
neous eigenvectors by |j1 , j2 , J, M i12 . The quantum numbers j1 , j2 , J and M identify
the corresponding eigenvalues:
For given values of j1 and j2 , the possible values of J and M in simultaneous eigen-
vectors of J21 , J22 , J2 and Jz are restricted by the “triangular inequality",
|j1 − j2 | ≤ J ≤ j1 + j2 ,
−J ≤ M ≤ J. (12.89)
(Thus J can have any of the following values: |j1 − j2 |, |j1 − j2 | + 1, |j1 − j2 | + 2,. . . ,
j1 + j2 − 1, j1 + j2 . For a given J, M can have any of the following values: −J,
−J + 1,. . . , J − 1, J.)
Each of the eigenvectors |j1 , j2 , J, M i12 can be written as a linear combination of
the vectors |j1 , m1 i1 |j2 , m2 i2 . The coefficients of this superposition are real numbers
called Clebsch-Gordan coefficients (note the spelling: Gordan, not Gordon). Following
well established traditions, we will write them hj1 , j2 , m1 , m2 |J, M i:
X
|j1 , j2 , J, M i12 = hj1 , j2 , m1 , m2 |J, M i|j1 , m1 i1 |j2 , m2 i2 . (12.90)
m1 ,m2
Reciprocally,
X
|j1 , m1 i1 |j2 , m2 i2 = hj1 , j2 , m1 , m2 |J, M i|j1 , j2 , J, M i12 . (12.91)
J
195
It can be shown that the Clebsch-Gordan coefficient hj1 , j2 , m1 , m2 |J, M i is zero when
M 6= m1 + m2 (see Worksheet 9). Therefore, in Eq. (12.90), the double sum runs
only over the values of m1 and m2 such that M = m1 + m2 , and in Eq. (12.91) M is
necessarily equal to m1 + m2 . In both of these equations, the possible values of J are
restricted by the triangular inequality mentioned above.
196
Supplement to Chapter12
These notes complement Chapter 12. Section S.1 covers the same material as Part 1 of
Lecture 17, and offers a simpler and easier introduction to the maths of space symme-
tries than Section 12.3 of the notes. Sections S.2 and S.3 explain various mathematical
facts referred to, in Section 12.3, as having been covered in “Problem QT2.6”.
One may recognize that this translation operator is the displacement operator intro-
duced earlier in the course (Workshop 2 and Lecture 14), but let us pretend we did
not know this. We note that for x0 = 0, T (x0 ) is necessarily the identity operator, I,
since T (0)ψ(x) = ψ(x − 0) = ψ(x) for any x and any ψ(x). We also note that for
an infinitesimal non-zero displacement dx0 , the translation operator T (dx0 ) cannot
be the identity operator, and must differ from it only by a term of first order in the
displacement. Thus
T (dx0 ) = I − (i/~) O dx0 , (S.3)
where O is a certain operator which this equation defines (we could have included the
factor of −i/~ in the operator O, but haven’t done this for conformity with standard
197
practice). As it turns out, the operator O, so defined, is nothing else than the momentum
operator P .
198
S.2 Rotations and orbital angular momentum
First, recall that any rotation in 3D space can be described by a 3 × 3 matrix. For
example, rotating a point by an angle θ about the z-axis changes its co-ordinates
from (x, y, z) to (x0 , y 0 , z 0 ), with
0
x cos θ − sin θ 0 x
y 0 = sin θ cos θ 0 y . (S.11)
z 0 0 0 1 z
Rotations about the x- or y-axes can be represented similarly. Let us denote the
corresponding rotation matrices by Rx (θ), Ry (θ) and Rz (θ):
1 0 0
Rx (θ) = 0 cos θ − sin θ , (S.12)
0 sin θ cos θ
cos θ 0 sin θ
Ry (θ) = 0 1 0 , (S.13)
− sin θ 0 cos θ
cos θ − sin θ 0
Rz (θ) = sin θ cos θ 0 . (S.14)
0 0 1
Now, consider a hydrogen atom in a state described by a wave function ψ(x, y, z),
ignoring spin and other relativistic effects. Imagine that you rotate this atom by an
angle θ about the z-axis of the system of coordinates, without otherwise disturbing
it in any way. For simplicity, assume that the origin of the system of coordinates
is at the centre of mass, so that rotations about the x-, y- or z-axis may change the
orientation of the atom but not its location in 3D space. The rotation transforms
the wave function ψ(x, y, z) into a new wave function ψ 0 (x, y, z), and generally
ψ 0 (x, y, z) will not be the same function as ψ(x, y, z). However, if ψ 0 (x, y, z) will not
differ much from ψ(x, y, z) if the rotation is by a very small angle. More precisely,
let us consider a rotation about the z-axis by an angle . Then,
∂ ∂
ψ 0 (x, y, z) = ψ(x, y, z) + y −x ψ(x, y, z) + · · · , (S.15)
∂x ∂y
where the terms not written in the right-hand side are quadratic or of higher order
in .
199
+ Proof: First, we note that rotating both the state and the co-ordinates
amounts to no change. Hence ψ 0 (x, y, z) = ψ(x00 , y 00 , z 00 ) if (x00 , y 00 , z 00 )
are the initial co-ordinates of the point brought to the point of co-
ordinates (x, y, z) by the rotation. For a rotation by an angle these
two sets of co-ordinates are related to each other by Eq. (S.11) with
θ = : 00
x cos − sin 0 x
y = sin cos 0 y 00 . (S.16)
z 0 0 1 z 00
We can invert this equation to express x00 , y 00 and z 00 in terms of x, y
and z. We can do this simply by changing into − (a rotation by θ
followed by a rotation by −θ is the same as no rotation at all, hence the
product of Rz (θ) by Rz (−θ) has got to be the unit matrix). Therefore
00
x cos sin 0 x
y 00 = − sin cos 0 y . (S.17)
z 00 0 0 1 z
Since we work to first order in , we can change cos by 1 (recall that
cos = 1 − 2 /2 + · · · ) and sin by . Thus, to first order in , x00 =
x + y, y 00 = y − x and z 00 = z. Moreover, also to first order in ,
dψ
ψ(x00 , y 00 , z 00 ) = ψ(x, y, z) + (S.18)
d
=0 00
∂ψ dy 00
∂ψ dx
= ψ(x, y, z) + + 00 (S.19)
∂x00 d ∂y d =0
Since dx00 /d = y and dy 00 /d = −x,
00 00 00 ∂ψ ∂ψ
ψ(x , y , z ) = ψ(x, y, z) + y 00 − x 00 (S.20)
∂x ∂y =0
We also note that
∂ ∂
ψ(x00 , y 00 , z 00 ) = ψ(x, y, z), (S.21)
∂x00 =0 ∂x
∂ ∂
ψ(x00 , y 00 , z 00 ) = ψ(x, y, z). (S.22)
∂y 00 =0 ∂y
Therefore, to first order in ,
00 00 00 ∂ ∂
ψ(x , y , z ) = ψ(x, y, z) + y −x ψ(x, y, z), (S.23)
∂x ∂y
which is Eq. (S.15).
200
The quadratic and higher order terms can be ignored in the right-hand side of
Eq. (S.15) if is infinitesimal. Namely, for a rotation by an infinitesimal angle dα,
0 ∂ ∂
ψ (x, y, z) = 1 + dα y −x ψ(x, y, z). (S.24)
∂x ∂y
Clearly,
∂ ∂
Jz = −i~ x −y , (S.26)
∂y ∂x
and we recognize that Jz is nothing else than Lz , the z-component of the orbital
angular momentum operator.
Ry () ≈ 0 1 0 , (S.28)
− 0 1 − /22
1 − 2 /2
− 0
Rz () ≈ 1 − 2 /2 0 . (S.29)
0 0 1
201
+ Proof: We start by calculating the matrix product Ry ()Rx ():
1 − 2 /2 0
1 0 0
Ry ()Rx () ≈ 0 1 0 0 1 − 2 /2 −
− 0 1 − /22 0 1 − 2 /2
(S.31)
1 − 2 /2 2 − 3 /2
≈ 0 1 − 2 /2 − (S.32)
− 3 2 4
− /2 1 − 2( /2) + /4
Keeping only the terms of order 0, 1 or 2 in results in
1 − 2 /2 2
Ry ()Rx () ≈ 0 1 − 2 /2 − . (S.33)
− 1 − 2
We now calculate the product of the four matrices, using the fact
that the matrix Ry (−)Rx (−) is simply the matrix Ry ()Rx () with
changed into −. For clarity, we first do the whole calculation and
only then drop any term that is not of order 0, 1 or 2 in . It would have
been as good (and more expedient) to drop these higher order terms as
soon as they arise, without writing them at all.
Ry (−)Rx (−)Ry ()Rx ()
1 − 2 /2 2 1 − 2 /2 2
−
≈ 0 1 − 2 /2 0 1 − 2 /2 −
− 1− 2 − 1 − 2
(S.34)
1 − 2(2 /2) + 4 /4 + 2 22 − 2(4 /2) − 2 − 3 /2 − 3 − + 3
≈ −2 1 0 .
(S.36)
0 0 1
202
Hence, to order 2 in , Ry (−)Rx (−)Ry ()Rx () = Rz (−2 ). We can
be sure that all the terms of order 0, 1 or 2 are taken into account in the
above calculations because cos θ differs from 1 − θ2 /2 only by terms in
θ4 and of higher order and sin θ differs from θ only by terms in θ3 and
of higher order.
Formally, the relation between the state of a rotated system and the correspond-
ing state of the unrotated system can be expressed by an equation such as |ψ 0 i =
R̂z (θ)|ψi, where R̂z (θ) is an operator corresponding to a rotation by an angle θ
about the z-axis. Suppose that we would rotate the system by an angle θ1 about the
x-axis and then by an angle θ2 about the y-axis. Correspondingly, the initial state
vector |ψi would be transformed into the state vector |ψ 00 i = R̂y (θ2 )R̂x (θ1 )|ψi. It
can be shown that these rotation operators can be written in the following forms:
The operators Jˆx , Jˆy , Jˆz and Jˆn are defined by these equations. They have the
same physical dimensions as ~, which are the physical dimensions of an angular
momentum.
Eq. (S.30) says that if is sufficiently small, then a rotation by about the x-axis
followed by a rotation by about the y-axis followed by a rotation by − about
the x-axis followed by a rotation by − about the y-axis is effectively the same as
a rotation by −2 about the z-axis. In terms of rotation operators, we must have,
correspondingly,
This equation implies that [Jˆx , Jˆy ] = i~Jˆz . I.e., the operators Jˆx and Jˆy satisfy the
commutation relation of angular momentum operators.
203
and similarly for R̂y (). We rewrite Eq. (S.42), replacing the rotation
operators by these approximate expressions, and simplify the result.
Let us start with R̂y ()R̂x ():
i 1 i 1
R̂y ()R̂x () ≈ Iˆ − Jˆy − 2 Jˆy Iˆ − Jˆx − 2 Jˆx
2 2 2 2
~ 2~ ~ 2~
(S.44)
i 2
≈ Iˆ − Jˆx + Jˆy − 2 Jˆy Jˆx + Jˆx2 /2 + Jˆy2 /2 .
~ ~
(S.45)
Then
204
Appendix A: Multiplying matrices, column vec-
tors and row vectors
This appendix is just a brief reminder of how to multiply two matrices, a column vector
by a matrix, a column vector by a row vector, or a matrix by a row vector. The rule is
always the same: one multiplies each element of a row by each element of a column
and sum up the products.
3. Multiplying a column vector by a row vector (the row vector on the left of the
column vector). The result is a number:
a0
a b = aa0 + bb0 . (A.3)
b0
4. Multiplying a matrix by a row vector (the row vector on the left of the column
vector). The result is a row vector:
α β
(A.4)
a b = aα + bγ aβ + bδ .
γ δ
5. Multiplying a row vector by a column vector (the row vector on the right of the
column vector). The result is a matrix:
0
aa ab0
a 0 0 (A.5)
a b = .
b ba0 bb0
205
Appendix B: Complex numbers
Familiarity with complex numbers is essential in Quantum Mechanics, even though not
quite all quantum mechanical calculations involve complex numbers. All the essentials
are summarised here. In principle, this appendix contains nothing that you have not
already seen previously. Please refer to your maths courses, to a maths textbook and/or
to reliable online material if you are unfamiliar with any of the results stated below.
Common errors
1. Referring to any complex numbers as imaginary numbers. (Not a bad error, but
the practice suggests a lack of familiarity with complex numbers. See below.)
2. Confusing the squared modulus or a complex number with the square of this
number. (A bad error, see below.)
• a is called the real part of z and b is called the imaginary part of z. The
corresponding symbols are Re z and Im z, or alternatively <(z) and =(z):
a = Re z, b = Im z. (B.1)
z + z∗ z − z∗
Re z = , Im z = . (B.2)
2 2i
206
√
• The modulus of z, denoted |z|, is the real number a2 + b2 . Note that z and z ∗
have the same modulus. Moreover, zz ∗ = z ∗ z = |z|2 :
Unless z is real, the modulus squared of z, |z|2 , is not the same number as the
square of z, z 2 . E.g.,
|2 + 3i|2 = 22 + 32 = 13, (B.4)
whereas
(2 + 3i)2 = 22 + 2 × 2 × 3i + (3i)2 = −5 + 12i. (B.5)
Complex exponentials
The familiar exponential function can be generalized to a function of a complex vari-
able, exp(z). This function crops up very often in applications.
• exp(a + ib) = exp(a) exp(ib), hence if a and b are real, | exp(a + ib)| = exp(a).
• z can always be written as |z| exp(iα), where α is a real number called the ar-
gument of z. One can write
α = arg z. (B.9)
(The argument of a complex number is not unique: if α is such that z = |z| exp(iα),
then z = |z| exp[i(α + 2nπ)], n = 0, ±1, ±2, . . .) Note that if z = |z| exp(iα),
then z ∗ = |z| exp(−iα).
207