Functional Analysis
Functional Analysis
Roman Vershynin
Department of Mathematics, University of Michigan, 530 Church
St., Ann Arbor, MI 48109, U.S.A.
E-mail address: [email protected]
Preface
These notes are for a one-semester graduate course in Functional Analysis,
which is based on measure theory. The notes correspond to the course Real Analysis
II, which the author taught at University of Michigan in the Fall 2010. The course
consists of about 40 lectures 50 minutes each.
The student is assumed to be familiar with measure theory (both Lebesgue and
abstract), have a good command of basic real analysis (epsilon-delta) and abstract
linear algebra (linear spaces and transformations).
The course develops the theory of Banach and Hilbert spaces and bounded
linear operators. Main principles of are covered in depth, which include HahnBanach theorem, open mapping theorem, closed graph theorem, principle of uniform boundedness, and Banach-Alaoglu theorem. Fourier series are developed for
general orthogonal systems in Hilbert spaces. Compact operators and basics of
Fredholm theory are covered.
Spectral theory for bounded operators is studied in the second half of the course.
This includes the spectral theory for compact self-adjoint operators, functional
calculus and basic spectral theory of general (non-compact) operators, although
the latter needs to be expanded a bit.
Topics not covered include: Krein-Milman theorem (although this can be done
with one extra lecture), unbounded linear operators, and Fourier transform. Most
applications to ODE and PDE are not covered, however the integral operators serve
as a main example of operators in this course.
The material has been compiled from several textbooks, including Eidelman,
Milman and Tsolomitis Functional Analysis, Kirillov and Gvishiani Theorems
and problemsin functional analysis, Reed and Simon Methods of modern mathematical physics. I. Functional analysis, V. Kadets A course in functional analysis (Russian), and P. Knyazev, Functional analysis. Minor borrowings are made
from Yoshida Functional analysis, Rudin Functional analysis, and Conway A
course in functional analysis. For some topics not covered, one may try R. Zimmer
Essential results of functional analysis.
Acknowledgement. The author is grateful to his students in the Math 602
course Real Analysis II, Winter 2010, who suggested numerous corrections for these
notes. Special thanks are to Matthew Masarik for his numerous thoughtful remarks,
corrections, and suggestions, which improved the presentation of this material.
Contents
Preface
iv
1
1
7
15
19
25
28
39
39
42
47
56
69
69
74
76
83
87
91
94
94
98
101
103
110
110
113
116
122
Bibliography
126
Index
127
CHAPTER 1
Pr s
x a,b
This is clearly a metric, so a function space becomes not only a linear vector space
but also a metric space. Such spaces will be called normed spaces later. Another
natural choice of a distance would be the
b
4.
5.
6.
7.
8.
9.
Also there are many natural examples of sequence spaces that are linear vector
spaces (check!):
1. s tall sequences of real numbers pan q8
n1 u. This space is too large, and is
never studied.
`1 c0 c `8 s.
1.1.4. Hamel basis. As we know, every finite dimensional linear vector space
E has a basis tx1 , . . . , xn u. A basis is a maximal linearly independent subset of
vectors in E. The number n of basis elements is called the dimension of E; this
number is independent of the choice of the basis. Every vector x P E can be
uniquely expressed as a linear combination of the basis elements:
(1.1)
x
ak xi ,
for some ak
P R.
k 1
pq pq
ra, bs.
(1.2)
k 1
ak xk
P X.
Exercise 1.1.5. Show that each of the following two statements gives
an equivalent definition of Hamel basis:
(1) A Hamel basis is a maximal linearly independent2 subset X E.
(2) A Hamel basis is a linearly independent subset of E which spans
E. The latter means that the linear span of X , defined as
!
SpanpX q : x
ak xk : ak
P R, xk P X , n P N
k 1
coincides with E.
Since we have no topology on E, we have to consider finite sums in (1.2). This
requirement is too strong to be put in practice, which makes Hamel bases essentially
impractical (except in theory). We will come across the more practical notion of
Schauder basis later.
Proposition 1.1.6. Every linear vector space E has a Hamel basis.
For finite dimensional spaces E, this result is usually proved in undergraduate
linear algebra using induction. One keeps adding linearly independent elements
to a set until it spans the whole E. This argument can be pushed into infinite
dimensions as well, where the usual induction is replaced by transfinite induction.
The transfinite induction is best done with Zorns lemma (review a Wikipedia article
on Zorns lemma if you are uncomfortable with all notions it uses):
Lemma 1.1.7 (Zorns lemma). A partially ordered set in which every chain has
an upper bound contains a maximal element.
Proof of Proposition 1.1.6. Consider a family F of all linearly independent subsets of E, which is partially ordered by inclusion. We claim that F has a
maximal element; this would obviously complete the proof by Exercise 1.1.5. We
will get a maximal element from Zorns lemma. Let us check its assumption. Consider a chain pX q of elements in F. The elements X are linearly independent
subsets of E totally ordered by inclusion. Therefore, their union Y X is again a
linearly independent subset of E (check!) Hence this union is an element of F, and
it is clearly an upper bound for the chain pX q. The assumption of Zorns lemma
is therefore satisfied, and the proof is complete
As in the finite dimensional case, the cardinality of Hamel basis of E is called
the dimension of E; one can show that the dimension is independent of the choice
of a Hamel basis.
Example 1.1.8. Here we consider some of the examples of linear vector spaces
given in Section 1.1.2.
2Linear independence means that every finite subset of X is linearly independent in the
1.
2.
3.
4.
dimpRn q n, dimpCn q n.
dimpPn pxqq n 1, the monomials t1, x, x2 , . . . , xn u form a basis.
dimpP pxqq 8, the monomials t1, x, x2 , . . .u form a Hamel basis.
dimpc00 q 8, the coordinate vectors p0, . . . , 0, 1, 0, . . .q form a Hamel basis.
Remark 1.1.9. Unfortunately, the notion of Hamel basis is too strong. Except
in spaces P pxq and c00 (which are isomorphic - why?) no explicit constructions are
known in any other infinite dimensional vector space. It would be great to have
a construction of a Hamel basis in C r0, 1s, for example. However, Hamel bases
usually have to be uncountable; see a later exercise.
1.1.5. Quotient spaces. The notion of quotient space allows one easily to
collapse some directions in linear vector spaces. One reason for doing this is when
one has unimportant directions and would likes to neglect them; see the construction
of L1 below.
Definition 1.1.10 (Quotient space). Let E1 be a subspace of a linear vector
space E. Consider an equivalence relation on E defined as
xy
if
xy
P E1 .
The quotient space E {E1 is then defined as the set of equivalence classes (cosets)
rxs for all x P E.
The quotient space is a linear space, with operations defined as
rxs rys : rx
y s,
arxs : raxs
for x, y
P E, a P R.
Exercise 1.1.11. Prove that the operations above are well defined,
and that quotient space is indeed a linear space.
Remark 1.1.12. 1. Observe that rxs is an affine subspace:
rxs x
E1 : tx
h : h P E1 u.
codimpE1 q dimpE q.
Example 1.1.13 (Space L1 ). The notion of quotient space comes handy when
we define the space of integrable functions L1 L1 p, , q where p, , q is an
arbitrary measure space. We first consider
E : tall integrable functions f on p, , qu.
To identify functions that are equal -almost everywhere, we consider the subspace
we would like to neglect:
E1 : tall functions f
Then we define
L1
0 -almost everywhereu
L1 p, , q : E {E1 .
This way, the elements of L1 are, strictly speaking, not functions but classes of
equivalences.3 But in practice, one thinks of an f P L1 as a function, keeping in
mind that functions that coincide -almost everywhere are the same.
Example 1.1.14 (Space L8 ). A similar procedure is used to define the space
of essentially bounded functions L8 L8 p, , q. A real valued function f on
is called essentially bounded if there exists a bounded function g on such that
f g -almost everywhere. Similar to the previous example, we consider the linear
vector space
E : tall essentially bounded functions f on p, , qu
and the subspace we would like to neglect:
E1 : tall functions f
Then we define
L8
0 -almost everywhereu
L8 p, , q : E {E1 .
for some a P R, z
P c0
This example shows that the space c0 makes up almost the whole space c,
except for one dimension given by the constant sequences. This explains why the
space c is rarely used in practice; one prefers to work with c0 which is almost the
same as c but has the advantage that we know the limits of all sequences there
(zero).
1.1.6. Linear operators. This is a quick review of the classical linear algebra
concept.
Definition 1.1.16 (Linear operator). A map T : E F between two linear
vector spaces E and F is called a linear operator if it preserves the operations of
addition of vectors and multiplication by scalars, i.e.
T pax
by q aT pxq
bT py q
for all x, y
P E, a, b P R.
ImpT q tT x : x P E u.
3Even more strictly speaking, the representative functions f in L may take infinite values,
1
too. However, every integrable function is finite a.e. So every such function is equivalent to a
function that is finite everywhere.
4One usually writes T x instead of T x
pq
Such operator is well be defined e.g. on the space of polynomials T : P pxq P pxq.
But usually one prefers to have a differential operator on a larger space; for example
T : C 1 r0, 1s C r0, 1s is also well defined.
Example 1.1.18 (Embedding and quotient map). Given a subspace E1 of a
linear vector space E, there are two canonical linear operators associated with it:
1. Embedding : E1 E, which acts as an identity pxq x;
2. Quotient map q : E E {E1 , which acts as q pxq rxs.
Example 1.1.19 (Shifts on sequence spaces). On any sequence space such as
c00 , c0 , c, `8 , `1 , one can define the right and left shift operators respectively as
Rpxq p0, x1 , x2 , . . .q;
Lpxq px2 , x3 , . . .q
x2
for some x1
P E1 , x2 P E2 .
Exercise 1.1.24. [Injectivization] This is a linear version of the fundamental theorem on homomorphisms for groups. Consider a linear
operator T : E F acting between linear spaces E and F . The operator
T may not be injective; we would like to make it into an injective operator. To this end, we consider the map T : X { ker T Y which sends
every coset rxs into a vector T x, i.e. Trxs T x.
(i) Prove that T is well defined, i.e. rx1 s rx2 s implies T x1 T x2 .
(ii) Check that T is a linear and injective operator.
(iii) Check that T is surjective then T is also surjective, and thus T is a
linear isomorphism between X { ker T and Y .
Axiom (iii) is called triangle inequality for the following reason. Given an
arbitraty triangle in E with vertices x, y, z P E, its lengths satisfy the inequality
(1.3)
}x z} }x y} }y z},
which follows from norm axiom (iii). For the usual Euclidean length on the plane,
this is the ordinary triangle inequality.
The normed space is naturally a metric space, with the metric defined by
dpx, y q : }x y }.
The norm axioms, and in particular triangle inequality (1.3), show that this is
indeed a metric (check!)
Exercise 1.2.2. [Normed spaces `8 , c, c0 , `1 , C pK q, L1 , L8 ] Many of
linear vector spaces introduced in Section 1.1.2 and Example 1.1.12 are
in fact normed spaces. Check the norm axioms for them:
1. The space of bounded sequences `8 is a normed space, with the norm
defined as
(1.4)
2. The spaces c and c0 are normed spaces, with the same sup-norm as in
(1.4).
3. The space of summable sequences `1 is a normed space, with the norm
defined as
}x}1 :
i 1
|xi |.
5.
}f }8 : max
|f pxq|.
K
The space L1 L1 p, , q is a normed space, with the norm defined
as6
}f }1 :
|f pxq| d.
}f }8 pinf
Aq0
sup |f ptq|
P z
t A
rs
}r s}
| p q|
rs
Definition 1.2.6 (Convex functions and sets). Let E be a linear vector space.
A function f : E R is convex if for all x, y P E, P r0, 1s one has
f px
P BX implies x P BX
10
1. If the function x }x} is convex, then the triangle inequality is satisfied, and
} } is a norm on E.
2. If the sublevel set tx P X : }x} 1u is convex, then } } is a norm on E.
P E, P r0, 1s we have
}x p1 qyq} }x} p1 q}y}.
Triangle inequality follows from this for 1{2.
Proof. 1. Convexity ensures that for every x, y
2. This statement is less trivial, and can not be obtained from the first one
directly. Indeed, while it is true that the sublevel sets of a convex functions are
convex sets, the converse statement may fail (construct an example!)
The assumption states that, for u, v P E we have:
}u p1 qv} 1.
Let x, y P E; we want to show that }x y } }x} }y }. This is equivalent to
y
x
}x} }y} }x} }y} 1.
(1.5)
then
x
}x} , v
}yy} , }x}}x}}y} .
Lec. 4: 09/15/2010
|f pxq|p d 8.
| pq
f t
g ptq| p
|f ptq|
|gptq|p ,
t P .
}f }p :
1{p
|f pxq|p d
for f
P Lp p, , q.
11
Proof. Norm axioms (i) and (ii) are straightforward. Axiom (iii), triangle
inequality, will follow from Proposition 1.2.8. To this end, it suffices to check that
the sublevel set
Bp : tf P Lp : }f }p 1u
is a convex set. To prove this, let us fix f, g P Bp and P r0, 1s. Since the function
z |z |p is convex on R for p 1, we have a pointwise inequality
f t
f t
g }p
Theorem 1.2.11 (Minkowski inequality in Lp ). Let p P r1, 8q. Then, for every
two functions f, g P Lp p, , q one has
1{p
|f ptq
g ptq|p d
|f ptq|p d
1{p
|gptq|p d
1{p
.
|xi |p 8.
i 1
}x}p :
8
|xi |p
1{p
i 1
Writing down Minkowski inequality for this specific measure space, we obtain:
Theorem 1.2.12 (Minkowski inequality in `p ). Let p P r1, 8q. Then, for every
two sequences x, y P `p one has
8
|xi
yi |p
1{p
8
i 1
|xi |p
8
1{p
i 1
|yi |p
1{p
i 1
A remarkable family of finite-dimensional spaces Lp p, , q is formed by considering to be a finite set, say t1, . . . , nu and to be the counting measure on
. The resulting space is called `np . The functions in `np can be obviously identified
with vectors in Rn . Thus `np pRn , } }p q with the norm
}x}p :
n
i 1
|xi |p
1{p
12
}rxs} : yinf
}x
PY
y }.
a A
Then clearly
(1.6)
13
Proof. First we observe that the number }rxs} is well defined, i.e. it does not
depend on a choice of a representative x in the coset rxs. This clearly follows from
the geometric definition (1.6).
Next, we have to check the three norm axioms.
(i) Assume that }rxs} 0. Then, from the geometric definition (1.6) we see
that 0 is a limit point of rxs. Since Y is closed, so is rxs x Y . Therefore 0 P rxs.
Hence rxs r0s, which verifies norm axiom (i).
(ii) Let x P X and P R. Then
}rxs} yinf
}x
PY
y } inf }x
y } inf }x
y Y
y Y
y } }rxs}.
}x1
y1 } }rx1 s} }x1
y1 },
}x2
y2 } }rx2 s} }x2
y2 }.
}x1
x2
y2 } }x1
y1
y1 }
} x2
y2 } }rx1 s}
}rx2 s}
2.
We conclude that
}rx1
x2 s} inf }x1
y } }x1
x2
y Y
x2
y1
y2 } }rx1 s}
}rx2 s}
2.
}rf s} 21
t K
t K
for f
P C pK q.
Exercise 1.2.20. Derive the following formula for the norm in the
quotient space `8 {c0 :
for a pai q8
i1
P `8 .
`1 Y
`1 Y tpx, yq : x P X, y P Y u.
14
}x}K : inf t 0
Show that
is K.
} }K
: x{t P K .
}rxs} : }x}
for x P E.
k xk
k 1
where k 0 are some numbers such that k1 k 1. Prove that convpAq
coincides with the set of all convex combinations of a finite number of
vectors from A.
15
}xn xm }
for all n, m N.
}fn fm }8 0, n, m 8.
Therefore, for every t P K, we have |fn ptq fm ptq| 0.
(1.7)
for all n, m N, t P K.
Lec.5: 09/17/10
16
A series
xk
k 1
8,
xk
xk
x.
k 1
}xk } 8.
k 1
(1.8)
}xk } 8.
k 1
}sn sm }
k n 1
xk
k n 1
}xk } 0.
17
k 1
}xk } 12
1
22
1
23
1.
xk
wn 1 w1
k 1
diverge. So we have constructed an absolutely convergent series in X which diverges. This completes the proof.
Exercise 1.3.7. Validate the two missing steps in the proof of Theorem 1.3.6. Let Xn be a normed space.
1. Let pvn q be a Cauchy sequence X which diverges. Prove that every
subsequence of pvn q diverges.
2. Let pvn q be a Cauchy sequence in X. Construct a rapidly Cauchy
subsequence pwn q of pvn q, i.e. one that satisfies (1.9).
Theorem 1.3.8. For every p P r1, 8q, the space Lp
space.
Lp p, , q is a Banach
}fk }p : M 8.
k 1
By
the completeness criterion, Theorem 1.3.6, it suffices to show that the series
k fk converges in Lp .
n
Case 1: all fk 0 pointwise. The partial sums k1 fk form a pointwise non8
decreasing sequence of functions. Denote the pointwise limit by k1 fk ; it may
be infinite at some points.
The triangle inequality (which is Minkowskis inequality) implies that the partial sums are bounded:
n
fk
k 1
n
k 1
}fk }p M,
k 1
fk
d M p .
18
We
monotone convergence theorem for the sequence of functions
napplyLebesgue
p
f
and
get
k
k 1
n
fk
8
k 1
fk
d.
k 1
fk
k n 1
fk : maxpfk , 0q
pointwise.
|fk |, we have
}fk }p
k 1
k 1
}fk }p 8.
8
So, by the first part of the argument, the series k1 fk converges in Lp . Similarly
8
8
8
8
we show that k1 fk converges in Lp . Therefore, k1 fk k1 fk k1 fk
converges in Lp . This completes the proof.
Lec.7: 09/22/10
} }} }
19
}f }C }f }8 }f 1 }8 }f pkq }8 .
k
Exercise 1.3.13. [Completeness of a direct sum] Let X and Y be Banach spaces. Show that the direct sum X `1 Y defined in Exercise 1.2.21
is a Banach space.
1.4. Inner product spaces
1.4.1. Definition. Cauchy-Schwarz inequality. Hilbert spaces form an
important and simplest class of Banach spaces. Speaking imprecisely, Hilbert spaces
are those Banach spaces where the concept of orthogonality of vectors is defined.
Hilbert spaces will arise as complete inner product spaces.
Definition 1.4.1 (Inner product space). Let E be a linear space over C. An
inner product on E is a function x, y : E E C which satisfies the following
three axioms:
(i) xx, xy 0 for all x P E; xx, xy 0 if and only if x 0;
(ii) xax by, z y axx, z y bxy, z y for all x, y, z P E and a, b P C;
(iii) xx, y y xy, xy for all x, y P E.
The space E with an inner product is called an inner product space.
Inner products over R are defined similarly, except there is no conjugation in
axiom (iii).
20
Remark 1.4.2. The inner product is (congjugate) linear in the second argument:
xx, ay bzy axx, yy bxx, zy.
This follows from axioms (ii) and (iii) of the inner product.
Definition 1.4.3 (Orthogonality). If xx, y y
are orthogonal and write x K y.
xx, yy
xk yk .
k 1
Proof. Case 1:
we have:
0 xx ty, x ty y t2 xy, y y 2txx, y y xx, xy.
A quadratic polynomial that is everywhere non-negatve must have a non-positive
discriminant, i.e.
xx, yy2 xx, xyxy, yy 0.
This is precisely Cauchy-Schwarz inequality.
Case 2: xx, y y P C arbitrary. We will multiply y by a unit scalar so that xx, y y
becomes a real number, and use the first part. Indeed, polar decomposition
implies that
where y 1 ei Argxx,yy y.
Now using the first part of the proof we conclude that
as required.
Proof. Of the three norm axioms, only the triangle inequality is non-trivial.
Let us check it. For x, y P X we have by Cauchy-Schwarz inequality that
}x
y }2
}x
y }2
}x}2 }y}2 .
21
Remark 1.4.8 (Angle between vectors). The concept of inner product makes
it possible to define the angle between two vectors x, y in an inner product space
X. Recall that in Euclidean space Rn , the inner product can be computed by the
formula
xx, yy }x}}y} cos px, yq
where px, y q is the angle between x and y. Therefore, in a general inner product
space X, it makes sense to define the angle between x, y by
xx, yy .
cos px, y q
}x}}y}
Cauchy-Schwarz inequality guarantees that the right hand side lies in r1, 1s, so
the angle exists. Nevertheless, the concept of angle is rarely used; one prefers to
work with inner product directly.
Lec.8: 09/24/10
xf, gy :
P L2 , the quan-
f g d
is finite, and it defines an inner product on L2 . This inner product obviously agrees
with the L2 norm, i.e. }f }2 xf, f y1{2 .
Proof. The only non-trivial fact to prove is that xf, g y is finite, i.e. that
f g is integrable. Since f , g and f
g belong to L2 , we have that f 2 , g2 and
2
2
2
pf gq f f g g are integrable. Hence f g is integrable, as required.
We can recast Cauchy-Schwarz inequality in this specific space L2 as follows.
Corollary 1.4.10 (Cauchy-Schwarz inequality in L2 ). For every f, g
has
f g d
|f |2 d
1{2
1{2
|g|2 d
P L2 one
The left
hand side of Cauchy-Schwarz inequality can be replaced by the larger
quantity |f g | d. (This can be seen by applying Cauchy-Schwarz inequality for
|f |, |g|.) Thus Cauchy-Schwarz inequality can be written as
}f g}1 }f }2 }g}2 .
|f |
1{p
1{q
|g|q d
P p1, 8q
be
}f g}1 }f }p }g}q .
22
|f ptqgptq| |f pptq|
|gptq|q
Integrating yields
|f g| p1
1
q
for all t P .
1
as required.
Using H
olders inequality we can clarify the scale of spaces Lp Lp p, , q
for various p. Assume that is a finite measure (this is important!) Then L8 is
the smallest space, L1 is the largest, and all other Lp , p P r1, 8q lie in between:
}f }r }f }s
for all f
P Ls Ls p, , q.
Lr .
}f }rr |f |r d |f |r 1 d
|f |s
r{s
|f |rs{r
r{s
1{q
1q d
pq }f }rs .
xx, yy
xi yi
for x pxi q, y
pyi q P `2 .
i 1
8
xi yi
i 1
8
i 1
|xi |2
8
1{2
|yi |2
1{2
i 1
As before, the left hand side can be replaced by the larger quantity
Similarly, H
olders inequality in this case takes the following form:
|xi yi |.
i 1
23
Corollary 1.4.14 (H
olders inequality for sequences). Let p, q P p1, 8q be
adjoint, i.e. p1 1q 1. Then for every x pxi q P `p and y pyi q P `q one has
|xi yi |
8
i 1
|xpi
8
1{p
|yiq
i 1
1{q
i 1
However, Corollary 1.4.12 on the scale of Lp (on finite measure spaces) does
not hold for `p , because the counting measure on N is not finite. In fact, the scale
is completely reversed in this case: `1 is the smallest space, `8 is the largest, and
the other `p , p P r1, 8q lie in between:
Corollary 1.4.15 (Scale of `p spaces). Let 1 r
}x}s }x}r
for all x P `r .
s 8. Then
`s .
xA, B y : trpAB q
m
n
aij bij .
i 1j 1
This is clearly an inner product. One way to see this is to identify Mm,n with Cmn
by concatenating the rows of a matrix A P Mm,n into a long vector in Cmn . Then
the canonical inner product in Cmn is the same as the right hand side of (1.11).
The norm defined by the inner product on Mm,n is called Hilbert-Schmidt or
Frobenius norm of matrices:
(1.12)
m
n
|aij |2
1{2
i 1j 1
Note some similarity between the forms of the inner product in L2 , which is
f g d and in Mm,n , which is xA, B y : trpAB q the integral is replaced
by the trace, functions by matrices, complex conjugation by transposition, and
product of functions by product of matrices.
xf, gy
EX
X p q dPp q.
24
covpX, Y q
VarpX q1{2 VarpY q1{2
Yy
pEX 2 qEXY
}XxX,
1{2 pEY 2 q1{2
} }Y }
2
Hence the correlation coefficient is nothing else as the cosine of the angle between
random variables X and Y considered as vectors in L2 (see Remark 1.4.8). This
demonstrates the geometric meaning of correlation the more random variables X
and Y are correlated, the less the angle between them, and vice versa.
1.4.6. Parallelogram law. The parallelogram law in planar geometry states
that for every parallelogram, the sums of squares of the diagonals equals the sum of
squares of the sides. This statement remains to be true in all inner product spaces:
Proposition 1.4.18 (Parallelogram law). Let X be an inner product space.
Then for every x, y P X one has
(1.13)
}x
y }2
}x y}2 2}x}2
2}y }2 .
yy
The parallelogram law characterizes inner product spaces. First recall that in
inner product spaces, the inner product determines the norm (}x} xx, xy1{2 ).
Vice versa, the inner product is uniquely determined by the norm, and it can be
reconstructed through the polarization identity:
Proposition 1.4.19 (Polarization identity). Let X be an inner product space.
Then for every x, y P X one has
(1.14)
xx, yy 14 }x
y }2 }x y }2
i}x
iy }2
i}x y }2 .
Lec.9: 09/27
25
The proof of this result is deferred to the exercises for this section.
It follows from Theorem 1.4.21 that being an inner product space is a local
property, since checking the parallelogram law involves just two (arbitrary) vectors.
In particular, if all two-dimensional linear subspaces of a normed space X are innerproduct spaces (with respect to some inner product, possibly different for each
subspace), then X is an inner product space (and there the inner product on all
subspaces is actually the same, induced from X!)
1.4.7. Additional Exercises.
Exercise 1.4.22. [Direct sum of inner product spaces] Let X, Y be
inner product spaces. Show that their direct sum
X
`2 Y : tpx, yq : x P X, y P Y u
are Hilbert
26
AK
tauK .
aPA
K
Therefore it suffices to check that tau is a closed set for every a P A. So we fix
a P A and consider a sequence xn P tauK such that xn x for some x P X. We
would like to show that x P tauK . To this end, notice that continuity of the inner
10
AK
X A;
it follows that
1
}x y} ymin
1 PY }x y }.
P Y K.
}x yn } d.
27
To bound }yn ym }, we use parallelogram law. We apply it for the parallelogram
with vertices x, yn , ym (and whose fourth vertex is determined by these three, see
the picture.)
}yn ym }2
4}x
1
pyn
2
ym q}2
2}x yn }2
2}x ym }2 .
2d2
2d2 4d2
0.
xx y, y1 y 0
for some y 1
P Y.
}x y}2 }x y
ty 1 }2
}x y}2
2txx y, y 1 y
t2 }y 1 }2 .
Lec.10: 09/29
28
In the proof of part (i) of Theorem 1.5.5, we used convexity rather than linearity
of Y . (Indeed, we needed that together with two points yn , ym P Y their midpoint
1
ym q is contained in Y ). Therefore, our argument implies the following more
2 pyn
general result:
Theorem 1.5.6 (Hilberts projection theorem). Given a closed convex set Y
in a Hilbert space X and a point x P X, there exists a unique closest point y P Y .
The map that takes x into the closest point y is called a projection onto convex
set Y and is abbreviated POCS. This map appears in several applied fields.
The orthogonality principle immediately implies that a Hilbert space X can be
decomposed into the orthogonal sum of a subspace Y and its complement Y K :
Corollary 1.5.7 (Orthogonal decomposition). Let X be a Hilbert space and
Y be a closed subspace. Then every vector x P X can be uniquely represented as
xy
z,
P Y,
P Y K.
Y ` Y K.
PY
PY K .
AK
f ptq
k
8
fppk qeikt
(1.15)
1
2
29
f ptqeikt dt
are called the Fourier coefficients of f . In order to make Fourier analysis rigorous,
one has to understand what functions f can be written as Fourier series, and in
what sense the Fourier series converges.
In order to do so, it is of great advantage to depart from this specific situation
and carry out Fourier analysis in an abstract Hilbert space. We will regard the
function f ptq as a vector in the function space L2 r, s. The exponential functions
eikt will form a set of orthogonal vectors in this space. Fourier series will then
become an orthogonal decomposition of a vector f with respect to an orthogonal
system of coordinates.
1.6.1. Orthogonal systems.
Definition 1.6.1 (Orthogonal system). A sequence pxk q in a Hilbert space X
is called an orthogonal system if
xxk , xl y 0
for all k
l.
If additionally }xk } 1 for all k, the sequence pxk q is called an orthonormal system.
Equivalently, pxk q is an orthonormal system if
xxk , xl y kl
where kl equals 1 if k
p0, . . . , 0, 1, 0, . . .q
whose all coordinates are zero except the k-th equals 1. The sequence pxk q8
k1 is
clearly an orthonormal system in `2 .
Example 1.6.3 (Fourier basis in L2 ). In the space11 L2 r, s, consider the
exponentials
1
(1.16)
ek ptq ? eikt , t P r, s.
2
8
Then pek qk8 is an orthonormal system in L2 r, s (check!).
Example 1.6.4 (Trigonometric system in L2 ). Closely related to the Fourier
basis is the trigonometric system. Note that we can write the exponentials from
the previous example as
1
fk ptq ?
cospktq i sinpktq .
2
Considering the real and imaginary parts separately, we see that the system
! 1
)
? , ?1 cosptq, ?1 sinptq, ?1 cosp2tq, ?1 sinp2tq, . . .
2
is an orthonormal system in L2 r, s (check!)
r
p q
11The space L
, can be identified with L2 T where T is the unit torus in C. We can
2
think of elements of this space as 2-periodic functions.
30
(i) k xk converges in X;
(ii) k }xk }2 8;
(iii) k xk converges unconditionally in X, i.e. for every reordering of terms.
In case of convergence, we have
2
xk
(1.17)
}xk }2 .
The proof of this result is based on its finite version, which may be called the
Pythagorean theorem in higher dimensions:
Lemma 1.6.7 (Pythagorean theorem). Let pxk q be an orthogonal system in a
Hilbert space X. Then for every n P N one has
n
2
xk
(1.18)
k 1
}xk }2 .
k 1
Proof. Using orthogonality, we see that the left hand side of (1.18)
n
A
k 1
xk ,
xk
k 1
xxk , xl y
k,l 1
xxk , xk y
k 1
(1.19)
m
2
xk
as n, m 8.
k n
Note
m that 2by Pythagorean theorem (Lemma 1.6.7), the quantity in (1.19) equals
criterion again we see that (1.19) is equivalent to the
kn }xk } . So using Cauchy
(ii) (iii). The scalar series k }xk }2 converges absolutely, therefore also
unconditionally (as we know from an analysis
course). Hence, by the equivalence
8
k
8 ak e
ikt
31
Lec.11: 10/01
xx, xk yxk .
k
xx, xk yxk .
k 1
1, . . . , n. We have
xx Sn pxq, xk y xx, xk y xSn pxq, xk y.
By definition of Sn pxq and orthonormality of pxk q we see that xSn pxq, xk y xx, xk y.
Therefore we conclude that xx Sn pxq, xk y 0 as required.
Let us estimate the size of Sn pxq. Since x Sn pxq K Sn , by Pythagorean
theorem we have }Sn pxq}2 }x Sn pxq}2 }x}2 . Hence
}Sn pxq}2 }x}2 .
On the other hand, by Pythagorean theorem and orthonormality,
}Sn pxq}2
}xx, xk yxk }2
k 1
|xx, xk y|2 .
k 1
This result along with the convergence criterion for orthogonal series, Theorem 1.6.6, shows that Fourier series always converge.
Corollary 1.6.12.
Let pxk q be an orthonormal system in a Hilbert space X.
Then the Fourier series k xx, xk yxk of every vector x P X converges in X.
12
(
n Recall that the
k1 ak xk : ak P C .
q
32
(i) The Fourier series k xx, xk yxk is the orthogonal projection of x onto Spanpxk q
(the closure of the linear span).
x
xx, xk yxk .
}x}2
|xx, xk y|2 .
Proof. The first part follows from the Optimality Theorem 1.6.13, since by
completeness the orthogonal projection onto Spanpxk q X is the identity map in
X. Parsevals identity follows from Fourier expansion (1.20), Pythagorean identity
(1.17) for orthogonal series, and the normalization condition }xk } 1.
Exercise 1.6.17. Prove that Parsevals identity holds for an orthonormal system pxk q if and only if pxk q is complete. Therefore the equality
cases of Bessels inequality hold exactly when the system is complete.
Now we describe some classical examples of complete sets and orthonormal
bases.
Example 1.6.18 (Monomials). Weierstrass approximation theorem states that
the system of monomials ptk q8
k0 is a complete system in C r0, 1s. We claim that
this is also a complete system in L2 r0, 1s.
Indeed, C r0, 1s is dense in L2 r0, 1s. This means that for every f P L2 r0, 1s
and 0, there exists g P C r0, 1s such that }f g }2 {2. By Weierstrass
approximation theorem, there exists h P Spanptk q8
k0 such that }g f }8 {2.
Hence }g f }2 }g f }8 {2, so by triangle inequality we conclude that
}f h}2 {2 {2 . We have proved that Spanptk q8k0 is dense in L2 r0, 1s as
required.
33
Example 1.6.19 (Exponentials). By a general version of Weierstrass approximation theorem (called Stone-Weierstrass theorem), the exponential monomials
peitk qkPZ is a complete system in C r, s. Repeating the argument in Example 1.6.18, we can check that this is also a complete system in L2 r, s.
Therefore, the system of exponentials
1
(1.21)
xk ptq ? eikt , k P Z,
2
forms an orthonormal basis of L2 r, s. Reformulating Theorem 1.6.16 in this
case, we obtain a basic result in classical Fourier analysis:
Theorem 1.6.20 (Classical Fourier series). Every function f
be represented by its Fourier series
f ptq
k
8
fppk qeikt ,
fppk q
where
1
2
P L2 r, s can
f ptqeikt dt.
The coefficients fppk q are all finite; the Fourier series converges in L2 r, s.
Example 1.6.21 (Trigonometric system). In a similar way we can show that
the trigonometric system considered in Example 1.6.4 is an orthonormal basis in
L2 r, s (do this!) Therefore a version of Theorem 1.6.20 holds for the trigonometric system, and it reads as follows:
f ptq
where
8
a0
ak cospktq
2 k1
bk sinpktq
1
1
ak
f ptq cospktq dt, bk
f ptq sinpktq dt.
This again holds for every function f P L2 r, s; the coefficients ak , bk are all
finite, and the Fourier series converges in L2 r, s.
1.6.5. Gram-Schmidt orthogonalization. There is a general way of creating an orthonormal basis phk q in a Hilbert space X out of some other, possibly
non-orthogonal system pxk q. One orthogonalize the system pxk q one element at a
time. This procedure is called Gram-Schmidt orthogonalization.
So let us consider a linearly independent system of vectors pxk q8
k1 in X. We
define the system phk q8
inductively
as
follows:
k 1
h1
}xx1 } ,
1
hn
}PPn xxn 1 } ,
n n 1
n 1, 2, . . .
where Pn denotes the orthogonal projection in X onto Spanph1 , . . . , hn qK . Geometrically, one rotates the new vector xn 1 so it becomes orthogonal to the vectors
hk constructed earlier, normalizes it, and calls hn 1 ; see the picture.
One can effectively compute the vectors Pn xn 1 used in this process. Indeed,
by Lemma 1.6.10, the orthogonal projection of a vector x onto Spanph1 , . . . , hn q is
the partial sum of Fourier series:
Sn pxq
xx, hk yhk .
k 1
Lec.12: 10/04
34
xx, hk yhk .
k 1
So
Pn x n
1 xn 1
xxn
1 , hk
yhk .
k 1
Spanpxk qnk1
for all n P N.
35
Exercise 1.6.25. Prove that the spaces `8 and L8 are not separable
by constructing an uncountable separated subset A, i.e. such that
inf t}x y } : x, y
P A, x yu 0.
Proof.
Let T be the map that takes xk to yk . More precisely, define T by
T
k
ak xk
a k yk .
36
Note that every x P X has the form x k ak xk for some (Fourier) coefficients ak ,
so the definition makes sense. Also, by Parsevals identity,
(1.23)
2
a k xk
|ak |2
2
ak yk .
ak yk
ak xk ,
}T x} }x}
for all x P X.
}T x T y} }x y}
for all x, y
P X.
1,
1,
t P r0, 1{2q
t P r1{2, 1q
0, 1, 2, . . . ,
l 0, 1, 2, . . . , 2k 1.
37
s,
P N Y t0u,
where rs denotes the integer part of a number. This system of functions
is called the Rademacher system. Show that the Rademacher system is
an orthonormal system in L2 r0, 1s, but is not complete (therefore not an
orthonormal basis).
Exercise 1.6.36. [Walsh system] Consider the functions wA ptq, A
N Y t0u (indexed by subsets A rather than numbers!) defined by
wA ptq
rk ptq
k A
Pk ptqPl ptqwptq dt kl .
Now consider the weight wptq ?12 et {2 , i.e. the standard normal
density. Prove that the orthogonal polynomials with respect to this
weight is the system of Hermite polynomials
2
Pk ptq p1qk et
{2 d et2 {2 ,
k
k
dt
up to normalization constants. More precisely, Pk ptq form an orthogonal
basis in L2 pR, wptq dtq and }Pk }22 k!.
Exercise 1.6.39. [Space of almost periodic functions] A function f :
R C is called almost periodic if it is a superposition of a finite number
of frequencies, i.e. f has the form
f ptq
ak eiwk t ,
where ak
P C,
wk
P R,
n P N.
k 1
Note that the frequencies wk are allowed to take arbitrary real values.
Denote the space of almost periodic functions by X0 , and equip it with
the inner product
xf, gy Tlim
8
1 T
2T
1{2
|f ptq|2 dt
38
xf, gy
k 1
ak bk
where f ptq
k 1
ak eiwk t , g ptq
bk eiwk t
k 1
(and where the frequencies wk are all different). The completion X of the
inner product space X0 is called the space of almost periodic functions.
Prove that X is a non-separable Hilbert space by showing that the
system of functions
teitw , w P Ru
is an orthonormal system in X.
CHAPTER 2
by q af pxq
bf py q
for all x, y
P E, a, b P C.
F pg q
g ptq d
1
0
g ptqwptq dt
F pg q g pt0 q
is clearly a linear functional on C r0, 1s. It is called the point evaluation functional
at t0 .
Physicists view the point evaluation functional as a special case of integration
with weight:
(2.2)
g pt0 q
1
0
g ptq pt t0 q dt
The weight here is given by Dirac delta function ptq, which is zero for all values t
1
except pt0 q 8, and such that 0 ptq dt 1. Dirac delta function does not exist
39
40
xk yk
xx, yy,
x px1 , . . . , xn q,
k 1
for some y
py1 , . . . , yn q P Cn .
Example 2.1.5. More generally, we will soon show that every liner functional
on a Hilbert space X has the form
for some y
functional.
f pxq xx, y y
X. For now, we note that f defined this way is indeed a linear
x in X
|f pxq| C }x}
for all x P X.
x. Then
|f pxn q f pxq| |f pxn xq| C }xn x} 0.
Thus f is continuous.
Vice versa, assume that f is not bounded. Then we can find a sequence pxn q
of nonzero vectors in X such that
|f pxn q| n}xn },
Dividing both sides by n}xn } we obtain
xn
f
1,
n}x }
n
n 1, 2, . . .
n 1, 2, . . .
PX
41
|f pxq| }f } }x}
for all x P X, f
P X .
Also, }f } is the smallest number in this inequality that makes it valid for all x P X.
Exercise 2.1.12. Compute the norms of the integration and the point
evaluation functionals considered in Examples 2.1.2 and 2.1.3.
2.1.3. Hyperplanes as level sets of linear functionals. General functions,
and in particular linear functionals f , on a linear vector space E may be visualized
by describing their level sets
tx P X : f pxq cu
42
Proof. (i) Follows from a linear version of the fundamental theorem on homomorphisms, Exercise 1.1.24. Indeed, the injectivization f : E { ker f C of f
establishes a linear bijection (isomorphism) between E { ker f and the range C of f .
Thus dimpE { ker f q dimpCq 1, so ker f is a hyperplane in E.
(ii) Since ker f ker g : H, the injectivizations f, g : E {H C are linear
functionals on the one-dimensional space E {H. A moments thought yields that
such linear functionals must be equal up to some constant factor a, i.e. f
a
g . Then frxs a
g rxs for all x P E. On the other hand, by construction of
injectivization, f pxq frxs and g pxq grxs. Therefore f pxq ag pxq as required.
(iii) Since dimpE {H q 1, we have
E {H
tarx0 s : a P Cu
for some x0 P E. Let x P E be arbitrary; then rxs arx0 s for some a apxq P C,
which implies x ax0 h for some h P H. Let us define f on E by f pxq a. Then
f is a linear functional (check!), and clearly ker f H.
Proposition 2.1.14. Let f be a bounded linear functional, i.e. f P X then
ker f is closed.
Proof. ker f is the pre-image of the closed set t0u under the continuous map
f , so it must be closed.
Remark 2.1.15. Using injectivization of f , one can show that the converse also
holds. So, a linear functional f is bounded if and only if ker f is closed. It follows
that the kernel of a linear functional is either closed or dense in X. (Why?)
Lec.14: 10/08
P X, the function
f pxq xx, y y,
xPX
43
Since f is
bounded, ker f is closed (see Proposition 2.1.14). Therefore, X can be represented
by the orthogonal decomposition
X
ker f ` Spanpy0 q
for some y0
P pker f qK .
x P X.
ty0 uK ker f.
Remark 2.2.2. In a concise form, the statement of Riesz representation theorem can be expressed as
X X.
Although X and X are formally different spaces, they can be canonically identified
as in Riesz representation theorem.
Let us rewrite Riesz representation theorem for the Hilbert space L2 .
Corollary 2.2.3 (L2 L2 ). Consider the space L2 L2 p, , q.
(i) For every weight function g P L2 , integration with weight
Gpf q :
f g d,
P L2
implies
pAq 0
44
g d
f d
is a bounded linear functional on the space L2 pq, and therefore also on the space
L2 p q. By Riesz representation theorem, there exists h P L2 p q such that
(2.4)
f d
f h d p
q
f h d
f h d
for all f
P L2 p
q.
(2.5)
f h d
f p1 hq d for all f
P L2 p
q.
We claim that
(2.6)
0h1
-a.e.
0h1
q-a.e.
By Monotone Convergence Theorem, one can show that (2.5) holds for arbitrary
q-measurable functions f such that f 0 p q-a.e. (Indeed, consider the
truncation fn ptq : minpf ptq, nq and let n 8.) The convention is that if one side
of (2.5) is infinite then the other is infinite, too.
Now, given a measurable set A, we choose f so that f h 1A . In other words,
we consider
1A
f :
h
1h
d.
h
45
Lec.15: 10/11
2.2.3. The dual of Lp . A version of the representation theorem for L2 , Corollary 2.2.3, holds in fact for all Lp spaces. In short, it states that Lp Lp1 where p
and p1 are conjugate exponents as in Holders inequality, i.e.
1
1
1, 1 p, p1 8,
p p1
and p1 8 if p 1. The rigorous statement is the following:
f g d,
P Lp
xk yk ,
x P `p
k 1
8 is an exercise.
(i) By H
olders inequality, we have
|Gpxq|
p, p1 8;
the case p
1,
xk yk }x}p }y }p1 .
|Gpxq|
xk yk }x}p }y }p1 .
Indeed, one can check that this holds for x pxk q defined as
1
xk ei Argpyk q |yk |p 1 .
For this x, it follows that }G} }y }p1 , so part (i) of the theorem is proved.
1The result can be extended (by decomposition) to -finite measures.
46
8
Gpxq G
xk ek
xk Gpek q,
x pxk q P `p .
k 1
k 1
yk ek
P `p1 , consider
1
ei Argpyk q |yk |p 1 ek ,
n 1, 2, . . .
k 1
k 1
}G}}xpnq }p |Gpxpnq q|
xk yk }xpnq }p }y pnq }p1 .
k 1
}G}.
k 1
}G}.
By part (i), we
f d,
P C pK q
| |p q
| |
47
ak f ptk q
k 1
where ak are fixed scalars, and t1 , . . . , tn are fixed distinct points in r0, 1s.
Deduce that the linear span of point evaluation functionals t1 , . . . , tn
in C r0, 1s is isometric to `n1 .
2.3. Hahn-Banach theorem
Lec.16: 10/13
f0 |X ,
0
P X0 .
Constructing extensions is a nontrivial problem because of the continuity (=boundedness) requirement for f .
Exercise 2.3.1. Show that if one does not require continuity of f , one
can construct f using a Hamel basis.
2.3.1. Extension by continuity. Before we state Hahn-Banach theorem, let
us address the simpler problem of extending a continuous linear functional from a
dense subspace to the whole space.
Proposition 2.3.2 (Extension by continuity). Let X0 be a dense subspace of a
normed space X. Then every functional f0 P X0 admits a unique extension f P X .
Moreover, }f } }f0 }.
Proof. Let x P X be arbitrary. By density, we can find a sequence pxn q X0
such that
xn x.
Then pf0 pxn qq is a Cauchy sequence, since
as n, m 8.
48
x and yn y we have
f pax by q lim f0 paxn byn q a lim f0 pxn q b lim f0 pyn q af pxq bf py q.
n
n
n
Finally, f is a bounded linear functional. Indeed, for xn x we have
|f pxq| lim
|f0 pxn q| }f0 } lim
}xn } }f0 } }x}.
n
n
This shows that f P X and }f } }f0 }. Note that the reverse inequality }f } }f0 }
}f }
1
0
|f ptq| dt
(Riemann integral).
1
0
f ptq dt
f pxq af pz q
x0 ,
a P R, x0
P X0 .
f px0 q af pz q
f0 px0 q.
f pz q : C.
Without loss of generality we can assume that }f0 } 1 (by rescaling). We are
looking for an extension f such that }f } 1, which means that
|f pxq| }x},
f pxq }x},
x P X.
x P X.
49
f0 px0 q }az
aC
x0 },
a P R, x0
P X0 .
The desired extension f will be constructed if we are able to show that this inequality has solutions in C.
For a 0, inequality (2.8) trivially holds by the assumption that }f0 } 1. For
a 0, inequality (2.8) can be written as
x
x0
0
,
f0
a
a
z
a P R , x0
P X0 .
the inequality
sup
f0
b R , x 1 X1
x
1
z
x
x0
0
.
f0
a
a
Collecting the similar terms we see that this in turn is equivalent to the inequality
f0
x
x1
z
b
x0
a
x1
z,
a, b P R ; x0 , x1
P X0 .
But this inequality is true; it follows from the fact that }f0 } 1, i.e.
f0
x
x1 x0
a
b
x1
,
b
pY, gq :
X0
Y :
Y , g pxq : g pxq if x P Y .
50
Lec.17: 10/15
2.3.3. Supporting functionals. Hahn-Banach theorem has a variety of consequences, both analytic and geometric. One of the basic tools guaranteed by
Hahn-Banach theorem is the existence of a supporting functional f P X for every
vector x P X.
Proposition 2.3.7 (Supporting functional). Let X be a normed space. For
every x P X there exists f P X such that
}f } 1,
f pxq }x}.
P X is defined as
}f } sup |f}pxx}q|
x0
Generally it is not true that every functional f attains its norm on some vector x,
i.e. that the supremum above can be replaced by the maximum.
Exercise 2.3.9. Construct a bounded linear functional on C r0, 1s which
does not attain its norm.
However, every vector x does attain its norm on some functional f P X ,
namely the supporting functional. This immediately follows from Proposition 2.3.7:
Corollary 2.3.10. For every vector x in a normed space X, one has
|f pxq|
}x} max
f 0 }f }
P X .
Hahn-Banach theorem implies that there are enough bounded linear functionals
Corollary 2.3.11 (X separates the points of X). For every two vectors x1
x2 in a normed space X, there exists a functional f P X such that f px1 q f px2 q.
Proof. The supporting functional f P X of the vector x
satisfy f px1 x2 q }x1 x2 } 0, as required.
x1 x2
must
51
2.3.4. Second dual space. Let X be a normed space as usual. The functionals f are designed to act on vectors x P X via f : x f pxq. Vice versa, we
can say that vectors x P X act on functionals f P X via
(2.9)
x:f
f pxq,
P X .
|f pxq| }x} }f }
shows that this functional is bounded, so
x P X
and the norm of x as a functional is }x}X
functional f P X of x we see that actually
}x}.
}x}X }x}.
We demonstrated that there exists a canonical embedding of X into into X .
We summarize this as follows.
Theorem 2.3.12 (Second dual space). Let X be a normed space. Then X can
be considered as a linear subspace of X . For this, a vector x P X is considered
as a bounded linear functional on X via the action
x:f
f pxq,
P X .
`1 while `1 `8 , so
`8 .
P X
X,
52
2.3.5. Hahn-Banach theorem for sublinear functions. A quick inspection of the proof of Hahn-Banach theorem in Section 2.3.2 reveals that we have not
used all the norm axioms there. We used just these two positive homogeneity
and triangle inequality, precisely
(i) }ax} a}x} for all x P X, a 0;
(ii) }x y } }x} }y } for all x, y P X.
Definition 2.3.17 (Sublinear function). Let X be a linear space. Functions
} } : X r0, 8q that satisfy (i) and (ii) above are called sublinear functions.
for all x P X0 .
for all x P X.
tK
t 0
ttx : x P K u.
Proposition 2.3.22 (Minkowski functional). Let K be a absorbing convex subset of a linear vector space X such that 0 P K. Then Minkowski functional
is a sublinear functional on X.
Conversely, let } } be a sublinear functional on a linear vector space X. Then
the sub-level set
K tx P X : }x} 1u
53
2.3.6. Separation of convex sets. Hahn-Banach theorem has some remarkable geometric implications, which are grouped together under the name of separation theorems. Under some mild topological requirements, these results guarantee
that two convex sets A, B can always be separated by a hyperplane. As we know
from Section 2.1.3, the hyperplanes correspond to the level sets of linear functionals
f . Therefore, we expect that a separation theorem for A, B would give us a linear
functional f and a number C such that
f paq C
f pbq,
a P A, b P B.
In this case, the sets A and B get separated by the hyperplane tx : f pxq C u.
Let us start from the simpler case when one of the two sets is a point.
Theorem 2.3.24 (Separating a point from a convex set). Let K be an open
convex subset of a normed space X, and consider a point x0 R K. Then there exists
a functional f P X , f 0, such that
f p xq f p x0 q ,
x P K.
Spanpx0 q
f0 ptxq t}x},
t P R.
54
Then f0 is dominated by
x P X.
To finish the proof, we need to check that f is bounded and that it separates
x0 from K as required. The boundedness follows from the inequality
1
f pxq }x}K }x}, x P X,
r
so f P X . To check the separation, consider x P K. Since x0 R K, we have
f pxq }x}K
f pbq,
a P A, b P B.
(ii) If both A and B are open, then the stronger inequality holds:
f paq C
f pbq,
a P A, b P B.
A B : ta b : a P A, b P B u.
The set K is open and convex. (Check!) Since A and B are disjoint, 0 P K.
Using Theorem 2.3.24, we obtain a functional f P X , f 0 such that
f pa bq f p0q 0,
a P A, b P B.
f pbq,
a P A, b P B.
a A
b B
Proof. Let
r
55
a A r {3
x P K.
Indeed, for open sets this follows from Theorem 2.3.25, while for closed sets this
follows from Corollary 2.3.27.
2.3.7. Convex sets are intersections of half-spaces.
Corollary 2.3.30. Every closed convex subset K of a normed space X is the
intersection of all (closed) half-spaces that contain K.
Recall that the half-space is what lies on one side of a hyperplane; therefore
half-spaces have the form
tx P X : f pxq au
for some f P X , a P R. See the picture illustrating Corollary 2.3.30.
x K
PX
: f pxq
au contains K
56
for all a P A, b P B.
}T x} }T } }x},
x P X.
Lec.19: 10/22
57
Z
Z
}ST } }T } }S }.
A version of Proposition 2.1.7 for linear functionals holds for linear operators,
and with a similar proof:
Proposition 2.4.4. Continuity and boundedness of linear operators are equivalent.
Exercise 2.4.5. Prove that a linear operator T : X Y is bounded
if and only if it maps sequences that converge to zero to bounded sequences.
2.4.2. Space of operators. Let X and Y be normed spaces. The space of all
bounded linear operators T : X Y equipped with the operator norm is denoted
LpX, Y q.
As an example, the dual space is a space of operators that map to scalars, i.e.
X
LpX, Rq.
}Tn Tm } 0, n, m 8.
Applying this to an arbitrary x P X and noting that
}Tn x Tn x} }Tn Tm } }x} 0, n, m 8
we see that pTn xq is a Cauchy sequence in Y . By the completeness of Y
Define the map T as
it converges.
T x : lim Tn x.
n
We claim that T is the limit of Tn in LpX, Y q, which would complete the proof.
It is easy to check that T : X Y is a linear operator. (Check!)
To show that T is bounded, we choose an arbitrary x P X and use continuity
of the norm:
}T x} lim }Tn x} sup }Tn }}x}.
n
Since a Cauchy sequence is always bounded (why?), supn }Tn } 8. It follows that
T is a bounded linear operator, i.e. T P LpX, Y q.
It remains to show that Tn T in LpX, Y q, i.e. in the operator norm. Since
T is Cauchy, for every 0 there exists a N P N such that
}Tn Tm } for n, m N.
Applying this to an arbitrary x P X we obtain
}Tn x Tm x} }x} for n, m N.
Letting m 8, we get
}Tn x T x} }x}
58
for n N.
}Tn T } for n N.
This means that Tn T in LpX, Y q as required.
xT ej , ei y,
i, j
1, . . . , n
and where pei q denotes the canonical basis of `n2 . This way, the i-th coordinate of
the vector T x can be computed as
pT xqi
(2.10)
tij xj .
j 1
}T } }T }HS
}T }HS
n
|tij |2
1{2
i,j 1
}T x}22
|pT xqi |2
i 1
n
n
2
tij xj
i 1 j 1
n
n
i 1
|tij |2
n
j 1
j 1
2.4.4. Hilbert-Schmidt integral operators. A similar construction in function spaces L2 leads to the notion of Hilbert-Schmidt integral operators. To this
end, consider a function k pt, sq P L2 pr0, 1s2 q which we call the kernel. Define a
linear operator T : L2 r0, 1s L2 r0, 1s as
(2.11)
pT f qptq
1
0
We can view this definition as a continuous version of (2.10), where kernel k pt, sq
can be considered as a continuous version of matrix. The operator T defined this
way is called Hilbert-Schmidt integral operator with kernel k pt, sq.
Proposition 2.4.8. A Hilbert-Schmidt integral operator T : L2 r0, 1s L2 r0, 1s
with kernel k pt, sq P L2 pr0, 1s2 q is bounded. Specifically,
}T } }k}2 .
59
2
1
k t, s f s ds
dt
}T f }
p qpq
0
0
}k}22 }f }22 .
2
2
1
dt
0
|kpt, sq|
1
ds
0
|f psq|2 ds
1
0
Given a kernel k pt, sq and the left hand side g ptq, the problem is to find the function
f psq.
Fredholm equations can be written as
Tf
g
where T is the corresponding Hilbert-Schmidt operator. Therefore, Fredholm equations are linear equations. They can be thought of as continuous versions of matrix
linear equations Ax b, where A is an m n matrix, b P Rm and x P Rn .
Remark 2.4.11. The particular measure space r0, 1s does not play any role in
the discussion above, and can be replaced with an arbitrary measure space.
pT f qptq
t
0
f psq ds.
r, s is chosen only for convenience; a similar result holds for an arbitrary
Lec.20: 10/25
Differentiating yields
Dek
Hence
pikqek ,
60
P Z.
}Dek }2 k}ek }2 k,
P Z.
so
}P x} }x},
x P X,
x P X.
We have shown:
Proposition 2.4.13. The orthogonal projection P in a Hilbert space X onto a
closed subspace X0 is a bounded linear operator.
Exercise 2.4.14. Show that }P }=1.
As an example, consider the n-th partial sum of the Fourier series of a function
P L2 r, s:
Sn f
k
n
xf, ek yek
pSn f qptq 21
Dn pt sqf psq ds
1
pDn f qptq
2
where
(2.14)
Dn pq
k
n
pn 2 q
sinsin
1
eik
61
}T x} }x}
for all x P X.
T0 |X .
0
As we know from Section 2.3, every bounded linear functional can be extended
from either dense or closed subspace to the whole space. For dense subspaces, we
62
can extend by continuity, while for closed subspaces the extension is guaranteed by
Hahn-Banach theorem.
For general linear operators, extension by continuity holds with the same proof
as in Proposition 2.3.2:
Proposition 2.4.20 (Extension by continuity). Let X0 be a dense subspace of
a normed space X, and Y be a Banach space. Then every operator T0 P LpX0 , Y q
admits a unique extension T P LpX, Y q. Moreover, }T } }T0 }.
Unfortunately, extension from a closed subspace is not always possible, and
Hahn-Banach theorem does not generalize to bounded linear operators. There is
a simple geometric description of the situations when such extensions are possible.
To state it we need a general notion of projections in normed space (not necessarily
orthogonal).
Definition 2.4.21 (Projection). Let X0 be a closed subspace of a normed
space X. An operator P P LpX, X q is called a projection in X onto X0 if
(i) P pX q X0 ;
(ii) P x x for all x P X0 , i.e. P |X0 IX0 .
Example 2.4.22. Any orthogonal projection in a Hilbert space is clearly a
projection in this sense. However, even in a Hilbert space there is a plenty of
non-orthogonal projections. (Construct one in a two-dimensional space.)
The following observation characterizes the subspaces from which extensions of
linear operators are possible.
Proposition 2.4.23 (Extensions of operators and projections). Let X0 be a
closed subspace of a normed space X. Then the following are equivalent:
(i) There exists a projection in X onto X0 . In this case we say that X0 is a
complemented subspace of X;
(ii) For every normed space Y , every operator T0 P LpX0 , Y q admits an extension
T P LpX, Y q.
Proof. Assume P is a projection in X onto X0 . Then for every operator
T0
63
(2.15)
x, y
P Cn .
Now we would like to extend this to a general definition of the adjoint T for a
linear operator T : X Y acting between normed spaces X and Y .
T
P LpY , X q is defined by
pT f qpxq : f pT xq,
P Y , x P X.
xf, xy : f pxq,
(2.16)
P X ,
x P X.
Notice that if X is a Hilbert space, this notation agrees with the inner product by
Riesz representation theorem (up to complex conjugation). In general, xf, xy does
not define an inner product since the arguments are taken from different spaces.
Then the definition of the adjoint reads as
xT f, xy xf, T xy,
P Y , x P X
One point has not been justified in Definition 2.4.25, why T is a bounded
linear operator. We shall prove this now:
Proof. Denoting as usual SX the unit sphere of X, and using notation (2.16),
we have
as required.
64
P LpX, Y q. Then
pIm T qK ker T .
Proof. Let f P Y . Then f P ker T means that T f 0, which is equivalent
to xT f, xy xf, T xy 0 for all x P X, which means that f P pIm T qK .
Corollary 2.4.33. Let H be a Hilbert space, and T
orthogonal decomposition holds:
P LpH, H q.
Then the
Im T ` ker T .
By Proposition 2.4.32, we have pIm T qK pIm T qK ker T .
H
Proof.
does the first identity hold?) By Proposition 1.5.7, the proof is complete.
(Why
P LpX, Y q.
Prove that
ker T
Deduce that
pIm T qK .
pker T qK Im T .
Im T .
65
2.4.10. Application: ergodic theory. Ergodic theorems allow one to compute space averages as time averages. Let us first state and prove a preliminary
form of von Neumanns ergodic theorem; its interpretation will follow.
Theorem 2.4.35 (von Neumann ergodic theorem). Let U be a unitary operator
on a Hilbert space H. Let P denote the orthogonal projection onto the invariant
subspace tx P X : U x xu. Then, for all x P H, we have
N 1
1 n
U x P x.
N 8 N
n0
lim
Proof. It suffices to prove the result for x P ker P and for x P Im P , because
then the result will follow for all x P H by the orthogonal decomposition H
ker P ` Im P . (Check!)
For x P Im P the result is trivial because in this case U n x U x x and
P x x. So let x P ker P . We will first find a convenient representation of ker P .
By definition, Im P kerpI U q kerpI U q because for unitary operators,
U x x if and only if U x x. (Check!) Therefore, using the duality between
kernels and images, Corollary 2.4.33, we have
ker P
as N
8.
(2.17)
|tn N :
pAq
2
as N
8.
66
for all measurable subsets A .4 A one-to-one, measure preserving transformation T is ergodic if the only functions f P L2 p, , q which satisfy f pT q f p q
for almost all P are the constant functions.
Exercise 2.4.38. Show that T is ergodic if and only if for all measurable subset A , T 1 pAq A implies pAq 0 or pAq 1.
Theorem 2.4.39. Consider a measure-preserving, ergodic transformation T on
a probability space p, , q. For every f P L2 p, , q one has
N 1
1
n
lim
f pT q
f d
N 8 N
n0
(2.18)
f T,
i.e.
pU f qpq f pT q,
P .
p q t P :
4Here T 1 A
67
sup
Pr s
t 0,1
sup
Pr s
s 0,1
|kpt, sq| ds : M1 8,
|kpt, sq| dt : M2 8.
Exercise 2.4.42. [Multiplication operator] Consider a multiplier function k ptq P C r0, 1s, and define a linear operator T : C r0, 1s C r0, 1s by
B px, b1 y1
a2 x2 , y q a1 B px1 , y q
b2 y2 q b1 B px, y1 q
a2 B px2 , y q,
b2 B px, y2 q.
P LpH, H q.
r s r s
P r 8s
8
68
x, y
PH
for all x, y
P H.
P LpH, H q
CHAPTER 3
In this chapter we shall study the three theorems that, together with HahnBanach theorem, form the main principles of functional analysis. Those are the
open mapping theorem, the uniform boundedness principle, and the closed graph
theorem.
3.1. Open mapping theorem
3.1.1. Statement and proof. This result was proved by S. Banach.
Theorem 3.1.1 (Open mapping theorem). Let X, Y be Banach spaces. Then
every surjective linear operator T P LpX, Y q is an open map, i.e. T maps open sets
in X to open sets in Y .
The proof of the open mapping theorem relies on Baire category theorem, which
states that every complete metric space M is a set of second category, i.e. M can
not be represented as a countable union of nowhere dense sets. Recall that a subset
A M is nowhere dense set if there is no neighborhood in X on which A is dense.
Equivalently, A is nowhere dense if the interior of the closure of A is empty.
The open mapping theorem states that for every open set U X, every y P T U
is an interior point of T U . We claim that it suffices to show this for U being the
unit ball BX and for y 0:
Claim 3.1.2. To prove the open mapping theorem, it suffices to find 0 such
that
(3.1)
T BX
BY .
PU
BX .
T px
BX q y
T BX
so y is an interior point of T U .
BY ,
We will now prove the Claim. In view of application of Baire category theorem,
we represent
X
nBX .
n N
Therefore
Y
TX
n N
69
nT BX .
70
y
BY .
BY .
As we see, we have almost proved the Claim, except for the closure. Unfortunately, in general K D does not imply K D even for convex and symmetric
sets K, D in a Banach space. (Give a counterexample!) However, this is true for
perfectly convex sets, defined as follows.
Definition 3.1.3 (Perfectly convex set). A set K in a Banach space Y is called
perfectly
convex if for every
sequence pxk q8
k1 and every numbers k 0 such that
8
8
1,
one
has
x
P
K.
k
k
k
k1
k 1
Convex sets satisfy this property only for finite sequences pxk q. (Why?) Therefore, every perfectly convex set is convex, but not vice versa. (Give an example.)
Lemma 3.1.4 (Perfectly convex sets). Let K be a perfectly convex set in a
Banach space Y . If K BY for some 0, then K 2 BY .
Proof. Assume B : BY
assumption clearly implies that
1
2B
K.
The
1
B,
2
for the right side is the -neighborhood of K in Y . Iterating this inclusion gives
B
K
K
1
1
1
K
B K
K
2
2
2
1
1
1
K
K
B K
2
4
2
Therefore1
1
K
2
By perfect convexity (check!), we have
B
1
1
B K
2
2
This proves the lemma.
1
K
4
1
K
4
1
K
8
1
B
4
1
K
2
1
K
8
1
K
16
1
K
4
1
B
8
K.
Exercise 3.1.5. Verify the steps of the above proof where we used
Minkowski sums and series of sets.
Now we are ready to complete the proof of the open mapping theorem. By
Lemma 3.1.4, it suffices to show that K T BX is perfectly convex. This is easy to
1
All sums and series of sets are in Minkowski sense. The sum of sets is defined as k Ak
t k ak : ak P Ak u. The same for infinite sums (series), where we insist on the convergence of
ak .
check. Indeed,
consider any sequence pT xk q
k such that k k 1. Then
(3.2)
provided that the series
T
k T xk
T BX
71
with xk
P BX , and numbers
k xk ,
k xk
k
k }xk }
1.
(3.3)
X { ker T
is an isomorphism.
for all x P E.
72
Exercise 3.1.9. Show that two norms on a linear vector space E are
equivalent if and only if they generate the same topology on E.
Proposition 3.1.10 (Domination and equivalence of norms). Consider two
norms }} and ~~ on a linear vector space E such that E is complete with respect
to both norms. Suppose that one norm dominates the other, i.e. one can find C
such that
~x~ C }x} for all x P E.
Then the two norms are equivalent.
Proof. The claim follows from inverse mapping theorem applied to the identity map (3.4).
Proposition 3.1.10 indicates that it is difficult to construct different good norms
on the same space. Either the space will be incomplete or the norms need to be
incomparable. This is a simple way to prove incompleteness of various spaces.
As an example, consider the norms } }1 and } }8 on C r0, 1s. On the one
hand, } }1 } }8 . On the other hand, the norms are not equivalent one can
easily construct functions with }f }1 1 and }f }8 arbitrarily large. (Do this.) By
Proposition 3.1.10, C r0, 1s must be incomplete with respect to one of these norms.
Since it is complete with its natural norm }}8 , it follows that C r0, 1s is incomplete
with respect to } }1 .
This argument is flexible and applies to a whole range of norms. It implies that
there is essentially only one natural norm on C r0, 1s, namely the sup-norm } }8 .
Exercise 3.1.11. [Direct sums of normed spaces] Let X and Y be
normed spaces and p P r1, 8s. Define the direct sum of X `p Y as the
Cartesian product X Y equipped with the norm
}T x} c}x}
for all x P X.
73
where C
}T }. It follows that T
for all x P X
xk uk
k 1
}T x}
k 1
|xk | }uk }
1{2
n
|xk |2
n
1{2
}uk }2
1{2
M }x},
k 1
k 1
2
where M
.
k1 }uk }
Moreover, T is bijective because puk q is a basis of X. (Why?) By inverse
mapping theorem, T is an isomorphism.
is an isomorphismu.
?n.
74
It follows that for every two n-dimensional normed spaces X and Y , one has
dpX, Y q n.
(Why?) E. Gluskin [4] proved in 1981 that this upper bound is asymptotically
sharp, i.e. there exists an absolute constant c 0 such that, for every n P N one
can construct n-dimensional normed spaces Xn and Yn with
dpXn , Yn q cn.
xn
xPX
implies T xn
T x.
P r 8s
75
xn
x P X and T xn y P Y
imply y
T x.
It is clear from these two lines that continuity always implies the closed graph
property:
Proposition 3.2.2. Let T : X Y be a linear operator between normed spaces
X and Y . If T is bounded then pT q is closed.
The opposite statement is nontrivial and requires completeness of both spaces
X and Y :
Theorem 3.2.3 (Closed graph theorem). Let T : X Y be a linear operator
between Banach spaces X and Y . If pT q is closed then T is bounded.
Proof. The direct sum X `1 Y is a Banach space (Exercise 1.3.13). The graph
pT q is a closed linear subspace of X `1 Y , hence pT q is a Banach space itself.
Consider the linear operator
u : pT q X,
upx, T xq : x.
Then u is a bounded, surjective and injective linear operator between two Banach
spaces. (Check!) By the open mapping theorem, u1 is bounded. This means that
there exists a number M such that
3.2.2. Interpretation and an example. Recalling the interpretation of continuity (3.5) and closed graph property (3.6), we can make advantage of the closed
graph property the extra assumption that T xn converges in Y . So, to check the
continuity of a linear operator T using the definition (3.5), one can always assume
for free that T xn converges. Checking continuity no longer requires proving that
the limit exists; it reduces to checking the consistency of the limits of xn and T xn .
As an example, consider the simplest differential operator
T
dtd ,
T : C 1 r0, 1s C r0, 1s
where C 1 r0, 1s is considered as a subspace of C r0, 1s, i.e. with respect to sup-norm.
Lemma 3.2.4. The differential operator T has closed graph.
p q
q1 limn fn1
provided
76
xx, T xy xT x, yy
(3.7)
for all x, y
be a linear operator on
P H.
Then T is bounded.
Proof. By the closed graph theorem, it suffices to check that the graph of T
is closed. To this end we choose convergent sequences xn x, T xn y in H. We
would like to show that y T x. It suffices to show that
xz, yy xz, T xy
for all z
P H.
(Why?) This follows by using continuity of the inner product and (3.7) twice:
xz, yy lim
xz, T xn y lim
xT z, xn y xT z, xy xz, T xy.
n
n
The proof is complete.
i dtd
with domain
Dom T
sup }T x} 8
T T
for every x P X.
T T
Note that the reverse direction is trivially true uniform boundedness implies
pointwise boundedness. (Why?)
77
T T
Xn ,
where
Xn
tx P X :
M pxq nu.
n N
Baire category theorem implies that one of Xn is not a nowhere dense subset of
X. Since Xn is closed (why?), Xn has nonempty interior. Summarizing, we have
shown that there exist n P N, x0 P X and 0 such that
x0 BX .
By symmetry of Xn , we also have Xn x0 BX .
Xn
we have
Xn
Hence by convexity of Xn ,
BX .
}x}
It follows that for every x P X,
T T
sup }T x}
n
}x}.
sup }T }
T T
sup }T x} n.
implies
T T
as required.
Exercise 3.3.2. Check that the sub-level sets Xn in the proof above
are closed, convex and symmetric. (All these properties were used in
the argument).
Remark 3.3.3 (Principle of condensation of singularities). Banach and Steinhaus called their Theorem 3.3.1 the principle of condensation of singularities for the
following reason. Suppose a family T LpX, Y q is not uniformly bounded. This
means that the set of vectors tT x : x P BX , T P T u is unbounded. Theorem 3.3.1
states that T is not even pointwise bounded, so there exists one vector x P X with
unbounded trajectory tT x : T P T u. One can say that the unboundedness of the
family T is condensated in a single singularity vector x.
Remark 3.3.4 (Completeness). In the proof of Theorem 3.3.1, the completeness of only X was used. So the result still holds if X is a Banach space and Y is
a normed space.
3.3.2. Weak and strong boundedness. Principle of uniform boundedness
can be used to check whether a given set in a Banach space is bounded.
Corollary 3.3.5 (Weak and strong boundedness). Let A be a subset of a
Banach space X. Assume that A is weakly bounded, i.e.
sup |f pxq| 8
x A
for every f
P X .
78
x A
Here again the reverse statement is trivially true (strong) boundedness trivially implies weak boundedness.
Proof. We embed X into X using the canonical embedding that we studied
in Theorem 2.3.12. So we consider vectors x P A as bounded linear functionals on
X acting as xpf q : f pxq, f P X . Rewriting the weak boundedness assumption
as supxPA |xpf q| 8 for f P X , we may understand this assumption as pointwise boundedness of the family A X LpX , Rq. The principle of uniform
boundedness implies that supxPA }x}X supxPA }x}X 8, as required.
Remark 3.3.6. Using Corollary 3.3.5, one can weaken the assumption (3.8) in
the principle of uniform boundedness to the following one:
sup |f pT xq| 8 for every x P X, f
T T
P Y .
(Why?)
3.3.3. Application to convergence of Fourier series. A basic and classical question in Fourier analysis is when does Fourier series of a function f on an
interval converge to f ?
Hilbert space technique provides a complete answer to this question in the
space L2 . As we know from Theorem 1.6.20, the Fourier series of every function in
L2 r, s converges to f in the L2 -norm.
In function spaces other than L2 , the answer to this problem is often nontrivial
and even negative. Unfortunately, such is the situation in the space of continuous
functions C r, s. There exist continuous functions f whose Fourier series do
not converge in C r, s (i.e. uniformly). This follows from a somewhat stronger
result, which in turn is a consequence of the principle of uniform boundedness:
Theorem 3.3.7 (Divergent Fourier series). There exists a function f
whose partial Fourier sums
n
pSn f qptq
k
n
P C r, s
fppk qeikt
pSn f qptq 21
where
Dn pq
sinpn 21 q
.
sin 21
We are interested in the behavior of pSn f qp0q. These are obviously linear functionals
on C r, s, which we denote
n pf q : pSn f qp0q
(We used that Dn is an even function).
1
2
79
}n } 21
|Dn psq| ds
(Why?)
On the other hand, evaluating these integrals by hand one can see that
(3.9)
}n } 8 as n 8.
Therefore, pn qnPZ is not a uniformly bounded family of
80
3.3.4. Schauder bases. The notion of Hamel basis, which we studied in Section 1.1.4, has a serious drawback. In all infinite-dimensional Banach spaces, Hamel
bases are uncountable, see Exercise 3.3.22. This makes it difficult to use Hamel bases
in practice. There exists an alternative notion of basis, which is more tailored to
the needs of analysis:
Definition 3.3.10 (Schauder basis). A sequence pxk q8
k1 in a Banach space X
is called a Schauder basis of X if every vector x P X can be uniquely expressed as
a convergent series
x
(3.10)
ak xk
k 1
x
ak xk .
k 1
81
Example 3.3.15 (Basis of the space of continuous functions). In C r0, 1s, the
natural candidates fail to be Schauder bases. The Fourier basis is not a Schauder
basis otherwise this would imply that the Fourier series of every continuous function would converge in C r0, 1s (why?), which would contradict Theorem 3.3.7.
The sequence of monomials 1, t, t2 , . . . is not a Schauder basis of C r0, 1s either.
(Prove this!)
The most known Schauder basis of C r0, 1s is the so-called Schauder system of
wavelets. Its mother wavelet ptq is obtained by integration of Haar mother wavelet,
i.e.
#
t
t,
t P r0, 1{2q
hpsq ds
ptq
1 t, t P r1{2, 1q
0
Then we consider the translates and dilates of the mother wavelet:
kl ptq p2k t lq,
0, 1, 2, . . . ,
l 0, 1, 2, . . . , 2k 1.
Together with the constant function 1, the system of functions kl ptq is called the
Schauder system, see the picture. It forms a Schauder basis on C r0, 1s. (Check!)
M }x},
n 1, 2, . . .
k 1
(3.12)
E : a pak q8
k1 :
ak xk converges in X
k 1
n
}a}E : sup
P
n N
k 1
ak xk .
82
Ta
ak xk .
k 1
}T a} }a}E .
Since pxk q is a Schauder basis, T is surjective and injective. By the inverse mapping
theorem, T is an isomorphism. Therefore one can find a number M such that
}a}E M }T a},
n
sup
n N
a P E.
ak xk M }x},
x P X,
k 1
ak xk .
k 1
n N
8.
Also, the coefficients ak ak pxq of the basis expansion (3.10) are obviously
linear functionals on X. They are called biorthogonal functionals of the basis pxk q
and denoted xk , i.e.
xk pxq ak .
With this notation, the basis expansion of x P X looks as
x
xk pxqxk
k 1
This resembles the Fourier series with respect to orthogonal bases in a Hilbert
space, except now we discuss this in general Banach spaces.
Corollary 3.3.19 (Biorthogonal functionals). The biorthogonal functionals
k N
p q
83
}xk pxqxk } }Sk pxq Sk1 pxq} }Sk pxq} }Sk1 pxq} 2M }x}
where M is the basis constant. On the other hand, }xk pxqxk } |xk pxq| }xk }. This
clearly completes the proof.
84
distpx, Y q
The next useful result states that pointwise convergence of operators implies
uniform convergence on compacta. We say that a sequence of operators Tn P
LpX, Y q between normed spaces X and Y converges pointwise to some T P LpX, Y q
if
Tn x T x for all x P X.
85
}Tn x T x} n 0
where n
for all x P A,
}Tn } M
for all n.
pq
}x Sn x} n 0
for all x P A,
k n
where n
Corollary 3.4.7 (Compactness in `p ). A subset A `p , p P r1, 8q is precompact if and only if A is bounded and has uniformly decaying tails, i.e.
|ak |p n 0
k n
where n
Proof. The claim follows immediately by applying Theorem 3.4.6 for the
canonical basis of `p .
86
1
for all k
k
P Nu.
|ak | bk
for all k
P N.
|s t|
|f psq f ptq|
implies
for all f
P A.
| |
implies
0
|f pt
q f ptq|
for all f
P A.
87
}xn x} 0.
(3.15)
for every f
x.
P X .
xxk , xy 0
for every x P X.
k 1
Even though weak convergence is generally strictly weaker than strong convergence, there are several useful ties between weak and strong properties. Weak
convergence clearly implies weak boundedness, which in turn implies strong boundedness by a consequence to the principle of uniform boundedness (Corollary 3.3.5):
Proposition 3.5.3. Weakly convergent sequences in Banach spaces are bounded.
Moreover, we have a good control of the weak limit, given in the next two
results.
w
Proposition 3.5.4. If xn
as required.
p q
6Recall that conv A is the smallest convex set containing A, see Exercise 1.2.24.
88
Proof. Suppose x R K : convpxk q. Using a separation theorem (Corollary refclosed convex separation), we can separate the closed convex set K from the
point txu. Namely, there exists a functional f P X such that
sup f py q f pxq.
Since xk
y K
3.5.2. Criteria of weak convergence. Some known criteria of weak convergence in classical normed spaces rely on the following tool.
Lemma 3.5.6 (Testing weak convergence on a dense set). Let X be a normed
w
space and A X be a dense set. Then xk
required.
for every i P N.
Proof. Necessity. If xn
x then by applying coordinate functionals ei P X
(i.e. those acting as ei pxq xpiq) we see that xk piq xpiq as required.
Sufficiency. We are given that pxk q is bounded and that f pxk q f pxq for
every coordinate functional f ei . By linearity, we get f pxk q f pxq for every
f P Spanpei q8
i1 .
On the other hand, the representation theorems (Corollary 2.2.6 and Exercise 2.2.7) state that X `1 if X c0 and X `p1 if X `p . The functionals
ei P X get identified with the coordinate vectors p0, . . . , 0, 1, 0, . . .q, which shows
that Spanpei q8
i1 is dense in X . (Why?)
The proof is finished by applying Lemma 3.5.6 to A Spanpei q8
i1 .
w
89
Exercise 3.5.9. State and prove a similar criterion of weak convergence in spaces with Schauder basis.
A similar criterion of weak convergence holds in spaces of continuous functions.
Theorem 3.5.10 (Weak convergence in C pK q). Let K be a compact topological
w
space. Then xk
for every t P K.
(3.16)
K
xn d
x d
K
for every Borel regular signed measure . On the other hand, our assumptions are
that the sequence of functions xn ptq is uniformly bounded and it converges to xptq
pointwise. The Lebesgue dominated convergence theorem implies (3.16).
A similar criterion of weak convergence holds in Lp spaces. However, it does
not make sense to consider the values of functions x P Lp in individual points.
Instead, we shall consider integrals of xptq over short intervals.
Theorem 3.5.11 (Weak convergence in Lp ). Let p P p1, 8q. A sequence xk
x in Lp r0, 1s if and only if the sequence pxk q is bounded in Lp and
b
a
xk ptq dt
b
a
xk ptq dt
Proof. One notices that the set of characteristic functions 1ra,bs ptq for ra, bs
(Why?) The
argument is finished similarly to Theorem 3.5.7.
r0, 1s spans the set of step functions, which is dense in pLp q Lp1 .
Remark 3.5.12. The same criterion holds for Lp pRq. (Why?)
90
3.5.3. Weak topology. Now we broaden the picture and study the weak
topology on X which defines weak convergence. This way, in addition to weak
convergence, we could be able to study other weak properties, such as weak boundedness, weak compactness and so on.
Definition 3.5.16 (Weak topology). The weak topology on a normed space X
is defined as the weakest topology in which all maps f P X (i.e. f : X R) are
continuous.
Equivalently, the base of the weak topology is given by the cylinders, which are
the sets of the form
tx P X : |fk px x0 q| , k 1, . . . , N u
where x0 P X, fk P X , 0, and n P N. So, these cylinders form a local base of
tx P X : f pxq au
} }8 8
91
Remark 3.5.22. Convexity assumption is critical in Proposition 3.5.21. Otherwise the result would claim that the weak and strong topologies are equivalent,
which is false.
3.6. Weak topology. Banach-Alauglus theorem
On X , there are two natural weaker topologies. The weak topology that
we already considered makes all functionals in X continuous functions on X .
The other topology, called weak topology, is only concerned with continuity of
functionals that come from X X .
3.6.1. Weak convergence.
Definition 3.6.1 (Weak convergence). Let X be a normed space. A sequence
of functionals pfk q in X weak converges to a functional f P X if
fk pxq f pxq
The weak convergence is denoted fk
for every x P X.
w f .
f dn
f d
for every f
P C pRq.
Assume that the measures n and are compactly supported, say on an interval
ra, bs. By the representation theorem for pC ra, bsq , Theorem 2.2.8, this convergence
is nothing different from
w
n in pC ra, bsq .
Summarizing, the weak convergence of measures in probability theory is actually
the weak convergence of measures acting as linear functionals on C ra, bs.
Example 3.6.3 (Dirac delta function). Recall that we understand Dirac delta
function ptq as the point evaluation functional at zero, see Example 2.1.3. Equivalently, Dirac delta function may be identified with the probability measure on R
with the only atom at the orgigin. Therefore Dirac delta function is the weak limit
of uniform measures on r n1 , n1 s as n 8.
This gives a natural way to approximate Dirac delta function ptq (which does
not exists as a function on R) by genuine functions n ptq, which are the probability
distribution functions of the uniform measures on r n1 , n1 s, see the picture.
92
tf P X : |pf f0 qpxk q| , k 1, . . . , N u
where f0 P X , xk P X, 0, and n P N. So, these cylinders form a local base of
weak topology at f0 .
P X is compact
Proof of Banach-Alaoglus theorem. We shall embed BX into the product space of intervals
K :
x X
r}x}, }x}s
f :X
93
Bx,y,a,b ,
where Bx,y,a,b
P K : f pax
by q af pxq bf py q .
x,y X, a,b R
Each set Bx,y,a,b is the preimage of the weak closed set t0u under the map f
f pax by q af pxq bf py q which, as we know, is continuous in the product topology.9
Therefore all sets Bx,y,a,b are weak closed, and so is their intersection BX . This
completes the proof.
xpf q : f pxq,
P K.
|f pxq| }x}X ,
}x}C pK q f PKmax
B
X
where the last inequality uses a consequence of Hahn-Banach theorem, Corollary 2.3.10.
Exercise 3.6.7. [Universality of `8 ] Show that `8 is a universal space
for all separable Banach spaces. In other words, show that every separable Banach space X isometrically embeds into `8 .
Hint: Consider a dense subset pxk q8
k1 of SX , choose supporting functionals fk P SX of xk , and define the embedding X `8 by x pfk pxqq8
k1 .
9Recall that the point evaluation maps are continuous in the product topology.
10A little disclaimer is that the compact topological space K may depend on X; otherwise
CHAPTER 4
pT f qptq
1
0
|t1 t2 |
implies
8.
94
95
(We can do this by continuity of the kernel k pt, sq.) Now, for every f
obtain by triangle inequality that
T f t1
T f t2
p qp q p qp q
1
0
P BC r0,1s , we
as |f psq| 1 for all s. This shows that the set K is equicontinuous, and therefore
precompact.
Exercise 4.1.5. Show that Volterra operator (2.12) is compact on
C r0, 1s, even though its kernel is discontinuous. See Exercise 4.1.19 for a
more general result.
4.1.2. Basic properties of compact operators.
Proposition 4.1.6 (Properties of K pX, Y q). (i) The set of compact operators K pX, Y q is a closed linear subspace of LpX, Y q.
(ii) K pX, Y q is an operator ideal. This means that if T P K pX, Y q then the
compositions ST and T S are both compact for every bounded linear operator
S.
Proof. (i) Linearity follows from the observation that the Minkowski sum of
two precompact sets is precompact (see exercise below).
Closedness. Consider a sequence Tn P K pX, Y q such that Tn T in LpX, Y q;
we want to prove that T P K pX, Y q. Let 0 and choose n P N such that
}Tn T } . This means that
Since is arbitrary,
96
}T xk }2 8.
k 1
The quantity
}T }HS :
}T xk }
1{2
k 1
}T xk }2
|xT xk , xj y|2
k,j
|xxk , T xj y|2
k,j
}T xj }2 .
}T xj }2
j,k
|xx1k , T xj y|2
j,k
}T x1k }2 .
}T }HS }T }HS .
97
}T x}
ak T xk
|ak | }T xk }
1{2
}T xk }2
1{2
}x} }T }HS .
|ak |2
}
T xk 22
k
1
1
k
|xKt , xk y|2 dt
|pT xk qptq|
dt
1
k
|xKt , xk y|2 dt
P K pX, Y q
Proof. Given f
(4.2)
x BX
x BX
y K
98
(making some selection of f ; it does not matter in which way). Then identity (4.2)
implies that
}T f }X }f |K }C pK q for every f P Y ,
which shows that U is an isometric (thus homeomorphic) embedding.
Now, U pGq is (uniformly) bounded in C pK q as
}T f }X }T } }f }Y }T } for every f P BY .
Moreover, U pGq is equicontinuous. Indeed, for every f P BY and for y1 , y2 P K
we have
f
Remark 4.1.18 (For future). Consider proving the reverse direction in Schauders
theorem. Also consider proving that compact operators map weak Cauchy sequences to strongly convergent.
Fredholm theory studies operators of the form identity plus compact. They
are conveniently put in the form I T where I is the identity operator on some
Banach space X and T P K pX, X q.
Fredholm theory is motivated by two applications. One is for solving linear
equations x T x b, and in particular integral equations (T being an integral
operator). Another related application is in spectral theory, where the spectrum of
T consists of numbers for which the operator I T is invertible. We will discuss
both applications in detail later.
4.2.1. Closed image.
I
T
P K pX, X q.
Then operator
99
but Axk
Arxk s 0.
Proof. Necessity. Assume that A is injective but not surjective. Consider the
subspaces of X
Yn : ImpAn q, n 0, 1, . . .
Then
Y0 Y1 Y2
is a chain of proper inclusions. Indeed, the first inclusion X ImpAq is proper by
assumption; the claim follows by induction. (Check this!)
Furthermore, Yn are closed subspaces of X. Indeed, by Newtons binomial
expansion we see that An pI T qn has the form A I T1 for some compact
operator T1 , so the claim follows from Theorem 4.2.1.
By Hahn-Banach theorem (see Exercise 2.3.33) we can find functionals
fn
P Yn
}fn } 1,
such that
fn
P YnK 1 .
So
dn,m
sup
x BYn
|xT fn T fm , xy|
sup
x BYn
T q pfn fm q
fm fn }.
sup
x BYn
|xfn , xy| 1
100
which we proved in Proposition 2.4.32 and Exercise 2.4.34. So, assume that A
I T is surjective. Then A I T is injective by (4.3). Since T is compact
by Schauders theorem, the first part of the proof gives that A is surjective. This
implies that A is injective by (4.3). The proof is complete.
Remark 4.2.3 (Compactness is essential). Fredholm alternative does not hold
for non-compact operators in general. For example, the right shift operator in `2
is injective but not surjective; the left shift operator in `2 is surjective but not
injective.
The name Fredholm alternative is explained by the following application to
solving linear equations of the form
x T x b
where T P K pX, X q, P C, b P X. One is interested in existence and uniqueness of
solution. Theorem 4.2.2 states that exactly one the following statements holds for
every 0:
either the homogeneous equation x T x 0 has a nontrivial
solution,
or the inhomogeneous equation x T x b has a solution
for every b; this solution is automatically unique.
1
0
1
0
codim Im A codim Im A .
101
Studying linear operators through their spectral properties is a powerful approach in analysis and mathematical physics. Recall from linear algebra that the
spectrum of a linear operator T acting on Cn consists of the eigenvalues of T ,
which are the numbers P C such that T x x for some nonzero vector x P Cn ;
such x are called the eigenvectors of T . Eigenvalues always exist by the fundamental theorem of algebra, as they are the roots of the characteristic polynomial
detpT I q 0. There are at most n eigenvalues of T , or one can say exactly
n counting multiplicities. Eigenvectors corresponding to different eigenvalues are
linearly independent.2
4.3.1. Examples and definition of spectrum. In infinite-dimensional normed
spaces, the spectrum is a richer concept than in finite-dimensional spaces. Let us
illustrate the difference on two examples.
Example 4.3.1 (Uncoutnable number of eigenvalues). Consider the differential
operator
d
T
dt
acting, for example, on C 1 pCq. To compute the spectrum of T , we solve the ordinary
differential equation u1 u. The solution has the form
uptq Cet .
Therefore, every P C is an eigenvalue of T .
pT f qptq tf ptq.
0. Therefore, T
has no eigenvalues.
1
than the multiplicity of that root. This happens, for example, for the Jordan block T
.
0
An orthonormal basis of eigenvectors exists if and only if T is normal, i.e. T T
TT .
3In the future, we will often say invertible instead of invertible as a bounded linear
operator.
102
4.3.2. Classification of spectrum. For operators T acting on a finite dimensional space, the spectrum consists of eigenvalues of T . In infinite dimensions,
this is not true, as there are various reasons why T I may be non-invertible.
These reasons are listed in the following definition:
Definition 4.3.4 (Classification of spectrum). Let X be a normed space and
P LpX, X q.
(i) The point spectrum p pT q is the set of all eigenvalues of T , i.e. the numbers
P C satisfying
kerpT I q 0.
(iii)
8
T ppxk q8
k1 q pk xk qk1 .
8
1 y yk
As pT I qx ppk qxk q8
k1 , we have pT I q
k k1 . It follows
1
that pT I q
is a bounded operator is and only if is not in the closure of
tk u8k1 , which is tk u8k1 Y t0u.
All k are clearly the eigenvalues of T as T ek k ek for the canonical basis
pek q of `2 . 0 is not an eigenvalue since T is injective (as all k 0). So 0 is either
in continuous or residual spectrum. Now, Im T is dense in `2 (why?), so 0 is in the
continuous spectrum. Our conclusion is:
p pT q tk u8
k 1 ,
c pT q t0u,
r pT q H.
pT f qptq tf ptq.
As pT I qf ptq pt qf ptq, we have
(4.4)
pT I q1 yptq t 1 yptq.
If R r0, 1s then the function t1 is bounded, thus pT I q1 is a bounded operator.
Therefore such are regular points. Conversely, if P r0, 1s then t1 R L2 r0, 1s
because of the non-integrable singularity at 0. Hence T I is not invertible (at
y ptq 1). Hence all such are regular points. Therefore, pT q r0, 1s.
As we noticed in Example 4.3.2, T has no eigenvalues. It follows from (4.4)
that ImpT I q is dense in L2 r0, 1s. (Check!) Our conclusion is:
p pT q H,
c pT q r0, 1s,
r pT q H.
103
Remark 4.3.7. If Dirac delta function ptq was a genuine function in L2 , then
its translates ptq : pt q would be the eigenvectors of the multiplication
operator on L2 :
T
and would be the eigenfunctions of T . The situation would be similar to the
discrete multiplication operator from Example 4.3.5.
Example 4.3.8 (Shift operator). Consider the right and left shift operators on
`2 , acting on a vector x px1 , x2 , . . .q as
Rpxq p0, x1 , x2 , . . .q,
c pRq t P C : || 1u,
p pLq t P C : || 1u,
r pRq t P C : || 1u;
c pRq t P C : || 1u,
r pRq H.
Exercise 4.3.9. Prove the claims about the spectra of shift operators
made in Example 4.3.8.
4.4. Properties of spectrum. Spectrum of compact operators.
Throughout this section, X denotes a Banach space and T
P LpX, X q.
4.4.1. Resolvent operator. Spectrum is bounded. Studying the spectrum of T is convenient via the so-called resolvent operator:
Definition 4.4.1 (Resolvent operator). To each regular point P pT q we
associate the operator
Rpq pT I q1 .
Rpq is called the resolvent operator of T . So the resolvent is a function R : pT q
LpX, X q.
The resolvent operator can be computed in terms of series expansion involving
T . This technique is based on the following simple lemma:
Lemma 4.4.2 (Von Neumann). Consider an operator S P LpX, X q such that
}S } 1. Then I S is invertible, and it can be expressed as a convergent series in
LpX, X q:
8
k 0
k
S converges absolutely because }S }
k
k 0
pI S q
Sk
k 0
}pI S q1 }
Sq I
k 0
k 0
S k pI
}S }k while
}S }k 1 1}S } .
104
pI 1 T q
k1 T k ,
k 1
}Rpq} || 1 }T } .
(4.5)
p qRpq 1 Rpq
The regular set pT q is an open set. Equivalently, the spec
Rpq I
Corollary 4.4.6.
trum pT q is a closed set.
, the right hand side of (4.5) defines a bounded linear opearator. One can check
that in this case identity (4.5) holds (do this!) and therefore P pT q.
4.4.3. Resolvent is an analytic function. Spectrum is nonempty. The
proof of Corollary 4.4.6 gives us a bit more information about the resolvent than we
have noticed. Let us go fo back to identity (4.5) and write the series expansion of
the inverse of p qRpq I according to von Neumanns lemma. We immediately
obtain:
Corollary 4.4.7 (Resolvent expansion). The resolvent Rpq is an analytic
operator-valued function on its domain pT q. Specifically, Rpq can be expressed as
a convergent power series in a small neighborhood of any point P pT q:
(4.6)
Rpq
p qk1 Rpqk .
k 1
Remark 4.4.8. It follows that for every functional f P LpX, X q , the function
f pRpqq is a usual (i.e. complex-valued) analytic function on pT q.
Theorem 4.4.9. The spectrum pT q is a nonempty set.
105
Proof. We shall deduce this result from Liouvilles theorem in complex analysis.4 To this end, assume that pT q H, hence pT q C and the resolvent Rpq
is an entire function (i.e. analytic on the whole complex plane).
Claim. Rpq is also bounded function on C with Rpq 0 as 8.
Indeed, by Proposition 4.4.4, Rpq is a bounded in the annulus || 2}T } and
vanishes at infinity. Since Rpq is a continuous function by Corollary 4.4.7, Rpq is
also bounded in the disc || 2}T }.
Claim. By Liouvilles theorem, Rpq 0 everywhere.
Indeed, we fix a functional f P LpX, X q and apply the usual Liouvilles theorem for the bounded entire function f pRpqq. It follows that f pRpqq is constant,
and since it must vanish at infinity it is zero everywhere. The claim follows.
The last claim contradicts the fact that Rpq is an invertible operator.
Summarizing our findings, we can state that the spectrum of every bounded
linear operator is a nonempty compact subset of C.
P LpX, X q is defined
rpT q max || : P pT q .
P LpX, X q acting
inf
}T n }1{n .
n
Exercise 4.4.12. Clearly rpT q }T }n1{n }T }, so Gelfands formula is
an improvement upon Proposition 4.4.3. Give an example where rpT q
}T }.
n
C
is constant everywhere.
5This is a partial case of the spectral mapping theorem we will study later.
6Specifically, we shall use the following theorem of compex analysis. Consider a Laurent
series
p q
f z
ak pz z0 qk .
k8
106
(4.7)
k1 f pT k q
k 1
sup n1 T n : K
n
8.
Taking n-th root and rearranging the terms, we obtain }T n }1{n K 1{n 1 1{n for
all n. It follows that lim supn }T n }1{n ||. Since this happens for all such that
|| rpT q, we have proved that
lim sup }T n }1{n
n
rpT q.
So, putting this together with the upper bound, we have proved that
rpT q inf }T n }1{n
n
and diverges outside the closure of A. Moreover, there exists at least one point on the inner
boundary z C : z z0
r of A and at least one point on the outer boundary z C :
z z0
R of A such that f z can not be analytically continues to those points.
t P
| | u
| | u
pq
t P
107
Proof. Clearly, the second and third claims of the theorem follow from the
first one (why?). So, assume the contrary, that there exist 0 and an infinite
sequence of linearly independent vectors pxk q8
k1 such that
k xk , where |k | .
Consider the subspaces En Spanpxk qnk1 ; then E1 E2
T xk
is a sequence of
pnq
a k xk
apnnq xn
un1 ,
where un1
P En1 .
k 1
Then
n apnnq xn
T un1 P En1 .
Now we are ready to estimate }T yn T ym } for n m. Since T ym P Em En1 ,
T yn
vn1 ,
where vn1
we obtain
The proof
108
}U x} }x}
for all x P H.
(i) operators on Cn and Rn given by n n unitary complex matrices and orthogonal real matrices; in particular rotations, symmetries, and permutations of
coordinates in Cn and Rn ;
(ii) right shift R on `2 (but not left why?)
(iii) an isometry between any pair of separable Hilbert spaces established in Theorem 1.6.30.
Remark 4.4.19. A unitary operator U preserves all pairwise distances, i.e.
Moreover, by polarization identity 1.4.19, U also preserves
the inner products:
}U x U y} }x y}.
xU x, U yy xx, yy
for all x, y
P H.
109
4.4.7. Additional exercises. In the following two exercises, one can work
over R. Similar results hold over C. The only difference is that for Hilbert spaces,
one has to take complex conjugation in appropriate places (which ones?), see Remark 2.4.26.
Exercise 4.4.23. [Spectrum of adjoint I] Let T P LpX, X q. Prove that
pT q pT q. Here the bar stands for complex conjugation rather than
for closure.
Exercise 4.4.24. [Spectrum of adjoint II] Let T P LpX, X q
(i) Prove that if P p pT q and R p pT q then P r pT q. (Hint: use
the duality relations from Proposition 2.4.32 and Exercise 2.4.34
for the operator T I.)
(ii) Prove that
r pT q p pT q r pT q Y p pT q.
Deduce that if X is reflexive, then r pT q p pT q. Deduce that
self-adjoint bounded linear operators in Hilbert space do not have
residual spectrum.
Exercise 4.4.25. [General multiplication operator on L2 ] Consider a
general multiplication operator T acting on L2 r0, 1s as
where g
CHAPTER 5
Throughout this chapter, H will denote a Hilbert space, and we will study
bounded self-adjoint operators T on H.
5.1. Spectrum of self-adjoint operators
5.1.1. Definition and examples. Let T be a bounded linear operator on a
Hilbert space, i.e. T P LpH, H q. Recall from Section 2.4.9 that the adjoint operator
T P LpH, H q is defined by xT x, y y xx, T y y for x, y P H.
Definition 5.1.1. An operator T
i.e.
xT x, yy xx, T yy,
x, y
P H.
iS
y xx, T xy xT x, xy.
110
111
xT x, yy 41 f px
y q f px y q
if px
iy q if px iy q
}T }
sup
x SH
|xT x, xy|.
}T }
sup }T x} sup
x SH
x,y SH
|xT x, yy|
sup
x SH
|xT x, xy| : M.
It remains to show that the inequality here is actually the identity. To this end, we
note that
sup |xT x, y y| sup RexT x, y y
x,y SH
x,y SH
P LpH, H q be a self-adjoint
112
I, we immediately obtain
Corollary 5.1.8 (Criterion of spectrum points). Let T P LpH, H q be a selfadjoint operator. Then P pT q if and only if the operator T I is not bounded
Applying this result for the operator T
below.
x SH
sup xT x, xy.
x SH
P pT q.
(i) Let P Czrm, M s; since the interval is closed we have
d : distp, rm, M sq 0.
113
Pp q
P LpH, H q
such
114
1 x1 and T x2 2 x2 then
1 xx1 , x2 y xT x1 , x2 y xx1 , T x2 y 2 xx1 , x2 y.
Proof. If T x1
(In the last identity we used that 2 is always real, so there is no conjugation). It
follows that if 1 2 then xx1 , x2 y 0 as claimed.
Definition 5.2.2 (Invariant subspace). A subspace E of H is called an invariant subspace of T if T pE q E.
Example 5.2.3. Every eigenspace of T is invariant. More generally, the linear
span of any subset of eigenvectors of T is an invariant subspace.
One of the most well known open problems in functional analysis is the invariant subspace problem. It asks whether every operator T P LpH, H q has a proper
invariant subspace (i.e. different from t0u and H).
Proposition 5.2.4. Let T P LpH, H q be self-adjoint. If E
subspace of T then E K is also an invariant subspace of T .
H is an invariant
PE
115
5.2.3. Diagonalization. Spectral Theorem 5.2.5 allows us to always represent compact self-adoint operators T P LpH, H q in a diagonal form, similarly to the
one for Hermitian matrices.
Let pk q be an orthonormal basis of eigenvectors of T . Then T k k where
n are the eigenvalues. We can identify the space H with `2 by identifying pk q
with the canonical basis pek q of `2 (recall Section 1.6.7). With this identification,
T becomes a multiplication operator acting on `2 as T ek k ek ; equivalently
8
T ppxk q8
k1 q pk xk qk1 .
We see that T now has a quite simple form, which we studied in Example 4.3.5.
In literature, one comes across various forms of spectral Theorem 5.2.5. We
mention two of them. Let as before pk q denote an orthonormal basis of eigenvectors
of T with corresponding eigenvalues k . Orthogonal basis expansion gives
x
xx, k yk ,
x P H.
n xx, n yn ,
k k we obtain that
x P H.
k k b k
k Pk
n 1
n n ptqn psq.
116
Proof. Consider the integral operator pT f qptq 0 k pt, sqf psq ds on L2 r0, 1s.
Let pn q be an orthonormal basis of its eigenvectors. Then the functions
n ptqm psq, n, m 1, 2, . . .
form an orthonormal basis of L2 pr0, 1s2 q. (Check!)
Let us write the basis expansion of our function in L2 pr0, 1s2 q:
k
xk, nm ynm .
nm
n,m
xk, nm y
11
0
1
1
0
pT m qptqn ptq dt xT m , n y m xm , n y
#0
m ,
0,
Therefore
k
nm
n m.
xk, nn ynn
n n ptqn psq
as claimed.
n k b k .
The numbers k are called singular values of T and the vectors k and
k are called left (resp. right) singular vectors of T .
(Hint: Choose pk q to be an orthonormal basis of eigenvectors of
T T . Write the basis expansion of x P H and apply T to both sides.)
5.3. Positive operators. Continuous functional calculus
Lec. 38: 12/8
117
xT x, xy 0
Positive operators are generalizations of non-negative numbers (which correspond to operators on one-dimensional space C1 ).
Example 5.3.2. Examples of positive operators include:
(i) T 2 for every self-adjoint T P LpH, H q, as xT 2 x, xy xT x, T xy 0;
(ii) Hermitian matrices with non-negative eigenvalues;
(iii) More generally, compact self-adjoint operators on H with non-negative eigenvalues. (Why?)
Definition 5.3.3 (Partial order). For self-adjoint operators S, T
we shall say that S T if T S 0.
P LpH, H q,
Theorem 5.3.4 (Spectrum interval). Let T P LpH, H q be a self-adjoint operator. Let m, M be the smallest and the largest numbers such that
T M I.
Then pT q rm, M s and m, M P pT q.
mI
Then T
0 if
5.3.2. Polynomials of an operator. We start to develop a functional calculus for self-adjoint operators T P LpH, H q. We begin by defining polynomials of T ,
then we extend the definition to continuous functions of T by approximation. Working with polynomials is straightforward, and the result of this subsection remain
valid for every bounded linear operator T on a general Banach space X.
a0
a1 T
an T n .
paf
bg qpT q a f pT q
b g pT q,
pf gqpT q f pT qgpT q,
f pT q f pT q.
This last property states in other words that for a fixed T P LpH, H q, the map
p ppT q is an -algebra homomorphism from P rts into LpH, H q.
The following example may serve us as a test case for many future results.
3In linear algebra, positive operators are called positive semidefinite.
118
(Check!) This example can be generalized for all compact self-adjoint operators T
on a general Hilbert space H. (Do this!)
5.3.3. Spectral mapping theorem for polynomials.
Lemma 5.3.8 (Invertibility). Let pptq be a polynomial and T P LpH, H q. Then
the operator ppT q is invertible if and only if pptq 0 for all t P pT q.
P LpH, H q. Then4
pppT qq pp pT qq.
Proof. For every complex number , we have P pppT qq if and only if the
operator ppT q I pp qpT q is not invertible. By the invertibility Lemma 5.3.8,
this is equivalent to the condition that pp qptq 0 for some t P pT q, which
means that pptq for some t P pT q. The latter is equivalent to P pp pT qq.
Using the spectral mapping theorem, one can in particular easily compute the
norms of operator polynomials:
Corollary 5.3.10 (Operator norm of polynomials). Let pptq be a polynomial
and T P LpH, H q be a self-adjoint operator. Then
}ppT q} tmax
|pptq|.
PpT q
This result generalizes the identity rpT q }T } for the spectral radius of selfadjoint operators T proved in Corollary 5.1.11.
Proof. Let us apply Corollary 5.1.11 for the operator ppT q. Then spectral
mapping theorem yields
p p qq : tpptq : t P pT qu.
119
5.3.4. Continuous functions of an operator. Let T P LpH, H q be a selfadjoint operator, and f ptq be a continuous function on pT q. We would like to
define f pT q P LpH, H q. To this end, we use Weierstrass approximation theorem,5
and we find polynomials pn ptq such that
pn ptq f ptq
(5.4)
uniformly on pT q.
as n, m 8.
(ii) Since the operators pn pT q are self-adjoint, and the self-adjoint operators
form a closed subset of LpH, H q (Exercise 5.1.4), f pT q is also self-adjoint. Furthermore, repeating the estimate in part (i), one sees that for any other approximating
sequence of polynomials qn ptq one has }pn pT q qn pT q} 0 as n 8. It follows
that the limit f pT q must be the same whether one chooses pn pT q or qn pT q as an
approximating sequence.
By passing to the limit in the corresponding properties for polynomials, one
sees that for two polynomials f and g we have
paf
bg qpT q a f pT q
b g pT q,
pf gqpT q f pT qgpT q,
f pT q f pT q.
(Check!) This property states in other words that for a fixed T P LpH, H q, the map
f f pT q is an -algebra homomorphism from C p pT qq into LpH, H q.
Example 5.3.12. Consider an invertible self-adjoint operator T P LpH, H q;
then pT q P rm, M s with m 0. Consider the function f ptq 1{t, which is
continuous on rm, M s. Then f pT q T 1 (Check!) In other words, we have the
remarkable identity of reciprocal and inverse:
1
T
T 1 .
Lec. 39: 12/10
pq
s p q
120
P LpH, H q
be a self-
P C ppT qq.
implies
f pT q g pT q.
Exercise 5.3.17. [Further properties of continuous functions of an operator] Let T, S P LpH, H q be self-adjoint operators and f, g P C p pT qq.
Prove that:
(i) f pT q f pT q ;
(ii) }f pT q} suptPpT q |f ptq| (this generalizes Corollary 5.3.10, and follows
by the same argument);
(iii) If operators S and T commute then operators f pS q and g pT q commute. (Check this for polynomials and pass to a limit.)
5.3.6. Square root of an operator. Consider a positive
? self-adjoint operator
T P LpH, H q. Then pT q?
r0, 8q. The function f ptq t is continuous on r0, 8q,
so we can define
?
? 2 f pT q T .
Since
p
? property implies that p T q2 T .
? tq t, the algebra homomorphism
Since t 0, Corollary 5.3.15 implies that T is a positive self-adjoint operator.
Summarizing, we have proved the following (except uniqueness):
Proposition 5.3.18 (Square root of an operator). For every positive selfadjoint
operator T P LpH, H q, there exists a unique positive self-adjoint operator
?
T P LpH, H q such that
?
p T q2 T.
121
T.
? ? ? ?
? ? ? ?
? ? ? ?
xST x, xy x S S T T x, xy x T S S T x, xy x S T x, S T xy 0
as required.
|T | :
T T ,
P LpH, H q
which we call the modulus of T . This generalizes the concept of modulus of complex
numbers,
?
|z| z z, z P C.
Lec. 40: 12/13
x P H.
U |T |.
The uniqueness of U follows easily: T x U |T |x means that U
T x, thus U is uniquely determined on Imp|T |q.
T
takes |T |x to
Theorem 5.3.23 generalizes the polar decomposition in the complex plane. The
latter states that every z P C can be represented as
z
ei Argpzq |z|.
Here ei Argpzq is a unit scalar (generalized by U ), and |z | is the modulus of z (generalized by |T |).
Remark 5.3.24. In general, U can not be extended to a bijective linear isometry
on the whole space H. Indeed, if T is the right shift on `2 then |T | I, so the
polar decomposition yields U T . Although U T is defined on the whole `2 , its
image is not even dense in `2 , so U is not bijective on `2 .
However, for invertible operators T , it is possible to extend U to a bijective
isometry on the whole space:
Theorem 5.3.25 (Polar decomposition for invertible operators). For every operator T P LpH, H q, there exists a unique unitary operator U P LpH, H q such that
T
U |T |.
Proof.
? Since T is invertible, T is invertible, so T T is invertible, and finally
|T | T T is invertible. (Why? Use Example 4.4.26.) Therefore ImpT q
Imp|T |q H, and the claim follows from Theorem 5.3.23.
Remark 5.3.26. This second form of the polar decomposition holds also for
all normal operators (those with T T T T ), compact
scalar operators, and
generally for all operators for which dim kerpT q dim kerpT q.
5.4. Borel functional calculus. Spectral theorem for self-adjoint
operators
In this section, we extend functional calculus to bounded Borel functions of
operators. This is done primarily to define the spectral projections, which are
indicator functions of an operator. Once we have spectral projections, we formulate
the spectral theorem for general (not necessarily compact) self-adjoint operators.
As usual, T P LpH, H q will denote a fixed self-adjoint operator on a Hilbert
space H. Let us fix an interval rm, M s which contains the spectrum pT q, e.g. the
spectrum interval.
5.4.1. Borel functional calculus. We consider the linear space of bounded
Borel complex-valued functions on rm, M s; denote this space B rm, M s. We would
like to define f pT q for f P B rm, M s, so that this extends the definition of f pT q for
continuous functions f P C rm, M s. The difficulty is that Borel functions are only
pointwise (but not uniform) limits of continuous functions.
Theorem 5.4.1 (Borel functional calculus). Let T P LpH, H q be a self-adjoint
operator on a Hilbert space H. For each f P B rm, M s one can define a self-adjoint
operator f pT q P LpH, H q such that
P H, define a
P C rm, M s.
xf pT qx, yy
f pq dx,y pq,
P C rm, M s.
pBf qpx, yq :
M
m
f pq dx,y pq,
P Brm, M s.
(Since x,y is a Borel measure, the integral is defined.) One quickly checks using
the definition of x,y that pBf qpx, y q is linear in x and conjugate linear in y. So B
is what is called a sesquilinear form. Moreover, this form is bounded:
xfn pT qx, yy
M
m
fn pq dx,y pq
M
m
f pq dx,y pq xf pT qx, y y.
xf pT qx, yy
f pq dx,y pq
valid for every bounded Borel function f P B rm, M s. In particular, the bilinear
form of T can be reproduced using spectral measures as
(5.6)
xT x, yy
M
m
dx,y pq.
Exercise 5.4.3. Compute the spectral measures for the diagonal matrix T diagp1 , . . . , n q acting as an operator on Cn .
Exercise 5.4.4. Let T be a multiplication operator in L2 r0, 1s by a
function g P L8 r0, 1s. Show that for f P B r0, 1s, the operator f pT q is the
multiplication operator in L2 r0, 1s by the function f pg ptqq.
5.4.3. Spectral projections. Let E be a Borel subset of rm, M s, and we
consider the indicator function
1E ptq
1, t P E
0, t R E.
E2
then PE1
PE
and ImpE1 q
PE n ,
n 1
xP pE qx, xy xPE x, xy
M
m
For this reason, the projection-valued measure P itself, rather than x,x , is often
called the spectral measure associated with the operator T .
5.4.4. Spectral theorem for self-adjoint operators.
Theorem 5.4.7 (Spectral theorem). Let T
on a Hilbert space H. Then
8
dP
xT x, xy
xdP x, xy;
As we noted, xdP x, xy is just the spectral measure x,x , so the last integral is the
usual Lebesgue integral.
Proof. With this remark, Theorem 5.4.7 is a reformulation of a partial case
of (5.6):
xT x, xy
M
m
dx,x pq.
Theorem 5.4.7 should be compared to the spectral Theorem 5.3 for compact
self-adjoint operators T . According to this theorem, T can be decomposed into the
sum
T
k Pk
k
Bibliography
[1] K. Ball, An elementary introduction to modern convex geometry. Flavors of geometry, 158,
Math. Sci. Res. Inst. Publ., 31, Cambridge Univ. Press, Cambridge, 1997.
[2] L. Carleson, On convergence and growth of partial sums of Fourier series, Acta Mathematica 116 (1966), 135157.
[3] P. Enflo, A counterexample to the approximation problem in Banach spaces., Acta Math.
130 (1973), 309317.
[4] E. Gluskin, The diameter of the Minkowski compactum is roughly equal to n, Funktsional.
Anal. i Prilozhen. 15 (1981), 7273.
[5] W. T. Gowers, B. Maurey, The unconditional basic sequence problem, J. Amer. Math. Soc.
6 (1993), 851874
[6] J. Lindenstrauss, Joram, On complemented subspaces of m, Israel J. Math. 5 (1967), 153
156.
[7] J. Lindenstrauss, L. Tzafriri, On the complemented subspaces problem, Israel J. Math. 9
(1971), 263269.
[8] J. Lindenstrauss, L. Tzafriri, Classical Banach spaces. I and II. Springer-Verlag, , 1977,
1979.
[9] A. Pelczkynski, Projections in certain Banach spaces, Studia Math. 19 (1960), 209228.
[10] R. S. Phillips, On linear transformations, Trans. Amer. Math. Soc. 48 (1940), 516541.
126
Index
Essentially bounded, 5
Extension
of bounded linear functionals, 47
of bounded linear operators, 61
Absorbing set, 52
Adjoint operator, 63
Angle between vectors, 21
Annihilator, 64
approximate, 112
Fourier
basis, 29, 80
coefficients, 31
series, 28, 31
Fourier basis, 80
Frame, 36
Fredholm
alternative, 99
integral equation, 59
theory, 98
Banach space, 15
Banach-Alaoglus theorem, 91
Banach-Mazur distance, 73
Basis
constant, 81
Hamel, 2
orthonormal, 32
Schauder, 79
Bessels inequality, 31
Biorthogonal functionals, 82
Cauchy-Schwarz inequality, 20
Closed graph theorem, 73
Codimension, 4
Compact
operator, 94
set, 83
compact operators, 106
Completion, 18
Convex
combination, 14
hull, 14
Convexity, 9
Image, 5
closed, 72, 98
Injectivization, 6
Inner product, 19
Integral operator, 58, 94, 97, 100
Inverse mapping theorem, 70
Isometry, 61
Isomorphic embedding, 72
Isomorphism, 61, 70
p q
Kernel, 5
Embedding, 6
Equicontinuity, 85
Ergodic theory, 64
Ergodic transformation, 66
Legendre polynomials, 34
Linear functional, 39
bounded, 40
127
INDEX
Linear operator, 5
bounded, 56
norm, 56
Measure-preserving transformation, 66
Minkowski functional, 14, 52
Minkowski inequality, 11
Modulus of an operator, 121
Net, 83
Norm, 7
equivalent, 71
Normed space, 7
Open map, 68
Open mapping theorem, 68
Operator norm, 56
Orthogonal complement, 26
orthogonal decomposition, 28
orthogonal projection, 28
Orthogonal series, 30
Orthogonal system, 29
Orthogonality principle, 26
Parallelogram law, 24
Parsevals identity, 32
Perfectly convex set, 69
Point evaluation functional, 39
Pointwise
bounded family of operators, 76
convergence of operators, 84
Polar decomposition, 122
Polarization identity, 24, 111
Positive operator, 117
Precompact set, 83
Principle of uniform boundedness, 76
Projection, 62
onto convex sets, 28
orthogonal, 26, 28, 60
Projection-valued measure, 125
Pythagorean inequality, 20
Quotient space, 4, 6, 12, 15
Rademacher system, 37
Radon-Nikodym theorem, 43
Reflexive spaces, 51
Regular point, 101
Resolvent
identity, 104
operator, 103
set, 101
Riesz representation theorem, 42
Schauder basis, 79
Schauder system, 80
Schur property, 89
Self-adjoint operator, 110
Semi-norm, 14
Separable space, 35
128
p q