INRIA
INRIA
I Introduction
These lectures are devoted to fast numerical algorithms and their relation to a number of
important results of Harmonic Analysis. The representation of wide classes of operators
in wavelet bases, for example, Calderón-Zygmund or pseudo-differential operators, may
be viewed as a method for their “compression”, i.e., conversion to a sparse form. The
sparsity of these representations is a consequence of the localization of wavelets in both,
space and wave number domains. In addition, the multiresolution structure of the wavelet
expansions brings about an efficient organization of transformations on a given scale and
interactions between different neighbouring scales. Such an organization of both linear
and non-linear transformations has been a powerful tool in Harmonic Analysis (usually
referred to as Littlewood-Paley and Calderón-Zygmund theories, see e.g. [33]) and now
appears to be an equally powerful tool in Numerical Analysis.
In applications, for example in image processing [12] and in seismics [22], the mul-
tiresolution methods were developed in a search for a substitute for the signal processing
algorithms based on the Fourier transform. A technique of subband coding with the ex-
act quadrature mirror filters (QMF) was introduced in [37]. It is clear that the stumbling
block on the road to both, the simpler analysis and the fast algorithms, was a limited
variety of the orthonormal bases of functional spaces. In fact, there were (with some
qualifications) only two major choices, the Fourier basis and the Haar basis. These two
bases are almost the antipodes in terms of their space–wave number (or time-frequency)
localization. Therefore, it is a remarkable discovery that besides the Fourier and Haar
bases, there is an infinite number of various orthonormal bases with a controllable lo-
calization in the space–wave number domain. The efforts in mathematics and various
applied fields culminated in the development of orthonormal bases of wavelets [39], [30],
and the notion of Multiresolution Analysis [31], [27]. There are many new constructions
of orthonormal bases with a controllable localization in the space–wave number domain,
notably [19], [16], [18], [17].
In Numerical Analysis many ingredients of Calderón-Zygmund theory were used
in the Fast Multipole algorithm for computing potential interactions [35], [24], [13]. The
1
Program in Applied Mathematics, University of Colorado at Boulder, Boulder, CO 80309-0526; Yale
University, P.O.Box 2155 Yale Station, New Haven, CT 06520
1
Fast Multipole algorithm requires order N operation to compute all the sums
qi qj
, where xi ∈ R3 i, j = 1, . . . , N.
X
pj = (1.1)
i6=j |xi − xj |
2
These methods may be viewed as devices for reducing a partial differential equation to a
sparse linear system for the cost of an inherently high condition number of the resulting
matrices. If instead of finite difference or finite element representations, we use the rep-
resentation of the derivatives in wavelet bases, then a simple modification by a diagonal
preconditioner produces a well-conditioned system. This, in turn, “automatically” yields
fast solvers for P DE’s and removes one of the original reasons for constructing such
methods as multigrid.
Another novel and important element of computing in the wavelet bases is that the
“compressed” operators in standard and non-standard forms may be multiplied rapidly.
The product of two operators in the standard form requires C(− log )N (or maybe
C(− log )N log2 N ) operations, where is the desired accuracy. In Section VII, we de-
scribe a new algorithm for the multiplication of operators in the non-standard form in
order (− log )N operations. Since the performance of many algorithms requiring multi-
plication of dense matrices has been limited by O(N 3 ) operations, these fast algorithms
address a critical numerical issue. Among the algorithms requiring multiplication of
matrices is an iterative algorithm for constructing the generalized inverse [36], [4], [38],
the scaling and squaring method for computing the exponential of an operator (see, for
example, [40]), and similar algorithms for sine and cosine of an operator, to mention
a few. By replacing the ordinary matrix multiplication in these algorithms by the fast
multiplication in the wavelet bases, the number of operations is reduced to, essentially,
an order N operations. For example, if both, the operator and its generalized inverse,
admit sparse representations in the wavelet basis, then the iterative algorithm [36] for
computing the generalized inverse requires only O(N log κ) operations, where κ is the
condition number of the matrix. Various numerical examples and applications may be
found in [6], [2] and [8].
As a final remark, we note, that computing in the wavelet bases has the following
three distinct attributes:
1. The operators and functions are represented in an orthonormal basis
2. The basis functions have vanishing moments leading to the sparsity of representa-
tions
3. The algorithms are recursive due to the multiresolution properties of the basis
One of the goals of these notes is to explain why (1-3) are of importance for fast
numerical algorithms. The algorithms of these notes are useful practical tools and at the
same time, directly related to several important results of Harmonic Analysis. I hope,
that the explicit interrelation between pure and numerical analysis in the algorithms of
these notes will be perceived as an accomplishment of the program of developing fast,
wavelet based algorithms for Numerical Analysis. This program was initiated jointly with
R. R. Coifman and V. Rokhlin at Yale University, and consequently, this work would not
have been possible without the collaboration with them.
3
II Preliminary Remarks
such that
1. Vj = {0} and Vj is dense in L2 (Rd )
T S
j∈Z j∈Z
2. For any f ∈ L2 (Rd ) and any j ∈ Z, f (x) ∈ Vj if and only if f (2x) ∈ Vj−1
3. For any f ∈ L2 (Rd ) and any k ∈ Zd , f (x) ∈ V0 if and only if f (x − k) ∈ V0
4. There exists a scaling function ϕ ∈ V0 such that {ϕ(x − k)}k∈Zd is a Riesz basis of
V0 .
Since all constructions in these lecture notes only use orthonormal bases, we will require
the basis of Condition 4 to be an orthonormal rather than just a Riesz basis.
Vj−1 = Vj ⊕ Wj , (2.2)
L2 (Rd ) =
M
Wj . (2.3)
j∈Z
If there is the coarsest scale n, then the chain of the subspaces (2.1) is replaced
by
Vn ⊂ . . . ⊂ V2 ⊂ V1 ⊂ V0 ⊂ V−1 ⊂ V−2 ⊂ . . . , (2.4)
and
L2 (Rd ) = Vn
M
Wj . (2.5)
j≤n
4
If there is a finite number of scales, then without loss of generality, we set j = 0
to be the finest scale. Instead of (2.4), we then have
Vn ⊂ . . . ⊂ V 2 ⊂ V 1 ⊂ V 0 , V0 ⊂ L2 (Rd ). (2.6)
In this case ϕ(x) = χ(x), where χ(x) is the characteristic function of the interval (0, 1).
For each j, χj,k (x) = 2−j/2 χ(2−j x − k), k ∈ Z, is the basis of Vj and hj,k (x) =
2−j/2 h(2−j x − k), k ∈ Z, is the basis of Wj .
The decomposition of a function into the Haar basis is an order N procedure.
Given N = 2n “samples” of a function, which may for simplicity be thought of as values
of scaled averages of f on intervals of length 2−n ,
Z 2−n (k+1)
s0k = 2n/2 f (x)dx, (2.8)
2−n k
5
to the standard form. The second basis is defined by the set of three kinds of basis
functions supported on squares: hj,k (x)hj,k0 (y), hj,k (x)χj,k0 (y), and χj,k (x)hj,k0 (y), where
χ(x) is the characteristic function of the interval (0, 1) and χj,k (x) = 2−j/2 χ(2−j x − k).
Representing an operator in this basis leads to the non-standard form (the terminology
will become clear later).
By considering an integral operator
Z
T (f )(x) = K(x, y)f (y)dy, (2.11)
and expanding its kernel in a two-dimensional Haar basis, we find that for Calderón-
Zygmund and pseudo-differential operators the decay of entries as a function of the
distance from the diagonal is faster in these representations than that in the original
kernel. These classes of operators are given by integral or distributional kernels that
are smooth away from the diagonal. For example, kernels K(x, y) of Calderón-Zygmund
operators satisfy the estimates
1
|K(x, y)| ≤ ,
|x − y|
CM
|∂xM K(x, y)| + |∂yM K(x, y)| ≤ , (2.12)
|x − y|1+M
we have (2.12)
Z Z h i
j
|βkk 0| ≤ | K(x, y) − K(x0j,k , y) hj,k (x) χj,k0 (y) dxdy|
C
≤ , (2.15)
|k − k 0 |2
j
where x0j,k = 2j (k + 21 ) denotes the center of the support of hj,k . Thus, entries βkk 0 decay
6
reason the gain in the decay is insufficient to make computing in the Haar basis practical.
To have a faster decay, it is necessary to use basis functions with several vanishing
moments. The vanishing moments are responsible for attaining practical algorithms, i.e.
controlling the constants in the complexity estimates of the fast algorithms.
We outline here the properties of compactly supported wavelets and refer for details to
[19], [20] and [33].
There are two immediate consequences of Definition II.1 with Condition 4’. First,
the function ϕ may be expressed as a linear combination of the basis functions of V−1 .
Since the functions {ϕj,k (x) = 2−j/2 ϕ(2−j x − k)}k∈Z form an orthonormal basis of Vj ,
we have
√ L−1
X
ϕ(x) = 2 hk ϕ(2x − k), (2.17)
k=0
2
We note that the notion of multiresolution analysis is more recent than the constructions of [39],
[30] and, of course, [25].
7
Second, the orthogonality of {ϕ(x − k)}k∈Z implies that
Z +∞ Z +∞
δk0 = ϕ(x − k)ϕ(x) dx = |ϕ̂(ξ)|2 e−ikξ dξ, (2.21)
−∞ −∞
and therefore, Z 2π
|ϕ̂(ξ + 2πl)|2 e−ikξ dξ,
X
δk0 = (2.22)
0 l∈Z
and
|ϕ̂(ξ + 2πl)|2 = 1.
X
(2.23)
l∈Z
By taking the sum in (2.24) separately over odd and even indeces, we have
l∈Z l∈Z
Using the 2π-periodicity of the function m0 and (2.23), we obtain, after replacing ξ/2 by
ξ, a necessary condition
|m0 (ξ)|2 + |m0 (ξ + π)|2 = 1. (2.26)
Thus, the coefficients hk in (2.20) are such that the 2π-periodic function m0 satisfies
equation (2.26).
Equation (2.26) defines a pair of the quadrature mirror filters (QMF). On denot-
ing
m1 (ξ) = e−iξ m0 (ξ + π), (2.27)
and defining the function ψ,
√ X
ψ(x) = 2 gk ϕ(2x − k), (2.28)
k
where
gk = (−1)k hL−k−1 , k = 0, . . . , L − 1, (2.29)
or, the Fourier transform of ψ,
where
k=L−1
m1 (ξ) = 2−1/2 gk eikξ ,
X
(2.31)
k=0
8
it is not difficult to show (see e.g., [33], [19], [20]), that on each fixed scale j ∈ Z, the
wavelets {ψj,k (x) = 2−j/2 ψ(2−j x − k)}k∈Z form an orthonormal basis of Wj .
The following lemma (I. Daubechies [19]) characterizes trigonometric polynomial
solutions of (2.26) which correspond to the orthonormal bases of compactly supported
wavelets with vanishing moments.
Lemma II.1 Any trigonometric polynomial solution m0 (ξ) of (2.26) is of the form
h iM
m0 (ξ) = 1
2
(1 + eiξ ) Q(eiξ ) (2.32)
and h i
sup P (y) + y M R( 21 − y) < 22(M −1) . (2.36)
0≤y≤1
The proof of this Lemma contains an algorithm for generating the coefficients
of the quadrature mirror filters H and G. The number L of the filter coefficients in
(2.20) and (2.31) is related to the number of vanishing moments M . For the wavelets
constructed in [19], L = 2M . If additional conditions are imposed (see [7] for an example),
then the relation might be different, but L is always even.
The decomposition of a function into the wavelet basis is an order N procedure.
Given the coefficients s0k , k = 0, 1, . . . , N as “samples” of the function f , the coefficients
sjk and djk on scales j ≥ 1 are computed at a cost proportional to N via
n=L−1
sjk = j−1
X
hn sn+2k+1 , (2.37)
n=0
and
n=L−1
djk j−1
X
= gn sn+2k+1 , (2.38)
n=0
9
where sjk and djk may be viewed as periodic sequences with the period 2n−j . Computing
via (2.37) and (2.38) is illustrated by the pyramid scheme
The properties (i) and (ii) imply that there are M 2 polynomial coefficients that determine
the functions f1 , . . . , fM , while the properties (iii) and (iv) provide M 2 constraints. It
turns out that the equations uncouple to give M nonsingular linear systems which may
be solved to obtain the coefficients and, thus, yielding the functions f1 , . . . , fM up to a
sign.
The functions f1 , . . . , fM may be obtained constructively as follows. Let us start
with
m−1
x
, x ∈ (0, 1),
1
fm (x) = −xm−1 , x ∈ (−1, 0), (2.42)
0, otherwise,
and note that the 2M functions 1, x, . . . , xM −1 , f11 , f21 , . . . , fM
1
are linearly independent.
Then, by the Gram-Schmidt process, we orthogonalize fm with respect to 1, x, . . . , xM −1 ,
1
2
to obtain fm , for m = 1, . . . , M . This orthogonality is preserved by the remaining or-
2
thogonalizations, which only produce linear combinations of the fm . First, if at least
2
one of fm is not orthogonal to x , we reorder the functions, so that hf12 , xM i 6= 0.
M
10
3 2
We then define fm = fm − am · f02 , where am is chosen so that hfm 3
, xM i = 0 for
m = 2, . . . , M , achieving the desired orthogonality to xM . Similarly, we continue to
M +1
orthogonalize with respect to xM +1 , . . . , x2M −2 to obtain f12 , f23 , f34 , . . . , fM , such that
m+1
hfm , xi i = 0 for i ≤ m + M − 2. Finally, we perform the Gram-Schmidt orthogonal-
M +1 M 2
ization on fM , fM −1 , . . . , f1 , in that order, and normalize to obtain fM , fM −1 , . . . , f1 .
m=M
By construction, the functions {fm }m=1 satisfy properties (i)-(iv). On denoting
hm (x) = 21/2 fm (2x − 1), m = 1, . . . , M, (2.43)
we define the space WjM , j = 0, −1, −2, . . ., as a linear span of functions
11
where VjM is given in (2.45), and the space WjM,2 as the orthogonal complement of VjM,2
M,2
in Vj−1 ,
M,2
= VjM,2 WjM,2 .
M
Vj−1 (2.50)
The space W0M,2 is spanned by the orthonormal basis
k=L
in terms of the filter coefficients {hk }k=1 , may be found using a formula for ϕ̂ ,
∞
−1/2
m0 (2−j ξ),
Y
ϕ̂(ξ) = (2π) (2.53)
j=1
where
k=L−1
−1/2
hk eikξ .
X
m0 (ξ) = 2 (2.54)
k=0
12
The moments Mm ∞ are obtained numerically (within the desired accuracy) by
m=M −1
recursively generating a sequence of vectors, {Mm
r }m=0 for r = 1, 2, . . . ,
j=m !
m
Mm 2−jr Mrm−j Mj1 ,
X
r+1 = (2.55)
j=0
j
starting with
k=L−1
1
Mm −m− 2
hk k m ,
X
1 = 2 m = 0, . . . , M − 1. (2.56)
k=0
m=M −1
Each vector {Mm
r }m=0 represents M moments of the product in (2.53) with r terms,
and this iteration converges rapidly. Notice, that we never computed the function ϕ
itself.
13
III The non-standard and standard forms
Pj : L2 (R) → Vj , (3.2)
as X
(Pj f ) (x) = hf, ϕj,k iϕj,k (x), (3.3)
k
where
Qj = Pj−1 − Pj (3.5)
is the projection operator on the subspace Wj . If there is the coarsest scale n, then
instead of (3.4) we have
n
X
T = (Qj T Qj + Qj T Pj + Pj T Qj ) + Pn T Pn , (3.6)
j=−∞
A j : Wj → W j , (3.9)
B j : Vj → W j , (3.10)
14
Γj : Wj → V j , (3.11)
where the operators {Aj , Bj , Γj }j∈Z are defined as Aj = Qj T Qj , Bj = Qj T Pj and
Γj = P j T Q j .
The operators {Aj , Bj , Γj }j∈Z admit a recursive definition via the relation
!
Aj+1 Bj+1
Tj = , (3.12)
Γj+1 Tj+1
where operators Tj = Pj T Pj ,
Tj : V j → V j , (3.13)
and the operator represented by the 2 × 2 matrix in (3.12) is a mapping
!
Aj+1 Bj+1
: Wj+1 ⊕ Vj+1 → Wj+1 ⊕ Vj+1 (3.14)
Γj+1 Tj+1
.
If there is a coarsest scale n, then
1. The operator Aj describes the interaction on the scale j only, since the subspace
Wj in (3.9) is an element of the direct sum in (2.3).
2. The operators Bj , Γj in (3.10) and (3.11) describe the interaction between the scale
j and all coarser scales. Indeed, the subspace Vj contains all the subspaces Vj 0
with j 0 > j (see (2.1)).
15
Α1 Β1 d1 d^ 1
Γ1 s1 ^s 1
=
^
Α2 Β2 d2 d2
s2 ^s 2
Γ2
Α 3 Β3 d3 ^3
d
Γ 3Τ 3 s3 ^s 3
16
Figure 2: An example of a matrix in the non-standard form (see Example 1)
17
The operator Tj is represented by the matrix sj ,
Z Z
sjk,k0 = K(x, y) ϕj,k (x) ϕj,k0 (y) dxdy. (3.19)
L−1
j j−1
X
βi,l = gk hm sk+2i+1,m+2l+1 , (3.21)
k,m=0
L−1
j j−1
X
γi,l = hk gm sk+2i+1,m+2l+1 , (3.22)
k,m=0
L−1
sji,l = j−1
X
hk hm sk+2i+1,m+2l+1 , (3.23)
k,m=0
Remark. It follows from (3.7) that after applying the non-standard form to a vector,
we arrive at the representation
j=n j=n
dˆjk ψkj (x) + ŝjk φjk (x),
X X X X
(T0 u0 )(x) = (3.24)
j=1 k∈Z j=1 k∈Z
where u0 = P0 u (see 1). We note, that in order to reconstruct a vector from the first and
the second sums in (3.24), we may apply the same algorithm. The reconstruction from
the first sum is just a reconstruction from the wavelet expansion and is accomplished by
using the quadrature mirror filters H and G. The reconstruction from the second sum
may be accomplished by using the same algorithm but with the pair of filters H and H.
Both results are then added. The number of operations required for this computation is
proportional to N .
18
III.2 The Standard Form
The standard form is obtained by representing
M
Vj = Wj 0 , (3.25)
j 0 >j
0 0
and considering for each scale j the operators {Bjj , Γjj }j 0 >j ,
0
Bjj : Wj 0 → Wj , (3.26)
0
Γjj : Wj → Wj 0 . (3.27)
If there is the coarsest scale n, then instead of (3.25), we have
0 =n
jM
Vj = V n Wj 0 . (3.28)
j 0 =j+1
0 0
In this case, the operators {Bjj , Γjj } for j 0 = j + 1, . . . , n are the same as in (3.26) and
(3.27) and, in addition, for each scale j, there are operators {Bjn+1 } and {Γn+1 j },
Bjn+1 : Vn → Wj , (3.29)
Γn+1
j : Wj → V n . (3.30)
(In this notation, Γn+1
n = Γn and Bnn+1 = Bn ). If the number of scales is finite and V0 is
finite dimensional, then the standard form is a representation of T0 = P0 T P0 as
0 0 0 0
T0 = {Aj , {Bjj }jj 0 =n j j =n n+1
=j+1 , {Γj }j 0 =j+1 , Bj , Γn+1
j , Tn }j=1,...,n . (3.31)
The operators (3.31) are organized as blocks of the matrix (see Figure 3 and Figure 4).
If the operator T is a Calderón-Zygmund or a pseudo-differential operator then,
for a fixed accuracy, all the operators in (3.31), except Tn , are banded. As a result, the
standard form has several “finger” bands which correspond to the interaction between
different scales. For a large class of operators (pseudo-differential, for example), the
interaction between different scales characterized by the size of the coefficients of “finger”
bands decays as the distance j 0 − j between the scales increases. Therefore, if the scales
0 0
j and j 0 are well separated, then for a given accuracy, the operators Bjj , Γjj can be
neglected.
There are two ways of computing the standard form of a matrix. First consists in
applying the one-dimensional transform (see (2.37) and (2.38)) to each column (row) of
the matrix and then to each row (column) of the result. Alternatively, one can compute
the non-standard form and then apply the one-dimensional transform to each row of all
operators Bj and each column of all operators Γj . We refer to [7] the for details.
19
Α1 Β 21 Β 31 Β 41 d1 d^ 1
=
^
Γ 21 Α 2 Β 32 Β 42 d2 d2
Γ 31 Γ 32 Α 3 Β 43 d3 ^3
d
Γ 41 Γ 42 Γ 43 Τ 3 s3 ^s 3
20
IV The compression of operators
The compression of operators or, in other words, the construction of their sparse repre-
sentations in orthonormal bases, directly affects the speed of computational algorithms.
While the compression of data (of images, for example) achieved by methods other than
finding a sparse representation in some basis may be adequate for some applications, the
compression of operators calls for a representation in a basis in order to effectively com-
pute in the sparse form. The standard and non-standard forms of operators in the wavelet
bases may be viewed as compression schemes for a wide class of operators. We restrict
our attention to two specific classes frequently encountered in analysis and applications,
Calderón-Zygmund and pseudo-differential operators.
The non-standard form of an operator T with a kernel K(x, y) is obtained by
evaluating the expressions
Z Z
αII 0 = K(x, y) ψI (x) ψI 0 (y) dxdy, (4.1)
Z Z
βII 0 = K(x, y) ψI (x) ϕI 0 (y) dxdy, (4.2)
and Z Z
γII 0 = K(x, y) ϕI (x) ψI 0 (y) dxdy, (4.3)
where the coefficients are labeled by the intervals I = Ikj and I 0 = Ikj0 denoting the
supports of ϕjk and ϕjk0 .
If K is smooth on the square I × I 0 , then expanding K into a Taylor series around
the center of the square and combining (2.16) with (4.1) - (4.3) and remembering that
the functions ψI and ψI 0 are supported on the intervals I and I 0 , we obtain the estimate
|αII 0 | + |βII 0 | + |γII 0 | ≤ C|I|M +1 sup |∂xM K(x, y)| + |∂yM K(x, y)| . (4.4)
(x,y)∈I×I 0
Obviously, the right-hand side of (4.4) is small whenever either |I| or the derivatives
involved are small, and we use this fact to “compress” matrices of integral operators
by converting them to the non-standard form, and discarding the coefficients that are
smaller than a chosen threshold.
For most numerical applications, the following Proposition IV.1 is quite adequate,
as long as the singularity of K is integrable across each row and each column.
Proposition IV.1 If the wavelet basis has M vanishing moments, then for any kernel
K satisfying the conditions
1
|K(x, y)| ≤ , (4.5)
|x − y|
C0
|∂xM K(x, y)| + |∂yM K(x, y)| ≤ (4.6)
|x − y|1+M
21
the matrices αj , β j , γ j (3.16) - (3.18) of the non-standard form satisfy the estimate
j j j CM
|αi,l | + |βi,l | + |γi,l |≤ , (4.7)
1 + |i − l|M +1
for all |i − l| ≥ 2M .
Proposition IV.2 If the wavelet basis has M vanishing moments, then for any pseudo-
differential operator with symbol σ of T and σ ∗ of T ∗ satisfying the standard conditions
j j j 2 λ j CM
|αi,l | + |βi,l | + |γi,l |≤ , (4.11)
(1 + |i − l|)M +1
If we approximate the operator T0N by the operator T0N,B obtained from T0N by
j j j
setting to zero all coefficients of matrices αi,l , βi,l , and γi,l outside of bands of width
B ≥ 2M around their diagonals, then it is easy to see that
C
||T0N,B − T0N || ≤ log2 N, (4.12)
BM
where C is a constant determined by the kernel K. In most numerical applications, the
accuracy of calculations is fixed, and the parameters of the algorithm (in our case,
the band width B and order M ) have to be chosen in such a manner that the desired
precision of calculations is achieved. If M is fixed, then B has to be such that
C
||T0N,B − T0N || ≤ log2 N ≤ , (4.13)
BM
22
or, equivalently,
1/M
C
B≥ log2 N . (4.14)
ε
The estimate (4.12) is sufficient for practical purposes. It is possible, however, to
obtain
C
||T0N,B − T0N || ≤ (4.15)
BM
instead of (4.12). In order to obtain this tighter estimate and avoid the factor log 2 N , we
use some of the ideas arising in the proof of the “T(1)” theorem of David and Journé.
j j j
The estimates (4.5), (4.6) are not sufficient to conclude that αi,l , βi,l , γi,l are
1
bounded for |i − l| ≤ 2M (for example, consider K(x, y) = |x−y| ). Therefore, we need to
assume that T defines a bounded operator on L2 or, a substantially weaker condition,
Z
| K(x, y) dxdy| ≤ C|I|, (4.16)
I×I
for all dyadic intervals I. This is the so-called “weak cancellation condition” (see [32]).
Under this condition and the conditions (4.5), (4.6) Proposition IV.1 may be extended
so that the estimate (4.7) holds for all integer i, l (see [32]).
Let us evaluate an operator T in the non-standard form on a function f ,
X X X
T (f )(x) = ψI (x) αII 0 dI 0 + ψI (x) βII 0 sI 0 + ϕI (x) γII 0 dI 0 (4.17)
I,I 0 I,I 0 I,I 0
T = A + Lβ + L∗γ , (4.18)
X X X
A(f )(x) = ψI (x) αI,I 0 dI 0 + ψI (x) βII 0 (sI 0 − sI ) + (ϕI (x) − ϕI 0 (x)) γII 0 dI 0 (4.19)
I,I 0 I,I 0 I,I 0
X
Lβ (f )(x) = ψI (x) sI βI , (4.20)
I
L∗γ (f )(x) =
X
ϕI 0 (x) dI 0 γI 0 (4.21)
I0
where
1 1
X Z Z
βI = βII 0 = ψI (x) K(x, y) dxdy = hψI , β(x)i (4.22)
I0 |I|1/2 |I|1/2
1 1
X Z Z
γ =
I0 γ II 0 = 0 1/2 ψI 0 (y) K(x, y) dxdy = hψI 0 , γ(y)i (4.23)
I |I | |I 0|1/2
and
β(x) = T (1)(x), (4.24)
23
γ(y) = T ∗ (1)(y) (4.25)
It is a remarkable fact that by analysing the functions (4.24) and (4.25) (and, therefore,
the operators Lβ and L∗γ ), it is possible to decide if a Calderón-Zygmund operator is
bounded.
Theorem IV.1 (G. David, J.L. Journé) Suppose that the operator (3.1) satisfies the
conditions (4.5), (4.6), and (4.16). Then a necessary and sufficient condition for T to
be bounded on L2 is that β(x) in (4.24) and γ(y) in (4.25) belong to dyadic B.M.O., i.e.
satisfy condition
1
Z
sup |β(x) − mJ (β)| 2 dx ≤ C, (4.26)
J |J| J
Splitting the operator T into the sum of three terms (4.18) and estimating them
separately leads to the estimate (4.15). We note that the functions T (1) and T ∗ (1) are
easily computed in the process of constructing the non-standard form and and may be
used to provide a useful estimate of the norm of the operator.
24
V The differential operators in wavelet bases.
and Z ∞
rilj = 2−j ϕ(2−j x − i) ϕ0 (2−j x − l) 2−j dx = 2−j ri−l , (5.4)
−∞
where
+∞ d
Z
αl = ψ(x − l)
ψ(x) dx, (5.5)
−∞ dx
d
Z +∞
βl = ψ(x − l) ϕ(x) dx, (5.6)
−∞ dx
d
Z +∞
γl = ϕ(x − l) ψ(x) dx. (5.7)
−∞ dx
and
+∞ d
Z
rl = ϕ(x − l) ϕ(x) dx. (5.8)
−∞ dx
Moreover, using (2.17) and (2.28) we have
L−1
X L−1
X
αi = 2 gk gk0 r2i+k−k0 , (5.9)
k=0 k 0 =0
L−1
X L−1
X
βi = 2 gk hk0 r2i+k−k0 , (5.10)
k=0 k 0 =0
and
L−1
X L−1
X
γi = 2 hk gk0 r2i+k−k0 , (5.11)
k=0 k 0 =0
25
and, therefore, the representation of d/dx is completely determined by rl in (5.8) or in
other words, by the representation of d/dx on the subspace V0 .
Rewriting (5.8) in terms of ϕ̂(ξ), where
1 +∞
Z
ϕ̂(ξ) = √ ϕ(x) eixξ dx, (5.12)
2π −∞
we obtain Z +∞
rl = (−iξ) |ϕ̂(ξ)|2 e−ilξ dξ. (5.13)
−∞
Proposition V.1 1. If the integrals in (5.8) or (5.13) exist, then the coefficients r l ,
l ∈ Z in (5.8) satisfy the following system of linear algebraic equations
L/2
1
X
rl = 2 r2l + 2
a2k−1 (r2l−2k+1 + r2l+2k−1 ) , (5.14)
k=1
and X
l rl = −1, (5.15)
l
where
L−2k
X
a2k−1 = 2 hi hi+2k−1 , k = 1, . . . , L/2. (5.16)
i=0
2. If M ≥ 2, then equations (5.14) and (5.15) have a unique solution with a finite
number of non-zero rl , namely, rl 6= 0 for −L + 2 ≤ l ≤ L − 2 and
rl = −r−l , (5.17)
Remark 1. If M = 1, then equations (5.14) and (5.15) have a unique solution but
the integrals in (5.8) or (5.13) may not be absolutely convergent. For the Haar basis
(h1 = h2 = 2−1/2 ) a1 = 1 and r1 = −1/2 and we obtain the simplest finite difference
operator (1/2, 0, −1/2). In this case the function ϕ is not continuous and
1 sin 21 ξ i 1 ξ
ϕ̂(ξ) = √ e2 .
2π 12 ξ
26
where an are the autocorrelation coefficients of the filter H = {hk }k=L−1
k=0 ,
L−1−n
X
an = 2 hi hi+n , n = 1, . . . , L − 1. (5.19)
i=0
It is easy to see that the autocorrelation coefficients an with even indices are zero,
This may be verifyed by using (2.20) to compute |m0 (ξ)|2 and |m0 (ξ + π)|2 ,
L−1
2 1 1
X
|m0 (ξ)| = 2
+ 2
an cos nξ, (5.21)
n=1
L/2 L/2−1
|m0 (ξ + π)|2 = 1 1 1
X X
2
− 2
a2k−1 cos(2k − 1)ξ + 2
a2k cos 2kξ, (5.22)
k=1 k=1
where an are given in (5.19). Combining (5.21) and (5.22) to satisfy (2.26), we obtain
L/2−1
X
a2k cos 2kξ = 0, (5.23)
k=1
since m
1
2
∂ξ |m0 (ξ)| = 0, for 1 ≤ m ≤ 2M − 1, (5.25)
i ξ=0
Proof.
d
Using (2.17) for both ϕ(x − l) and dx
ϕ(x) in (5.8) we obtain
L−1
X L−1 Z +∞
ϕ(2x − 2i − k) ϕ0 (2x − l) 2 dx
X
ri = 2 hk hl (5.26)
k=0 l=0 −∞
and hence,
L−1
X L−1
X
ri = 2 hk hl r2i+k−l . (5.27)
k=0 l=0
27
Substituting l = k − m, we rewrite (5.27) as
L−1
X k−L+1
X
ri = 2 hk hk−m r2i+m . (5.28)
k=0 m=k
PL−1
Changing the order of summation in (5.28) and using the fact that k=0 h2k = 1, we
arrive at
L−1
X
rl = 2r2l + an (r2l−n + r2l+n ), l ∈ Z, (5.29)
n=1
where an are given in (5.19). Using (5.20) we obtain (5.14) from (5.29).
In order to obtain (5.15) we use the following relation
l=+∞ l=m
!
m
m m l
Mlϕ xm−l ,
X X
l ϕ(x − l) = x + (−1) (5.30)
l=−∞ l=1
l
where Z +∞
Mlϕ = ϕ(x) xl dx, l = 1, . . . , m, (5.31)
−∞
are the moments of the function ϕ(x). Relation (5.31) follows simply on taking Fourier
transforms and using Leibniz’ rule. Using (5.8) and (5.30) with m = 1 we obtain (5.15).
If M ≥ 2, then
|ϕ̂(ξ)|2 |ξ| ≤ C(1 + |ξ|)−1−, (5.32)
where > 0, and hence, the integral in (5.13) is absolutely convergent. This assertion
follows from Lemma 3.2 of [19], where it is shown that
|ϕ̂(ξ)| ≤ C(1 + |ξ|)−M +log2 B , (5.33)
where
B = sup |Q(eiξ )|.
ξ∈R
28
where
rl eilξ ,
X
r̂(ξ) = (5.36)
l
r2l eilξ ,
X
r̂even (ξ/2) = (5.37)
l
and
r2l+1 ei(2l+1) ξ/2 .
X
r̂odd (ξ/2) = (5.38)
l
Noticing that
2 r̂even (ξ/2) = r̂(ξ/2) + r̂(ξ/2 + π) (5.39)
and
2 r̂odd (ξ/2) = r̂(ξ/2) − r̂(ξ/2 + π), (5.40)
and using (5.18), we obtain from (5.35)
h i
r̂(ξ) = r̂(ξ/2) + r̂(ξ/2 + π) + 2|m0 (ξ/2)|2 − 1 (r̂(ξ/2) − r̂(ξ/2 + π) ) . (5.41)
where Z +∞
−j/2
fj,k−l = 2 f (x) ϕ(2−j x − k + l) dx. (5.44)
−∞
Rewriting (5.44) as
Z +∞
−j/2
fj,k−l = 2 f (x − 2j l) ϕ(2−j x − k) dx, (5.45)
−∞
29
where x̃ = x̃(x, x − 2j l) and |x̃ − x| ≤ 2j l. Substituting (5.46) in (5.43) and using (5.34)
and (5.15), we obtain
X Z +∞
0
(Tj f )(x) = f (x) ϕj,k (x) dx ϕj,k (x) +
k∈Z −∞
!
Z +∞
j 1 2 00
X X
2 2
rl l f (x̃) ϕj,k (x) dx ϕj,k (x). (5.47)
k∈Z l −∞
It is clear that as j → −∞, operators Tj and d/dx coincide on smooth functions. Using
standard arguments it is easy to prove that T−∞ = d/dx and hence, the solution to (5.14)
and (5.15) is unique. The relation (5.17) follows now from (5.13).
Remark 2. We note that expressions (5.9) and (5.10) for αl and βl (γl = −β−l ) may
be simplified by changing the order of summation in (5.9) and (5.10) and introducing
PL−1−n PL−1−n
the correlation coefficients 2 i=0 gi hi+n , 2 i=0 hi gi+n and 2 L−1−n gi gi+n . The
P
i=0
expression for αl is especially simple, αl = 4r2l − rl .
Examples. For the examples we will use Daubechies’ wavelets constructed in [19]. First,
let us compute the coefficients a2k−1 , k = 1, . . . , M , where M is the number of vanishing
moments and L = 2M . Using relation (4.22) of [19],
(2M − 1)! ξ
Z
2
|m0 (ξ)| = 1 − sin2M −1 ξ dξ, (5.48)
[(M − 1)!]2 22M −1 0
Rξ
we find, by computing 0 sin2M −1 ξ dξ, that
M
(−1)m−1 cos(2m − 1)ξ
|m0 (ξ)|2 = 1
+ 12 CM
X
2
, (5.49)
m=1 (M − m)! (M + m − 1)! (2m − 1)
where " #2
(2M − 1)!
CM = . (5.50)
(M − 1)! 4M −1
Thus, by comparing (5.49) and (5.18), we have
(−1)m−1 CM
a2m−1 = , where m = 1, . . . , M. (5.51)
(M − m)! (M + m − 1)! (2m − 1)
We note that by virtue of being solutions of a linear system with rational co-
efficients (a2m−1 in (5.51) are rational by construction), the coefficients rl are rational
numbers. The coefficients rl are the same for all Daubechies’ wavelets with a fixed number
of vanishing moments M , while there are several wavelet bases for a given M depending
on the choice of the roots of polynomials in the construction described in [19].
30
Solving equations of Proposition 1, we present the results for Daubechies’ wavelets
with M = 2, 3, 4, 5, 6.
1. M = 2
9 1
a1 = , a3 = − ,
8 8
and
2 1
r1 = − , r2 = ,
3 12
The coefficients (−1/12, 2/3, 0, −2/3, 1/12) of this example can be found in many books
on numerical analysis as a choice of coefficients for numerical differentiation.
2. M = 3
75 25 3
a1 = , a3 = − , a5 = ,
64 128 128
and
272 53 16 1
r1 = − , r2 = , r3 = − , r4 = − .
365 365 1095 2920
3. M = 4
1225 245 49 5
a1 = , a3 = − , a5 = , a7 = − ,
1024 1024 1024 1024
and
39296 76113 1664
r1 = − , r2 = , r3 = − ,
49553 396424 49553
2645 128 1
r4 = , r5 = , r6 = − .
1189272 743295 1189272
4. M = 5
19845 2205 567 405 35
a1 = , a3 = − , a5 = , a7 = − , a9 = ,
16384 8192 8192 32768 32768
and
957310976 265226398 735232
r1 = − , r2 = , r3 = − ,
1159104017 1159104017 13780629
17297069 1386496 563818
r4 = , r5 = − , r6 = − ,
2318208034 5795520085 10431936153
2048 5
r7 = − , r8 = − .
8113728119 18545664272
31
5. M = 6
160083 38115 22869
a1 = , a3 = − , a5 = ,
131072 131072 262144
5445 847 63
a7 = − , a9 = , a11 = − ,
262144 262144 262144
and
3986930636128256 4850197389074509 1019185340268544
r1 = , r2 = , r3 = ,
4689752620280145 18759010481120580 14069257860840435
136429697045009 7449960660992 483632604097
r4 = , r5 = , r6 = ,
9379505240560290 4689752620280145 112554062886723480
78962327552 31567002859 2719744
r7 = , r8 = , r9 = ,
6565653668392203 75036041924482320 937950524056029
1743
r10 = .
2501201397482744
32
Coefficients Coefficients
l rl l rl
M =5 1 -0.82590601185015 M =8 1 -0.88344604609097
2 0.22882018706694 2 0.30325935147672
3 -5.3352571932672E-02 3 -0.10636406828947
4 7.4613963657755E-03 4 3.1290147839488E-02
5 -2.3923582002393E-04 5 -6.9583791164537E-03
6 -5.4047301644748E-05 6 1.0315302133757E-03
7 -2.5241171135682E-07 7 -7.6677069083796E-05
8 -2.6960479423517E-10 8 -2.4519921109537E-07
9 -3.9938104563894E-08
10 7.2079482385949E-08
M =6 1 -0.85013666155592 11 9.6971849256415E-10
2 0.25855294414146 12 7.2522069166503E-13
3 -7.2440589997659E-02 13 -1.2400785360984E-14
4 1.4545511041994E-02 14 1.5854647516841E-19
5 -1.5885615434757E-03
6 4.2968915709948E-06
7 1.2026575195723E-05 M =9 1 -0.89531640583699
8 4.2069120451167E-07 2 0.32031206224855
9 -2.8996668057051E-09 3 -0.12095364936000
10 6.9686511520083E-13 4 3.9952721886694E-02
5 -1.0616930669821E-02
6 2.1034028106558E-03
M =7 1 -0.86874391452377 7 -2.7812077649932E-04
2 0.28296509452594 8 1.9620437763642E-05
3 -9.0189066217795E-02 9 -4.8782468879634E-07
4 2.2687411014648E-02 10 1.0361220591478E-07
5 -3.8814546576295E-03 11 -1.5966864798639E-08
6 3.3734404776409E-04 12 -8.1374108294110E-10
7 4.2363946800701E-06 13 -5.4025197533630E-13
8 -1.6501679210868E-06 14 -4.7814005916812E-14
9 -2.1871130331900E-07 15 -1.6187880013009E-18
10 4.1830548203747E-10 16 -4.8507474310747E-24
11 -1.2035273999989E-11
12 -6.6283900594600E-16
l=L−2
Table 1: The coefficients {rl }l=1 for Daubechies’ wavelets, where L = 2M and M =
5, . . . , 9.
33
Proposition V.2 1. If the integrals in (5.52) or (5.53) exist, then the coefficients
(n)
rl , l ∈ Z satisfy the following system of linear algebraic equations
L/2
(n) (n) (n)
= 2n r2l + 1
X
rl 2
a2k−1 (r2l−2k+1 + r2l+2k−1 ) , (5.54)
k=1
and
(n)
l n rl = (−1)n n!,
X
(5.55)
l
and X (n)
rl = 0, (5.58)
l
34
We note that among the wavelets with L = 4, the wavelets with two vanishing
moments M = 2 do not have the best Hölder exponent (see [21]), but the representation
of the third derivative exists only if the number of vanishing moments M = 2.
(n)
The equations for computing the coefficients rl may be viewed as an eigenvalue
problem. Let us derive the equation corresponding to (5.42) for dn /dxn directly from
(5.53). We rewrite (5.53) as
Z 2π
(n)
|ϕ̂(ξ + 2πk)|2 in (ξ + 2πk)n e−ilξ dξ.
X
rl = (5.61)
0 k∈Z
Therefore,
|ϕ̂(ξ + 2πk)|2 in (ξ + 2πk)n ,
X
r̂(ξ) = (5.62)
k∈Z
where
(n)
rl eilξ .
X
r̂(ξ) = (5.63)
l
(M0 f )(ξ) = |m0 (ξ/2)|2 f (ξ/2) + |m0 (ξ/2 + π)|2 f (ξ/2 + π). (5.66)
We rewrite (5.65) as
M0 r̂ = 2−nr̂, (5.67)
so that r̂ is an eigenvector of the operator M0 corresponding to the eigenvalue 2−n . Thus,
finding the representation of the derivatives in the wavelet basis is equivalent to finding
trigonometric polynomial solutions of (5.67) and vice versa. (The operator M0 is also
introduced in [14] and [26], where the problem (5.67) with the eigenvalue 1 is considered).
35
in wavelet bases, the numerical evidence illustrating this fact is of interest, since it rep-
resents one of the advantages of computing in the wavelet bases.
For operators with a homogeneous symbol the bound on the condition number
depends on the particular choice of the wavelet basis (by the condition number we un-
derstand the ratio of the largest singular value to the smallest singular value above the
threshold of accuracy). After applying such a preconditioner, the condition number κp of
the operator is uniformly bounded with respect to the size of the matrix. We recall that
the condition number controls the rate of convergence of a number of iterative algorithms;
√
for example the number of iterations of the conjugate gradient method is O( κp ).
We present here two tables, 2 and 3, illustrating such preconditioning applied to
the standard form of the second derivative. In the following examples the standard form
of the periodized second derivative D2 of size N × N , where N = 2n , is preconditioned
by the diagonal matrix P ,
D2p = P D2 P.
where Pil = δil 2j , 1 ≤ j ≤ n, and where j is chosen depending on i, l so that N −
N/2j−1 + 1 ≤ i, l ≤ N − N/2j , and PN N = 2n .
In the tables we compare the original condition number κ of D2 and κp of D2p .
36
N κ κp
64 0.14545E+04 0.10792E+02
Table 2: Condition numbers of the matrix of periodized second derivative (with and with-
out preconditioning) in the basis of Daubechies’ wavelets with three vanishing moments
M = 3.
N κ κp
64 0.10472E+04 0.43542E+01
Table 3: Condition numbers of the matrix of periodized second derivative (with and
without preconditioning) in the basis of Daubechies’ wavelets with six vanishing moments
M = 6.
37
VI The convolution operators in wavelet bases
In this Section we consider the computation of the non-standard form of convolution
operators. For convolution operators the quadrature formulas for representing the kernel
on V0 are of the simplest form due to the fact that the moments of the autocorrelation
of the scaling function ϕ vanish. Moreover, by combining the asymptotics of the wavelet
coefficients of the operator with the system of linear algebraic equations (similar to those
in Section V) we arrive at an effective method for computing these coefficients [5]. This
method is especially simple if the symbol of the operator is homogeneous of some degree.
(j−1)
Let us assume that the matrix ti−l (i, l ∈ Z) represents the operator Pj−1 T Pj−1
on the subspace Vj−1 . To compute the representation of Pj T Pj , we have the following
formula (3.26) of [7]
L−1
X L−1
(j) X (j−1)
tl = hk hm t2l+k−m . (6.1)
k=0 m=0
It easily reduces to
L/2
(j) (j−1) 1
X (j−1) (j−1)
tl = t2l + 2
a2k−1 (t2l−2k+1 + t2l+2k−1 ). (6.2)
k=0
and Z +∞
Mm
Φ = y m Φ(y)dy = 0, for 1 ≤ m ≤ 2M − 1. (6.7)
−∞
Clearly, we have m
1
Mm
Φ = ∂ξ |ϕ̂(ξ)| 2
. (6.8)
i ξ=0
38
Using (6.8) and the identity ϕ̂(ξ) = ϕ̂(ξ/2)m0 (ξ/2) (see [19]), it follows from (5.25) that
(6.7) holds.
Since the moments of the function Φ vanish, (6.7), equation (6.4) leads to a one-
point quadrature formula for computing the representation of convolution operators on
the finest scale for all compactly supported wavelets. This formula is obtained in exactly
the same manner as for the special choice of the wavelet basis described in [7] (eqns.
3.8-3.12), where the shifted moments of the function ϕ vanish; we refer to this paper for
the details.
Here we introduce a different approach, which consists in solving the system of
linear algebraic equations (6.2) subject to asymptotic conditions. This method is espe-
cially simple if the symbol of the operator is homogeneous of some degree since in this
case the operator is completely defined by its representation on V0 .
Let us consider two examples of such operators, the Hilbert transform and the
operator of fractional differentiation (or anti-differentiation).
which, in turn, completely define all other coefficients of the non-standard form. Namely,
H = {Aj , Bj , Γj }j∈Z , Aj = A0 , Bj = B0 , and Γj = Γ0 , where matrix elements αi−l , βi−l ,
and γi−l of A0 , B0 , and Γ0 are computed from the coefficients rl ,
L−1
X L−1
X
αi = gk gk0 r2i+k−k0 , (6.11)
k=0 k 0 =0
L−1
X L−1
X
βi = gk hk0 r2i+k−k0 , (6.12)
k=0 k 0 =0
and
L−1
X L−1
X
γi = hk gk0 r2i+k−k0 . (6.13)
k=0 k 0 =0
39
Coefficients Coefficients
l rl l rl
M =6 1 -0.588303698 9 -0.035367761
2 -0.077576414 10 -0.031830988
3 -0.128743695 11 -0.028937262
4 -0.075063628 12 -0.026525823
5 -0.064168018 13 -0.024485376
6 -0.053041366 14 -0.022736420
7 -0.045470650 15 -0.021220659
8 -0.039788641 16 -0.019894368
where the coefficients a2k−1 are given in (5.19). Using (6.4), (6.6) and (6.7) we obtain
the asymptotics of rl for large l,
1 1
rl = − + O( 2M ). (6.15)
πl l
By rewriting (6.10) in terms of ϕ̂(ξ),
Z ∞
rl = −2 |ϕ̂(ξ)|2 sin(lξ) dξ. (6.16)
0
we obtain rl = −r−l and set r0 = 0. We note that the coefficient r0 cannot be determined
from equations (6.14) and (6.15).
Solving (6.14) with the asymptotic condition (6.15), we compute the coefficients
rl , l 6= 0 with any prescribed accuracy.
Example.
We compute the coefficients rl of the Hilbert transform for Daubechies’ wavelets
with six vanishing moments with accuracy 10−7 . The coefficients for l > 16 are obtained
using asymptotics (6.15). (We note that r−l = −rl and r0 = 0).
40
VI.2 The fractional derivatives
We use the following definition of fractional derivatives
Z +∞ (x − y)−α−1
+
(∂xα f ) (x) = f (y)dy, (6.17)
−∞ Γ(−α)
L−1
X L−1
βi = 2 α
X
gk hk0 r2i+k−k0 , (6.20)
k=0 k 0 =0
and
L−1
X L−1
γi = 2 α
X
hk gk0 r2i+k−k0 . (6.21)
k=0 k 0 =0
It easy to verify that the coefficients rl satisfy the following system of linear
algebraic equations
L/2
α 1
X
rl = 2 r2l + 2
a2k−1 (r2l−2k+1 + r2l+2k−1 ) , (6.22)
k=1
where the coefficients a2k−1 are given in (5.19). Using (6.4), (6.6) and (6.7) we obtain
the asymptotics of rl for large l,
1 1 1
rl = 1+α
+ O( 1+α+2M ) for l > 0, (6.23)
Γ(−α) l l
rl = 0 for l < 0. (6.24)
Example.
41
Coefficients Coefficients
l rl l rl
M =6 -7 -2.82831017E-06 4 -2.77955293E-02
-6 -1.68623867E-06 5 -2.61324170E-02
-5 4.45847796E-04 6 -1.91718816E-02
-4 -4.34633415E-03 7 -1.52272841E-02
-3 2.28821728E-02 8 -1.24667403E-02
-2 -8.49883759E-02 9 -1.04479500E-02
-1 0.27799963 10 -8.92061945E-03
0 0.84681966 11 -7.73225246E-03
1 -0.69847577 12 -6.78614593E-03
2 2.36400139E-02 13 -6.01838599E-03
3 -8.97463780E-02 14 -5.38521459E-03
Table 5: The coefficients {rl }l , l = −7, . . . , 14 of the fractional derivative α = 0.5 for
Daubechies’ wavelet with six vanishing moments.
42
VII Multiplication of operators in wavelet bases
43
and, therefore,
||T̃ T̂ − (T̃ T̂ ) || ≤ + (1 + ) + (1 + )2 . (7.6)
The right hand side of (7.6) is dominated by 3. For example, if we compute T 4 then we
might lose one significant digit.
44
Finally, we rewrite (7.10) as a sum of two terms,
where
j=n
Xh i
F = (Âj Ãj + B̂j Γ̃j ) + (B̂j T̃j + Âj B̃j ) + (T̂j Γ̃j + Γ̂j Ãj ) (7.12)
j=1
and
j=n
X
R = T̂n T̃n + Pj Γ̂j B̃j Pj (7.13)
j=1
The operators in the sum (7.12) are acting on following the subspaces,
where j = 1, . . . , n, and
T̂n T̃n : Vn → Vn . (7.18)
We now describe a fast O(N ) algorithm for the multiplication of the non-standard
forms of Calderón-Zygmund and pseudo-differential operators.
2. We compute the non-standard form of all the operators in (7.17). First, we observe
that Γ̂j B̃j , j = 1, . . . , n are banded. Starting from the finest scale j = 1, we
expand Γ̂j B̃j to obtain Āj+1 , B̄j+1 , Γ̄j+1 and T̄j+1 . We then add T̄j+1 to Γ̂j+1 B̃j+1
and expand the sum of the two, etc. As a result, we obtain Āj , B̄j , Γ̄j , j = 2, . . . , n
and T̄n . Since at all scales we are expanding a banded matrix, and the number
45
of operations is halved each time we go to the sparser scale, the total number of
operations at this step is proportional to N .
The resulting operators Āj , B̄j , Γ̄j , j = 2, . . . , n, and T̄n are acting on the sub-
spaces,
Āj : Wj → Wj , (7.19)
B̄j : Vj → Wj , (7.20)
Γ̄j : Wj → Vj , (7.21)
and
T̄n : Vn → Vn , (7.22)
3. At this step we add the corresponding operators computed at step 1 and step 2, to
obtain the non-standard form of the operator T = T̂ T̃ ,
If the product of two operators satisfies the estimates of Section IV, then the
operators {Aj , Bj , Γj }j∈Z of the non-standard form of T = T̂ T̃ are banded (for a given
precision). Examining (7.23)-(7.25), we find that in (7.24) and (7.25) there is only one
term that may potentially be dense. However, if the multiplicants and the product are
banded then all terms are banded, and we conclude that B̂j T̃j and T̂j Γ̃j must be banded.
Thus, we need only a banded version of the operators T̂j and T̃j . The banded versions
of T̂j and T̃j are computed in the process of constructing the non-standard form and,
therefore, we only need to store the results of these computations.
These is an alternative (direct) argument to show that B̂j T̃j and T̂j Γ̃j are banded.
It requires a proof that the first several moments of rows of B̂j and columns of Γ̃j are
negligible, which may be found in [29] for pseudo-differential operators.
46
VIII Fast iterative algorithms in wavelet bases
The fast multiplication algorithms of Section VII give a second life to a number of iterative
algorithms.
with
X0 = αA∗ , (8.2)
where A∗ is the adjoint matrix and α is chosen so that the largest eigenvalue of αA ∗ A is
less than two. Then the sequence Xk converges to the generalized inverse A† .
When this result is combined with the fast multiplication algorithms of Sec-
tion VII, we obtain an algorithm for constructing the generalized inverse in the stan-
dard form in at most O(N log2 N log R) operations and in the non-standard form in
O(N log R), where R is the condition number of the matrix. By the condition number
we understand the ratio of the largest singular value to the smallest singular value above
the threshold of accuracy.
The details of this algorithm in the context of computing in wavelet bases will
appear in [6]. Throughout the iteration (8.1)-(8.2), it is necessary to maintain the “fin-
ger” band structure of the standard form or the banded structure of the submatrices
of the non-standard form of the matrices Xk . Hence, the standard and non-standard
forms of both the operator and its generalized inverse must admit such structures. Since
pseudo-differential operators of order zero form an algebra, the operators of this class
satisfy this condition. Also, since any pseudo-differential operator may be multiplied by
a compressible operator so that the product is a pseudo-differential operator of order
zero, it is easy to see that the iteration (8.1)-(8.2) is applicable to all pseudo-differential
operators.
Example.
The following table contains timings and accuracy comparison of the construction
of the generalized inverse via the singular value decomposition (SVD), which is O(N 3 )
47
procedure, and via the iteration (8.1)-(8.2) in the wavelet basis using Fast Wavelet Trans-
form (FWT). The computations were performed on Sun Sparc workstation and we used
a routine from LINPACK for computing the singular value decomposition. For tests we
used the following full rank matrix
1
i−j
i 6= j
Aij = ,
1 i=j
where i, j = 1, . . . , N . The accuracy threshold was set to 10−4 , i.e., entries of Xk below
10−4 were systematically removed after each iteration.
with
X0 = αA∗ A, (8.4)
where A∗ is the adjoint matrix and α is chosen so that the largest eigenvalue of αA∗ A is
less than two.
48
Then I −Xk converges to Pnull . This can be shown either directly or by combining
an invariant representation for Pnull = I − A∗ (AA∗ )† A with the iteration (8.1)-(8.2) to
compute the generalized inverse (AA∗ )† . The fast multiplication algorithm makes the
iteration (8.3)-(8.4) fast for a wide class of operators (with the same complexity as the
algorithm for the generalized inverse). The important difference is, however, that (8.3)-
(8.4) does not require compressibility of the inverse operator but only of the powers of
the operator.
49
VIII.4 Fast algorithms for computing the exponential, sine and
cosine of a matrix
The exponential of a matrix (or an operator), as well as sine and cosine functions are
among the first to be considered in any calculus of operators. As in the case of the
generalized inverse, we use previously known algorithms (see e.g. [40]), which obtain
a completely different complexity estimates when we use them in conjunction with the
wavelet representations.
An algorithm for computing the exponential of a matrix is based on the identity
h i2L
exp(A) = exp(2−L A) . (8.10)
First, exp(2−L A) is computed by, for example, using the Taylor series. The number L is
chosen so that the largest singular value of 2−L A is less than one. At the second stage
of the algorithm the matrix 2−L A is squared L times to obtain the result.
Similarly, sine and cosine of a matrix may be computed using the elementary
double-angle formulas. On denoting
Yl = cos(2l−L A) (8.11)
Xl = sin(2l−L A), (8.12)
we have for l = 0, . . . , L − 1
where I is the identity. Again, we choose L so that the largest singular value of 2−L A is
less than one, compute the sine and cosine of 2−L A using the Taylor series, and then use
(8.13) and (8.14).
Ordinarily such algorithms require at least O(N 3 ) operations, since a number of
multiplications of dense matrices has to be performed [40]. The fast algorithm for the
multiplication of matrices in the standard form reduces the complexity to no more than
O(N log2 N ) operations, and the fast algorithm for the multiplication of matrices in the
non-standard form to O(N ) operations (see Section VII).
To achieve such performance, it is necessary to maintain the “finger” band struc-
ture of the standard form or the banded structure of the submatrices of the non-standard
form throughout the iteration. Whether it is possible to do depends on the particular
operator and, usually, may be verified analytically.
Unlike the algorithm for the generalized inverse, the algorithms of this Remark
are not self-correcting. Thus, it is necessary to maintain sufficient accuracy initially so
as to obtain the desired accuracy after all the multiplications have been performed.
50
IX Computing F (u) in the wavelet bases
In this Section we describe a fast, adaptive algorithm for computing F (u), where F is an
infinitely differentiable function and u is represented in a wavelet basis. An important
example is F (u) = u2 . Our analytic results generalize theorems of J. M. Bony [10], [11],
[9], [15] on the propagation of singularities of solutions of non-linear equations. Our
numerical approach, however, is novel. We expect a wide range of applications of this
algorithm.
or
j=n j=n
u20 = 2 (Qj u)(Qj u) + u2n .
X X
(Pj u)(Qj u) + (9.4)
j=1 j=1
u2 =
X
(2Pj u + Qj u)(Qj u), (9.5)
j∈Z
which is essentially the para-product of J.M. Bony. In what follows, we will consider
each term of (9.5) as a bilinear mapping
and show that, for a given precision, very few coefficients are needed to describe mapping
(9.6) for all j.
51
Before proceeding further, let us consider an example of (9.4) in the Haar basis.
We have the following explicit relations,
On denoting
we rewrite (9.9) as
j=n j=n
u20 (x) = dˆjk hjk (x) + ŝjk χjk (x) + ŝnk χnk (x).
X X X X X
(9.11)
j=1 k∈Z j=1 k∈Z k∈Z
We note that if the coefficient djk is zero then there is no need to keep the corresponding
average sjk . In other words, we need to keep averages only near the singularities, i.e.,
where the wavelet coefficients djk ( or products sjk djk ) are significant for a given accuracy.
Finally, to compute the coefficients of the wavelet expansion of the function u20 ,
we need to expand the second sum in (9.11) into the wavelet basis. Starting from the
scale j = 1, we compute the differences and averages d¯j+1 k and s̄j+1
k from sjk . We then
add s̄j+1
k to ŝj+1
k before expanding it further. As a result, we compute d¯jk , j = 2, . . . , n,
(we set d¯1k = 0) and ŝnk and obtain
j=n
u20 (x) = (dˆjk + d¯jk ) hjk (x) + (ŝnk + ŝnk ) χnk (x).
X X X
(9.12)
j=1 k∈Z k∈Z
52
It is clear, that the number of operations for computing the Haar expansion of u20 is
proportional to the number of significant coefficients djk in the wavelet expansion of u0 .
If the original function is represented by a vector of the length N , then, in the worst
case, the number of operations is proportional to N . If the original function is smooth
with a finite number of singularities, then the number of operations is proportional to
log2 N .
We now return to the general case and derive an algorithm to expand (9.4) into
the wavelet basis. For each scale j, and for each term in (9.4), we consider the bilinear
mappings,
MVj W : Vj × Wj → L2 (R) = Vj
M
Wj 0 , (9.13)
j 0 ≤j
j 2
M
MW W : Wj × Wj → L (R) = Vj Wj 0 , (9.14)
j 0 ≤j
and
MVn V : Vn × Vn → L2 (R) = Vn
M
Wj 0 , (9.15)
j 0 ≤n
which map a product of two functions into its expansion in the wavelet basis. The choice
of the representation of L2 (R), which depends on the scale j, turns out to be important
for the speed of the algorithm. 3
The mappings (9.13)-(9.15) are tabulated by computing the coefficients
0
Z +∞ 0
MVj,jW W (k, k 0 , l) = φjk (x) ψkj 0 (x) ψlj (x) dx, (9.16)
−∞
0
Z +∞ 0
j,j
MW 0
W W (k, k , l) = ψkj (x) ψkj 0 (x) ψlj (x) dx, (9.17)
−∞
0
Z +∞ 0
MVj,jV W (k, k 0 , l) = φjk (x) φjk0 (x) ψlj (x) dx, (9.18)
−∞
and Z +∞
MVj W V (k, k 0 , l) = φjk (x) ψkj 0 (x) φjl (x) dx, (9.19)
−∞
Z +∞
j
MW 0
W V (k, k , l) = ψkj (x) ψkj 0 (x) φjl (x) dx, (9.20)
−∞
Z +∞
MVj V V (k, k 0 , l) = φjk (x) φjk0 (x) φjl (x) dx, (9.21)
−∞
where j 0 ≤ j. It is clear, that the coefficients are identically zero for |k − k 0 | > k0 ,
where k0 depends on the overlap of the supports of the basis functions. The number
3
There might be fewer significant coefficients if we write, for example, L2 (R) = Vj−j0
L
j 0 ≤j−j0 Wj 0
in (9.13)-(9.15) with some j0 ≥ 1.
53
of coefficients which need to be stored may be reduced further by observing that, for
example,
Z +∞
j,j 0 0 0
MW 0
W W (k, k , l) =2 −j 0 /2
ψ0j−j (x) ψk−k
j−j 0
0 (x) ψ j−j 0
2 k−l
(x) dx, (9.22)
−∞
so that
j,j 0 j−j 0 0 0
0 −j /2 0 j−j
MW W W (k, k , l) = 2 M̃W W W (k − k , 2 k − l). (9.23)
However, the most significant reduction in the number of coefficients is a conse-
quence of the fact that the coefficients in (9.16)-(9.18) decay as the distance r = j − j 0
between the scales increases. For example, rewriting (9.23) as
Z +∞
r 0 r −r
M̃W W W (k − k , 2 k − l) = 2 ψ(2−r x) ψ(2−r x − k + k 0 ) ψ(x − 2r k + l) dx, (9.24)
−∞
and remembering that the regularity of the product ψ(2−r x) ψ(2−r x − k + k 0 ) increases
linearly with the number of vanishing moments of the function ψ, we obtain
r 0 r −rλM
|M̃W W W (k − k , 2 k − l)| ≤ C2 (9.25)
with some λ (see [19], [21]). It might be possible to estimate λ in (9.25), but since the
coefficients in (9.16)-(9.18) are computed numerically, the rate of decay and the number
of significant coefficients may be measured directly. To summarize, for a given precision,
only the coefficients with labels r = j − j 0 , such that 0 ≤ r ≤ j0 , where j0 is a small
constant, need to be stored.
Evaluating the mappings (9.13)-(9.15), we arrive at the following representation
2
of u0 ,
j=n j=n
u20 (x) dˆjk ψkj (x) + ŝjk φjk (x),
X X X X
= (9.26)
j=1 k∈Z j=1 k∈Z
where dˆjk , ŝjk are obtained by adding contributions (on the appropriate scales) of the
results of evaluating (9.13)-(9.15) for all scales j in (9.4).
The next stage of the algorithm is the expansion of the second term in (9.26) into
the wavelet basis and is similar to that for the Haar basis.
Let us evaluate the complexity of computing u20 in the worst case. We assume that
the original function u0 is represented by a vector of length N = 2n and its expansion
into the wavelet basis also requires order N significant entries of the vector djk . In the
process of evaluating mappings (9.13)-(9.15), each coefficient djk (j and k are fixed) will
combine with the coefficients djk0 (or sjk0 ), where |k − k 0 | ≤ k0 , to produce a non-zero
contribution on scales j 0 , 0 ≤ j − j 0 ≤ j0 . Since j0 and k0 are (small) constants and are
independent of the size N , the total number of operations is proportional to the number
of significant coefficients djk , which is O(N ) in this case.
54
If the number of significant coefficients djk is proportional to the number of scales,
log2 N , so will be the number of operations required to evaluate the mappings (9.13)-
(9.15). It is also clear, that it is necessary to store only those averages s jk0 , which combine
with the significant coefficients djk to produce a non-zero contribution. Therefore, it
is sufficient to store only those sjk0 , for which there exist the coefficient djk , such that
|k − k 0 | ≤ k0 and the product sjk0 djk is above the threshold of accuracy. It means that we
need to store averages only in the neighborhood of singularities.
The number of operation for expanding of the second term in (9.26) into the
wavelet basis is proportional to the number of significant entries, and the estimate is
completely similar to that for the Haar basis.
Remark. The algorithm for evaluation F (u) = u2 in the wavelet basis allows us to
evaluate the product of two functions, since uv = 41 [(u + v)2 − (u − v)2 ].
Expanding the function F in the Taylor series at the point (x + y)/2, we have
∞
1 x+y
F (2k−1) ( )(x − y)2k−1 .
X
F (x) − F (y) = 2k−2
(9.28)
k=1 (2k − 1)! 2 2
Let us write out several terms of this formula explicitly. If we keep only the first term,
we obtain
j=n
F (1) (Pj u + 21 Qj u) (Qj u).
X
F (u0 ) − F (un ) ≈ (9.30)
j=1
If F (u) = u2 , then formula (9.30) is exact and we obtain (9.3). If we keep two terms, we
obtain
j=n
1
(1) 1
(Qj u) + F (3) (Pj u + 21 Qj u) (Qj u)3
X
F (u0 ) − F (un ) ≈ F (Pj u + Q u)
2 j
(9.31)
j=1 24
55
We note that there is no term with the second derivative of F in (9.31) (there are no even
derivatives in (9.29)). Using (9.30) and considering the remainder of the series in (9.29)
as an error term, we obtain the results of J.M. Bony. The error term, however, will be
slightly smoother than in Bony’s results, since the remainder has a factor (Qj u)3 instead
of (Qj u)2 . We may also keep more terms to make the remainder arbitrarily smooth.
It is not clear at this point, if there is an advantage in computing F (u) via (9.29),
or a repeated application of the algorithm for F (u) = u2 is sufficient to compute various
functions F . However, there are several analytic advantages in considering (9.29). In
particular, by using theorems characterizing the functional spaces, for example the Hölder
spaces C s (R), in terms of their wavelet coefficients (see [33]), the series (9.29) may be
truncated for given precision.
Finally, we remark that the adaptive numerical algorithms based on the expansion
in (9.29) are novel and we expect to develop new methods for solving P.D.E.’s using these
algorithms.
56
References
[1] B. Alpert. Sparse representation of smooth linear operators. Ph.D. thesis, Yale
University, 1990.
[2] B. Alpert, G. Beylkin, R. R. Coifman, and V. Rokhlin. Wavelets for the fast solu-
tion of second-kind integral equations. Technical report, Department of Computer
Science, Yale University, New Haven, CT, 1990.
[3] B. Alpert and V. Rokhlin. A fast algorithm for the evaluation of legendre expansions.
SIAM J. on Sci. Stat. Comput., 12(1):158–179, 1991. Yale University Technical
Report, YALEU/DCS/RR-671 (1989).
[6] G. Beylkin, R. R. Coifman, and V. Rokhlin. Fast wavelet transforms and numerical
algorithms II. in progress.
[7] G. Beylkin, R. R. Coifman, and V. Rokhlin. Fast wavelet transforms and numerical
algorithms I. Comm. Pure and Appl. Math., 44:141–183, 1991. Yale University
Technical Report YALEU/DCS/RR-696, August 1989.
[9] J. M. Bony. Interaction des singularités pour les équations aux dérivées partielles
non-linéaires. Sém. e.d.p., 1979/80, 22, 1981/82, 2 et 1983/84, 10, Centre de
Mathématique, Ecole Polytechnique, 91128-Palaisau, France.
[10] J. M. Bony. Calcul symbolique et propagation des singularités pour les équations
aux dérivées partielles non-linéaires. Ann. Scient. E.N.S., 14:209–246, 1981.
[11] J. M. Bony. Propagation et interaction des singularités pour les solutions des
équations aux dérivées partielles non-linéaires. In Proceedings of the International
Congress of Mathematicians, Warszawa, pages 1133–1147, 1983.
[12] P. J. Burt and E. H. Adelson. The Laplacian pyramid as a compact image code.
IEEE Trans. Communications, 31(4):532–540, Apr. 1983.
57
[13] J. Carrier, L. Greengard, and V. Rokhlin. A fast adaptive multipole algorithm for
particle simulations. SIAM Journal of Scientific and Statistical Computing, 9(4),
1988. Yale University Technical Report, YALEU/DCS/RR-496 (1986).
[16] R. R. Coifman and Y. Meyer. Nouvelles bases orthonormée de l 2 (r) ayant la structure
du sysème de walsh. 1989. preprint.
[17] R. R. Coifman and Y. Meyer. Nouvelles bases orthogonales. C.R. Acad. Sci., Paris,
1990.
[21] I. Daubechies and J. Lagarius. Two scale difference equations: I-II. SIAM J. Math.
Anal., 1990.
[23] L. Greengard. Potential flow in channels. SIAM J. Sci. Stat. Comput., 11(4):603–
620, 1990.
[24] L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. J. Comp.
Phys., 73(1):325–348, 1987.
58
[28] S. Mallat. Review of multifrequency channel decomposition of images and wavelet
models. Technical Report 412, Courant Institute of Mathematical Sciences, New
York University, 1988.
[31] Y. Meyer. Ondelettes et functions splines. Technical report, séminaire edp, Ecole
Polytechnique, Paris, France, 1986.
[32] Y. Meyer. Wavelets and operators. In N.T. Peck E. Berkson and J. Uhl, editors,
Analysis at Urbana. London Math. Society, Lecture Notes Series 137, 1989. v.1.
[34] S.T. O’Donnel and V. Rokhlin. A fast algorithm for the numerical evaluation of con-
formal mappings. SIAM J. Sci. Stat. Comput., 10(3):475–487, 1989. Yale University
Technical Report, YALEU/DCS/RR-554 (1987).
[36] G. Schulz. Iterative berechnung der reziproken matrix. Z. Angew. Math. Mech.,
13:57–59, 1933.
[40] R. C. Ward. Numerical computation of the matrix exponential with accuracy esti-
mates. SIAM. J. Numer. Anal., 14(4):600–610, 1977.
59