1993_Wavelet transforms versus Fourier transforms
1993_Wavelet transforms versus Fourier transforms
Gilbert Strang
Figure 1. Scaling function φ(x), wavelet W (x), and the next level of
detail.
1
2 GILBERT STRANG
Already you see the two essential operations: translation and dilation. The step
from W (2x) to W (2x − 1) is translation. The step from W (x) to W (2x) is dilation.
Starting from a single function, the graphs are shifted and compressed. The next
level contains W (4x), W (4x − 1), W (4x − 2), W (4x − 3). Each is supported on an
interval of length 14 . In the end we have Haar’s infinite family of functions:
9 1 1 1 0 3
1 1 1 −1 02
y = W4 b is = .
2 1 −1 0 1 4
0 1 −1 0 −1 1
† Rademacher was first to propose an orthogonal family of ±1 functions; it was not complete.
After Walsh constructed a complete set, Rademacher’s Part II was regrettably unpublished and
seems to be lost (but Schur saw it).
WAVELET TRANSFORMS VERSUS FOURIER TRANSFORMS 3
1 1 1 1 1 1 1 1
1 1 1 −1 −1 1 1 (−i) (−i)2 (−i)3
W4−1 = and F4−1 = .
4 2 −2 0 0 4 1 (−i)2 (−i)4 (−i)6
0 0 2 −2 1 (−i)3 (−i)6 (−i)9
The essential point is that the inverse matrices have the same form as the originals.
If we can transform quickly, we can invert quickly — between coefficients and
function values. The Fourier coefficients come from values at n points. The Haar
coefficients come from values on n subintervals.
For the FFT, each step requires 12 n multiplications (as shown below). For the Fast
Wavelet Transform, the cost of each successive step is cut in half. It is a beautiful
“pyramid scheme” created by Burt and Adelson and Mallat and others. The total
cost has a factor 1 + 12 + 41 + · · · that stays below 2. This is why the final outcome
for the FWT is O(n) without the logarithm ℓ.
The matrix factorizations are so simple, especially for n = 4, that it seems
worthwhile to display them. The FFT has two copies of the half-size transform F2
in the middle:
1 1 1 1 1
1 i 1 i2 1
(1) F4 = .
1 −1 1 1 1
1 −i 1 i2 1
The permutation on the right puts the even a’s (a0 and a2 ) ahead of the odd a’s
(a1 and a3 ). Then come separate half-size transforms on the evens and odds. The
matrix at the left combines these two half-size outputs in a way that produces the
correct full-size answer. By multiplying those three matrices we recover F4 .
The factorization of W4 is a little different:
1 1 1 1 1
1 −1 1 1 −1
(2) W4 = .
1 1 1 1
1 −1 1 1
At the next level of detail (for W8 ), the same 2 by 2 matrix appears four times in
the left factor. The permutation matrix puts columns 0, 2, 4, 6 of that factor ahead
of 1, 3, 5, 7. The third factor has W4 in one corner and I4 in the other corner (just
as W4 above ends with W2 and I2 — this factorization is the matrix form of the
pyramid algorithm). It is the identity matrices I4 and I2 that save multiplications.
Altogether W2 appears 4 times at the left of W8 , then 2 times at the left of W4 , and
then once at the right. The multiplication count from these n − 1 small matrices is
O(n) — the Holy Grail of complexity theory.
Walsh would have another copy of the 2 by 2 matrix in the last corner, instead
of I2 . Now the product has orthogonal columns with all entries ±1 — the Walsh
basis. Allowing W2 or I2 , W4 or I4 , W8 or I8 , . . . in the third factors, the matrix
products exhibit a whole family of orthogonal bases. This is a wavelet packet, with
great flexibility. Then a “best basis” algorithm aims for a choice that concentrates
most of f into a few basis vectors. That is the goal — to compress information.
The same principle of factorization applies for any power of 2, say n = 1024. For
Fourier, the entries of F are powers of ω = e2πi/1024 . The row and column indices
go from 0 to 1023 instead of 1 to 1024. The zeroth row and column are filled with
ω 0 = 1. The entry in row j, column k of F is ω jk . This is the term ikx
P e jkevaluated
at x = 2πj/1024. The multiplication F1024 a computes the series ak ω for j = 0
to 1023.
The key to the matrix factorization is just this. Squaring the 1024th root of
unity gives the 512th root: (ω 2 )512 = 1. This was the reason behind the middle
factor in (1), where i is the fourth root and i2 is the square root. It is the essential
link between F1024 and F512 . The first stage of the FFT is the great factorization
WAVELET TRANSFORMS VERSUS FOURIER TRANSFORMS 5
I512 is the identity matrix. D512 is the diagonal matrix with entries (1, ω, . . . , ω 511 ),
requiring about 512 multiplications. The two copies of F512 in the middle give a
matrix only half full compared to F1024 — here is the crucial saving. The shuffle
separates the incoming vector a into (a0 , a2 , . . . , a1022 ) with even indices and the
odd part (a1 , a3 , . . . , a1023 ).
Equation (3) is an imitation of equation (1), eight levels higher. Both are easily
verified. Computational linear algebra has become a world of matrix factorizations,
and this one is outstanding.
You have anticipated what comes next. Each F512 is reduced in the same way
to two half-size transforms F = F256 . The work is cut in half again, except for an
additional 512 multiplications from the diagonal matrices D = D256 :
I D F even-odd gives
F512 I −D F 0 and 2 mod 4
(4) = .
F512 I D F even-odd gives
I −D F 1 and 3 mod 4
For n = 1024 there are ℓ = 10 levels, and each level has 21 n = 512 multiplications
from the first factor — to reassemble the half-size outputs from the level below.
Those D’s yield the final count 12 nℓ.
In practice, ℓ = log2 n is controlled by splitting the signal into smaller blocks.
With n = 8, the scale length of the transform is closer to the scale length of
most images. This is the short time Fourier transform, which is the transform
of a “windowed” function wf . The multiplier w is the characteristic function of
the window. (Smoothing is necessary! Otherwise this blocking of the image can
be visually unacceptable. The ridges of fingerprints are broken up very badly,
and windowing was unsuccessful in tests by the FBI.) In other applications the
implementation may favor the FFT — theoretical complexity is rarely the whole
story.
A more gradual exposition of the Fourier matrix and the FFT is in the mono-
graphs [3, 4] and the textbooks [5, 6] — and in many other sources [see 7]. (In the
lower level text [8], it is intended more for reference than for teaching. On the other
hand, this is just a matrix–vector multiplication!) FFT codes are freely available
on netlib, and generally each machine has its own special software.
For higher-order wavelets, the FWT still involves many copies of a single small
matrix. The entries of this matrix are coefficients ck from the “dilation equation”.
We move from fast algorithms to a quite different part of mathematics — with the
goal of constructing new orthogonal bases. The basis functions are unusual, for a
good reason.
is similar to the rectangle rule for integration, or Euler’s method for a differential
equation, or forward differences ∆y/∆x as estimates of dy/dx. Each is a simple and
natural first approach, but inadequate in the end. Through all of scientific comput-
ing runs this common theme: Increase the accuracy at least to second order. What
this means is: Get the linear term right.
For integration, we move to the trapezoidal rule and midpoint rule. For deriva-
tives, second-order accuracy comes with centered differences. The whole point of
Newton’s method for solving f (x) = 0 is to follow the tangent line. All these are
exact when f is linear. For wavelets to be accurate, W (x) and φ(x) need the same
improvement. Every ax + b must be a linear combination of translates.
Piecewise polynomials (splines and finite elements) are often based on the “hat”
function — the integral of Haar’s W (x). But this piecewise linear function does not
produce orthogonal wavelets with a local basis. The requirement of orthogonality to
dilations conflicts strongly with the demand for compact support — so much so that
it was originally doubted whether one function could satisfy both requirements and
still produce ax + b. It was the achievement of Ingrid Daubechies [9] to construct
such a function.
We now outline the construction of wavelets. The reader will understand that
we only touch on parts of the theory and on selected applications. An excellent
account of the history is in [10]. Meyer and Lemarié describe the earliest wavelets
(including Gabor’s). Then comes the beautiful pattern of multiresolution analysis
uncovered by Mallat — which is hidden by the simplicity of the Haar basis. Mallat’s
analysis found expression in the Daubechies wavelets.
Begin on the interval [0, 1]. The space V0 spanned by φ(x) is orthogonal to the
space W0 spanned by W (x). Their sum V1 = V0 ⊕ W0 consists of all piecewise
constant functions on half-intervals. A different basis for V1 is φ(2x) = 21 (φ(x) +
W (x)) and φ(2x − 1) = 12 (φ(x) − W (x)). Notice especially that V0 ⊂ V1 . The
function φ(x) is a combination of φ(2x) and φ(2x−1). This is the dilation equation,
for Haar’s example.
Now extend that pattern to the spaces Vj and Wj of dimension 2j :
The next space V2 is spanned by φ(4x), φ(4x − 1), φ(4x − 2), φ(4x − 3). It contains
all piecewise constant functions on quarter-intervals. That space was also spanned
by the four functions φ(x), W (x), W (2x), W (2x − 1) at the start of this paper.
Therefore, V2 decomposes into V1 and W1 just as V1 decomposes into V0 and W0 :
(5) V2 = V1 ⊕ W1 = V0 ⊕ W0 ⊕ W1 .
At every level, the wavelet space Wj is the “difference” between Vj+1 and Vj :
(6) Vj+1 = Vj ⊕ Wj = V0 ⊕ W0 ⊕ · · · ⊕ Wj .
The translates of wavelets on the right are also translates of scaling functions on
the left. For the construction of wavelets, this offers a totally different approach.
Instead of creating W (x) and the spaces Wj , we can create φ(x) and the spaces
Vj . It is a choice between the terms Wj of an infinite series or their partial sums
WAVELET TRANSFORMS VERSUS FOURIER TRANSFORMS 7
Vj . Historically the constructions began with W (x). Today the constructions begin
with φ(x). It has proved easier to work with sums than differences.
A first step is to change from [0, 1] to the whole line R. The translation index k
is unrestricted. The subspaces Vj and Wj are infinite-dimensional (L2 closures of
translates). One basis for L2 (R) consists of φ(x − k) and Wj k (x) = W (2j x − k)
with j ≥ 0, k ∈ Z. Another basis contains all Wj k with j, k ∈ Z. Then the dilation
index j is also unrestricted — for j = −1 the functions φ(2−1 x − k) are constant
on intervals of length 2. The decomposition into Vj ⊕ Wj continues to hold! The
sequence of closed subspaces Vj has the following basic properties for −∞ < j < ∞:
\ [
Vj ⊂ Vj+1 and Vj = {0} and Vj is dense in L2 (R);
f (x) is in Vj if and only if f (2x) is in Vj+1 ;
V0 has an orthogonal basis of translates φ(x − k), k ∈ Z.
These properties yield a “multiresolution analysis” — the pattern that other wavelets
will follow. Vj will be spanned by φ(2j x − k). Wj will be its orthogonal comple-
ment in Vj+1 . Mallat proved, under mild hypotheses, that Wj is also spanned by
translates [11]; these are the wavelets.
Dilation is built into multiresolution analysis by the property that f (x) ∈ Vj ⇔
f (2x) ∈ Vj+1 . This applies in particular to φ(x). It must be a combination of
translates of φ(2x). That is the hidden pattern, which has become central to this
subject. We have reached the dilation equation.
N
X
(7) φ(x) = ck φ(2x − k).
k=0
The coefficients for Haar are c0 = c1 = 1. The box function φ is the sum of two
half-width boxes. That is equation (7). Then W is a combination of the same
translates (because W0 ⊂ V1 ). The coefficients for W = φ(2x) − φ(2x − 1) are 1
and −1. It is absolutely remarkable that W uses the same coefficients as φ, but in
reverse order and with alternating signs:
1
X
(8) W (x) = (−1)k c1−k φ(2x − k).
1−N
This construction makes W orthogonal to φ and its translates. (For those trans-
lates to be orthogonal to each other, see below.) The key is that every vector
c0 , c1 , c2 , c3 is automatically orthogonal to c3 , −c2 , c1 , −c0 and all even translates
like 0, 0, c3 , −c2 .
When N is odd, c1−k can be replaced in (8) by cN −k . This shift by N − 1 is
even. Then the sum goes from 0 to N and W (x) looks especially attractive.
8 GILBERT STRANG
Everything hinges on the c’s. They dominate all that follows. They determine
(and are determined by) φ, they determine W , and they go into the matrix fac-
torization (2). In the applications, convolution with φ is an averaging operator —
it produces smooth functions (and a blurred picture). Convolution with W is a
differencing operator, which picks out details.
The convolution of the box with itself is the piecewise linear hat function —
equal to 1 at x = 1 and supported on the interval [0, 2]. It satisfies the dilation
equation with c0 = 12 , c1 = 1, c2 = 21 . But there is a condition on the c’s in
order that the wavelet basis W (2j x − k) shall be orthogonal. The three coefficients
1 1
2 , 1, 2 do not satisfy that condition. Daubechies found the unique c0 , c1 , c2 , c3 (four
coefficients are necessary) to give orthogonality plus second-order approximation.
Then the question becomes: How to solve the dilation equation?
Note added in proof. A new construction has just appeared that uses two scal-
ing functions φi and wavelets Wi . Their translates are still orthogonal [38]. The
combination φ1 (x) + φ1 (x − 1) + φ2 (x) is the hat function, so second-order accuracy
is achieved. The remarkable property is that these are “short functions”: φ1 is
supported on [0, 1] and φ2 on [0, 2]. They satisfy a matrix dilation equation.
These short wavelets open new possibilities for application, since the greatest
difficulties are always at boundaries. The success of the finite element method
is largely based on the very local character of its basis functions. Splines have
longer support (and more smoothness), wavelets have even longer support (and
orthogonality). The translates of a long basis function overrun the boundary.
There are two principal methods to solve dilation equations. One is by Fourier
transform, the other is by matrix products. Both
√ give φ as a limit, not as an explicit
function. We never discover the exact value φ( 2). It is amazing to compute with
a function we do not know — but the applications only require the c’s. When
complicated functions come from a simple rule, we know from increasing experience
what to do: Stay with the simple rule.
Solution of the dilation equation by Fourier transform. Without the “2”
we would have an ordinary difference equation — entirely familiar. The presence
of two scales, x and 2x, is the problem. A warning comes from Weierstrass and de
Rham and P Takagi — their nowhere differentiable functions are all built on multiple
scales like an cos(bn x). The Fourier transform easily handles translation by k in
equation (7), but 2x in physical space becomes ξ/2 in frequency space:
1X ξ ξ ξ
(9) φ̂(ξ) = ck eikξ/2 φ̂ =P φ̂ .
2 2 2 2
1
ck eikξ . With ξ = 0 in (9) we find P (0) = 1 or
P
The
P “symbol” is P (ξ) = 2
ck = 2 — the first requirement on the c’s. This allows us to look for a solution
R
normalized by φ̂(0) = φ(x) dx = 1. It does not ensure that we find a φ that is
continuous or even in L1 . What we do find is an infinite product, by recursion from
ξ/2 to ξ/4 and onward:
∞
ξ ξ ξ ξ ξ Y ξ
φ̂(ξ) = P φ̂ =P P φ̂ = ··· = P j
.
2 2 2 4 4 j=1
2
Solution by matrix products [12, 13]. When φ is known at the integers, the
gives φ at half-integers such as x = 32 . Since 2x − k is an integer,
dilation equation P
we just evaluate ck φ(2x − k). Then the equation gives φ at quarter-integers as
combinations of φ at half-integers. The combinations are built into the entries of
two matrices A and B, and the recursion is taking their products.
To start we need φ at the integers. With N = 3, for example, set x = 1 and
x = 2 in the dilation equation:
The equation turns out to be v(x) = Av(2x) for 0 ≤ x ≤ 12 and v(x) = Bv(2x − 1)
for 21 ≤ x ≤ 1. By recursion this yields v at any dyadic point — whose binary
expansion is finite. Each 0 or 1 in the expansion decides between A and B. For
example
Important: The matrix B has entries c2i−j . So does A, when the indexing starts
with i = j = 0. The dilation equation itself is φ = Cφ, with an operator C of this
new kind. Without the 2 it would be a Toeplitz operator, constant along each diag-
onal, but now every other row is removed. Engineers call it “convolution followed
by decimation”. (The word downsampling is also used — possibly a euphemism
for decimation.) Note that the derivative of the dilation equation is φ′ = 2Cφ′ .
Successive derivatives introduce powers of 2. The eigenvalues of these operators C
are 1, 21 , 14 ,P
P . . . , until φ(n) is not defined in the space at hand. The sum condition
ceven = codd = 1 is always imposed — it assures in Condition A1 below that
we have first-order approximation at least.
When x is not a dyadic point p/2n , the recursion in (11) does not termi-
nate. The binary expansion x = .0100101 . . . corresponds to an infinite product
ABAABAB . . . . The convergence of such a product is by no means assured. It is
a major problem to find a direct test on the c’s that is equivalent to convergence —
for matrix products in every order. We briefly describe what is known for arbitrary
A and B.
For a single matrix A, the growth of the powers An is governed by the spectral
radius ρ(A) = max |λi |. Any norm of An is roughly the nth power of this largest
eigenvalue. Taking nth roots makes this precise:
The difficulty is not to define ρ(A, B) but to compute it. For symmetric or normal or
commuting or upper triangular matrices it is the larger of ρ(A) and ρ(B). Otherwise
eigenvalues of products are not controlled by products of eigenvalues. An example
with zero eigenvalues, ρ(A) = 0 = ρ(B), is
0 2 0 0 4 0
A= , B= , AB = .
0 0 2 0 0 0
A beautiful theorem of Berger and Wang [15] asserts that these eigenvalues of
products yield the same limit (now a supremum) that was approached by norms:
It is ρ(A′ , B ′ ) that decides the size of φ(x)−φ(y). Continuity follows from ρ < 1 [16].
Then φ and W belong to C α for all α less than − log2 ρ. (When α > 1, derivatives
of integer order [α] have Hölder exponent α − [α].) In Sobolev spaces H s , Eirola
and Villemoes [17, 18] showed how an ordinary spectral radius — computable —
gives the exact regularity s.
c0 c1 c2 c3
L 1
c0 c1 c2 c3
(16) = √ · · · · · · .
H
2 c3 −c2 c1 −c0
c3 −c2 c1 −c0
This matrix enters each step of the wavelet transform, from vector y to wavelet
coefficients b. The pyramid algorithm executes that transform by recursion with
rescaling. We display two steps for a general wavelet and then specifically for Haar
on [0, 1]:
L 1 1 1 1
H L 1 1 −1 √ 1 1 1
(17) H is √2 √ .
I 2 √ 2 1 −1
2 1 −1
Figure 3 shows how those terms |P (ξ)|2 and |P (ξ + π)|2 are mirror functions that
add to 1. It also shows how four coefficients give a flatter response — with higher
accuracy at ξ = 0. Then |P |2 has a fourth-order zero at ξ = π.
WAVELET TRANSFORMS VERSUS FOURIER TRANSFORMS 13
Acknowledgment
I thank Peter Heller for a long conversation about the MPEG contest and its
rules.
Additional note. After completing this paper I learned, with pleasure and amaze-
ment, that a thesis which I had promised to supervise (“formally”, in the most
informal sense of that word) was to contain the filter design for MIT’s entry in the
HDTV competition. The Ph.D. candidate is Peter Monta. The competition is still
ahead (in 1992). Whether won or lost, I am sure the degree will be granted! These
paragraphs briefly indicate how the standards for High Definition Television aim
to yield a very sharp picture.
The key is high resolution, which requires a higher bit-rate of transmission. For
the MPEG contest in Japan — to compress videos onto CD’s and computers —
the rate was 1 megabit/second. For the HDTV contest that number is closer to 24.
Both compression ratios are about 100 to 1. (The better picture has more pixels.)
The audio signal gets 21 megabit/sec for its four stereo channels; closed captions use
less. In contrast, conventional television has no compression at all — in principle,
you see everything. The color standard was set in 1953, and the black and white
standard about 1941.
The FCC will judge between an AT&T/Zenith entry, two MIT/General Instru-
ments entries, and a partly European entry from Philips and others. These finalists
are all digital, an advance which surprised the New York Times. Monta proposed
a filter that uses seven coefficients or “taps” for low-pass and four for high-pass.
Thus the filters are not mirror images as in wavelets, or brick walls either. Two-
dimensional images come from tensor products of one-dimensional filters. Their
exact coefficients will not be set until the last minute, possibly for secrecy — and
cosine transforms may still be chosen in the end.
The red-green-blue components are converted by a 3 by 3 orthogonal matrix
to better coordinates. Linear algebra enters, literally the spectral theorem. The
luminance axis from the leading eigenvector gives the brightness.
A critical step is motion estimation, to give a quick and close prediction of
successive images. A motion vector is estimated for each region in the image [34].
The system transmits only the difference between predicted and actual images —
the “motion compensated residual”. When that has too much energy, the motion
estimator is disabled and the most recent image is sent. This will be the case when
there is a scene change. Note that coding decisions are based on the energy in
different bands (the size of Fourier coefficients). The L1 norm is probably better.
Other features may be used in 2001.
It is very impressive to see an HDTV image. The final verdict has just been
promised for the spring of 1993. Wavelets will not be in that standard, but they have
no shortage of potential applications [24, 35–37]. A recent one is the LANDSAT
8 satellite, which will locate a grid on the earth with pixel width of 2 yards. The
compression algorithm that does win will use good mathematics.
References
1. A. Haar, Zur Theorie der orthogonalen Funktionensysteme, Math. Ann. 69 (1910), 331–
371.
2. J. L. Walsh, A closed set of normal orthogonal functions, Amer. J. Math. 45 (1923), 5–24.
WAVELET TRANSFORMS VERSUS FOURIER TRANSFORMS 17
3. R. E. Blahut, Fast algorithms for digital signal processing, Addison-Wesley, New York,
1984.
4. C. Van Loan, Computational frameworks for the fast Fourier transform, SIAM, Philadel-
phia, PA, 1992.
5. G. Strang, Introduction to applied mathematics, Wellesley-Cambridge Press, Wellesley,
MA, 1986.
6. W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical recipes,
Cambridge Univ. Press, Cambridge, 2nd ed., 1993.
7. P. Duhamel and M. Vetterli, Fast Fourier transforms : a tutorial review, Signal Processing
19 (1990), 259–299.
8. G. Strang, Introduction to linear algebra, Wellesley–Cambridge Press, Wellesley, MA, 1993.
9. I. Daubechies, Orthogonal bases of compactly supported wavelets, Comm. Pure Appl. Math.
41 (1988), 909–996.
10. P. G. Lemarié (ed.), Les ondelettes en 1989, Lecture Notes in Math., vol. 1438, Springer-
Verlag, New York, 1990.
11. S. Mallat, Multiresolution approximations and wavelet orthogonal bases of L2 (R), Trans.
Amer. Math. Soc. 315 (1989), 69–88; A theory for multiresolution approximation: the
wavelet representation, IEEE Trans. PAMI 11 (1989), 674–693.
12. I. Daubechies and J. Lagarias, Two–scale difference equations: I. Existence and global
regularity of solutions, SIAM J. Math. Anal. 22 (1991), 1388–1410; II. Local regularity,
infinite products of matrices and fractals, SIAM J. Math. Anal. 23 (1992), 1031–1079.
13. , Sets of matrices all infinite products of which converge, Linear Algebra Appl. 161
(1992), 227–263.
14. G.-C. Rota and G. Strang, A note on the joint spectral radius, Kon. Nederl. Akad. Wet.
Proc. A 63 (1960), 379–381.
15. M. Berger and Y. Wang, Bounded semigroups of matrices, Linear Algebra Appl. 166
(1992), 21–28.
16. D. Colella and C. Heil, Characterizations of scaling functions, I. Continuous solutions,
SIAM J. Matrix Anal. Appl. 15 (1994) (to appear).
17. T. Eirola, Sobolev characterization of solutions of dilation equations, SIAM J. Math. Anal.
23 (1992), 1015–1030.
18. L. F. Villemoes, Energy moments in time and frequency for two-scale difference equation
solutions and wavelets, SIAM J. Math. Anal. 23 (1992), 1519–1543.
19. G. Strang, Wavelets and dilation equations : a brief introduction, SIAM Review 31 (1989),
614–627.
20. O. Rioul and M. Vetterli, Wavelets and signal processing, IEEE Signal Processing Mag. 8
(1991), 14–38.
21. M. Vetterli and C. Herley, Wavelets and filter banks: theory and design, IEEE Trans.
Acoust. Speech Signal Process. 40 (1992), 2207–2232.
22. P. P. Vaidyanathan, Multirate digital filters, filterbanks, polyphase networks, and appli-
cations : a tutorial, Proc. IEEE 78 (1990), 56–93; Multirate systems and filter banks,
Prentice-Hall, Englewood Cliffs, NJ, 1993.
23. P. Heller, Regular M -band wavelets, SIAM J. Matrix Anal. Appl. (to appear).
24. I. Daubechies, Ten lectures on wavelets, SIAM, Philadelphia, PA, 1992.
25. F. Schipp, W. R. Wade, and P. Simon, Walsh series, Akad. Kaidó and Adam Hilger,
Budapest and Bristol, 1990.
26. Y. Meyer, Ondelettes et opérateurs, Hermann, Paris, 1990; Wavelets, translation to be
published by Cambridge Univ. Press.
27. R. DeVore and B. J. Lucier, Wavelets, Acta Numerica 1 (1991), 1–56.
28. N. S. Jayant and P. Noll, Digital coding of waveforms, Prentice–Hall, Englewood Cliffs,
NJ, 1984.
29. T. Hopper and F. Preston, Compression of grey-scale fingerprint images, Data Compres-
sion Conference, IEEE Computer Society Press, New York, 1992.
30. M. V. Wickerhauser, High-resolution still picture compression, preprint.
31. M. V. Wickerhauser and R. R. Coifman, Entropy based methods for best basis selection,
IEEE Trans. Inform. Theory 38 (1992), 713–718.
32. R. DeVore, B. Jawerth, and B. J. Lucier, Image compression through wavelet transform
coding, IEEE Trans. Inform. Theory 38 (1992), 719–746.
18 GILBERT STRANG
33. J. N. Bradley and C. Brislawn, Compression of fingerprint data using the wavelet vector
quantization image compression algorithm, Los Alamos Report 92–1507, 1992.
34. J. Lim, Two-dimensional signal and image processing, Prentice-Hall, Englewood Cliffs,
NJ, 1990.
35. G. Beylkin, R. R. Coifman, and V. Rokhlin, Fast wavelet transforms and numerical algo-
rithms, Comm. Pure Appl. Math. 44 (1991), 141–183.
36. C. K. Chui, An introduction to wavelets, Academic Press, New York, 1992.
37. M. B. Ruskai et al., Wavelets and their applications, Jones and Bartlett, Boston, 1992.
38. J. S. Geronimo, D. P. Hardin, and P. R. Massopust, Fractal functions and wavelet expan-
sions based on several scaling functions (to appear).