2.4 Solving Systems of Linear Equations
2.4 Solving Systems of Linear Equations
4 SOLVING SYSTEMS OF
LINEAR EQUATIONS
L and U Matrices
Lower Triangular Matrix
l 11
l
21
[ L ]=
l 31
l 41
Upper Triangular Matrix
0
l 22
l 32
l 42
0
0
l 33
0
l 34 l 44
0
u
u
22
23
24
[U ]=
0
0 u33 u34
0
0 u44
0
Complexity of LU Decomposition
to solve Ax=b:
decompose A into LU
-- cost 2n3/3 flops
solve Ly=b for y by forw. substitution -- cost n2 flops
solve Ux=y for x by back substitution -- cost n2 flops
slower alternative:
compute A-1
multiply x=A-1b
this costs about 3 times as much as LU
26 Sept. 2000
Cholesky LU Factorization
If [A] is symmetric and positive definite, it is convenient to
use Cholesky decomposition.
[A] = [L][L]T= [U]T[U]
No pivoting or scaling needed if [A] is symmetric and
positive definite (all eigenvalues are positive)
If [A] is not positive definite, the procedure may encounter
the square root of a negative number
Complexity is that of LU (due to symmetry exploitation)
Cholesky LU Factorization
[A] = [U]T[U]
Recurrence relations
i 1
i 1
uij =
uii
for j = i + 1, , n
Pivoting in LU Decomposition
Still need pivoting in LU decomposition
(why?)
Messes up order of [L]
What to do?
Need to pivot both [L] and a permutation
matrix [P]
Initialize [P] as identity matrix and pivot
when [A] is pivoted. Also pivot [L]
LDLT factorization
where
E cT I
cT
E
A=
= 1
c
B
B cE 1cT
cE I
E cT I
E
I E 1cT
A=
= 1
1 T
cE
I
B
cE
c
c
B
I
I
L = 1
cE
and
I
I
L =
E T cT I
=
I
E 1cT
Complexity of LU Decomposition
to solve Ax=b:
decompose A into LU
-- cost 2n3/3 flops
solve Ly=b for y by forw. substitution -- cost n2 flops
solve Ux=y for x by back substitution -- cost n2 flops
slower alternative:
compute A-1
multiply x=A-1b
this costs about 3 times as much as LU
26 Sept. 2000
12
Cholesky factorization
LDL
factorization
26 Sept. 2000
13
CS6963
15"
L12:
Sparse
Linear
Algebra
SUMMARY SECTION 2
The heaviest components of numerical software are Numerical
differentiation (AD/DIVDIFF) and linear algebra.
Factorization is always preferable to direct (Gaussian)
elimination.
Keeping track of sparsity in linear algebra can enormously
improve performance.
Problem definition
min f (x)
f :R ! R
n
- continuously differentiable
- gradient is available
-Hessian is unavailable
! 2 f (x * ) ! 0
DEMO
Algorithm: Newton.
Note: not only does the algorithm not converge, the function
values go to infinity.
So we should have known ahead of time we should have done
something else earlier.
Descent Principle
Descent Principle: Carry Out a one-Dimensional Search Along a
Line where I will decrease the function.
g ( ) = f ( x k + p k )
f ( x k + ! pk ) < f ( x k )
Bk ! 0
Line Search-Armijo
I cannot accept just about ANY decrease, for I may NEVER
converge (why , example of spurious convergence).
IDEA: Accept only decreases PROPORTIONAL TO THE
SQUARE OF GRADIENT. Then I have to converge (since process
stops only when gradient is 0).
Example: Armijo Rule. It uses the concept of BACKTRACKING.
! "(0,1)
g(0)+ c1g()
! "(0,1/ 2)
g(0)+ g()
Some Theory
Global Convergence:
Fast Convergence:
Newton is accepted by LS
Extensions
Line Search Refinements:
Use interpolation
Wolfe and Goldshtein rule