MKL 2017 Developer Reference Fortran PDF
MKL 2017 Developer Reference Fortran PDF
Developer Reference
Revision: 082
MKL 2017
Legal Information
Contents
Contents
Legal Information.............................................................................. 33
Introducing the Intel Math Kernel Library.........................................39
Getting Help and Support................................................................... 41
What's New........................................................................................ 43
Notational Conventions...................................................................... 45
3
Intel Math Kernel Library Developer Reference
?hpr2.....................................................................................99
?sbmv.................................................................................. 101
?spmv.................................................................................. 103
?spr..................................................................................... 105
?spr2................................................................................... 107
?symv.................................................................................. 109
?syr..................................................................................... 111
?syr2................................................................................... 112
?tbmv.................................................................................. 114
?tbsv................................................................................... 117
?tpmv.................................................................................. 119
?tpsv................................................................................... 121
?trmv...................................................................................123
?trsv.................................................................................... 125
BLAS Level 3 Routines.................................................................... 127
?gemm.................................................................................127
?hemm.................................................................................131
?herk................................................................................... 133
?her2k................................................................................. 136
?symm................................................................................. 138
?syrk................................................................................... 141
?syr2k..................................................................................144
?trmm..................................................................................147
?trsm................................................................................... 149
Sparse BLAS Level 1 Routines.................................................................. 152
Vector Arguments.......................................................................... 152
Naming Conventions for Sparse BLAS Routines.................................. 152
Routines and Data Types.................................................................152
BLAS Level 1 Routines That Can Work With Sparse Vectors.................. 153
?axpyi.......................................................................................... 153
?doti............................................................................................ 155
?dotci........................................................................................... 156
?dotui...........................................................................................157
?gthr............................................................................................158
?gthrz.......................................................................................... 159
?roti.............................................................................................161
?sctr............................................................................................ 162
Sparse BLAS Level 2 and Level 3 Routines................................................. 163
Naming Conventions in Sparse BLAS Level 2 and Level 3.....................164
Sparse Matrix Storage Formats for Sparse BLAS Routines.................... 165
Routines and Supported Operations..................................................165
Interface Consideration...................................................................166
Sparse BLAS Level 2 and Level 3 Routines.........................................171
mkl_?csrgemv.......................................................................174
mkl_?bsrgemv.......................................................................176
mkl_?coogemv...................................................................... 178
mkl_?diagemv.......................................................................181
mkl_?csrsymv....................................................................... 183
mkl_?bsrsymv.......................................................................186
mkl_?coosymv...................................................................... 188
4
Contents
mkl_?diasymv....................................................................... 190
mkl_?csrtrsv......................................................................... 193
mkl_?bsrtrsv.........................................................................195
mkl_?cootrsv........................................................................ 198
mkl_?diatrsv......................................................................... 201
mkl_cspblas_?csrgemv........................................................... 203
mkl_cspblas_?bsrgemv...........................................................206
mkl_cspblas_?coogemv.......................................................... 208
mkl_cspblas_?csrsymv........................................................... 210
mkl_cspblas_?bsrsymv........................................................... 213
mkl_cspblas_?coosymv.......................................................... 215
mkl_cspblas_?csrtrsv............................................................. 217
mkl_cspblas_?bsrtrsv............................................................. 220
mkl_cspblas_?cootrsv............................................................ 223
mkl_?csrmv.......................................................................... 226
mkl_?bsrmv.......................................................................... 229
mkl_?cscmv.......................................................................... 233
mkl_?coomv......................................................................... 236
mkl_?csrsv........................................................................... 239
mkl_?bsrsv........................................................................... 243
mkl_?cscsv........................................................................... 246
mkl_?coosv...........................................................................250
mkl_?csrmm......................................................................... 253
mkl_?bsrmm.........................................................................257
mkl_?cscmm.........................................................................261
mkl_?coomm........................................................................ 265
mkl_?csrsm.......................................................................... 268
mkl_?cscsm.......................................................................... 272
mkl_?coosm..........................................................................276
mkl_?bsrsm.......................................................................... 279
mkl_?diamv.......................................................................... 282
mkl_?skymv......................................................................... 286
mkl_?diasv........................................................................... 289
mkl_?skysv...........................................................................292
mkl_?diamm......................................................................... 295
mkl_?skymm........................................................................ 299
mkl_?diasm.......................................................................... 302
mkl_?skysm..........................................................................306
mkl_?dnscsr..........................................................................309
mkl_?csrcoo..........................................................................312
mkl_?csrbsr.......................................................................... 315
mkl_?csrcsc.......................................................................... 318
mkl_?csrdia.......................................................................... 321
mkl_?csrsky..........................................................................324
mkl_?csradd......................................................................... 327
mkl_?csrmultcsr.................................................................... 331
mkl_?csrmultd...................................................................... 335
Inspector-executor Sparse BLAS Routines..................................................338
Naming conventions in Inspector-executor Sparse BLAS Routines......... 338
5
Intel Math Kernel Library Developer Reference
6
Contents
7
Intel Math Kernel Library Developer Reference
8
Contents
?gtts2.........................................................................................1382
?isnan........................................................................................ 1383
?laisnan...................................................................................... 1384
?labrd.........................................................................................1384
?lacn2........................................................................................ 1387
?lacon.........................................................................................1388
?lacpy.........................................................................................1390
?ladiv......................................................................................... 1391
?lae2.......................................................................................... 1392
?laebz.........................................................................................1393
?laed0........................................................................................ 1397
?laed1........................................................................................ 1399
?laed2........................................................................................ 1401
?laed3........................................................................................ 1403
?laed4........................................................................................ 1405
?laed5........................................................................................ 1406
?laed6........................................................................................ 1407
?laed7........................................................................................ 1408
?laed8........................................................................................ 1412
?laed9........................................................................................ 1415
?laeda........................................................................................ 1416
?laein......................................................................................... 1418
?laev2........................................................................................ 1420
?laexc.........................................................................................1422
?lag2.......................................................................................... 1423
?lags2........................................................................................ 1425
?lagtf..........................................................................................1428
?lagtm........................................................................................ 1430
?lagts......................................................................................... 1431
?lagv2........................................................................................ 1433
?lahqr.........................................................................................1435
?lahrd.........................................................................................1437
?lahr2.........................................................................................1439
?laic1......................................................................................... 1442
?lakf2......................................................................................... 1444
?laln2......................................................................................... 1445
?lals0......................................................................................... 1448
?lalsa..........................................................................................1451
?lalsd......................................................................................... 1454
?lamrg........................................................................................1456
?laneg........................................................................................ 1457
?langb........................................................................................ 1458
?lange........................................................................................ 1460
?langt.........................................................................................1461
?lanhs........................................................................................ 1462
?lansb........................................................................................ 1463
?lanhb........................................................................................ 1465
?lansp........................................................................................ 1466
?lanhp........................................................................................ 1468
?lanst/?lanht............................................................................... 1469
9
Intel Math Kernel Library Developer Reference
?lansy.........................................................................................1470
?lanhe........................................................................................ 1471
?lantb.........................................................................................1473
?lantp.........................................................................................1474
?lantr......................................................................................... 1476
?lanv2........................................................................................ 1478
?lapll.......................................................................................... 1479
?lapmr........................................................................................1480
?lapmt........................................................................................ 1481
?lapy2........................................................................................ 1482
?lapy3........................................................................................ 1483
?laqgb........................................................................................ 1483
?laqge........................................................................................ 1485
?laqhb........................................................................................ 1487
?laqp2........................................................................................ 1488
?laqps........................................................................................ 1490
?laqr0.........................................................................................1492
?laqr1.........................................................................................1495
?laqr2.........................................................................................1496
?laqr3.........................................................................................1500
?laqr4.........................................................................................1503
?laqr5.........................................................................................1506
?laqsb........................................................................................ 1509
?laqsp........................................................................................ 1511
?laqsy.........................................................................................1512
?laqtr......................................................................................... 1514
?lar1v.........................................................................................1516
?lar2v.........................................................................................1519
?laran.........................................................................................1520
?larf........................................................................................... 1521
?larfb......................................................................................... 1522
?larfg......................................................................................... 1526
?larfgp........................................................................................1527
?larft.......................................................................................... 1529
?larfx..........................................................................................1531
?large.........................................................................................1533
?largv.........................................................................................1534
?larnd.........................................................................................1535
?larnv.........................................................................................1536
?laror......................................................................................... 1537
?larot......................................................................................... 1540
?larra......................................................................................... 1543
?larrb......................................................................................... 1545
?larrc..........................................................................................1547
?larrd......................................................................................... 1548
?larre......................................................................................... 1552
?larrf.......................................................................................... 1555
?larrj.......................................................................................... 1557
?larrk......................................................................................... 1559
?larrr..........................................................................................1560
10
Contents
?larrv......................................................................................... 1561
?lartg......................................................................................... 1565
?lartgp........................................................................................1566
?lartgs........................................................................................ 1568
?lartv......................................................................................... 1569
?laruv.........................................................................................1570
?larz...........................................................................................1571
?larzb......................................................................................... 1573
?larzt..........................................................................................1575
?las2.......................................................................................... 1578
?lascl..........................................................................................1579
?lasd0........................................................................................ 1580
?lasd1........................................................................................ 1582
?lasd2........................................................................................ 1584
?lasd3........................................................................................ 1587
?lasd4........................................................................................ 1590
?lasd5........................................................................................ 1592
?lasd6........................................................................................ 1593
?lasd7........................................................................................ 1597
?lasd8........................................................................................ 1600
?lasd9........................................................................................ 1602
?lasda.........................................................................................1604
?lasdq........................................................................................ 1607
?lasdt......................................................................................... 1610
?laset......................................................................................... 1610
?lasq1........................................................................................ 1612
?lasq2........................................................................................ 1613
?lasq3........................................................................................ 1614
?lasq4........................................................................................ 1616
?lasq5........................................................................................ 1617
?lasq6........................................................................................ 1619
?lasr...........................................................................................1620
?lasrt..........................................................................................1623
?lassq.........................................................................................1623
?lasv2.........................................................................................1625
?laswp........................................................................................ 1626
?lasy2.........................................................................................1627
?lasyf......................................................................................... 1629
?lasyf_rook................................................................................. 1631
?lahef......................................................................................... 1633
?lahef_rook................................................................................. 1635
?latbs......................................................................................... 1637
?latm1........................................................................................ 1639
?latm2........................................................................................ 1641
?latm3........................................................................................ 1644
?latm5........................................................................................ 1647
?latm6........................................................................................ 1651
?latme........................................................................................ 1654
?latmr........................................................................................ 1658
?latdf..........................................................................................1666
11
Intel Math Kernel Library Developer Reference
?latps......................................................................................... 1667
?latrd......................................................................................... 1670
?latrs..........................................................................................1672
?latrz..........................................................................................1676
?lauu2........................................................................................ 1678
?lauum....................................................................................... 1679
?orbdb1/?unbdb1......................................................................... 1680
?orbdb2/?unbdb2......................................................................... 1683
?orbdb3/?unbdb3......................................................................... 1686
?orbdb4/?unbdb4......................................................................... 1689
?orbdb5/?unbdb5......................................................................... 1693
?orbdb6/?unbdb6......................................................................... 1695
?org2l/?ung2l.............................................................................. 1698
?org2r/?ung2r............................................................................. 1699
?orgl2/?ungl2.............................................................................. 1700
?orgr2/?ungr2............................................................................. 1702
?orm2l/?unm2l............................................................................ 1703
?orm2r/?unm2r........................................................................... 1705
?orml2/?unml2............................................................................ 1707
?ormr2/?unmr2........................................................................... 1709
?ormr3/?unmr3........................................................................... 1711
?pbtf2.........................................................................................1713
?potf2.........................................................................................1715
?ptts2.........................................................................................1716
?rscl........................................................................................... 1717
?syswapr.................................................................................... 1718
?heswapr.................................................................................... 1720
?sygs2/?hegs2.............................................................................1723
?sytd2/?hetd2............................................................................. 1724
?sytf2......................................................................................... 1726
?sytf2_rook................................................................................. 1728
?hetf2.........................................................................................1729
?hetf2_rook.................................................................................1731
?tgex2........................................................................................ 1732
?tgsy2........................................................................................ 1735
?trti2.......................................................................................... 1738
clag2z.........................................................................................1739
dlag2s........................................................................................ 1740
slag2d........................................................................................ 1741
zlag2c.........................................................................................1742
?larfp......................................................................................... 1743
ila?lc.......................................................................................... 1744
ila?lr...........................................................................................1745
?gsvj0........................................................................................ 1746
?gsvj1........................................................................................ 1749
?sfrk...........................................................................................1752
?hfrk.......................................................................................... 1754
?tfsm..........................................................................................1755
?lansf......................................................................................... 1757
?lanhf......................................................................................... 1759
12
Contents
?tfttp..........................................................................................1760
?tfttr.......................................................................................... 1761
?tpqrt2 ...................................................................................... 1763
?tprfb ........................................................................................ 1765
?tpttf..........................................................................................1768
?tpttr..........................................................................................1770
?trttf.......................................................................................... 1771
?trttp..........................................................................................1772
?pstf2.........................................................................................1773
dlat2s ........................................................................................ 1775
zlat2c ........................................................................................ 1776
?lacp2........................................................................................ 1777
?la_gbamv.................................................................................. 1778
?la_gbrcond................................................................................ 1780
?la_gbrcond_c............................................................................. 1782
?la_gbrcond_x............................................................................. 1784
?la_gbrfsx_extended.................................................................... 1785
?la_gbrpvgrw...............................................................................1792
?la_geamv.................................................................................. 1793
?la_gercond.................................................................................1795
?la_gercond_c............................................................................. 1796
?la_gercond_x............................................................................. 1798
?la_gerfsx_extended.....................................................................1799
?la_heamv.................................................................................. 1805
?la_hercond_c............................................................................. 1807
?la_hercond_x............................................................................. 1808
?la_herfsx_extended.................................................................... 1810
?la_herpvgrw...............................................................................1816
?la_lin_berr................................................................................. 1817
?la_porcond................................................................................ 1818
?la_porcond_c............................................................................. 1819
?la_porcond_x............................................................................. 1821
?la_porfsx_extended.................................................................... 1822
?la_porpvgrw...............................................................................1829
?laqhe........................................................................................ 1830
?laqhp........................................................................................ 1831
?larcm........................................................................................ 1833
?la_gerpvgrw...............................................................................1834
?larscl2.......................................................................................1835
?lascl2........................................................................................ 1836
?la_syamv...................................................................................1837
?la_syrcond.................................................................................1839
?la_syrcond_c..............................................................................1840
?la_syrcond_x............................................................................. 1842
?la_syrfsx_extended.....................................................................1843
?la_syrpvgrw............................................................................... 1850
?la_wwaddw................................................................................1851
mkl_?tppack................................................................................1852
mkl_?tpunpack............................................................................ 1854
LAPACK Utility Functions and Routines.....................................................1856
13
Intel Math Kernel Library Developer Reference
ilaver..........................................................................................1857
ilaenv......................................................................................... 1857
iparmq........................................................................................1859
ieeeck.........................................................................................1861
?labad........................................................................................ 1862
?lamch....................................................................................... 1863
?lamc1....................................................................................... 1864
?lamc2....................................................................................... 1864
?lamc3....................................................................................... 1865
?lamc4....................................................................................... 1866
?lamc5....................................................................................... 1867
chla_transtype............................................................................. 1867
iladiag........................................................................................ 1868
ilaprec........................................................................................ 1869
ilatrans....................................................................................... 1869
ilauplo........................................................................................ 1870
xerbla_array................................................................................1871
LAPACK Test Functions and Routines....................................................... 1871
?latms........................................................................................ 1872
14
Contents
p?gerfs............................................................................... 1922
p?porfs............................................................................... 1926
p?trrfs................................................................................ 1929
Matrix Inversion: ScaLAPACK Computational Routines....................... 1932
p?getri............................................................................... 1933
p?potri............................................................................... 1935
p?trtri.................................................................................1936
Matrix Equilibration: ScaLAPACK Computational Routines...................1938
p?geequ............................................................................. 1938
p?poequ............................................................................. 1940
Orthogonal Factorizations: ScaLAPACK Computational Routines.......... 1942
p?geqrf...............................................................................1942
p?geqpf.............................................................................. 1945
p?orgqr.............................................................................. 1948
p?ungqr..............................................................................1949
p?ormqr............................................................................. 1951
p?unmqr.............................................................................1954
p?gelqf............................................................................... 1957
p?orglq...............................................................................1959
p?unglq.............................................................................. 1961
p?ormlq..............................................................................1963
p?unmlq............................................................................. 1966
p?geqlf............................................................................... 1969
p?orgql...............................................................................1971
p?ungql.............................................................................. 1973
p?ormql..............................................................................1975
p?unmql............................................................................. 1978
p?gerqf...............................................................................1981
p?orgrq.............................................................................. 1983
p?ungrq..............................................................................1985
p?ormr3............................................................................. 1987
p?unmr3.............................................................................1990
p?ormrq............................................................................. 1994
p?unmrq.............................................................................1997
p?tzrzf................................................................................2000
p?ormrz..............................................................................2002
p?unmrz............................................................................. 2005
p?ggqrf...............................................................................2008
p?ggrqf...............................................................................2012
Symmetric Eigenvalue Problems: ScaLAPACK Computational Routines. 2017
p?syngst.............................................................................2017
p?syntrd............................................................................. 2020
p?sytrd...............................................................................2024
p?ormtr.............................................................................. 2027
p?hengst............................................................................ 2030
p?hentrd.............................................................................2033
p?hetrd.............................................................................. 2037
p?unmtr............................................................................. 2040
p?stebz.............................................................................. 2044
p?stedc.............................................................................. 2047
15
Intel Math Kernel Library Developer Reference
p?stein............................................................................... 2049
Nonsymmetric Eigenvalue Problems: ScaLAPACK Computational
Routines................................................................................. 2053
p?gehrd..............................................................................2053
p?ormhr............................................................................. 2056
p?unmhr.............................................................................2059
p?lahqr...............................................................................2063
p?hseqr.............................................................................. 2065
p?trevc............................................................................... 2067
Singular Value Decomposition: ScaLAPACK Driver Routines................ 2070
p?gebrd.............................................................................. 2071
p?ormbr............................................................................. 2074
p?unmbr.............................................................................2079
Generalized Symmetric-Definite Eigenvalue Problems: ScaLAPACK
Computational Routines............................................................ 2083
p?sygst...............................................................................2083
p?hegst.............................................................................. 2085
ScaLAPACK Driver Routines....................................................................2087
p?gesv........................................................................................2087
p?gesvx...................................................................................... 2089
p?gbsv........................................................................................2094
p?dbsv........................................................................................2097
p?dtsv........................................................................................ 2099
p?posv........................................................................................2102
p?posvx...................................................................................... 2104
p?pbsv........................................................................................2109
p?ptsv........................................................................................ 2112
p?gels.........................................................................................2114
p?syev........................................................................................2118
p?syevd...................................................................................... 2121
p?syevr.......................................................................................2123
p?syevx...................................................................................... 2128
p?heev....................................................................................... 2134
p?heevd......................................................................................2137
p?heevr...................................................................................... 2140
p?heevx......................................................................................2145
p?gesvd...................................................................................... 2152
p?sygvx...................................................................................... 2156
p?hegvx......................................................................................2163
ScaLAPACK Auxiliary Routines................................................................ 2171
b?laapp.......................................................................................2177
b?laexc....................................................................................... 2178
b?trexc....................................................................................... 2180
p?lacgv....................................................................................... 2182
p?max1...................................................................................... 2183
pilaver........................................................................................ 2184
pmpcol....................................................................................... 2184
pmpim2...................................................................................... 2186
?combamax1............................................................................... 2187
p?sum1...................................................................................... 2188
16
Contents
p?dbtrsv..................................................................................... 2188
p?dttrsv...................................................................................... 2191
p?gebal.......................................................................................2195
p?gebd2......................................................................................2196
p?gehd2..................................................................................... 2200
p?gelq2...................................................................................... 2202
p?geql2...................................................................................... 2205
p?geqr2...................................................................................... 2207
p?gerq2...................................................................................... 2209
p?getf2....................................................................................... 2211
p?labrd....................................................................................... 2213
p?lacon.......................................................................................2217
p?laconsb....................................................................................2218
p?lacp2.......................................................................................2219
p?lacp3.......................................................................................2221
p?lacpy....................................................................................... 2223
p?laevswp................................................................................... 2224
p?lahrd....................................................................................... 2226
p?laiect.......................................................................................2228
p?lamve......................................................................................2229
p?lange.......................................................................................2231
p?lanhs.......................................................................................2233
p?lansy, p?lanhe.......................................................................... 2235
p?lantr........................................................................................2237
p?lapiv........................................................................................2239
p?lapv2.......................................................................................2241
p?laqge.......................................................................................2243
p?laqr0....................................................................................... 2245
p?laqr1....................................................................................... 2248
p?laqr2....................................................................................... 2251
p?laqr3....................................................................................... 2254
p?laqr4....................................................................................... 2257
p?laqr5....................................................................................... 2259
p?laqsy....................................................................................... 2262
p?lared1d....................................................................................2264
p?lared2d....................................................................................2265
p?larf......................................................................................... 2266
p?larfb........................................................................................2269
p?larfc........................................................................................ 2273
p?larfg........................................................................................2276
p?larft........................................................................................ 2277
p?larz......................................................................................... 2280
p?larzb....................................................................................... 2283
p?larzc........................................................................................2287
p?larzt........................................................................................ 2289
p?lascl........................................................................................ 2292
p?lase2.......................................................................................2294
p?laset....................................................................................... 2296
p?lasmsub...................................................................................2297
p?lasrt........................................................................................ 2299
17
Intel Math Kernel Library Developer Reference
p?lassq....................................................................................... 2301
p?laswp...................................................................................... 2302
p?latra........................................................................................2304
p?latrd........................................................................................2305
p?latrs........................................................................................ 2309
p?latrz........................................................................................ 2311
p?lauu2...................................................................................... 2313
p?lauum..................................................................................... 2315
p?lawil........................................................................................ 2316
p?org2l/p?ung2l........................................................................... 2317
p?org2r/p?ung2r.......................................................................... 2320
p?orgl2/p?ungl2........................................................................... 2322
p?orgr2/p?ungr2.......................................................................... 2324
p?orm2l/p?unm2l......................................................................... 2326
p?orm2r/p?unm2r........................................................................ 2330
p?orml2/p?unml2......................................................................... 2333
p?ormr2/p?unmr2........................................................................ 2337
p?pbtrsv..................................................................................... 2340
p?pttrsv...................................................................................... 2344
p?potf2....................................................................................... 2347
p?rot.......................................................................................... 2349
p?rscl......................................................................................... 2351
p?sygs2/p?hegs2......................................................................... 2352
p?sytd2/p?hetd2.......................................................................... 2355
p?trord....................................................................................... 2358
p?trsen....................................................................................... 2362
p?trti2........................................................................................ 2367
?lahqr2....................................................................................... 2368
?lamsh....................................................................................... 2370
?lapst......................................................................................... 2372
?laqr6.........................................................................................2372
?lar1va....................................................................................... 2376
?laref..........................................................................................2378
?larrb2........................................................................................2381
?larrd2........................................................................................2384
?larre2........................................................................................2388
?larre2a...................................................................................... 2392
?larrf2........................................................................................ 2396
?larrv2........................................................................................2399
?lasorte...................................................................................... 2404
?lasrt2........................................................................................ 2405
?stegr2....................................................................................... 2406
?stegr2a..................................................................................... 2410
?stegr2b..................................................................................... 2414
?stein2....................................................................................... 2418
?dbtf2.........................................................................................2420
?dbtrf......................................................................................... 2421
?dttrf..........................................................................................2423
?dttrsv........................................................................................2424
?pttrsv........................................................................................2425
18
Contents
?steqr2....................................................................................... 2427
?trmvt........................................................................................ 2428
pilaenv....................................................................................... 2431
pilaenvx......................................................................................2432
pjlaenv....................................................................................... 2434
Additional ScaLAPACK Routines...................................................... 2435
ScaLAPACK Utility Functions and Routines................................................2436
p?labad.......................................................................................2436
p?lachkieee................................................................................. 2437
p?lamch......................................................................................2438
p?lasnbt......................................................................................2439
ScaLAPACK Redistribution/Copy Routines.................................................2440
p?gemr2d................................................................................... 2440
p?trmr2d.................................................................................... 2442
19
Intel Math Kernel Library Developer Reference
20
Contents
21
Intel Math Kernel Library Developer Reference
v?Sinh................................................................................2651
v?Tanh............................................................................... 2653
v?Acosh..............................................................................2655
v?Asinh.............................................................................. 2657
v?Atanh..............................................................................2659
Special Functions......................................................................... 2661
v?Erf.................................................................................. 2661
v?Erfc.................................................................................2663
v?CdfNorm..........................................................................2665
v?ErfInv..............................................................................2667
v?ErfcInv............................................................................ 2669
v?CdfNormInv..................................................................... 2671
v?LGamma..........................................................................2672
v?TGamma......................................................................... 2674
v?ExpInt1........................................................................... 2675
Rounding Functions...................................................................... 2676
v?Floor............................................................................... 2676
v?Ceil.................................................................................2677
v?Trunc.............................................................................. 2678
v?Round............................................................................. 2680
v?NearbyInt........................................................................ 2681
v?Rint................................................................................ 2682
v?Modf............................................................................... 2683
v?Frac................................................................................ 2684
VM Pack/Unpack Functions.................................................................... 2685
v?Pack........................................................................................2686
v?Unpack.................................................................................... 2688
VM Service Functions............................................................................ 2689
vmlSetMode................................................................................ 2690
vmlgetmode................................................................................ 2691
vmlSetErrStatus...........................................................................2692
vmlgeterrstatus........................................................................... 2693
vmlclearerrstatus......................................................................... 2693
vmlSetErrorCallBack..................................................................... 2694
vmlGetErrorCallBack.....................................................................2696
vmlClearErrorCallBack.................................................................. 2696
22
Contents
vslNewStreamEx..................................................................2712
vsliNewAbstractStream......................................................... 2713
vsldNewAbstractStream........................................................ 2715
vslsNewAbstractStream........................................................ 2717
vslDeleteStream.................................................................. 2718
vslCopyStream.................................................................... 2719
vslCopyStreamState............................................................. 2720
vslSaveStreamF...................................................................2720
vslLoadStreamF................................................................... 2721
vslSaveStreamM.................................................................. 2723
vslLoadStreamM.................................................................. 2724
vslGetStreamSize.................................................................2725
vslLeapfrogStream............................................................... 2725
vslSkipAheadStream............................................................ 2727
vslGetStreamStateBrng........................................................ 2729
vslGetNumRegBrngs.............................................................2729
Distribution Generators................................................................. 2730
Continuous Distributions....................................................... 2733
Discrete Distributions........................................................... 2761
Advanced Service Routines............................................................ 2779
Advanced Service Routine Data Types..................................... 2779
vslGetBrngProperties............................................................ 2780
Convolution and Correlation................................................................... 2780
Convolution and Correlation Naming Conventions............................. 2781
Convolution and Correlation Data Types.......................................... 2782
Convolution and Correlation Parameters.......................................... 2782
Convolution and Correlation Task Status and Error Reporting..............2784
Convolution and Correlation Task Constructors................................. 2785
vslConvNewTask/vslCorrNewTask........................................... 2786
vslConvNewTask1D/vslCorrNewTask1D................................... 2788
vslConvNewTaskX/vslCorrNewTaskX....................................... 2789
vslConvNewTaskX1D/vslCorrNewTaskX1D................................2792
Convolution and Correlation Task Editors......................................... 2794
vslConvSetMode/vslCorrSetMode........................................... 2795
vslConvSetInternalPrecision/vslCorrSetInternalPrecision............2796
vslConvSetStart/vslCorrSetStart............................................ 2797
vslConvSetDecimation/vslCorrSetDecimation........................... 2798
Task Execution Routines................................................................ 2798
vslConvExec/vslCorrExec...................................................... 2799
vslConvExec1D/vslCorrExec1D...............................................2801
vslConvExecX/vslCorrExecX...................................................2803
vslConvExecX1D/vslCorrExecX1D........................................... 2806
Convolution and Correlation Task Destructors...................................2808
vslConvDeleteTask/vslCorrDeleteTask......................................2808
Convolution and Correlation Task Copiers........................................ 2809
vslConvCopyTask/vslCorrCopyTask......................................... 2809
Convolution and Correlation Usage Examples................................... 2810
Convolution and Correlation Mathematical Notation and Definitions..... 2813
Convolution and Correlation Data Allocation..................................... 2814
Summary Statistics...............................................................................2816
23
Intel Math Kernel Library Developer Reference
24
Contents
DftiCommitDescriptor........................................................... 2889
DftiFreeDescriptor................................................................ 2890
DftiCopyDescriptor............................................................... 2891
FFT Descriptor Configuration Functions............................................2892
DftiSetValue........................................................................ 2892
DftiGetValue........................................................................2894
FFT Computation Functions............................................................2896
DftiComputeForward.............................................................2896
DftiComputeBackward.......................................................... 2899
Configuring and Computing an FFT in Fortran.......................... 2901
Status Checking Functions.............................................................2904
DftiErrorClass...................................................................... 2904
DftiErrorMessage................................................................. 2905
Cluster FFT Functions............................................................................2906
Computing Cluster FFT..................................................................2907
Distributing Data among Processes................................................. 2908
Cluster FFT Interface.................................................................... 2909
Cluster FFT Descriptor Manipulation Functions.................................. 2910
DftiCreateDescriptorDM........................................................ 2910
DftiCommitDescriptorDM.......................................................2911
DftiFreeDescriptorDM........................................................... 2912
Cluster FFT Computation Functions................................................. 2913
DftiComputeForwardDM........................................................ 2913
DftiComputeBackwardDM...................................................... 2914
Cluster FFT Descriptor Configuration Functions................................. 2916
DftiSetValueDM....................................................................2916
DftiGetValueDM................................................................... 2918
Error Codes................................................................................. 2920
25
Intel Math Kernel Library Developer Reference
p?ahemv.....................................................................................2952
p?her......................................................................................... 2954
p?her2........................................................................................2955
p?symv.......................................................................................2957
p?asymv..................................................................................... 2959
p?syr..........................................................................................2961
p?syr2........................................................................................ 2963
p?trmv........................................................................................2965
p?atrmv...................................................................................... 2967
p?trsv.........................................................................................2970
PBLAS Level 3 Routines......................................................................... 2972
p?geadd......................................................................................2972
p?tradd.......................................................................................2974
p?gemm..................................................................................... 2976
p?hemm..................................................................................... 2979
p?herk........................................................................................ 2981
p?her2k...................................................................................... 2983
p?symm......................................................................................2985
p?syrk........................................................................................ 2987
p?syr2k...................................................................................... 2990
p?tran........................................................................................ 2992
p?tranu.......................................................................................2993
p?tranc....................................................................................... 2995
p?trmm...................................................................................... 2996
p?trsm........................................................................................2998
26
Contents
?_commit_sph_p/?_commit_sph_np.......................................3039
?_sph_p/?_sph_np............................................................... 3041
free_sph_p/free_sph_np....................................................... 3043
Common Parameters for the Poisson Solver..................................... 3044
ipar....................................................................................3044
dpar and spar......................................................................3048
Caveat on Parameter Modifications......................................... 3050
Parameters That Define Boundary Conditions...........................3050
Poisson Solver Implementation Details............................................ 3053
Calling PDE Support Routines from Fortran...............................................3059
27
Intel Math Kernel Library Developer Reference
pxerbla...............................................................................3101
Handling Fatal Errors.................................................................... 3101
mkl_set_exit_handler........................................................... 3102
Character Equality Testing..................................................................... 3102
lsame......................................................................................... 3102
lsamen....................................................................................... 3103
Timing................................................................................................ 3104
second/dsecnd.............................................................................3104
mkl_get_cpu_clocks..................................................................... 3104
mkl_get_cpu_frequency................................................................ 3105
mkl_get_max_cpu_frequency........................................................ 3106
mkl_get_clocks_frequency.............................................................3106
Memory Management............................................................................3107
mkl_free_buffers..........................................................................3107
mkl_thread_free_buffers............................................................... 3108
mkl_disable_fast_mm...................................................................3108
mkl_mem_stat............................................................................ 3109
mkl_peak_mem_usage................................................................. 3109
mkl_malloc..................................................................................3110
mkl_calloc...................................................................................3111
mkl_realloc................................................................................. 3112
mkl_free..................................................................................... 3113
mkl_set_memory_limit................................................................. 3114
Usage Examples for the Memory Functions...................................... 3115
Single Dynamic Library Control...............................................................3117
mkl_set_interface_layer................................................................ 3117
mkl_set_threading_layer...............................................................3118
mkl_set_xerbla............................................................................ 3119
mkl_set_progress.........................................................................3120
mkl_set_pardiso_pivot.................................................................. 3121
Intel Many Integrated Core Architecture Support...................................... 3121
mkl_mic_enable...........................................................................3122
mkl_mic_disable.......................................................................... 3122
mkl_mic_get_device_count........................................................... 3123
mkl_mic_set_workdivision.............................................................3124
mkl_mic_get_workdivision............................................................ 3125
mkl_mic_set_max_memory...........................................................3127
mkl_mic_free_memory................................................................. 3128
mkl_mic_register_memory............................................................ 3129
mkl_mic_set_device_num_threads................................................. 3130
mkl_mic_set_resource_limit.......................................................... 3131
mkl_mic_get_resource_limit.......................................................... 3134
mkl_mic_set_offload_report.......................................................... 3135
mkl_mic_set_flags....................................................................... 3136
mkl_mic_get_flags....................................................................... 3136
mkl_mic_get_status..................................................................... 3137
mkl_mic_clear_status................................................................... 3139
mkl_mic_get_meminfo..................................................................3140
mkl_mic_get_cpuinfo....................................................................3141
Conditional Numerical Reproducibility Control........................................... 3143
28
Contents
mkl_cbwr_set.............................................................................. 3143
mkl_cbwr_get..............................................................................3144
mkl_cbwr_get_auto_branch...........................................................3145
Named Constants for CNR Control.................................................. 3146
Reproducibility Conditions............................................................. 3147
Usage Examples for CNR Support Functions..................................... 3148
Miscellaneous.......................................................................................3148
mkl_progress...............................................................................3148
mkl_enable_instructions ...............................................................3150
mkl_set_env_mode...................................................................... 3151
mkl_verbose................................................................................3152
mkl_set_mpi .............................................................................. 3153
mkl_finalize.................................................................................3155
29
Intel Math Kernel Library Developer Reference
30
Contents
Bibliography
Glossary
31
Intel Math Kernel Library Developer Reference
32
Legal Information
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this
document.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of
merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from
course of performance, course of dealing, or usage in trade.
This document contains information on products, services and/or processes in development. All information
provided here is subject to change without notice. Contact your Intel representative to obtain the latest
forecast, schedule, specifications and roadmaps.
The products and services described may contain defects or errors which may cause deviations from
published specifications. Current characterized errata are available on request.
Software and workloads used in performance tests may have been optimized for performance only on Intel
microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific
computer systems, components, software, operations and functions. Any change to any of those factors may
cause the results to vary. You should consult other information and performance tests to assist you in fully
evaluating your contemplated purchases, including the performance of that product when combined with
other products.
Cilk, Intel, the Intel logo, Intel Atom, Intel Core, Intel Inside, Intel NetBurst, Intel SpeedStep, Intel vPro,
Intel Xeon Phi, Intel XScale, Itanium, MMX, Pentium, Thunderbolt, Ultrabook, VTune and Xeon are
trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation
in the United States and/or other countries.
Java is a registered trademark of Oracle and/or its affiliates.
Intel MKL supports LAPACK 3.5 set of computational, driver, auxiliary and utility routines under the
following license:
Copyright 1992-2011 The University of Tennessee and The University of Tennessee Research Foundation. All rights
reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the
following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following
disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following
disclaimer listed in this license in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holders nor the names of its contributors may be used to endorse or promote
products derived from this software without specific prior written permission.
33
Intel Math Kernel Library Developer Reference
The copyright holders provide no reassurances that the source code provided does not infringe any patent, copyright,
or any other intellectual property rights of third parties. The copyright holders disclaim any liability to any recipient for
claims brought against recipient by any third party for infringement of that parties intellectual property rights.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The original versions of LAPACK from which that part of Intel MKL was derived can be obtained from
http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S.
Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D.
Sorensen.
The original versions of the Basic Linear Algebra Subprograms (BLAS) from which the respective part of
Intel MKL was derived can be obtained from http://www.netlib.org/blas/index.html.
XBLAS is distributed under the following copyright:
Copyright 2008-2009 The University of California Berkeley. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided
that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the
following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the
following disclaimer listed in this license in the documentation and/or other materials provided with the
distribution.
- Neither the name of the copyright holders nor the names of its contributors may be used to endorse or
promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The original versions of the Basic Linear Algebra Communication Subprograms (BLACS) from which the
respective part of Intel MKL was derived can be obtained from http://www.netlib.org/blacs/index.html.
The authors of BLACS are Jack Dongarra and R. Clint Whaley.
The original versions of Scalable LAPACK (ScaLAPACK) from which the respective part of Intel MKL was
derived can be obtained from http://www.netlib.org/scalapack/index.html. The authors of ScaLAPACK are
L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G.
Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley.
The original versions of the Parallel Basic Linear Algebra Subprograms (PBLAS) routines from which the
respective part of Intel MKL was derived can be obtained from http://www.netlib.org/scalapack/html/
pblas_qref.html.
PARDISO (PARallel DIrect SOlver)* in Intel MKL was originally developed by the Department of Computer
Science at the University of Basel (http://www.unibas.ch). It can be obtained at http://www.pardiso-
project.org.
The Extended Eigensolver functionality is based on the Feast solver package and is distributed under the
following license:
34
Legal Information
35
Intel Math Kernel Library Developer Reference
36
Legal Information
The above copyright notice and this permission notice shall be included in all copies or substantial portions
of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT
OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
HPL Copyright Notice and Licensing Terms
Redistribution and use in source and binary forms, with or without modification, are permitted provided
that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the
following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions, and the
following disclaimer in the documentation and/or other materials provided with the distribution.
3. All advertising materials mentioning features or use of this software must display the following
acknowledgement: This product includes software developed at the University of Tennessee, Knoxville,
Innovative Computing Laboratories.
4. The name of the University, the name of the Laboratory, or the names of its contributors may not be
used to endorse or promote products derived from this software without specific written permission.
Copyright 2017, Intel Corporation. All rights reserved.
37
Intel Math Kernel Library Developer Reference
38
Introducing the Intel Math Kernel
Library
The Intel Math Kernel Library (Intel MKL) improves performance with math routines for software
applications that solve large computational problems. Intel MKL provides BLAS and LAPACK linear algebra
routines, fast Fourier transforms, vectorized math functions, random number generation functions, and other
functionality.
NOTE
It is your responsibility when using Intel MKL to ensure that input data has the required format and
does not contain invalid characters. These can cause unexpected behavior of the library.
The library requires subroutine and function parameters to be valid before being passed. While some
Intel MKL routines do limited checking of parameter errors, your application should check for NULL
pointers, for example.
Intel MKL is optimized for the latest Intel processors, including processors with multiple cores (see the Intel
MKL Release Notes for the full list of supported processors). Intel MKL also performs well on non-Intel
processors.
For more details about functionality provided by Intel MKL, see the Function Domains section.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
39
Intel Math Kernel Library Developer Reference
40
Getting Help and Support
Intel provides a support web site that contains a rich repository of self help information, including getting
started tips, known product issues, product errata, license information, user forums, and more. Visit the Intel
MKL support website at http://www.intel.com/software/products/support/.
41
Intel Math Kernel Library Developer Reference
42
What's New
This Developer Reference documents Intel Math Kernel Library (Intel MKL) 2017 Update 2 release for the
Fortran interface.
NOTE
This publication, the Intel Math Kernel Library Developer Reference, was previously known as the Intel
Math Kernel Library Reference Manual.
The following function domains were updated with new functions, enhancements to the existing functionality,
or improvements to the existing documentation:
New LAPACK functions provide factorization particularly for tall and skinny matrices. See ?gelq, ?geqr, ?
gemlq, ?gemqr, and ?getsls.
Support functions have been added to specify and return the number of partitions along the leading
dimension of the output matrix for parallel ?gemm functions. For more details, see mkl_set_num_stripes
and mkl_get_num_stripes.
Support of ScaLAPACK by the progress routine has been implemented. For more details, see
mkl_progress and mkl_set_progress.
The manual has also been updated to reflect other enhancements to the product, and minor improvements
and error corrections have been made.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
43
Intel Math Kernel Library Developer Reference
44
Notational Conventions
This manual uses the following terms to refer to operating systems:
Windows* OS This term refers to information that is valid on all supported Windows* operating
systems.
Linux* OS This term refers to information that is valid on all supported Linux* operating
systems.
macOS* This term refers to information that is valid on Intel-based systems running the
macOS* operating system.
?swap Refers to all four data types of the vector-vector ?swap routine:
sswap, dswap, cswap, and zswap.
Font Conventions
The following font conventions are used:
UPPERCASE COURIER Data type used in the description of input and output parameters for
Fortran interface. For example, CHARACTER*1.
lowercase courier italic Variables in arguments and parameters description. For example, incx.
45
Intel Math Kernel Library Developer Reference
46
Function Domains 1
NOTE
It is your responsibility when using Intel MKL to ensure that input data has the required format and
does not contain invalid characters. These can cause unexpected behavior of the library.
The library requires subroutine and function parameters to be valid before being passed. While some
Intel MKL routines do limited checking of parameter errors, your application should check for NULL
pointers, for example.
The Intel Math Kernel Library includes Fortran routines and functions optimized for Intel processor-based
computers running operating systems that support multiprocessing. In addition to the Fortran interface, Intel
MKL includes a C-language interface for the Discrete Fourier transform functions, as well as for the Vector
Mathematics and Vector Statistics functions. For hardware and software requirements to use Intel MKL, see
Intel MKL Release Notes.
BLAS Routines
The BLAS routines and functions are divided into the following groups according to the operations they
perform:
BLAS Level 1 Routines perform operations of both addition and reduction on vectors of data. Typical
operations include scaling and dot products.
BLAS Level 2 Routines perform matrix-vector operations, such as matrix-vector multiplication, rank-1 and
rank-2 matrix updates, and solution of triangular systems.
BLAS Level 3 Routines perform matrix-matrix operations, such as matrix-matrix multiplication, rank-k
update, and solution of triangular systems.
Starting from release 8.0, Intel MKL also supports the Fortran 95 interface to the BLAS routines.
Starting from release 10.1, a number of BLAS-like Extensions are added to enable the user to perform
certain data manipulation, including matrix in-place and out-of-place transposition operations combined with
simple matrix arithmetic operations.
LAPACK Routines
The Intel Math Kernel Library fully supports the LAPACK 3.6 set of computational, driver, auxiliary and utility
routines.
The original versions of LAPACK from which that part of Intel MKL was derived can be obtained from http://
www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J.
Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen.
The LAPACK routines can be divided into the following groups according to the operations they perform:
Routines for solving systems of linear equations, factoring and inverting matrices, and estimating
condition numbers (see LAPACK Routines: Linear Equations).
47
1 Intel Math Kernel Library Developer Reference
Routines for solving least squares problems, eigenvalue and singular value problems, and Sylvester's
equations (see LAPACK Routines: Least Squares and Eigenvalue Problems).
Auxiliary and utility routines used to perform certain subtasks, common low-level computation or related
tasks (see LAPACK Auxiliary Routines and LAPACK Utility Functions and Routines).
Starting from release 8.0, Intel MKL also supports the Fortran 95 interface to LAPACK computational and
driver routines. This interface provides an opportunity for simplified calls of LAPACK routines with fewer
required arguments.
ScaLAPACK Routines
The ScaLAPACK package (provided only for Intel 64 and Intel Many Integrated Core architectures, see
Chapter 6) runs on distributed-memory architectures and includes routines for solving systems of linear
equations, solving linear least squares problems, eigenvalue and singular value problems, as well as
performing a number of related computational tasks.
The original versions of ScaLAPACK from which that part of Intel MKL was derived can be obtained from
http://www.netlib.org/scalapack/index.html. The authors of ScaLAPACK are L. Blackford, J. Choi, A.Cleary, E.
D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K.Stanley, D. Walker, and
R. Whaley.
The Intel MKL version of ScaLAPACK is optimized for Intel processors and uses MPICH version of MPI as well
as Intel MPI.
PBLAS Routines
The PBLAS routines perform operations with distributed vectors and matrices.
PBLAS Level 1 Routines perform operations of both addition and reduction on vectors of data. Typical
operations include scaling and dot products.
PBLAS Level 2 Routines perform distributed matrix-vector operations, such as matrix-vector multiplication,
rank-1 and rank-2 matrix updates, and solution of triangular systems.
PBLAS Level 3 Routines perform distributed matrix-matrix operations, such as matrix-matrix
multiplication, rank-k update, and solution of triangular systems.
Intel MKL provides the PBLAS routines with interface similar to the interface used in the Netlib PBLAS (part of
the ScaLAPACK package, see http://www.netlib.org/scalapack/html/pblas_qref.html).
48
Function Domains 1
from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms [Bai00]) or
other Davidson-Jacobi techniques [Sleijpen96]. The Feast algorithm is inspired by the density-matrix
representation and contour integration technique in quantum mechanics.
It is free from orthogonalization procedures. Its main computational tasks consist of solving very few inner
independent linear systems with multiple right-hand sides and one reduced eigenvalue problem orders of
magnitude smaller than the original one. The Feast algorithm combines simplicity and efficiency and offers
many important capabilities for achieving high performance, robustness, accuracy, and scalability on parallel
architectures. This algorithm is expected to significantly augment numerical performance in large-scale
modern applications.
Some of the characteristics of the Feast algorithm [Polizzi09] are:
VM Functions
The Vector Mathematics functions (see Chapter 9) include a set of highly optimized implementations of
certain computationally expensive core mathematical functions (power, trigonometric, exponential,
hyperbolic, etc.) that operate on vectors of real and complex numbers.
Application programs that might significantly improve performance with VM include nonlinear programming
software, integrals computation, and many others. VM provides interfaces both for Fortran and C languages.
Statistical Functions
Vector Statistics (VS) contains three sets of functions (see Chapter 10) providing:
Pseudorandom, quasi-random, and non-deterministic random number generator subroutines
implementing basic continuous and discrete distributions. To provide best performance, the VS
subroutines use calls to highly optimized Basic Random Number Generators (BRNGs) and a set of vector
mathematical functions.
A wide variety of convolution and correlation operations.
Initial statistical analysis of raw single and double precision multi-dimensional datasets.
49
1 Intel Math Kernel Library Developer Reference
The Trigonometric Transform routines may be helpful to users who implement their own solvers similar to the
Intel MKL Poisson Solver. The users can improve performance of their solvers by using fast sine, cosine, and
staggered cosine transforms implemented in the Trigonometric Transform interface.
The Poisson Solver is designed for fast solving of simple Helmholtz, Poisson, and Laplace problems. The
Trigonometric Transform interface, which underlies the solver, is based on the Intel MKL FFT interface (refer
to Chapter 11), optimized for Intel processors.
Support Functions
The Intel MKL support functions (see Chapter 15) are used to support the operation of the Intel MKL
software and provide basic information on the library and library operation, such as the current library
version, timing, setting and measuring of CPU frequency, error handling, and memory allocation.
Starting from release 10.0, the Intel MKL support functions provide additional threading control.
Starting from release 10.1, Intel MKL selectively supports a Progress Routine feature to track progress of a
lengthy computation and/or interrupt the computation using a callback function mechanism. The user
application can define a function called mkl_progress that is regularly called from the Intel MKL routine
supporting the progress routine feature. See the Progress Routines section in Chapter 15 for reference. Refer
to a specific LAPACK or DSS/PARDISO function description to see whether the function supports this feature
or not.
BLACS Routines
The Intel Math Kernel Library implements routines from the BLACS (Basic Linear Algebra Communication
Subprograms) package (see Chapter 16) that are used to support a linear algebra oriented message passing
interface that may be implemented efficiently and uniformly across a large range of distributed memory
platforms.
The original versions of BLACS from which that part of Intel MKL was derived can be obtained from http://
www.netlib.org/blacs/index.html. The authors of BLACS are Jack Dongarra and R. Clint Whaley.
spline construction
interpolation including computation of derivatives and integration
search
The algorithms operate on single and double vector-valued functions set in the points of the given partition.
You can use Data Fitting algorithms in applications that are based on data approximation. See Data Fitting
Functions for more information.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
50
Function Domains 1
Optimization Notice
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Performance Enhancements
The Intel Math Kernel Library has been optimized by exploiting both processor and system features and
capabilities. Special care has been given to those routines that most profit from cache-management
techniques. These especially include matrix-matrix operation routines such as dgemm().
In addition, code optimization techniques have been applied to minimize dependencies of scheduling integer
and floating-point units on the results within the processor.
The major optimization techniques used throughout the library include:
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Parallelism
In addition to the performance enhancements discussed above, Intel MKL offers performance gains through
parallelism provided by the symmetric multiprocessing performance (SMP) feature. You can obtain
improvements from SMP in the following ways:
One way is based on user-managed threads in the program and further distribution of the operations over
the threads based on data decomposition, domain decomposition, control decomposition, or some other
parallelizing technique. Each thread can use any of the Intel MKL functions (except for the deprecated ?
lacon LAPACK routine) because the library has been designed to be thread-safe.
Another method is to use the FFT and BLAS level 3 routines. They have been parallelized and require no
alterations of your application to gain the performance enhancements of multiprocessing. Performance
using multiple processors on the level 3 BLAS shows excellent scaling. Since the threads are called and
managed within the library, the application does not need to be recompiled thread-safe.
Yet another method is to use tuned LAPACK routines. Currently these include the single- and double
precision flavors of routines for QR factorization of general matrices, triangular factorization of general
and symmetric positive-definite matrices, solving systems of equations with such matrices, as well as
solving symmetric eigenvalue problems.
51
1 Intel Math Kernel Library Developer Reference
For instructions on setting the number of available processors for the BLAS level 3 and LAPACK routines, see
Intel MKL Developer Guide.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
52
BLAS and Sparse BLAS Routines 2
This chapter describes the Intel Math Kernel Library implementation of the BLAS and Sparse BLAS routines,
and BLAS-like extensions. The routine descriptions are arranged in several sections:
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Sparse BLAS Level 1 Routines (vector-vector operations).
Sparse BLAS Level 2 and Level 3 Routines (matrix-vector and matrix-matrix operations)
BLAS-like Extensions
Each section presents the routine and function group descriptions in alphabetical order by routine or function
group name; for example, the ?asum group, the ?axpy group. The question mark in the group name
corresponds to different character codes indicating the data type (s, d, c, and z or their combination); see
Routine Naming Conventions.
When BLAS or Sparse BLAS routines encounter an error, they call the error reporting routine xerbla.
In BLAS Level 1 groups i?amax and i?amin, an "i" is placed before the data-type indicator and corresponds
to the index of an element in the vector. These groups are placed in the end of the BLAS Level 1 section.
BLAS Routines
Some routines and functions can have combined character codes, such as sc or dz.
For example, the function scasum uses a complex input array and returns a real value.
The <name> field, in BLAS level 1, indicates the operation type. For example, the BLAS level 1 routines ?
dot, ?rot, ?swap compute a vector dot product, vector rotation, and vector swap, respectively.
In BLAS level 2 and 3, <name> reflects the matrix argument type:
ge general matrix
sy symmetric matrix
53
2 Intel Math Kernel Library Developer Reference
he Hermitian matrix
tr triangular matrix
The <mod> field, if present, provides additional details of the operation. BLAS level 1 names can have the
following characters in the <mod> field:
c conjugated vector
u unconjugated vector
BLAS level 2 names can have the following characters in the <mod> field:
mv matrix-vector product
BLAS level 3 names can have the following characters in the <mod> field:
mm matrix-matrix product
scasum <sc> <asum>: sum of magnitudes of vector elements, single precision real
output and single precision complex input
sgemv <s> <ge> <mv>: matrix-vector product, general matrix, single precision
54
BLAS and Sparse BLAS Routines 2
ztrmm <z> <tr> <mm>: matrix-matrix product, triangular matrix, double-precision
complex.
Sparse BLAS level 1 naming conventions are similar to those of BLAS level 1. For more information, see
Naming Conventions.
NOTE
For BLAS, Intel MKL offers two types of Fortran 95 interfaces:
using mkl_blas.fi only through include 'mkl.fi' statement. Such interfaces allow you to
make use of the original LAPACK routines with all their arguments
using blas.f90 that includes improved interfaces. This file is used to generate the module files
blas95.mod and f95_precision.mod. See also section "Fortran 95 interfaces and wrappers to
LAPACK and BLAS" of Intel MKL Developer Guide for details. The module files are used to process
the FORTRAN use clauses referencing the BLAS interface: use blas95 and use f95_precision.
The names of parameters used in Fortran 95 interface are typically the same as those used for the
respective generic (FORTRAN 77) interface. In rare cases formal argument names may be different.
Some input parameters such as array dimensions are not required in Fortran 95 and are skipped from the
calling sequence. Array dimensions are reconstructed from the user data that must exactly follow the
required array shape.
A parameter can be skipped if its value is completely defined by the presence or absence of another
parameter in the calling sequence, and the restored value is the only meaningful value for the skipped
parameter.
Parameters specifying the increment values incx and incy are skipped. In most cases their values are
equal to 1. In Fortran 95 an increment with different value can be directly established in the
corresponding parameter.
Some generic parameters are declared as optional in Fortran 95 interface and may or may not be present
in the calling sequence. A parameter can be declared optional if it satisfies one of the following conditions:
1. It can take only a few possible values. The default value of such parameter typically is the first value in
the list; all exceptions to this rule are explicitly stated in the routine description.
2. It has a natural default value.
Optional parameters are given in square brackets in Fortran 95 call syntax.
The particular rules used for reconstructing the values of omitted optional parameters are specific for each
routine and are detailed in the respective "Fortran 95 Notes" subsection at the end of routine specification
section. If this subsection is omitted, the Fortran 95 interface for the given routine does not differ from the
corresponding FORTRAN 77 interface.
Note that this interface is not implemented in the current version of Sparse BLAS Level 2 and Level 3
routines.
Full storage: a matrix A is stored in a two-dimensional array a, with the matrix element Aij stored in the
array element a(i,j).
55
2 Intel Math Kernel Library Developer Reference
Packed storage scheme allows you to store symmetric, Hermitian, or triangular matrices more compactly:
the upper or lower triangle of the matrix is packed by columns in a one-dimensional array.
Band storage: a band matrix is stored compactly in a two-dimensional array: columns of the matrix are
stored in the corresponding columns of the array, and diagonals of the matrix are stored in rows of the
array.
For more information on matrix storage schemes, see Matrix Arguments in Appendix B.
?asum
Computes the sum of magnitudes of the vector
elements.
Syntax
res = sasum(n, x, incx)
56
BLAS and Sparse BLAS Routines 2
res = scasum(n, x, incx)
res = dasum(n, x, incx)
res = dzasum(n, x, incx)
res = asum(x)
Include Files
mkl.fi, blas.f90
Description
The ?asum routine computes the sum of the magnitudes of elements of a real vector, or the sum of
magnitudes of the real and imaginary parts of elements of a complex vector:
Input Parameters
Output Parameters
57
2 Intel Math Kernel Library Developer Reference
?axpy
Computes a vector-scalar product and adds the result
to a vector.
Syntax
call saxpy(n, a, x, incx, y, incy)
call daxpy(n, a, x, incx, y, incy)
call caxpy(n, a, x, incx, y, incy)
call zaxpy(n, a, x, incx, y, incy)
call axpy(x, y [,a])
Include Files
mkl.fi, blas.f90
Description
y := a*x + y
where:
a is a scalar
x and y are vectors each with a number of elements that equals n.
Input Parameters
58
BLAS and Sparse BLAS Routines 2
Array, size at least (1 + (n-1)*abs(incy)).
Output Parameters
?copy
Copies vector to another vector.
Syntax
call scopy(n, x, incx, y, incy)
call dcopy(n, x, incx, y, incy)
call ccopy(n, x, incx, y, incy)
call zcopy(n, x, incx, y, incy)
call copy(x, y)
Include Files
mkl.fi, blas.f90
Description
y = x,
Input Parameters
59
2 Intel Math Kernel Library Developer Reference
Output Parameters
?dot
Computes a vector-vector dot product.
Syntax
res = sdot(n, x, incx, y, incy)
res = ddot(n, x, incx, y, incy)
res = dot(x, y)
Include Files
mkl.fi, blas.f90
Description
60
BLAS and Sparse BLAS Routines 2
Input Parameters
Output Parameters
?sdot
Computes a vector-vector dot product with double
precision.
Syntax
res = sdsdot(n, sb, sx, incx, sy, incy)
res = dsdot(n, sx, incx, sy, incy)
res = sdot(sx, sy)
res = sdot(sx, sy, sb)
Include Files
mkl.fi, blas.f90
Description
61
2 Intel Math Kernel Library Developer Reference
The ?sdot routines compute the inner product of two vectors with double precision. Both routines use double
precision accumulation of the intermediate results, but the sdsdot routine outputs the final result in single
precision, whereas the dsdot routine outputs the double precision result. The function sdsdot also adds
scalar value sb to the inner product.
Input Parameters
n INTEGER. Specifies the number of elements in the input vectors sx and sy.
sb REAL. Single precision scalar to be added to inner product (for the function
sdsdot only).
sx, sy REAL.
Arrays, size at least (1+(n -1)*abs(incx)) and (1+(n-1)*abs(incy)),
respectively. Contain the input single precision vectors.
Output Parameters
NOTE
Note that scalar parameter sb is declared as a required parameter in Fortran 95 interface for the
function sdot to distinguish between function flavors that output final result in different precision.
?dotc
Computes a dot product of a conjugated vector with
another vector.
Syntax
res = cdotc(n, x, incx, y, incy)
res = zdotc(n, x, incx, y, incy)
res = dotc(x, y)
62
BLAS and Sparse BLAS Routines 2
Include Files
mkl.fi, blas.f90
Description
Input Parameters
Output Parameters
?dotu
Computes a vector-vector dot product.
Syntax
res = cdotu(n, x, incx, y, incy)
63
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, blas.f90
Description
Input Parameters
Output Parameters
64
BLAS and Sparse BLAS Routines 2
?nrm2
Computes the Euclidean norm of a vector.
Syntax
res = snrm2(n, x, incx)
res = dnrm2(n, x, incx)
res = scnrm2(n, x, incx)
res = dznrm2(n, x, incx)
res = nrm2(x)
Include Files
mkl.fi, blas.f90
Description
res = ||x||,
where:
x is a vector,
res is a value containing the Euclidean norm of the elements of x.
Input Parameters
Output Parameters
65
2 Intel Math Kernel Library Developer Reference
?rot
Performs rotation of points in the plane.
Syntax
call srot(n, x, incx, y, incy, c, s)
call drot(n, x, incx, y, incy, c, s)
call csrot(n, x, incx, y, incy, c, s)
call zdrot(n, x, incx, y, incy, c, s)
call rot(x, y, c, s)
Include Files
mkl.fi, blas.f90
Description
Given two complex vectors x and y, each vector element of these vectors is replaced as follows:
xi = c*xi + s*yi
yi = c*yi - s*xi
Input Parameters
66
BLAS and Sparse BLAS Routines 2
A scalar.
Output Parameters
?rotg
Computes the parameters for a Givens rotation.
Syntax
call srotg(a, b, c, s)
call drotg(a, b, c, s)
call crotg(a, b, c, s)
call zrotg(a, b, c, s)
call rotg(a, b, c, s)
Include Files
mkl.fi, blas.f90
Description
Given the Cartesian coordinates (a, b) of a point, these routines return the parameters c, s, r, and z
associated with the Givens rotation. The parameters c and s define a unitary matrix such that:
The parameter z is defined such that if |a| > |b|, z is s; otherwise if c is not 0 z is 1/c; otherwise z is 1.
See a more accurate LAPACK version ?lartg.
67
2 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
?rotm
Performs modified Givens rotation of points in the
plane.
Syntax
call srotm(n, x, incx, y, incy, param)
call drotm(n, x, incx, y, incy, param)
call rotm(x, y, param)
Include Files
mkl.fi, blas.f90
68
BLAS and Sparse BLAS Routines 2
Description
Given two vectors x and y, each vector element of these vectors is replaced as follows:
xi xi
=H
yi yi
for i=1 to n, where H is a modified Givens transformation matrix whose values are stored in the param(2)
through param(5) array. See discussion on the param argument.
Input Parameters
1.0 h 12
flag = 0.0: H =
h 21 1.0
h 11 1.0
flag = 1.0: H =
1.0 h 22
1.0 0.0
flag = -2.0: H =
0.0 1.0
In the last three cases, the matrix entries of 1.0, -1.0, and 0.0 are assumed
based on the value of flag and are not required to be set in the param
vector.
69
2 Intel Math Kernel Library Developer Reference
Output Parameters
?rotmg
Computes the parameters for a modified Givens
rotation.
Syntax
call srotmg(d1, d2, x1, y1, param)
call drotmg(d1, d2, x1, y1, param)
call rotmg(d1, d2, x1, y1, param)
Include Files
mkl.fi, blas.f90
Description
Given Cartesian coordinates (x1, y1) of an input vector, these routines compute the components of a
modified Givens transformation matrix H that zeros the y-component of the resulting vector:
x1 x1 d1
=H
0 y1 d2
Input Parameters
70
BLAS and Sparse BLAS Routines 2
y1 REAL for srotmg
DOUBLE PRECISION for drotmg
Provides the y-coordinate of the input vector.
Output Parameters
1.0 h 12
flag = 0.0: H =
h 21 1.0
h 11 1.0
flag = 1.0: H =
1.0 h 22
1.0 0.0
flag = -2.0: H =
0.0 1.0
In the last three cases, the matrix entries of 1.0, -1.0, and 0.0 are assumed
based on the value of flag and are not required to be set in the param
vector.
?scal
Computes the product of a vector by a scalar.
Syntax
call sscal(n, a, x, incx)
call dscal(n, a, x, incx)
71
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, blas.f90
Description
x = a*x
where:
a is a scalar, x is an n-element vector.
Input Parameters
Output Parameters
x Updated vector x.
72
BLAS and Sparse BLAS Routines 2
?swap
Swaps a vector with another vector.
Syntax
call sswap(n, x, incx, y, incy)
call dswap(n, x, incx, y, incy)
call cswap(n, x, incx, y, incy)
call zswap(n, x, incx, y, incy)
call swap(x, y)
Include Files
mkl.fi, blas.f90
Description
Given two vectors x and y, the ?swap routines return vectors y and x swapped, each replacing the other.
Input Parameters
Output Parameters
73
2 Intel Math Kernel Library Developer Reference
i?amax
Finds the index of the element with maximum
absolute value.
Syntax
index = isamax(n, x, incx)
index = idamax(n, x, incx)
index = icamax(n, x, incx)
index = izamax(n, x, incx)
index = iamax(x)
Include Files
mkl.fi, blas.f90
Description
Given a vector x, the i?amax functions return the position of the vector element x(i) that has the largest
absolute value for real flavors, or the largest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
If more than one vector element is found with the same largest absolute value, the index of the first one
encountered is returned.
Input Parameters
Output Parameters
index INTEGER. Contains the position of vector element that has the largest
absolute value such that x(index) has the largest absolute value.
74
BLAS and Sparse BLAS Routines 2
x Holds the vector with the number of elements n.
i?amin
Finds the index of the element with the smallest
absolute value.
Syntax
index = isamin(n, x, incx)
index = idamin(n, x, incx)
index = icamin(n, x, incx)
index = izamin(n, x, incx)
index = iamin(x)
Include Files
mkl.fi, blas.f90
Description
Given a vector x, the i?amin functions return the position of the vector element x(i) that has the smallest
absolute value for real flavors, or the smallest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
If more than one vector element is found with the same smallest absolute value, the index of the first one
encountered is returned.
Input Parameters
Output Parameters
index INTEGER. Indicates the position of vector element with the smallest
absolute value such that x(index) has the smallest absolute value.
75
2 Intel Math Kernel Library Developer Reference
?cabs1
Computes absolute value of complex number.
Syntax
res = scabs1(z)
res = dcabs1(z)
res = cabs1(z)
Include Files
mkl.fi, blas.f90
Description
The ?cabs1 is an auxiliary routine for a few BLAS Level 1 routines. This routine performs an operation
defined as
res=|Re(z)|+|Im(z)|,
where z is a scalar, and res is a value containing the absolute value of a complex number z.
Input Parameters
Output Parameters
76
BLAS and Sparse BLAS Routines 2
Routine Groups Data Types Description
?gbmv
Computes a matrix-vector product using a general
band matrix
Syntax
call sgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy)
call dgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy)
call cgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy)
call zgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy)
call gbmv(a, x, y [,kl] [,m] [,alpha] [,beta] [,trans])
Include Files
mkl.fi, blas.f90
Description
77
2 Intel Math Kernel Library Developer Reference
y := alpha*A*x + beta*y,
or
y := alpha*A'*x + beta*y,
or
y := alpha *conjg(A')*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-n band matrix, with kl sub-diagonals and ku super-diagonals.
Input Parameters
78
BLAS and Sparse BLAS Routines 2
diagonal starting at position 1 in row (ku + 2), and so on. Elements in the
array a that do not correspond to elements in the band matrix (such as the
top left ku by ku triangle) are not referenced.
The following program segment transfers a band matrix from conventional
full matrix storage (matrix) to band storage (a):
do 20, j = 1, n
k = ku + 1 - j
do 10, i = max(1, j-ku), min(m, j+kl)
a(k+i, j) = matrix(i,j)
10 continue
20 continue
incx INTEGER. Specifies the increment for the elements of x. incx must not be
zero.
Output Parameters
y Updated vector y.
79
2 Intel Math Kernel Library Developer Reference
a Holds the array a of size (kl+ku+1, n). Contains a banded matrix m*nwith
kl lower diagonal and ku upper diagonal.
x Holds the vector with the number of elements rx, where rx = n if trans =
'N',rx = m otherwise.
y Holds the vector with the number of elements ry, where ry = m if trans =
'N',ry = n otherwise.
kl If omitted, assumed kl = ku, that is, the number of lower diagonals equals
the number of the upper diagonals.
?gemv
Computes a matrix-vector product using a general
matrix
Syntax
call sgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call dgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call cgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call zgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call scgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call dzgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call gemv(a, x, y [,alpha][,beta] [,trans])
Include Files
mkl.fi, blas.f90
Description
y := alpha*A*x + beta*y,
or
80
BLAS and Sparse BLAS Routines 2
y := alpha*A'*x + beta*y,
or
y := alpha*conjg(A')*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-n matrix.
Input Parameters
Before entry, the leading m-by-n part of the array a must contain the
matrix of coefficients.
81
2 Intel Math Kernel Library Developer Reference
Output Parameters
y Updated vector y.
82
BLAS and Sparse BLAS Routines 2
?ger
Performs a rank-1 update of a general matrix.
Syntax
call sger(m, n, alpha, x, incx, y, incy, a, lda)
call dger(m, n, alpha, x, incx, y, incy, a, lda)
call ger(a, x, y [,alpha])
Include Files
mkl.fi, blas.f90
Description
A := alpha*x*y'+ A,
where:
alpha is a scalar,
x is an m-element vector,
y is an n-element vector,
A is an m-by-n general matrix.
Input Parameters
83
2 Intel Math Kernel Library Developer Reference
Before entry, the leading m-by-n part of the array a must contain the
matrix of coefficients.
Output Parameters
?gerc
Performs a rank-1 update (conjugated) of a general
matrix.
Syntax
call cgerc(m, n, alpha, x, incx, y, incy, a, lda)
call zgerc(m, n, alpha, x, incx, y, incy, a, lda)
call gerc(a, x, y [,alpha])
Include Files
mkl.fi, blas.f90
Description
A := alpha*x*conjg(y') + A,
where:
alpha is a scalar,
84
BLAS and Sparse BLAS Routines 2
x is an m-element vector,
y is an n-element vector,
A is an m-by-n matrix.
Input Parameters
Before entry, the leading m-by-n part of the array a must contain the
matrix of coefficients.
Output Parameters
85
2 Intel Math Kernel Library Developer Reference
?geru
Performs a rank-1 update (unconjugated) of a general
matrix.
Syntax
call cgeru(m, n, alpha, x, incx, y, incy, a, lda)
call zgeru(m, n, alpha, x, incx, y, incy, a, lda)
call geru(a, x, y [,alpha])
Include Files
mkl.fi, blas.f90
Description
A := alpha*x*y ' + A,
where:
alpha is a scalar,
x is an m-element vector,
y is an n-element vector,
A is an m-by-n matrix.
Input Parameters
86
BLAS and Sparse BLAS Routines 2
x COMPLEX for cgeru
DOUBLE COMPLEX for zgeru
Array, size at least (1 + (m - 1)*abs(incx)). Before entry, the
incremented array x must contain the m-element vector x.
Before entry, the leading m-by-n part of the array a must contain the
matrix of coefficients.
Output Parameters
?hbmv
Computes a matrix-vector product using a Hermitian
band matrix.
87
2 Intel Math Kernel Library Developer Reference
Syntax
call chbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call zhbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call hbmv(a, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
where:
alpha and beta are scalars,
x and y are n-element vectors,
A is an n-by-n Hermitian band matrix, with k super-diagonals.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
Hermitian band matrix A is used:
If uplo = 'U' or 'u', then the upper triangular part of the matrix A is
used.
If uplo = 'L' or 'l', then the low triangular part of the matrix A is used.
Before entry with uplo = 'U' or 'u', the leading (k + 1) by n part of the
array a must contain the upper triangular band part of the Hermitian
matrix. The matrix must be supplied column-by-column, with the leading
diagonal of the matrix in row (k + 1) of the array, the first super-diagonal
starting at position 2 in row k, and so on. The top left k by k triangle of the
array a is not referenced.
88
BLAS and Sparse BLAS Routines 2
The following program segment transfers the upper triangular part of a
Hermitian band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
Before entry with uplo = 'L' or 'l', the leading (k + 1) by n part of the
array a must contain the lower triangular band part of the Hermitian matrix,
supplied column-by-column, with the leading diagonal of the matrix in row
1 of the array, the first sub-diagonal starting at position 1 in row 2, and so
on. The bottom right k by k triangle of the array a is not referenced.
The following program segment transfers the lower triangular part of a
Hermitian band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
The imaginary parts of the diagonal elements need not be set and are
assumed to be zero.
89
2 Intel Math Kernel Library Developer Reference
Output Parameters
?hemv
Computes a matrix-vector product using a Hermitian
matrix.
Syntax
call chemv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call zhemv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call hemv(a, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
y := alpha*A*x + beta*y,
where:
alpha and beta are scalars,
x and y are n-element vectors,
A is an n-by-n Hermitian matrix.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array a is used.
If uplo = 'U' or 'u', then the upper triangular of the array a is used.
90
BLAS and Sparse BLAS Routines 2
If uplo = 'L' or 'l', then the low triangular of the array a is used.
Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular
part of the array a must contain the upper triangular part of the Hermitian
matrix and the strictly lower triangular part of a is not referenced. Before
entry with uplo = 'L' or 'l', the leading n-by-n lower triangular part of
the array a must contain the lower triangular part of the Hermitian matrix
and the strictly upper triangular part of a is not referenced.
The imaginary parts of the diagonal elements need not be set and are
assumed to be zero.
Output Parameters
91
2 Intel Math Kernel Library Developer Reference
?her
Performs a rank-1 update of a Hermitian matrix.
Syntax
call cher(uplo, n, alpha, x, incx, a, lda)
call zher(uplo, n, alpha, x, incx, a, lda)
call her(a, x [,uplo] [, alpha])
Include Files
mkl.fi, blas.f90
Description
A := alpha*x*conjg(x') + A,
where:
alpha is a real scalar,
x is an n-element vector,
A is an n-by-n Hermitian matrix.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array a is used.
If uplo = 'U' or 'u', then the upper triangular of the array a is used.
If uplo = 'L' or 'l', then the low triangular of the array a is used.
92
BLAS and Sparse BLAS Routines 2
DOUBLE PRECISION for zher
Specifies the scalar alpha.
Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular
part of the array a must contain the upper triangular part of the Hermitian
matrix and the strictly lower triangular part of a is not referenced.
Before entry with uplo = 'L' or 'l', the leading n-by-n lower triangular
part of the array a must contain the lower triangular part of the Hermitian
matrix and the strictly upper triangular part of a is not referenced.
The imaginary parts of the diagonal elements need not be set and are
assumed to be zero.
Output Parameters
a With uplo = 'U' or 'u', the upper triangular part of the array a is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array a is
overwritten by the lower triangular part of the updated matrix.
The imaginary parts of the diagonal elements are set to zero.
93
2 Intel Math Kernel Library Developer Reference
?her2
Performs a rank-2 update of a Hermitian matrix.
Syntax
call cher2(uplo, n, alpha, x, incx, y, incy, a, lda)
call zher2(uplo, n, alpha, x, incx, y, incy, a, lda)
call her2(a, x, y [,uplo][,alpha])
Include Files
mkl.fi, blas.f90
Description
where:
alpha is a scalar,
x and y are n-element vectors,
A is an n-by-n Hermitian matrix.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array a is used.
If uplo = 'U' or 'u', then the upper triangular of the array a is used.
If uplo = 'L' or 'l', then the low triangular of the array a is used.
94
BLAS and Sparse BLAS Routines 2
incy INTEGER. Specifies the increment for the elements of y.
The value of incy must not be zero.
Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular
part of the array a must contain the upper triangular part of the Hermitian
matrix and the strictly lower triangular part of a is not referenced.
Before entry with uplo = 'L' or 'l', the leading n-by-n lower triangular
part of the array a must contain the lower triangular part of the Hermitian
matrix and the strictly upper triangular part of a is not referenced.
The imaginary parts of the diagonal elements need not be set and are
assumed to be zero.
Output Parameters
a With uplo = 'U' or 'u', the upper triangular part of the array a is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array a is
overwritten by the lower triangular part of the updated matrix.
The imaginary parts of the diagonal elements are set to zero.
?hpmv
Computes a matrix-vector product using a Hermitian
packed matrix.
Syntax
call chpmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
call zhpmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
95
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, blas.f90
Description
y := alpha*A*x + beta*y,
where:
alpha and beta are scalars,
x and y are n-element vectors,
A is an n-by-n Hermitian matrix, supplied in packed form.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
matrix A is supplied in the packed array ap.
If uplo = 'U' or 'u', then the upper triangular part of the matrix A is
supplied in the packed array ap .
If uplo = 'L' or 'l', then the low triangular part of the matrix A is
supplied in the packed array ap .
Before entry with uplo = 'U' or 'u', the array ap must contain the upper
triangular part of the Hermitian matrix packed sequentially, column-by-
column, so that ap(1) contains A1, 1, ap(2) and ap(3) contain A1, 2 and
A2, 2 respectively, and so on. Before entry with uplo = 'L' or 'l', the
array ap must contain the lower triangular part of the Hermitian matrix
packed sequentially, column-by-column, so that ap(1) contains A1, 1,
ap(2) and ap(3) contain A2, 1 and A3, 1 respectively, and so on.
The imaginary parts of the diagonal elements need not be set and are
assumed to be zero.
96
BLAS and Sparse BLAS Routines 2
incx INTEGER. Specifies the increment for the elements of x.
The value of incx must not be zero.
Output Parameters
?hpr
Performs a rank-1 update of a Hermitian packed
matrix.
Syntax
call chpr(uplo, n, alpha, x, incx, ap)
call zhpr(uplo, n, alpha, x, incx, ap)
call hpr(ap, x [,uplo] [, alpha])
97
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, blas.f90
Description
A := alpha*x*conjg(x') + A,
where:
alpha is a real scalar,
x is an n-element vector,
A is an n-by-n Hermitian matrix, supplied in packed form.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
matrix A is supplied in the packed array ap.
If uplo = 'U' or 'u', the upper triangular part of the matrix A is supplied
in the packed array ap .
If uplo = 'L' or 'l', the low triangular part of the matrix A is supplied in
the packed array ap .
incx INTEGER. Specifies the increment for the elements of x. incx must not be
zero.
Before entry with uplo = 'U' or 'u', the array ap must contain the upper
triangular part of the Hermitian matrix packed sequentially, column-by-
column, so that ap(1) contains A1, 1, ap(2) and ap(3) contain A1, 2 and
A2, 2 respectively, and so on.
Before entry with uplo = 'L' or 'l', the array ap must contain the lower
triangular part of the Hermitian matrix packed sequentially, column-by-
column, so that ap(1) contains A1, 1, ap(2) and ap(3) contain A2, 1 and
A3, 1 respectively, and so on.
98
BLAS and Sparse BLAS Routines 2
The imaginary parts of the diagonal elements need not be set and are
assumed to be zero.
Output Parameters
ap With uplo = 'U' or 'u', overwritten by the upper triangular part of the
updated matrix.
With uplo = 'L' or 'l', overwritten by the lower triangular part of the
updated matrix.
The imaginary parts of the diagonal elements are set to zero.
?hpr2
Performs a rank-2 update of a Hermitian packed
matrix.
Syntax
call chpr2(uplo, n, alpha, x, incx, y, incy, ap)
call zhpr2(uplo, n, alpha, x, incx, y, incy, ap)
call hpr2(ap, x, y [,uplo][,alpha])
Include Files
mkl.fi, blas.f90
Description
A := alpha*x*conjg(y') + conjg(alpha)*y*conjg(x') + A,
where:
alpha is a scalar,
x and y are n-element vectors,
A is an n-by-n Hermitian matrix, supplied in packed form.
99
2 Intel Math Kernel Library Developer Reference
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
matrix A is supplied in the packed array ap.
If uplo = 'U' or 'u', then the upper triangular part of the matrix A is
supplied in the packed array ap .
If uplo = 'L' or 'l', then the low triangular part of the matrix A is
supplied in the packed array ap .
Before entry with uplo = 'U' or 'u', the array ap must contain the upper
triangular part of the Hermitian matrix packed sequentially, column-by-
column, so that ap(1) contains A1, 1, ap(2) and ap(3) contain A1, 2 and
A2, 2 respectively, and so on.
Before entry with uplo = 'L' or 'l', the array ap must contain the lower
triangular part of the Hermitian matrix packed sequentially, column-by-
column, so that ap(1) contains A1, 1, ap(2) and ap(3) contain A2, 1 and
A3, 1 respectively, and so on.
The imaginary parts of the diagonal elements need not be set and are
assumed to be zero.
100
BLAS and Sparse BLAS Routines 2
Output Parameters
ap With uplo = 'U' or 'u', overwritten by the upper triangular part of the
updated matrix.
With uplo = 'L' or 'l', overwritten by the lower triangular part of the
updated matrix.
The imaginary parts of the diagonal elements need are set to zero.
?sbmv
Computes a matrix-vector product using a symmetric
band matrix.
Syntax
call ssbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call dsbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call sbmv(a, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
y := alpha*A*x + beta*y,
where:
alpha and beta are scalars,
x and y are n-element vectors,
A is an n-by-n symmetric band matrix, with k super-diagonals.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
band matrix A is used:
101
2 Intel Math Kernel Library Developer Reference
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
Before entry with uplo = 'L' or 'l', the leading (k + 1) by n part of the
array a must contain the lower triangular band part of the symmetric
matrix, supplied column-by-column, with the leading diagonal of the matrix
in row 1 of the array, the first sub-diagonal starting at position 1 in row 2,
and so on. The bottom right k by k triangle of the array a is not referenced.
The following program segment transfers the lower triangular part of a
symmetric band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
102
BLAS and Sparse BLAS Routines 2
Array, size at least (1 + (n - 1)*abs(incx)). Before entry, the
incremented array x must contain the vector x.
Output Parameters
?spmv
Computes a matrix-vector product using a symmetric
packed matrix.
Syntax
call sspmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
call dspmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
call spmv(ap, x, y [,uplo][,alpha] [,beta])
103
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, blas.f90
Description
y := alpha*A*x + beta*y,
where:
alpha and beta are scalars,
x and y are n-element vectors,
A is an n-by-n symmetric matrix, supplied in packed form.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
matrix A is supplied in the packed array ap.
If uplo = 'U' or 'u', then the upper triangular part of the matrix A is
supplied in the packed array ap .
If uplo = 'L' or 'l', then the low triangular part of the matrix A is
supplied in the packed array ap .
Before entry with uplo = 'U' or 'u', the array ap must contain the upper
triangular part of the symmetric matrix packed sequentially, column-by-
column, so that ap(1) contains a(1,1), ap(2) and ap(3) contain a(1,2)
and a(2, 2) respectively, and so on. Before entry with uplo = 'L' or
'l', the array ap must contain the lower triangular part of the symmetric
matrix packed sequentially, column-by-column, so that ap(1) contains
a(1,1), ap(2) and ap(3) contain a(2,1) and a(3,1) respectively, and so
on.
104
BLAS and Sparse BLAS Routines 2
beta REAL for sspmv
DOUBLE PRECISION for dspmv
Specifies the scalar beta.
When beta is supplied as zero, then y need not be set on input.
Output Parameters
?spr
Performs a rank-1 update of a symmetric packed
matrix.
Syntax
call sspr(uplo, n, alpha, x, incx, ap)
call dspr(uplo, n, alpha, x, incx, ap)
call spr(ap, x [,uplo] [, alpha])
Include Files
mkl.fi, blas.f90
Description
105
2 Intel Math Kernel Library Developer Reference
a:= alpha*x*x'+ A,
where:
alpha is a real scalar,
x is an n-element vector,
A is an n-by-n symmetric matrix, supplied in packed form.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
matrix A is supplied in the packed array ap.
If uplo = 'U' or 'u', then the upper triangular part of the matrix A is
supplied in the packed array ap .
If uplo = 'L' or 'l', then the low triangular part of the matrix A is
supplied in the packed array ap .
Output Parameters
ap With uplo = 'U' or 'u', overwritten by the upper triangular part of the
updated matrix.
106
BLAS and Sparse BLAS Routines 2
With uplo = 'L' or 'l', overwritten by the lower triangular part of the
updated matrix.
?spr2
Performs a rank-2 update of a symmetric packed
matrix.
Syntax
call sspr2(uplo, n, alpha, x, incx, y, incy, ap)
call dspr2(uplo, n, alpha, x, incx, y, incy, ap)
call spr2(ap, x, y [,uplo][,alpha])
Include Files
mkl.fi, blas.f90
Description
where:
alpha is a scalar,
x and y are n-element vectors,
A is an n-by-n symmetric matrix, supplied in packed form.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
matrix A is supplied in the packed array ap.
If uplo = 'U' or 'u', then the upper triangular part of the matrix A is
supplied in the packed array ap .
If uplo = 'L' or 'l', then the low triangular part of the matrix A is
supplied in the packed array ap .
107
2 Intel Math Kernel Library Developer Reference
incy INTEGER. Specifies the increment for the elements of y. The value of incy
must not be zero.
Output Parameters
ap With uplo = 'U' or 'u', overwritten by the upper triangular part of the
updated matrix.
With uplo = 'L' or 'l', overwritten by the lower triangular part of the
updated matrix.
108
BLAS and Sparse BLAS Routines 2
x Holds the vector with the number of elements n.
?symv
Computes a matrix-vector product for a symmetric
matrix.
Syntax
call ssymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call dsymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call symv(a, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
y := alpha*A*x + beta*y,
where:
alpha and beta are scalars,
x and y are n-element vectors,
A is an n-by-n symmetric matrix.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array a is used.
If uplo = 'U' or 'u', then the upper triangular part of the array a is used.
If uplo = 'L' or 'l', then the low triangular part of the array a is used.
109
2 Intel Math Kernel Library Developer Reference
Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular
part of the array a must contain the upper triangular part of the symmetric
matrix A and the strictly lower triangular part of a is not referenced. Before
entry with uplo = 'L' or 'l', the leading n-by-n lower triangular part of
the array a must contain the lower triangular part of the symmetric matrix
A and the strictly upper triangular part of a is not referenced.
Output Parameters
110
BLAS and Sparse BLAS Routines 2
beta The default value is 0.
?syr
Performs a rank-1 update of a symmetric matrix.
Syntax
call ssyr(uplo, n, alpha, x, incx, a, lda)
call dsyr(uplo, n, alpha, x, incx, a, lda)
call syr(a, x [,uplo] [, alpha])
Include Files
mkl.fi, blas.f90
Description
A := alpha*x*x' + A ,
where:
alpha is a real scalar,
x is an n-element vector,
A is an n-by-n symmetric matrix.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array a is used.
If uplo = 'U' or 'u', then the upper triangular part of the array a is used.
If uplo = 'L' or 'l', then the low triangular part of the array a is used.
111
2 Intel Math Kernel Library Developer Reference
Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular
part of the array a must contain the upper triangular part of the symmetric
matrix A and the strictly lower triangular part of a is not referenced.
Before entry with uplo = 'L' or 'l', the leading n-by-n lower triangular
part of the array a must contain the lower triangular part of the symmetric
matrix A and the strictly upper triangular part of a is not referenced.
Output Parameters
a With uplo = 'U' or 'u', the upper triangular part of the array a is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array a is
overwritten by the lower triangular part of the updated matrix.
?syr2
Performs a rank-2 update of symmetric matrix.
Syntax
call ssyr2(uplo, n, alpha, x, incx, y, incy, a, lda)
call dsyr2(uplo, n, alpha, x, incx, y, incy, a, lda)
call syr2(a, x, y [,uplo][,alpha])
Include Files
mkl.fi, blas.f90
Description
A := alpha*x*y'+ alpha*y*x' + A,
where:
alpha is a scalar,
112
BLAS and Sparse BLAS Routines 2
x and y are n-element vectors,
A is an n-by-n symmetric matrix.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array a is used.
If uplo = 'U' or 'u', then the upper triangular part of the array a is used.
If uplo = 'L' or 'l', then the low triangular part of the array a is used.
incy INTEGER. Specifies the increment for the elements of y. The value of incy
must not be zero.
Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular
part of the array a must contain the upper triangular part of the symmetric
matrix and the strictly lower triangular part of a is not referenced.
Before entry with uplo = 'L' or 'l', the leading n-by-n lower triangular
part of the array a must contain the lower triangular part of the symmetric
matrix and the strictly upper triangular part of a is not referenced.
113
2 Intel Math Kernel Library Developer Reference
Output Parameters
a With uplo = 'U' or 'u', the upper triangular part of the array a is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array a is
overwritten by the lower triangular part of the updated matrix.
?tbmv
Computes a matrix-vector product using a triangular
band matrix.
Syntax
call stbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call dtbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call ctbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call ztbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call tbmv(a, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
Input Parameters
114
BLAS and Sparse BLAS Routines 2
uplo = 'U' or 'u'
if uplo = 'L' or 'l', then the matrix is low triangular.
k INTEGER. On entry with uplo = 'U' or 'u' specifies the number of super-
diagonals of the matrix A. On entry with uplo = 'L' or 'l', k specifies the
number of sub-diagonals of the matrix a.
The value of k must satisfy 0k.
Before entry with uplo = 'U' or 'u', the leading (k + 1) by n part of the
array a must contain the upper triangular band part of the matrix of
coefficients, supplied column-by-column, with the leading diagonal of the
matrix in row (k + 1) of the array, the first super-diagonal starting at
position 2 in row k, and so on. The top left k by k triangle of the array a is
not referenced. The following program segment transfers an upper
triangular band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
Before entry with uplo = 'L' or 'l', the leading (k + 1) by n part of the
array a must contain the lower triangular band part of the matrix of
coefficients, supplied column-by-column, with the leading diagonal of the
matrix in row 1 of the array, the first sub-diagonal starting at position 1 in
row 2, and so on. The bottom right k by k triangle of the array a is not
115
2 Intel Math Kernel Library Developer Reference
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
Note that when uplo = 'U' or 'u', the elements of the array a
corresponding to the diagonal elements of the matrix are not referenced,
but are assumed to be unity.
Output Parameters
116
BLAS and Sparse BLAS Routines 2
?tbsv
Solves a system of linear equations whose coefficients
are in a triangular band matrix.
Syntax
call stbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call dtbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call ctbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call ztbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call tbsv(a, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
Input Parameters
117
2 Intel Math Kernel Library Developer Reference
Before entry with uplo = 'U' or 'u', the leading (k + 1) by n part of the
array a must contain the upper triangular band part of the matrix of
coefficients, supplied column-by-column, with the leading diagonal of the
matrix in row (k + 1) of the array, the first super-diagonal starting at
position 2 in row k, and so on. The top left k by k triangle of the array a is
not referenced.
The following program segment transfers an upper triangular band matrix
from conventional full matrix storage (matrix) to band storage (a):
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
Before entry with uplo = 'L' or 'l', the leading (k + 1) by n part of the
array a must contain the lower triangular band part of the matrix of
coefficients, supplied column-by-column, with the leading diagonal of the
matrix in row 1 of the array, the first sub-diagonal starting at position 1 in
row 2, and so on. The bottom right k by k triangle of the array a is not
referenced.
The following program segment transfers a lower triangular band matrix
from conventional full matrix storage (matrix) to band storage (a):
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
When diag = 'U' or 'u', the elements of the array a corresponding to the
diagonal elements of the matrix are not referenced, but are assumed to be
unity.
118
BLAS and Sparse BLAS Routines 2
incx INTEGER. Specifies the increment for the elements of x.
The value of incx must not be zero.
Output Parameters
?tpmv
Computes a matrix-vector product using a triangular
packed matrix.
Syntax
call stpmv(uplo, trans, diag, n, ap, x, incx)
call dtpmv(uplo, trans, diag, n, ap, x, incx)
call ctpmv(uplo, trans, diag, n, ap, x, incx)
call ztpmv(uplo, trans, diag, n, ap, x, incx)
call tpmv(ap, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
119
2 Intel Math Kernel Library Developer Reference
Input Parameters
Before entry with uplo = 'U' or 'u', the array ap must contain the upper
triangular matrix packed sequentially, column-by-column, so that ap(1)
contains a(1,1), ap(2) and ap(3) contain a(1,2) and a(2,2)
respectively, and so on. Before entry with uplo = 'L' or 'l', the array ap
must contain the lower triangular matrix packed sequentially, column-by-
column, so thatap(1) contains a(1,1), ap(2) and ap(3) contain a(2,1)
and a(3,1) respectively, and so on. When diag = 'U' or 'u', the
diagonal elements of a are not referenced, but are assumed to be unity.
Output Parameters
120
BLAS and Sparse BLAS Routines 2
BLAS 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or reconstructible arguments, see BLAS 95
Interface Conventions.
Specific details for the routine tpmv interface are the following:
?tpsv
Solves a system of linear equations whose coefficients
are in a triangular packed matrix.
Syntax
call stpsv(uplo, trans, diag, n, ap, x, incx)
call dtpsv(uplo, trans, diag, n, ap, x, incx)
call ctpsv(uplo, trans, diag, n, ap, x, incx)
call ztpsv(uplo, trans, diag, n, ap, x, incx)
call tpsv(ap, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
Input Parameters
121
2 Intel Math Kernel Library Developer Reference
Before entry with uplo = 'U' or 'u', the array ap must contain the upper
triangular part of the triangular matrix packed sequentially, column-by-
column, so that ap(1) contains a(1, 1), ap(2) and ap(3) contain a(1,
2) and a(2, 2) respectively, and so on.
Before entry with uplo = 'L' or 'l', the array ap must contain the lower
triangular part of the triangular matrix packed sequentially, column-by-
column, so that ap(1) contains a(1, 1), ap(2) and ap(3) contain a(2,
1) and a(3, 1) respectively, and so on.
When diag = 'U' or 'u', the diagonal elements of a are not referenced,
but are assumed to be unity.
Output Parameters
122
BLAS and Sparse BLAS Routines 2
Specific details for the routine tpsv interface are the following:
?trmv
Computes a matrix-vector product using a triangular
matrix.
Syntax
call strmv(uplo, trans, diag, n, a, lda, x, incx)
call dtrmv(uplo, trans, diag, n, a, lda, x, incx)
call ctrmv(uplo, trans, diag, n, a, lda, x, incx)
call ztrmv(uplo, trans, diag, n, a, lda, x, incx)
call trmv(a, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
The ?trmv routines perform one of the following matrix-vector operations defined as
Input Parameters
123
2 Intel Math Kernel Library Developer Reference
Output Parameters
124
BLAS and Sparse BLAS Routines 2
The default value is 'N'.
?trsv
Solves a system of linear equations whose coefficients
are in a triangular matrix.
Syntax
call strsv(uplo, trans, diag, n, a, lda, x, incx)
call dtrsv(uplo, trans, diag, n, a, lda, x, incx)
call ctrsv(uplo, trans, diag, n, a, lda, x, incx)
call ztrsv(uplo, trans, diag, n, a, lda, x, incx)
call trsv(a, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
Input Parameters
125
2 Intel Math Kernel Library Developer Reference
Output Parameters
126
BLAS and Sparse BLAS Routines 2
BLAS Level 3 Routines
BLAS Level 3 routines perform matrix-matrix operations. Table BLAS Level 3 Routine Groups and Their Data
Types lists the BLAS Level 3 routine groups and the data types associated with them.
BLAS Level 3 Routine Groups and Their Data Types
Routine Group Data Types Description
The BLAS functions are blocked where possible to restructure the code in a way that increases the
localization of data reference, enhances cache memory use, and reduces the dependency on the memory
bus.
The code is distributed across the processors to maximize parallelism.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
?gemm
Computes a matrix-matrix product with general
matrices.
127
2 Intel Math Kernel Library Developer Reference
Syntax
call sgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call cgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call scgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dzgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call gemm(a, b, c [,transa][,transb] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?gemm routines compute a scalar-matrix-matrix product and add the result to a scalar-matrix product,
with general matrices. The operation is defined as
C := alpha*op(A)*op(B) + beta*C,
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,
alpha and beta are scalars,
A, B and C are matrices:
op(A) is an m-by-k matrix,
op(B) is a k-by-n matrix,
C is an m-by-n matrix.
See also
Input Parameters
128
BLAS and Sparse BLAS Routines 2
m INTEGER. Specifies the number of rows of the matrix op(A) and of the
matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix op(B) and the
number of columns of the matrix C.
The value of n must be at least zero.
k INTEGER. Specifies the number of columns of the matrix op(A) and the
number of rows of the matrix op(B).
129
2 Intel Math Kernel Library Developer Reference
Output Parameters
Example
For examples of routine usage, see the code in the Intel MKL installation directory:
sgemm: examples\blas\source\sgemmx.f
dgemm: examples\blas\source\dgemmx.f
cgemm: examples\blas\source\cgemmx.f
zgemm: examples\blas\source\zgemmx.f
130
BLAS and Sparse BLAS Routines 2
mb = k if transb = 'N',
mb = n otherwise.
?hemm
Computes a matrix-matrix product where one input
matrix is Hermitian.
Syntax
call chemm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call zhemm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call hemm(a, b, c [,side][,uplo] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?hemm routines compute a scalar-matrix-matrix product using a Hermitian matrix A and a general matrix
B and add the result to a scalar-matrix product using a general matrix C. The operation is defined as
C := alpha*A*B + beta*C
or
C := alpha*B*A + beta*C,
where:
alpha and beta are scalars,
A is a Hermitian matrix,
B and C are m-by-n matrices.
Input Parameters
side CHARACTER*1. Specifies whether the Hermitian matrix A appears on the left
or right in the operation as follows:
if side = 'L' or 'l', then C := alpha*A*B + beta*C;
131
2 Intel Math Kernel Library Developer Reference
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
Hermitian matrix A is used:
If uplo = 'U' or 'u', then the upper triangular part of the Hermitian
matrix A is used.
If uplo = 'L' or 'l', then the low triangular part of the Hermitian matrix
A is used.
132
BLAS and Sparse BLAS Routines 2
beta COMPLEX for chemm
DOUBLE COMPLEX for zhemm
Specifies the scalar beta.
When beta is supplied as zero, then c need not be set on input.
Output Parameters
?herk
Performs a Hermitian rank-k update.
Syntax
call cherk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call zherk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call herk(a, c [,uplo] [, trans] [,alpha][,beta])
133
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, blas.f90
Description
The ?herk routines perform a rank-k matrix-matrix operation using a general matrix A and a Hermitian
matrix C. The operation is defined as
C := alpha*A*AH + beta*C,
or
C := alpha*AH*A + beta*C,
where:
alpha and beta are real scalars,
C is an n-by-n Hermitian matrix,
A is an n-by-k matrix in the first case and a k-by-n matrix in the second case.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array c is used.
If uplo = 'U' or 'u', then the upper triangular part of the array c is used.
If uplo = 'L' or 'l', then the low triangular part of the array c is used.
134
BLAS and Sparse BLAS Routines 2
lda INTEGER. Specifies the leading dimension of a as declared in the calling
(sub)program. When trans= 'N' or 'n', then lda must be at least max(1,
n), otherwise lda must be at least max(1, k).
Output Parameters
c With uplo = 'U' or 'u', the upper triangular part of the array c is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array c is
overwritten by the lower triangular part of the updated matrix.
The imaginary parts of the diagonal elements are set to zero.
135
2 Intel Math Kernel Library Developer Reference
?her2k
Performs a Hermitian rank-2k update.
Syntax
call cher2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zher2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call her2k(a, b, c [,uplo][,trans] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?her2k routines perform a rank-2k matrix-matrix operation using general matrices A and B and a
Hermitian matrix C. The operation is defined as
or
where:
alpha is a scalar and beta is a real scalar,
C is an n-by-n Hermitian matrix,
A and B are n-by-k matrices in the first case and k-by-n matrices in the second case.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array c is used.
If uplo = 'U' or 'u', then the upper triangular of the array c is used.
If uplo = 'L' or 'l', then the low triangular of the array c is used.
k INTEGER. With trans= 'N' or 'n' specifies the number of columns of the
matrix A, and with trans= 'C' or 'c', k specifies the number of rows of
the matrix A.
136
BLAS and Sparse BLAS Routines 2
The value of k must be at least equal to zero.
137
2 Intel Math Kernel Library Developer Reference
Output Parameters
c With uplo = 'U' or 'u', the upper triangular part of the array c is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array c is
overwritten by the lower triangular part of the updated matrix.
The imaginary parts of the diagonal elements are set to zero.
?symm
Computes a matrix-matrix product where one input
matrix is symmetric.
Syntax
call ssymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call dsymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call csymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call zsymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call symm(a, b, c [,side][,uplo] [,alpha][,beta])
138
BLAS and Sparse BLAS Routines 2
Include Files
mkl.fi, blas.f90
Description
The ?symm routines compute a scalar-matrix-matrix product with one symmetric matrix and add the result to
a scalar-matrix product. The operation is defined as
C := alpha*A*B + beta*C,
or
C := alpha*B*A + beta*C,
where:
alpha and beta are scalars,
A is a symmetric matrix,
B and C are m-by-n matrices.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
symmetric matrix A is used:
if uplo = 'U' or 'u', then the upper triangular part is used;
139
2 Intel Math Kernel Library Developer Reference
Before entry with side = 'L' or 'l', the m-by-m part of the array a must
contain the symmetric matrix, such that when uplo = 'U' or 'u', the
leading m-by-m upper triangular part of the array a must contain the upper
triangular part of the symmetric matrix and the strictly lower triangular part
of a is not referenced, and when side = 'L' or 'l', the leading m-by-m
lower triangular part of the array a must contain the lower triangular part of
the symmetric matrix and the strictly upper triangular part of a is not
referenced.
Before entry with side = 'R' or 'r', the n-by-n part of the array a must
contain the symmetric matrix, such that when uplo = 'U' or 'u'e array a
must contain the upper triangular part of the symmetric matrix and the
strictly lower triangular part of a is not referenced, and when side = 'L'
or 'l', the leading n-by-n lower triangular part of the array a must contain
the lower triangular part of the symmetric matrix and the strictly upper
triangular part of a is not referenced.
140
BLAS and Sparse BLAS Routines 2
Output Parameters
?syrk
Performs a symmetric rank-k update.
Syntax
call ssyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call dsyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call csyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call zsyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call syrk(a, c [,uplo] [, trans] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?syrk routines perform a rank-k matrix-matrix operation for a symmetric matrix C using a general
matrix A. The operation is defined as
C := alpha*A*A' + beta*C,
or
C := alpha*A'*A + beta*C,
where:
alpha and beta are scalars,
141
2 Intel Math Kernel Library Developer Reference
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array c is used.
If uplo = 'U' or 'u', then the upper triangular part of the array c is used.
If uplo = 'L' or 'l', then the low triangular part of the array c is used.
142
BLAS and Sparse BLAS Routines 2
DOUBLE PRECISION for dsyrk
COMPLEX for csyrk
DOUBLE COMPLEX for zsyrk
Specifies the scalar beta.
Output Parameters
c With uplo = 'U' or 'u', the upper triangular part of the array c is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array c is
overwritten by the lower triangular part of the updated matrix.
143
2 Intel Math Kernel Library Developer Reference
?syr2k
Performs a symmetric rank-2k update.
Syntax
call ssyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dsyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call csyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zsyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call syr2k(a, b, c [,uplo][,trans] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?syr2k routines perform a rank-2k matrix-matrix operation for a symmetric matrix C using general
matrices A and B. The operation is defined as
or
where:
alpha and beta are scalars,
C is an n-by-n symmetric matrix,
A and B are n-by-k matrices in the first case, and k-by-n matrices in the second case.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array c is used.
If uplo = 'U' or 'u', then the upper triangular part of the array c is used.
If uplo = 'L' or 'l', then the low triangular part of the array c is used.
144
BLAS and Sparse BLAS Routines 2
k INTEGER. On entry with trans= 'N' or 'n', k specifies the number of
columns of the matrices A and B, and on entry with trans= 'T' or 't' or
'C' or 'c', k specifies the number of rows of the matrices A and B. The
value of k must be at least zero.
145
2 Intel Math Kernel Library Developer Reference
Output Parameters
c With uplo = 'U' or 'u', the upper triangular part of the array c is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array c is
overwritten by the lower triangular part of the updated matrix.
146
BLAS and Sparse BLAS Routines 2
beta The default value is 0.
?trmm
Computes a matrix-matrix product where one input
matrix is triangular.
Syntax
Include Files
mkl.fi, blas.f90
Description
The ?trmm routines compute a scalar-matrix-matrix product with one triangular matrix. The operation is
defined as
B := alpha*op(A)*B
or
B := alpha*B*op(A)
where:
alpha is a scalar,
B is an m-by-n matrix,
A is a unit, or non-unit, upper or lower triangular matrix
op(A) is one of op(A) = A, or op(A) = A', or op(A) = conjg(A').
Input Parameters
147
2 Intel Math Kernel Library Developer Reference
148
BLAS and Sparse BLAS Routines 2
ldb INTEGER. Specifies the leading dimension of b as declared in the calling
(sub)program. ldb must be at least max(1, m).
Output Parameters
?trsm
Solves a triangular matrix equation.
Syntax
call strsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call dtrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ctrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ztrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call trsm(a, b [,side] [, uplo] [,transa][,diag] [,alpha])
Include Files
mkl.fi, blas.f90
Description
op(A)*X = alpha*B,
149
2 Intel Math Kernel Library Developer Reference
or
X*op(A) = alpha*B,
where:
alpha is a scalar,
X and B are m-by-n matrices,
A is a unit, or non-unit, upper or lower triangular matrix
op(A) is one of op(A) = A, or op(A) = A', or op(A) = conjg(A').
The matrix B is overwritten by the solution matrix X.
Input Parameters
150
BLAS and Sparse BLAS Routines 2
DOUBLE PRECISION for dtrsm
COMPLEX for ctrsm
DOUBLE COMPLEX for ztrsm
Array, size (lda, k) , where k is m when side = 'L' or 'l' and is n
when side = 'R' or 'r'. Before entry with uplo = 'U' or 'u', the
leading k by k upper triangular part of the array a must contain the upper
triangular matrix and the strictly lower triangular part of a is not
referenced.
Before entry with uplo = 'L' or 'l' lower triangular part of the array a
must contain the lower triangular matrix and the strictly upper triangular
part of a is not referenced.
When diag = 'U' or 'u', the diagonal elements of a are not referenced
either, but are assumed to be unity.
Output Parameters
151
2 Intel Math Kernel Library Developer Reference
Vector Arguments
Compressed sparse vectors. Let a be a vector stored in an array, and assume that the only non-zero
elements of a are the following:
a(k1), a(k2), a(k3) . . . a(knz),
Thus, a sparse vector is fully determined by the triple (nz, x, indx). If you pass a negative or zero value of nz
to Sparse BLAS, the subroutines do not modify any arrays or variables.
Full-storage vectors. Sparse BLAS routines can also use a vector argument fully stored in a single array (a
full-storage vector). If y is a full-storage vector, its elements must be stored contiguously: the first element
in y(1), the second in y(2), and so on. This corresponds to an increment incy = 1 in BLAS Level 1. No
increment value for full-storage vectors is passed as an argument to Sparse BLAS routines or functions.
152
BLAS and Sparse BLAS Routines 2
Sparse BLAS Routines and Their Data Types
Routine/ Data Types Description
Function
i?amax index of the element with the largest absolute value for real flavors, or the
largest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
i?amin index of the element with the smallest absolute value for real flavors, or the
smallest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
The result i returned by i?amax and i?amin should be interpreted as index in the compressed-form array, so
that the largest (smallest) value is x(i); the corresponding index in full-storage array is indx(i).
You can also call ?rotg to compute the parameters of Givens rotation and then pass these parameters to the
Sparse BLAS routines ?roti.
?axpyi
Adds a scalar multiple of compressed sparse vector to
a full-storage vector.
Syntax
call saxpyi(nz, a, x, indx, y)
call daxpyi(nz, a, x, indx, y)
call caxpyi(nz, a, x, indx, y)
call zaxpyi(nz, a, x, indx, y)
153
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, blas.f90
Description
y := a*x + y
where:
a is a scalar,
x is a sparse vector stored in compressed form,
y is a vector in full storage form.
The ?axpyi routines reference or modify only the elements of y whose indices are listed in the array indx.
Input Parameters
Output Parameters
154
BLAS and Sparse BLAS Routines 2
BLAS 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or reconstructible arguments, see BLAS 95
Interface Conventions.
Specific details for the routine axpyi interface are the following:
?doti
Computes the dot product of a compressed sparse real
vector by a full-storage real vector.
Syntax
res = sdoti(nz, x, indx, y )
res = ddoti(nz, x, indx, y )
res = doti(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
where the triple (nz, x, indx) defines a sparse real vector stored in compressed form, and y is a real vector in
full storage form. The functions reference only the elements of y whose indices are listed in the array indx.
The values in indx must be distinct.
Input Parameters
155
2 Intel Math Kernel Library Developer Reference
Output Parameters
?dotci
Computes the conjugated dot product of a
compressed sparse complex vector with a full-storage
complex vector.
Syntax
res = cdotci(nz, x, indx, y )
res = zdotci(nzz, x, indx, y )
res = dotci(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
where the triple (nz, x, indx) defines a sparse complex vector stored in compressed form, and y is a real
vector in full storage form. The functions reference only the elements of y whose indices are listed in the
array indx. The values in indx must be distinct.
Input Parameters
156
BLAS and Sparse BLAS Routines 2
indx INTEGER. Specifies the indices for the elements of x.
Array, size at least nz.
Output Parameters
?dotui
Computes the dot product of a compressed sparse
complex vector by a full-storage complex vector.
Syntax
res = cdotui(nz, x, indx, y )
res = zdotui(nzz, x, indx, y )
res = dotui(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
where the triple (nz, x, indx) defines a sparse complex vector stored in compressed form, and y is a real
vector in full storage form. The functions reference only the elements of y whose indices are listed in the
array indx. The values in indx must be distinct.
157
2 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
?gthr
Gathers a full-storage sparse vector's elements into
compressed form.
Syntax
call sgthr(nz, y, x, indx )
call dgthr(nz, y, x, indx )
call cgthr(nz, y, x, indx )
call zgthr(nz, y, x, indx )
res = gthr(x, indx, y)
158
BLAS and Sparse BLAS Routines 2
Include Files
mkl.fi, blas.f90
Description
The ?gthr routines gather the specified elements of a full-storage sparse vector y into compressed form(nz,
x, indx). The routines reference only the elements of y whose indices are listed in the array indx:
x(i) = y(indx(i)), for i=1,2,... ,nz.
Input Parameters
Output Parameters
?gthrz
Gathers a sparse vector's elements into compressed
form, replacing them by zeros.
159
2 Intel Math Kernel Library Developer Reference
Syntax
call sgthrz(nz, y, x, indx )
call dgthrz(nz, y, x, indx )
call cgthrz(nz, y, x, indx )
call zgthrz(nz, y, x, indx )
res = gthrz(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
The ?gthrz routines gather the elements with indices specified by the array indx from a full-storage vector y
into compressed form (nz, x, indx) and overwrite the gathered elements of y by zeros. Other elements of y
are not referenced or modified (see also ?gthr).
Input Parameters
Output Parameters
160
BLAS and Sparse BLAS Routines 2
x Holds the vector with the number of elements nz.
?roti
Applies Givens rotation to sparse vectors one of which
is in compressed form.
Syntax
call sroti(nz, x, indx, y, c, s)
call droti(nz, x, indx, y, c, s)
call roti(x, indx, y, c, s)
Include Files
mkl.fi, blas.f90
Description
The ?roti routines apply the Givens rotation to elements of two real vectors, x (in compressed form nz, x,
indx) and y (in full storage form):
The routines reference only the elements of y whose indices are listed in the array indx. The values in indx
must be distinct.
Input Parameters
161
2 Intel Math Kernel Library Developer Reference
Output Parameters
?sctr
Converts compressed sparse vectors into full storage
form.
Syntax
call ssctr(nz, x, indx, y )
call dsctr(nz, x, indx, y )
call csctr(nz, x, indx, y )
call zsctr(nz, x, indx, y )
call sctr(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
The ?sctr routines scatter the elements of the compressed sparse vector (nz, x, indx) to a full-storage
vector y. The routines modify only the elements of y whose indices are listed in the array indx:
y(indx(i)) = x(i), for i=1,2,... ,nz.
Input Parameters
162
BLAS and Sparse BLAS Routines 2
COMPLEX for csctr
DOUBLE COMPLEX for zsctr
Array, size at least nz.
Contains the vector to be converted to full-storage form.
Output Parameters
163
2 Intel Math Kernel Library Developer Reference
The routines with simplified interfaces have eight-character base names in accordance with the templates:
mkl_<character > <data> <mtype> <operation>( )
The <data> field indicates the sparse matrix storage format (see section Sparse Matrix Storage Formats):
164
BLAS and Sparse BLAS Routines 2
Sparse Matrix Storage Formats for Sparse BLAS Routines
The current version of Intel MKL Sparse BLAS Level 2 and Level 3 routines support the following point entry
[Duff86] storage formats for sparse matrices:
computing the vector product between a sparse matrix and a dense vector:
y := alpha*op(A)*x + beta*y
solving a single triangular system:
y := alpha*inv(op(A))*x
computing the vector product between a sparse matrix and a dense vector (for general and symmetric
matrices):
y := op(A)*x
165
2 Intel Math Kernel Library Developer Reference
Matrix type is indicated by the field <mtype> in the routine name (see section Naming Conventions in Sparse
BLAS Level 2 and Level 3).
NOTE
The routines with simplified interfaces support only four sparse matrix storage formats, specifically:
CSR format in the 3-array variation accepted in the direct sparse solvers and in the CXML;
diagonal format accepted in the CXML;
coordinate format;
BSR format in the 3-array variation.
Note that routines with both typical (conventional) and simplified interfaces use the same computational
kernels that work with certain internal data structures.
The Intel MKL Sparse BLAS Level 2 and Level 3 routines do not support in-place operations.
Complete list of all routines is given in the Sparse BLAS Level 2 and Level 3 Routines.
Interface Consideration
The detailed descriptions of the one-based and zero-based variants of the sparse data storage formats are
given in the "Sparse Matrix Storage Formats" in Appendix A.
Most parameters of the routines are identical for both one-based and zero-based indexing, but some of them
have certain differences. The following table lists all these differences.
166
BLAS and Sparse BLAS Routines 2
Parameter One-based Indexing Zero-based Indexing
pntrb Array of length m. This array contains Array of length m. This array contains
row indices, such that pntrb(i) - row indices, such that pntrb(i) -
pntrb(1)+1 is the first index of row i pntrb(0) is the first index of row i in
in the arrays val and indx the arrays val and indx.
pntre Array of length m. This array contains Array of length m. This array contains
row indices, such that pntre(I) - row indices, such that pntre(i) -
pntrb(1) is the last index of row i in pntrb(0)-1 is the last index of row i
the arrays val and indx. in the arrays val and indx.
ia Array of length m + 1, containing Array of length m+1, containing indices
indices of elements in the array a, of elements in the array a, such that
such that ia(i) is the index in the ia(i) is the index in the array a of
array a of the first non-zero element the first non-zero element from the
from the row i. The value of the last row i. The value of the last element
element ia(m + 1) is equal to the ia(m) is equal to the number of non-
number of non-zeros plus one. zeros.
ldb Specifies the leading dimension of b as Specifies the second dimension of b as
declared in the calling (sub)program. declared in the calling (sub)program.
ldc Specifies the leading dimension of c as Specifies the second dimension of c as
declared in the calling (sub)program. declared in the calling (sub)program.
The analogous NIST* Sparse BLAS (NSB) library routines have the following interfaces:
xyyymm(transa, m, n, k, alpha, descra, arg(A), b, ldb, beta, c, ldc, work, lwork), for
matrix-matrix product;
xyyysm(transa, m, n, unitd, dv, alpha, descra, arg(A), b, ldb, beta, c, ldc, work,
lwork), for triangular solvers with multiple right-hand sides.
Some similar arguments are used in both libraries. The argument transa indicates what operation is
performed and is slightly different in the NSB library (see Table Parameter transa). The arguments m and k
are the number of rows and column in the matrix A, respectively, n is the number of columns in the matrix C.
The arguments alpha and beta are scalar alpha and beta respectively (beta is not used in the Intel MKL
triangular solvers.) The arguments b and c are rectangular arrays with the leading dimension ldb and ldc,
respectively. arg(A) denotes the list of arguments that describe the sparse representation of A.
Parameter transa
value N or n 0 op(A) = A
167
2 Intel Math Kernel Library Developer Reference
T or t 1 op(A) = AT
C or c 2 op(A) = AT or op(A) =
AH
Parameter matdescra
The parameter matdescra describes the relevant characteristic of the matrix A. This manual describes
matdescra as an array of six elements in line with the NIST* implementation. However, only the first four
elements of the array are used in the current versions of the Intel MKL Sparse BLAS routines. Elements
matdescra(5) and matdescra(6) are reserved for future use. Note that whether matdescra is described in
your application as an array of length 6 or 4 is of no importance because the array is declared as a pointer in
the Intel MKL routines. To learn more about declaration of the matdescra array, see the Sparse BLAS
examples located in the Intel MKL installation directory: examples/spblasf/ for Fortran . The table below
lists elements of the parameter matdescra, their Fortran values, and their meanings. The parameter
matdescra corresponds to the argument descra from NSB library.
Possible Values of the Parameter matdescra (descra)
one-based zero-based
indexing indexing
value G G 0 general
S S 1 symmetric (A = AT)
H H 2 Hermitian (A = (AH))
T T 3 triangular
A A 4 skew(anti)-symmetric (A = -AT)
D D 5 diagonal
value L L 1 lower
U U 2 upper
value N N 0 non-unit
U U 1 unit
type of indexing
4th element matdescra(4) matdescra(3) descra(4)
one-based indexing
value F 1
zero-based indexing
C 0
168
BLAS and Sparse BLAS Routines 2
In some cases possible element values of the parameter matdescra depend on the values of other elements.
The Table "Possible Combinations of Element Values of the Parameter matdescra" lists all possible
combinations of element values for both multiplication routines and triangular solvers.
Possible Combinations of Element Values of the Parameter matdescra
For a matrix in the skyline format with the main diagonal declared to be a unit, diagonal elements must be
stored in the sparse representation even if they are zero. In all other formats, diagonal elements can be
stored (if needed) in the sparse representation if they are not zero.
A = L + D + U
where L is the strict lower triangle of A, U is the strict upper triangle of A, D is the main diagonal.
Table "Output Matrices for Multiplication Routines" shows correspondence between the output matrices and
values of the parameter matdescra for the sparse matrix A for multiplication routines.
Output Matrices for Multiplication Routines
matdescra(1) matdescra(2) matdescra(3) Output Matrix
S or H L N alpha*op(L+D+L')*x + beta*y
alpha*op(L+D+L')*B + beta*C
S or H L U alpha*op(L+I+L')*x + beta*y
alpha*op(L+I+L')*B + beta*C
S or H U N alpha*op(U'+D+U)*x + beta*y
169
2 Intel Math Kernel Library Developer Reference
S or H U U alpha*op(U'+I+U)*x + beta*y
alpha*op(U'+I+U)*B + beta*C
T L U alpha*op(L+I)*x + beta*y
alpha*op(L+I)*B + beta*C
T L N alpha*op(L+D)*x + beta*y
alpha*op(L+D)*B + beta*C
T U U alpha*op(U+I)*x + beta*y
alpha*op(U+I)*B + beta*C
T U N alpha*op(U+D)*x + beta*y
alpha*op(U+D)*B + beta*C
Table Output Matrices for Triangular Solvers shows correspondence between the output matrices and values
of the parameter matdescra for the sparse matrix A for triangular solvers.
Output Matrices for Triangular Solvers
matdescra(1) matdescra(2) matdescra(3) Output Matrix
T L N alpha*inv(op(L))*x
alpha*inv(op(L))*B
T L U alpha*inv(op(L))*x
alpha*inv(op(L))*B
T U N alpha*inv(op(U))*x
alpha*inv(op(U))*B
T U U alpha*inv(op(U))*x
alpha*inv(op(U))*B
D ignored N alpha*inv(D)*x
alpha*inv(D)*B
D ignored U alpha*x
alpha*B
170
BLAS and Sparse BLAS Routines 2
Sparse BLAS Level 2 and Level 3 Routines.
Table Sparse BLAS Level 2 and Level 3 Routines lists the sparse BLAS Level 2 and Level 3 routines
described in more detail later in this section.
Sparse BLAS Level 2 and Level 3 Routines
Routine/Function Description
171
2 Intel Math Kernel Library Developer Reference
Routine/Function Description
172
BLAS and Sparse BLAS Routines 2
Routine/Function Description
Matrix converters
mkl_?csradd Computes the sum of two sparse matrices stored in the CSR
format (3-array variation) with one-based indexing.
173
2 Intel Math Kernel Library Developer Reference
Routine/Function Description
mkl_?csrgemv
Computes matrix - vector product of a sparse general
matrix stored in the CSR format (3-array variation)
with one-based indexing.
Syntax
call mkl_scsrgemv(transa, m, a, ia, ja, x, y)
call mkl_dcsrgemv(transa, m, a, ia, ja, x, y)
call mkl_ccsrgemv(transa, m, a, ia, ja, x, y)
call mkl_zcsrgemv(transa, m, a, ia, ja, x, y)
Include Files
mkl.fi
Description
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is an m-by-m sparse square matrix in the CSR format (3-array variation), AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
174
BLAS and Sparse BLAS Routines 2
COMPLEX for mkl_ccsrgemv.
DOUBLE COMPLEX for mkl_zcsrgemv.
Array containing non-zero elements of the matrix A. Its length is equal to
the number of non-zero elements in the matrix A. Refer to values array
description in Sparse Matrix Storage Formats for more details.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to the length of the array a. Refer to columns array
description in Sparse Matrix Storage Formats for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
175
2 Intel Math Kernel Library Developer Reference
mkl_?bsrgemv
Computes matrix - vector product of a sparse general
matrix stored in the BSR format (3-array variation)
with one-based indexing.
Syntax
call mkl_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_dbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_zbsrgemv(transa, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
Description
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is an m-by-m block sparse square matrix in the BSR format (3-array variation), AT is the transpose of A.
176
BLAS and Sparse BLAS Routines 2
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
ja INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
Its length is equal to the number of non-zero blocks of the matrix A. Refer
to columns array description in BSR Format for more details.
Output Parameters
177
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
mkl_?coogemv
Computes matrix-vector product of a sparse general
matrix stored in the coordinate format with one-based
indexing.
Syntax
call mkl_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)
178
BLAS and Sparse BLAS Routines 2
call mkl_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is an m-by-m sparse square matrix in the coordinate format, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
179
2 Intel Math Kernel Library Developer Reference
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A. Refer to columns array description in
Coordinate Format for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL val(*), x(*), y(*)
180
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
INTEGER rowind(*), colind(*)
COMPLEX val(*), x(*), y(*)
mkl_?diagemv
Computes matrix - vector product of a sparse general
matrix stored in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
Include Files
mkl.fi
Description
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is an m-by-m sparse square matrix in the diagonal storage format, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
181
2 Intel Math Kernel Library Developer Reference
idiag INTEGER. Array of length ndiag, contains the distances between main
diagonal and each non-zero diagonals in the matrix A.
Refer to distance array description in Diagonal Storage Scheme for more
details.
Output Parameters
182
BLAS and Sparse BLAS Routines 2
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER m, lval, ndiag
INTEGER idiag(*)
REAL val(lval,*), x(*), y(*)
mkl_?csrsymv
Computes matrix - vector product of a sparse
symmetrical matrix stored in the CSR format (3-array
variation) with one-based indexing.
Syntax
call mkl_scsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_dcsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_ccsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_zcsrsymv(uplo, m, a, ia, ja, x, y)
Include Files
mkl.fi
183
2 Intel Math Kernel Library Developer Reference
Description
y := A*x
where:
x and y are vectors,
A is an upper or lower triangle of the symmetrical sparse matrix in the CSR format (3-array variation).
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to the length of the array a. Refer to columns array
description in Sparse Matrix Storage Formats for more details.
184
BLAS and Sparse BLAS Routines 2
DOUBLE COMPLEX for mkl_zcsrsymv.
Array, size is m.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
185
2 Intel Math Kernel Library Developer Reference
mkl_?bsrsymv
Computes matrix-vector product of a sparse
symmetrical matrix stored in the BSR format (3-array
variation) with one-based indexing.
Syntax
call mkl_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
Description
y := A*x
where:
x and y are vectors,
A is an upper or lower triangle of the symmetrical sparse matrix in the BSR format (3-array variation).
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
considered.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
186
BLAS and Sparse BLAS Routines 2
Array containing elements of non-zero blocks of the matrix A. Its length is
equal to the number of non-zero blocks in the matrix A multiplied by lb*lb.
Refer to values array description in BSR Format for more details.
ja INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
Its length is equal to the number of non-zero blocks of the matrix A. Refer
to columns array description in BSR Format for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
187
2 Intel Math Kernel Library Developer Reference
mkl_?coosymv
Computes matrix - vector product of a sparse
symmetrical matrix stored in the coordinate format
with one-based indexing.
Syntax
call mkl_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_ccoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
y := A*x
where:
x and y are vectors,
A is an upper or lower triangle of the symmetrical sparse matrix in the coordinate format.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
188
BLAS and Sparse BLAS Routines 2
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A. Refer to columns array description in
Coordinate Format for more details.
Output Parameters
189
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL val(*), x(*), y(*)
mkl_?diasymv
Computes matrix - vector product of a sparse
symmetrical matrix stored in the diagonal format with
one-based indexing.
Syntax
call mkl_sdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
Include Files
mkl.fi
190
BLAS and Sparse BLAS Routines 2
Description
y := A*x
where:
x and y are vectors,
A is an upper or lower triangle of the symmetrical sparse matrix.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
idiag INTEGER. Array of length ndiag, contains the distances between main
diagonal and each non-zero diagonals in the matrix A.
Refer to distance array description in Diagonal Storage Scheme for more
details.
191
2 Intel Math Kernel Library Developer Reference
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 uplo
INTEGER m, lval, ndiag
INTEGER idiag(*)
REAL val(lval,*), x(*), y(*)
192
BLAS and Sparse BLAS Routines 2
mkl_?csrtrsv
Triangular solvers with simplified interface for a sparse
matrix in the CSR format (3-array variation) with one-
based indexing.
Syntax
call mkl_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
Include Files
mkl.fi
Description
The mkl_?csrtrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the CSR format (3 array variation):
A*y = x
or
AT*y = x,
where:
x and y are vectors,
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
193
2 Intel Math Kernel Library Developer Reference
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to the length of the array a. Refer to columns array
description in Sparse Matrix Storage Formats for more details.
NOTE
Column indices must be sorted in increasing order for each row.
Output Parameters
194
BLAS and Sparse BLAS Routines 2
Array, size at least m.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
mkl_?bsrtrsv
Triangular solver with simplified interface for a sparse
matrix stored in the BSR format (3-array variation)
with one-based indexing.
Syntax
call mkl_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
195
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The mkl_?bsrtrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the BSR format (3-array variation) :
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies the upper or low triangle of the matrix A is used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
196
BLAS and Sparse BLAS Routines 2
Array containing elements of non-zero blocks of the matrix A. Its length is
equal to the number of non-zero blocks in the matrix A multiplied by lb*lb.
Refer to values array description in BSR Format for more details.
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
ja INTEGER.
Array containing the column indices for each non-zero block in the matrix A.
Its length is equal to the number of non-zero blocks of the matrix A. Refer
to columns array description in BSR Format for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m, lb
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
197
2 Intel Math Kernel Library Developer Reference
mkl_?cootrsv
Triangular solvers with simplified interface for a sparse
matrix in the coordinate format with one-based
indexing.
Syntax
call mkl_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_?cootrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the coordinate format:
A*y = x
or
AT*y = x,
where:
x and y are vectors,
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, AT is the transpose of A.
198
BLAS and Sparse BLAS Routines 2
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
considered.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A. Refer to columns array description in
Coordinate Format for more details.
199
2 Intel Math Kernel Library Developer Reference
Array, size is m.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL val(*), x(*), y(*)
200
BLAS and Sparse BLAS Routines 2
mkl_?diatrsv
Triangular solvers with simplified interface for a sparse
matrix in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
Include Files
mkl.fi
Description
The mkl_?diatrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the diagonal format:
A*y = x
or
AT*y = x,
where:
x and y are vectors,
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
201
2 Intel Math Kernel Library Developer Reference
idiag INTEGER. Array of length ndiag, contains the distances between main
diagonal and each non-zero diagonals in the matrix A.
NOTE
All elements of this array must be sorted in increasing order.
Output Parameters
202
BLAS and Sparse BLAS Routines 2
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m, lval, ndiag
INTEGER indiag(*)
REAL val(lval,*), x(*), y(*)
mkl_cspblas_?csrgemv
Computes matrix - vector product of a sparse general
matrix stored in the CSR format (3-array variation)
with zero-based indexing.
Syntax
call mkl_cspblas_scsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_zcsrgemv(transa, m, a, ia, ja, x, y)
Include Files
mkl.fi
203
2 Intel Math Kernel Library Developer Reference
Description
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is an m-by-m sparse square matrix in the CSR format (3-array variation) with zero-based indexing, AT is
the transpose of A.
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to the length of the array a. Refer to columns array
description in Sparse Matrix Storage Formats for more details.
204
BLAS and Sparse BLAS Routines 2
DOUBLE PRECISION for mkl_cspblas_dcsrgemv.
COMPLEX for mkl_cspblas_ccsrgemv.
DOUBLE COMPLEX for mkl_cspblas_zcsrgemv.
Array, size is m.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
205
2 Intel Math Kernel Library Developer Reference
mkl_cspblas_?bsrgemv
Computes matrix - vector product of a sparse general
matrix stored in the BSR format (3-array variation)
with zero-based indexing.
Syntax
call mkl_cspblas_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrgemv(transa, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
Description
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is an m-by-m block sparse square matrix in the BSR format (3-array variation) with zero-based indexing,
AT is the transpose of A.
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
206
BLAS and Sparse BLAS Routines 2
COMPLEX for mkl_cspblas_cbsrgemv.
DOUBLE COMPLEX for mkl_cspblas_zbsrgemv.
Array containing elements of non-zero blocks of the matrix A. Its length is
equal to the number of non-zero blocks in the matrix A multiplied by lb*lb.
Refer to values array description in BSR Format for more details.
ja INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
Its length is equal to the number of non-zero blocks of the matrix A. Refer
to columns array description in BSR Format for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
207
2 Intel Math Kernel Library Developer Reference
mkl_cspblas_?coogemv
Computes matrix - vector product of a sparse general
matrix stored in the coordinate format with zero-
based indexing.
Syntax
call mkl_cspblas_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is an m-by-m sparse square matrix in the coordinate format with zero-based indexing, AT is the transpose
of A.
208
BLAS and Sparse BLAS Routines 2
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A. Refer to columns array description in
Coordinate Format for more details.
Output Parameters
209
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL val(*), x(*), y(*)
mkl_cspblas_?csrsymv
Computes matrix-vector product of a sparse
symmetrical matrix stored in the CSR format (3-array
variation) with zero-based indexing.
Syntax
call mkl_cspblas_scsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrsymv(uplo, m, a, ia, ja, x, y)
210
BLAS and Sparse BLAS Routines 2
call mkl_cspblas_zcsrsymv(uplo, m, a, ia, ja, x, y)
Include Files
mkl.fi
Description
y := A*x
where:
x and y are vectors,
A is an upper or lower triangle of the symmetrical sparse matrix in the CSR format (3-array variation) with
zero-based indexing.
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
211
2 Intel Math Kernel Library Developer Reference
Its length is equal to the length of the array a. Refer to columns array
description in Sparse Matrix Storage Formats for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
212
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_cspblas_zcsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
INTEGER ia(*), ja(*)
DOUBLE COMPLEX a(*), x(*), y(*)
mkl_cspblas_?bsrsymv
Computes matrix-vector product of a sparse
symmetrical matrix stored in the BSR format (3-arrays
variation) with zero-based indexing.
Syntax
call mkl_cspblas_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
Description
y := A*x
where:
x and y are vectors,
A is an upper or lower triangle of the symmetrical sparse matrix in the BSR format (3-array variation) with
zero-based indexing.
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
213
2 Intel Math Kernel Library Developer Reference
ja INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
Its length is equal to the number of non-zero blocks of the matrix A. Refer
to columns array description in BSR Format for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
214
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_cspblas_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
INTEGER ia(*), ja(*)
DOUBLE PRECISION a(*), x(*), y(*)
mkl_cspblas_?coosymv
Computes matrix - vector product of a sparse
symmetrical matrix stored in the coordinate format
with zero-based indexing .
Syntax
call mkl_cspblas_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_ccoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
y := A*x
where:
x and y are vectors,
A is an upper or lower triangle of the symmetrical sparse matrix in the coordinate format with zero-based
indexing.
215
2 Intel Math Kernel Library Developer Reference
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A. Refer to columns array description in
Coordinate Format for more details.
Output Parameters
216
BLAS and Sparse BLAS Routines 2
DOUBLE COMPLEX for mkl_cspblas_zcoosymv.
Array, size at least m.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL val(*), x(*), y(*)
mkl_cspblas_?csrtrsv
Triangular solvers with simplified interface for a sparse
matrix in the CSR format (3-array variation) with
zero-based indexing.
Syntax
call mkl_cspblas_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
217
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The mkl_cspblas_?csrtrsv routine solves a system of linear equations with matrix-vector operations for a
sparse matrix stored in the CSR format (3-array variation) with zero-based indexing:
A*y = x
or
AT*y = x,
where:
x and y are vectors,
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, AT is the transpose of A.
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
218
BLAS and Sparse BLAS Routines 2
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to the length of the array a. Refer to columns array
description in Sparse Matrix Storage Formats for more details.
NOTE
Column indices must be sorted in increasing order for each row.
Output Parameters
219
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
mkl_cspblas_?bsrtrsv
Triangular solver with simplified interface for a sparse
matrix stored in the BSR format (3-array variation)
with zero-based indexing.
Syntax
call mkl_cspblas_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
220
BLAS and Sparse BLAS Routines 2
Description
The mkl_cspblas_?bsrtrsv routine solves a system of linear equations with matrix-vector operations for a
sparse matrix stored in the BSR format (3-array variation) with zero-based indexing:
y := A*x
or
y := AT*x,
where:
x and y are vectors,
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, AT is the transpose of A.
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies the upper or low triangle of the matrix A is used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
221
2 Intel Math Kernel Library Developer Reference
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
ja INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
Its length is equal to the number of non-zero blocks of the matrix A. Refer
to columns array description in BSR Format for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m, lb
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
222
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_cspblas_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m, lb
INTEGER ia(*), ja(*)
DOUBLE PRECISION a(*), x(*), y(*)
mkl_cspblas_?cootrsv
Triangular solvers with simplified interface for a sparse
matrix in the coordinate format with zero-based
indexing .
Syntax
call mkl_cspblas_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_cspblas_?cootrsv routine solves a system of linear equations with matrix-vector operations for a
sparse matrix stored in the coordinate format with zero-based indexing:
A*y = x
or
AT*y = x,
where:
x and y are vectors,
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, AT is the transpose of A.
223
2 Intel Math Kernel Library Developer Reference
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
considered.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A. Refer to columns array description in
Coordinate Format for more details.
224
BLAS and Sparse BLAS Routines 2
Array, size is m.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL val(*), x(*), y(*)
225
2 Intel Math Kernel Library Developer Reference
mkl_?csrmv
Computes matrix - vector product of a sparse matrix
stored in the CSR format.
Syntax
call mkl_scsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_dcsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_ccsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_zcsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
Include Files
mkl.fi
Description
y := alpha*A*x + beta*y
or
y := alpha*AT*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-k sparse matrix in the CSR format, AT is the transpose of A.
NOTE
This routine supports a CSR format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
226
BLAS and Sparse BLAS Routines 2
Specifies the scalar alpha.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
indx INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to length of the val array.
227
2 Intel Math Kernel Library Developer Reference
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmv(transa, m, k, alpha, matdescra, val, indx,
pntrb, pntre, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha, beta
REAL val(*), x(*), y(*)
228
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_ccsrmv(transa, m, k, alpha, matdescra, val, indx,
pntrb, pntre, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k
INTEGER indx(*), pntrb(m), pntre(m)
COMPLEX alpha, beta
COMPLEX val(*), x(*), y(*)
mkl_?bsrmv
Computes matrix - vector product of a sparse matrix
stored in the BSR format.
Syntax
call mkl_sbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre, x, beta,
y)
call mkl_dbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre, x, beta,
y)
call mkl_cbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre, x, beta,
y)
call mkl_zbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre, x, beta,
y)
Include Files
mkl.fi
Description
y := alpha*A*x + beta*y
or
y := alpha*AT*x + beta*y,
where:
229
2 Intel Math Kernel Library Developer Reference
NOTE
This routine supports a BSR format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
indx INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
230
BLAS and Sparse BLAS Routines 2
Its length is equal to the number of non-zero blocks in the matrix A.
Refer to columns array description in BSR Format for more details.
Output Parameters
231
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,
pntrb, pntre, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k, lb
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha, beta
REAL val(*), x(*), y(*)
232
BLAS and Sparse BLAS Routines 2
mkl_?cscmv
Computes matrix-vector product for a sparse matrix in
the CSC format.
Syntax
call mkl_scscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_dcscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_ccscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_zcscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
Include Files
mkl.fi
Description
y := alpha*A*x + beta*y
or
y := alpha*AT*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-k sparse matrix in compressed sparse column (CSC) format, AT is the transpose of A.
NOTE
This routine supports CSC format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
233
2 Intel Math Kernel Library Developer Reference
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
indx INTEGER. Array containing the row indices for each non-zero element of the
matrix A.
Its length is equal to length of the val array.
For zero-based indexing this array contains column indices, such that
pntrb(i) - pntrb(0) is the first index of column i in the arrays val and
indx.
Refer to pointerb array description in CSC Format for more details.
234
BLAS and Sparse BLAS Routines 2
Array, size at least k if transa = 'N' or 'n' and at least m otherwise. On
entry, the array x must contain the vector x.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscmv(transa, m, k, alpha, matdescra, val, indx,
pntrb, pntre, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha, beta
REAL val(*), x(*), y(*)
235
2 Intel Math Kernel Library Developer Reference
mkl_?coomv
Computes matrix - vector product for a sparse matrix
in the coordinate format.
Syntax
call mkl_scoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
call mkl_dcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
call mkl_ccoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
call mkl_zcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
Include Files
mkl.fi
Description
y := alpha*A*x + beta*y
or
y := alpha*AT*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-k sparse matrix in compressed coordinate format, AT is the transpose of A.
236
BLAS and Sparse BLAS Routines 2
NOTE
This routine supports a coordinate format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A.
Refer to columns array description in Coordinate Format for more details.
237
2 Intel Math Kernel Library Developer Reference
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k, nnz
INTEGER rowind(*), colind(*)
REAL alpha, beta
REAL val(*), x(*), y(*)
238
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_ccoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k, nnz
INTEGER rowind(*), colind(*)
COMPLEX alpha, beta
COMPLEX val(*), x(*), y(*)
mkl_?csrsv
Solves a system of linear equations for a sparse
matrix in the CSR format.
Syntax
call mkl_scsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_dcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_ccsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_zcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
Include Files
mkl.fi
Description
The mkl_?csrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the CSR format:
y := alpha*inv(A)*x
or
y := alpha*inv(AT)*x,
where:
alpha is scalar, x and y are vectors, A is a sparse upper or lower triangular matrix with unit or non-unit main
diagonal, AT is the transpose of A.
239
2 Intel Math Kernel Library Developer Reference
NOTE
This routine supports a CSR format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
indx INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to length of the val array.
240
BLAS and Sparse BLAS Routines 2
Refer to columns array description in CSR Format for more details.
NOTE
Column indices must be sorted in increasing order for each row.
On entry, the array x must contain the vector x. The elements are accessed
with unit increment.
On entry, the array y must contain the vector y. The elements are accessed
with unit increment.
Output Parameters
241
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha
REAL val(*)
REAL x(*), y(*)
242
BLAS and Sparse BLAS Routines 2
mkl_?bsrsv
Solves a system of linear equations for a sparse
matrix in the BSR format.
Syntax
call mkl_sbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_dbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_cbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_zbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
Include Files
mkl.fi
Description
The mkl_?bsrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the BSR format:
y := alpha*inv(A)*x
or
y := alpha*inv(AT)* x,
where:
alpha is scalar, x and y are vectors, A is a sparse upper or lower triangular matrix with unit or non-unit main
diagonal, AT is the transpose of A.
NOTE
This routine supports a BSR format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
243
2 Intel Math Kernel Library Developer Reference
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
Refer to the values array description in BSR Format for more details.
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
indx INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
Its length is equal to the number of non-zero blocks in the matrix A.
Refer to the columns array description in BSR Format for more details.
244
BLAS and Sparse BLAS Routines 2
DOUBLE COMPLEX for mkl_zbsrsv.
Array, size at least (m*lb).
On entry, the array x must contain the vector x. The elements are accessed
with unit increment.
On entry, the array y must contain the vector y. The elements are accessed
with unit increment.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, lb
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha
REAL val(*)
REAL x(*), y(*)
245
2 Intel Math Kernel Library Developer Reference
mkl_?cscsv
Solves a system of linear equations for a sparse
matrix in the CSC format.
Syntax
call mkl_scscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_dcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_ccscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_zcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
Include Files
mkl.fi
Description
The mkl_?cscsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the CSC format:
y := alpha*inv(A)*x
or
y := alpha*inv(AT)* x,
where:
alpha is scalar, x and y are vectors, A is a sparse upper or lower triangular matrix with unit or non-unit main
diagonal, AT is the transpose of A.
246
BLAS and Sparse BLAS Routines 2
NOTE
This routine supports a CSC format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
indx INTEGER. Array containing the row indices for each non-zero element of the
matrix A.
Its length is equal to length of the val array.
247
2 Intel Math Kernel Library Developer Reference
NOTE
Row indices must be sorted in increasing order for each column.
For zero-based indexing this array contains column indices, such that
pntrb(i) - pntrb(0) is the first index of column i in the arrays val and
indx.
Refer to pointerb array description in CSC Format for more details.
On entry, the array x must contain the vector x. The elements are accessed
with unit increment.
On entry, the array y must contain the vector y. The elements are accessed
with unit increment.
Output Parameters
248
BLAS and Sparse BLAS Routines 2
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha
REAL val(*)
REAL x(*), y(*)
249
2 Intel Math Kernel Library Developer Reference
mkl_?coosv
Solves a system of linear equations for a sparse
matrix in the coordinate format.
Syntax
call mkl_scoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
call mkl_dcoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
call mkl_ccoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
call mkl_zcoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_?coosv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the coordinate format:
y := alpha*inv(A)*x
or
y := alpha*inv(AT)*x,
where:
alpha is scalar, x and y are vectors, A is a sparse upper or lower triangular matrix with unit or non-unit main
diagonal, AT is the transpose of A.
NOTE
This routine supports a coordinate format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
250
BLAS and Sparse BLAS Routines 2
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A.
Refer to columns array description in Coordinate Format for more details.
On entry, the array x must contain the vector x. The elements are accessed
with unit increment.
On entry, the array y must contain the vector y. The elements are accessed
with unit increment.
Output Parameters
251
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL alpha
REAL val(*)
REAL x(*), y(*)
252
BLAS and Sparse BLAS Routines 2
mkl_?csrmm
Computes matrix - matrix product of a sparse matrix
stored in the CSR format.
Syntax
call mkl_scsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_dcsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_ccsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_zcsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
Include Files
mkl.fi
Description
C := alpha*A*B + beta*C
or
C := alpha*AT*B + beta*C
or
C := alpha*AH*B + beta*C,
where:
alpha and beta are scalars,
B and C are dense matrices, A is an m-by-k sparse matrix in compressed sparse row (CSR) format, AT is the
transpose of A, and AH is the conjugate transpose of A.
NOTE
This routine supports a CSR format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
253
2 Intel Math Kernel Library Developer Reference
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
indx INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to length of the val array.
254
BLAS and Sparse BLAS Routines 2
Refer to pointerE array description in CSR Format for more details.
On entry with transa='N' or 'n', the leading k-by-n part of the array b
must contain the matrix B, otherwise the leading m-by-n part of the array b
must contain the matrix B.
ldb INTEGER. Specifies the leading dimension of b for one-based indexing, and
the second dimension of b for zero-based indexing, as declared in the
calling (sub)program.
ldc INTEGER. Specifies the leading dimension of c for one-based indexing, and
the second dimension of c for zero-based indexing, as declared in the
calling (sub)program.
Output Parameters
255
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_dcsrmm(transa, m, n, k, alpha, matdescra, val, indx,
pntrb, pntre, b, ldb, beta, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, k, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha, beta
REAL val(*), b(ldb,*), c(ldc,*)
256
BLAS and Sparse BLAS Routines 2
mkl_?bsrmm
Computes matrix - matrix product of a sparse matrix
stored in the BSR format.
Syntax
call mkl_sbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, beta, c, ldc)
call mkl_dbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, beta, c, ldc)
call mkl_cbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, beta, c, ldc)
call mkl_zbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, beta, c, ldc)
Include Files
mkl.fi
Description
C := alpha*A*B + beta*C
or
C := alpha*AT*B + beta*C
or
C := alpha*AH*B + beta*C,
where:
alpha and beta are scalars,
B and C are dense matrices, A is an m-by-k sparse matrix in block sparse row (BSR) format, AT is the
transpose of A, and AH is the conjugate transpose of A.
NOTE
This routine supports a BSR format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
257
2 Intel Math Kernel Library Developer Reference
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
indx INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
Its length is equal to the number of non-zero blocks in the matrix A. Refer
to the columns array description in BSR Format for more details.
258
BLAS and Sparse BLAS Routines 2
For zero-based indexing this array contains row indices, such that
pntre(I) - pntrb(0) - 1 is the last index of block row I in the array
indx.
Refer to pointerE array description in BSR Format for more details.
On entry with transa='N' or 'n', the leading n-by-k block part of the
array b must contain the matrix B, otherwise the leading m-by-n block part
of the array b must contain the matrix B.
ldb INTEGER. Specifies the leading dimension (in blocks) of b as declared in the
calling (sub)program.
ldc INTEGER. Specifies the leading dimension (in blocks) of c as declared in the
calling (sub)program.
Output Parameters
259
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrmm(transa, m, n, k, lb, alpha, matdescra, val,
indx, pntrb, pntre, b, ldb, beta, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, k, ld, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha, beta
REAL val(*), b(ldb,*), c(ldc,*)
260
BLAS and Sparse BLAS Routines 2
mkl_?cscmm
Computes matrix-matrix product of a sparse matrix
stored in the CSC format.
Syntax
call mkl_scscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_dcscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_ccscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_zcscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
Include Files
mkl.fi
Description
C := alpha*A*B + beta*C
or
C := alpha*AT*B + beta*C,
or
C := alpha*AH*B + beta*C,
where:
alpha and beta are scalars,
B and C are dense matrices, A is an m-by-k sparse matrix in compressed sparse column (CSC) format, AT is
the transpose of A, and AH is the conjugate transpose of A.
NOTE
This routine supports CSC format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
261
2 Intel Math Kernel Library Developer Reference
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
indx INTEGER. Array containing the row indices for each non-zero element of the
matrix A.
Its length is equal to length of the val array.
For zero-based indexing this array contains column indices, such that
pntrb(i) - pntrb(0) is the first index of column i in the arrays val and
indx.
Refer to pointerb array description in CSC Format for more details.
262
BLAS and Sparse BLAS Routines 2
For zero-based indexing this array contains column indices, such that
pntre(i) - pntrb(1) - 1 is the last index of column i in the arrays val
and indx.
On entry with transa = 'N' or 'n', the leading k-by-n part of the array b
must contain the matrix B, otherwise the leading m-by-n part of the array b
must contain the matrix B.
ldb INTEGER. Specifies the leading dimension of b for one-based indexing, and
the second dimension of b for zero-based indexing, as declared in the
calling (sub)program.
ldc INTEGER. Specifies the leading dimension of c for one-based indexing, and
the second dimension of c for zero-based indexing, as declared in the
calling (sub)program.
Output Parameters
263
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscmm(transa, m, n, k, alpha, matdescra, val, indx,
pntrb, pntre, b, ldb, beta, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, k, ldb, ldc
INTEGER indx(*), pntrb(k), pntre(k)
REAL alpha, beta
REAL val(*), b(ldb,*), c(ldc,*)
264
BLAS and Sparse BLAS Routines 2
mkl_?coomm
Computes matrix-matrix product of a sparse matrix
stored in the coordinate format.
Syntax
call mkl_scoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz, b, ldb,
beta, c, ldc)
call mkl_dcoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz, b, ldb,
beta, c, ldc)
call mkl_ccoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz, b, ldb,
beta, c, ldc)
call mkl_zcoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz, b, ldb,
beta, c, ldc)
Include Files
mkl.fi
Description
C := alpha*A*B + beta*C
or
C := alpha*AT*B + beta*C,
or
C := alpha*AH*B + beta*C,
where:
alpha and beta are scalars,
B and C are dense matrices, A is an m-by-k sparse matrix in the coordinate format, AT is the transpose of A,
and AH is the conjugate transpose of A.
NOTE
This routine supports a coordinate format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
265
2 Intel Math Kernel Library Developer Reference
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A.
Refer to columns array description in Coordinate Format for more details.
On entry with transa = 'N' or 'n', the leading k-by-n part of the array b
must contain the matrix B, otherwise the leading m-by-n part of the array b
must contain the matrix B.
266
BLAS and Sparse BLAS Routines 2
ldb INTEGER. Specifies the leading dimension of b for one-based indexing, and
the second dimension of b for zero-based indexing, as declared in the
calling (sub)program.
ldc INTEGER. Specifies the leading dimension of c for one-based indexing, and
the second dimension of c for zero-based indexing, as declared in the
calling (sub)program.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoomm(transa, m, n, k, alpha, matdescra, val,
rowind, colind, nnz, b, ldb, beta, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, k, ldb, ldc, nnz
INTEGER rowind(*), colind(*)
REAL alpha, beta
REAL val(*), b(ldb,*), c(ldc,*)
267
2 Intel Math Kernel Library Developer Reference
mkl_?csrsm
Solves a system of linear matrix equations for a
sparse matrix in the CSR format.
Syntax
call mkl_scsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_dcsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_ccsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_zcsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
268
BLAS and Sparse BLAS Routines 2
Include Files
mkl.fi
Description
The mkl_?csrsm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the CSR format:
C := alpha*inv(A)*B
or
C := alpha*inv(AT)*B,
where:
alpha is scalar, B and C are dense matrices, A is a sparse upper or lower triangular matrix with unit or non-
unit main diagonal, AT is the transpose of A.
NOTE
This routine supports a CSR format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
269
2 Intel Math Kernel Library Developer Reference
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
indx INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to length of the val array.
NOTE
Column indices must be sorted in increasing order for each row.
270
BLAS and Sparse BLAS Routines 2
ldb INTEGER. Specifies the leading dimension of b for one-based indexing, and
the second dimension of b for zero-based indexing, as declared in the
calling (sub)program.
ldc INTEGER. Specifies the leading dimension of c for one-based indexing, and
the second dimension of c for zero-based indexing, as declared in the
calling (sub)program.
Output Parameters
c REAL*8.
Array, size ldc by n for one-based indexing, and (m, ldc) for zero-based
indexing.
The leading m-by-n part of the array c contains the output matrix C.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsm(transa, m, n, alpha, matdescra, val, indx,
pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha
REAL val(*), b(ldb,*), c(ldc,*)
271
2 Intel Math Kernel Library Developer Reference
mkl_?cscsm
Solves a system of linear matrix equations for a
sparse matrix in the CSC format.
Syntax
call mkl_scscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_dcscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_ccscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_zcscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
Include Files
mkl.fi
Description
The mkl_?cscsm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the CSC format:
C := alpha*inv(A)*B
or
C := alpha*inv(AT)*B,
272
BLAS and Sparse BLAS Routines 2
where:
alpha is scalar, B and C are dense matrices, A is a sparse upper or lower triangular matrix with unit or non-
unit main diagonal, AT is the transpose of A.
NOTE
This routine supports a CSC format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
273
2 Intel Math Kernel Library Developer Reference
indx INTEGER. Array containing the row indices for each non-zero element of the
matrix A. Its length is equal to length of the val array.
NOTE
Row indices must be sorted in increasing order for each column.
For zero-based indexing this array contains column indices, such that
pntrb(I) - pntrb(0) is the first index of column I in the arrays val and
indx.
Refer to pointerb array description in CSC Format for more details.
ldb INTEGER. Specifies the leading dimension of b for one-based indexing, and
the second dimension of b for zero-based indexing, as declared in the
calling (sub)program.
ldc INTEGER. Specifies the leading dimension of c for one-based indexing, and
the second dimension of c for zero-based indexing, as declared in the
calling (sub)program.
274
BLAS and Sparse BLAS Routines 2
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscsm(transa, m, n, alpha, matdescra, val, indx,
pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha
REAL val(*), b(ldb,*), c(ldc,*)
275
2 Intel Math Kernel Library Developer Reference
mkl_?coosm
Solves a system of linear matrix equations for a
sparse matrix in the coordinate format.
Syntax
call mkl_scoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
call mkl_dcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
call mkl_ccoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
call mkl_zcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
Include Files
mkl.fi
Description
The mkl_?coosm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the coordinate format:
C := alpha*inv(A)*B
or
C := alpha*inv(AT)*B,
where:
alpha is scalar, B and C are dense matrices, A is a sparse upper or lower triangular matrix with unit or non-
unit main diagonal, AT is the transpose of A.
NOTE
This routine supports a coordinate format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
276
BLAS and Sparse BLAS Routines 2
transa CHARACTER*1. Specifies the system of linear equations.
If transa = 'N' or 'n', then the matrix-matrix product is computed as
C := alpha*inv(A)*B
If transa = 'T' or 't' or 'C' or 'c', then the matrix-vector product is
computed as C := alpha*inv(AT)*B,
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A.
Refer to columns array description in Coordinate Format for more details.
277
2 Intel Math Kernel Library Developer Reference
Array, size ldb by n for one-based indexing, and (m, ldb) for zero-based
indexing.
Before entry the leading m-by-n part of the array b must contain the matrix
B.
ldb INTEGER. Specifies the leading dimension of b for one-based indexing, and
the second dimension of b for zero-based indexing, as declared in the
calling (sub)program.
ldc INTEGER. Specifies the leading dimension of c for one-based indexing, and
the second dimension of c for zero-based indexing, as declared in the
calling (sub)program.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, ldb, ldc, nnz
INTEGER rowind(*), colind(*)
REAL alpha
REAL val(*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_dcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, ldb, ldc, nnz
INTEGER rowind(*), colind(*)
DOUBLE PRECISION alpha
DOUBLE PRECISION val(*), b(ldb,*), c(ldc,*)
278
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_ccoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, ldb, ldc, nnz
INTEGER rowind(*), colind(*)
COMPLEX alpha
COMPLEX val(*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, ldb, ldc, nnz
INTEGER rowind(*), colind(*)
DOUBLE COMPLEX alpha
DOUBLE COMPLEX val(*), b(ldb,*), c(ldc,*)
mkl_?bsrsm
Solves a system of linear matrix equations for a
sparse matrix in the BSR format.
Syntax
call mkl_scsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
call mkl_dcsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
call mkl_ccsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
call mkl_zcsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
Include Files
mkl.fi
Description
The mkl_?bsrsm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the BSR format:
C := alpha*inv(A)*B
or
C := alpha*inv(AT)*B,
where:
alpha is scalar, B and C are dense matrices, A is a sparse upper or lower triangular matrix with unit or non-
unit main diagonal, AT is the transpose of A.
279
2 Intel Math Kernel Library Developer Reference
NOTE
This routine supports a BSR format both with one-based indexing and zero-based indexing.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.
280
BLAS and Sparse BLAS Routines 2
indx INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to the number of non-zero blocks in the matrix A.
Refer to the columns array description in BSR Format for more details.
ldb INTEGER. Specifies the leading dimension (in blocks) of b as declared in the
calling (sub)program.
ldc INTEGER. Specifies the leading dimension (in blocks) of c as declared in the
calling (sub)program.
Output Parameters
281
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, lb, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha
REAL val(*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_dbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, lb, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
DOUBLE PRECISION alpha
DOUBLE PRECISION val(*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_cbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, lb, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
COMPLEX alpha
COMPLEX val(*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, lb, ldb, ldc
INTEGER indx(*), pntrb(m), pntre(m)
DOUBLE COMPLEX alpha
DOUBLE COMPLEX val(*), b(ldb,*), c(ldc,*)
mkl_?diamv
Computes matrix - vector product for a sparse matrix
in the diagonal format with one-based indexing.
282
BLAS and Sparse BLAS Routines 2
Syntax
call mkl_sdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x, beta, y)
call mkl_ddiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x, beta, y)
call mkl_cdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x, beta, y)
call mkl_zdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x, beta, y)
Include Files
mkl.fi
Description
y := alpha*A*x + beta*y
or
y := alpha*AT*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-k sparse matrix stored in the diagonal format, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
283
2 Intel Math Kernel Library Developer Reference
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
idiag INTEGER. Array of length ndiag, contains the distances between main
diagonal and each non-zero diagonals in the matrix A.
Refer to distance array description in Diagonal Storage Scheme for more
details.
284
BLAS and Sparse BLAS Routines 2
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiamv(transa, m, k, alpha, matdescra, val, lval, idiag,
ndiag, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k, lval, ndiag
INTEGER idiag(*)
REAL alpha, beta
REAL val(lval,*), x(*), y(*)
285
2 Intel Math Kernel Library Developer Reference
mkl_?skymv
Computes matrix - vector product for a sparse matrix
in the skyline storage format with one-based indexing.
Syntax
call mkl_sskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_dskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_cskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_zskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
Include Files
mkl.fi
Description
y := alpha*A*x + beta*y
or
y := alpha*AT*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-k sparse matrix stored using the skyline storage scheme, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
286
BLAS and Sparse BLAS Routines 2
If transa = 'N' or 'n', then y := alpha*A*x + beta*y
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
NOTE
General matrices (matdescra(1)='G') is not supported.
287
2 Intel Math Kernel Library Developer Reference
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k
INTEGER pntr(*)
REAL alpha, beta
REAL val(*), x(*), y(*)
288
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_cdskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k
INTEGER pntr(*)
COMPLEX alpha, beta
COMPLEX val(*), x(*), y(*)
mkl_?diasv
Solves a system of linear equations for a sparse
matrix in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_ddiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_cdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_zdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
Include Files
mkl.fi
Description
The mkl_?diasv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the diagonal format:
y := alpha*inv(A)*x
or
y := alpha*inv(AT)* x,
where:
alpha is scalar, x and y are vectors, A is a sparse upper or lower triangular matrix with unit or non-unit main
diagonal, AT is the transpose of A.
289
2 Intel Math Kernel Library Developer Reference
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
idiag INTEGER. Array of length ndiag, contains the distances between main
diagonal and each non-zero diagonals in the matrix A.
NOTE
All elements of this array must be sorted in increasing order.
290
BLAS and Sparse BLAS Routines 2
ndiag INTEGER. Specifies the number of non-zero diagonals of the matrix A.
On entry, the array x must contain the vector x. The elements are accessed
with unit increment.
On entry, the array y must contain the vector y. The elements are accessed
with unit increment.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, lval, ndiag
INTEGER indiag(*)
REAL alpha
REAL val(lval,*), x(*), y(*)
291
2 Intel Math Kernel Library Developer Reference
mkl_?skysv
Solves a system of linear equations for a sparse
matrix in the skyline format with one-based indexing.
Syntax
call mkl_sskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_dskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_cskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_zskysv(transa, m, alpha, matdescra, val, pntr, x, y)
Include Files
mkl.fi
Description
The mkl_?skysv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the skyline storage format:
y := alpha*inv(A)*x
or
y := alpha*inv(AT)*x,
where:
alpha is scalar, x and y are vectors, A is a sparse upper or lower triangular matrix with unit or non-unit main
diagonal, AT is the transpose of A.
292
BLAS and Sparse BLAS Routines 2
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
NOTE
General matrices (matdescra(1)='G') is not supported.
293
2 Intel Math Kernel Library Developer Reference
It contains the indices specifying in the val the positions of the first
element in each row (column) of the matrix A. Refer to pointers array
description in Skyline Storage Scheme for more details.
On entry, the array x must contain the vector x. The elements are accessed
with unit increment.
On entry, the array y must contain the vector y. The elements are accessed
with unit increment.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskysv(transa, m, alpha, matdescra, val, pntr, x, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m
INTEGER pntr(*)
REAL alpha
REAL val(*), x(*), y(*)
294
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_cskysv(transa, m, alpha, matdescra, val, pntr, x, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m
INTEGER pntr(*)
COMPLEX alpha
COMPLEX val(*), x(*), y(*)
mkl_?diamm
Computes matrix-matrix product of a sparse matrix
stored in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag, b, ldb,
beta, c, ldc)
call mkl_ddiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag, b, ldb,
beta, c, ldc)
call mkl_cdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag, b, ldb,
beta, c, ldc)
call mkl_zdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag, b, ldb,
beta, c, ldc)
Include Files
mkl.fi
Description
C := alpha*A*B + beta*C
or
C := alpha*AT*B + beta*C,
or
C := alpha*AH*B + beta*C,
295
2 Intel Math Kernel Library Developer Reference
where:
alpha and beta are scalars,
B and C are dense matrices, A is an m-by-k sparse matrix in the diagonal format, AT is the transpose of A,
and AH is the conjugate transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
296
BLAS and Sparse BLAS Routines 2
idiag INTEGER. Array of length ndiag, contains the distances between main
diagonal and each non-zero diagonals in the matrix A.
Refer to distance array description in Diagonal Storage Scheme for more
details.
On entry with transa = 'N' or 'n', the leading k-by-n part of the array b
must contain the matrix B, otherwise the leading m-by-n part of the array b
must contain the matrix B.
On entry, the leading m-by-n part of the array c must contain the matrix C,
otherwise the leading k-by-n part of the array c must contain the matrix C.
Output Parameters
297
2 Intel Math Kernel Library Developer Reference
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiamm(transa, m, n, k, alpha, matdescra, val, lval,
idiag, ndiag, b, ldb, beta, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, k, ldb, ldc, lval, ndiag
INTEGER idiag(*)
REAL alpha, beta
REAL val(lval,*), b(ldb,*), c(ldc,*)
298
BLAS and Sparse BLAS Routines 2
mkl_?skymm
Computes matrix-matrix product of a sparse matrix
stored using the skyline storage scheme with one-
based indexing.
Syntax
call mkl_sskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta, c, ldc)
call mkl_dskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta, c, ldc)
call mkl_cskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta, c, ldc)
call mkl_zskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta, c, ldc)
Include Files
mkl.fi
Description
C := alpha*A*B + beta*C
or
C := alpha*AT*B + beta*C,
or
C := alpha*AH*B + beta*C,
where:
alpha and beta are scalars,
B and C are dense matrices, A is an m-by-k sparse matrix in the skyline storage format, AT is the transpose
of A, and AH is the conjugate transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
299
2 Intel Math Kernel Library Developer Reference
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
NOTE
General matrices (matdescra (1)='G') is not supported.
On entry with transa = 'N' or 'n', the leading k-by-n part of the array b
must contain the matrix B, otherwise the leading m-by-n part of the array b
must contain the matrix B.
300
BLAS and Sparse BLAS Routines 2
ldb INTEGER. Specifies the leading dimension of b as declared in the calling
(sub)program.
On entry, the leading m-by-n part of the array c must contain the matrix C,
otherwise the leading k-by-n part of the array c must contain the matrix C.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,
ldb, beta, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, k, ldb, ldc
INTEGER pntr(*)
REAL alpha, beta
REAL val(*), b(ldb,*), c(ldc,*)
301
2 Intel Math Kernel Library Developer Reference
mkl_?diasm
Solves a system of linear matrix equations for a
sparse matrix in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b, ldb, c,
ldc)
call mkl_ddiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b, ldb, c,
ldc)
call mkl_cdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b, ldb, c,
ldc)
call mkl_zdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b, ldb, c,
ldc)
302
BLAS and Sparse BLAS Routines 2
Include Files
mkl.fi
Description
The mkl_?diasm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the diagonal format:
C := alpha*inv(A)*B
or
C := alpha*inv(AT)*B,
where:
alpha is scalar, B and C are dense matrices, A is a sparse upper or lower triangular matrix with unit or non-
unit main diagonal, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
303
2 Intel Math Kernel Library Developer Reference
idiag INTEGER. Array of length ndiag, contains the distances between main
diagonal and each non-zero diagonals in the matrix A.
NOTE
All elements of this array must be sorted in increasing order.
On entry the leading m-by-n part of the array b must contain the matrix B.
Output Parameters
304
BLAS and Sparse BLAS Routines 2
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasm(transa, m, n, alpha, matdescra, val, lval, idiag,
ndiag, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, ldb, ldc, lval, ndiag
INTEGER idiag(*)
REAL alpha
REAL val(lval,*), b(ldb,*), c(ldc,*)
305
2 Intel Math Kernel Library Developer Reference
mkl_?skysm
Solves a system of linear matrix equations for a
sparse matrix stored using the skyline storage scheme
with one-based indexing.
Syntax
call mkl_sskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_dskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_cskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_zskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
Include Files
mkl.fi
Description
The mkl_?skysm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the skyline storage format:
C := alpha*inv(A)*B
or
C := alpha*inv(AT)*B,
where:
alpha is scalar, B and C are dense matrices, A is a sparse upper or lower triangular matrix with unit or non-
unit main diagonal, AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
306
BLAS and Sparse BLAS Routines 2
Specifies the scalar alpha.
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
NOTE
General matrices (matdescra(1)='G') is not supported.
On entry the leading m-by-n part of the array b must contain the matrix B.
307
2 Intel Math Kernel Library Developer Reference
Output Parameters
Interfaces
FORTRAN 77:
308
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_zskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, n, ldb, ldc
INTEGER pntr(*)
DOUBLE COMPLEX alpha
DOUBLE COMPLEX val(*), b(ldb,*), c(ldc,*)
mkl_?dnscsr
Convert a sparse matrix in uncompressed
representation to the CSR format and vice versa.
Syntax
call mkl_sdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
call mkl_ddnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
call mkl_cdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
call mkl_zdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
Include Files
mkl.fi
Description
This routine converts a sparse matrix A between formats: stored as a rectangular array (dense
representation) and stored using compressed sparse row (CSR) format (3-array variation).
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
job INTEGER
Array, contains the following conversion parameters:
309
2 Intel Math Kernel Library Developer Reference
if job(3)=1, one-based indexing for the matrix in CSR format is
used.
job(4): Portion of matrix.
If job(4)=0, adns is a lower triangular part of matrix A;
If job(4)=1, adns is an upper triangular part of matrix A;
If job(4)=2, adns is a whole matrix A.
job(5)=nzmax: maximum number of the non-zero elements allowed if
job(1)=0.
job(6): job indicator for conversion to CSR format.
If job(6)=0, only array ia is generated for the output storage.
If job(6)>0, arrays acsr, ia, ja are generated for the output
storage.
adns (input/output)
REAL for mkl_sdnscsr.
DOUBLE PRECISION for mkl_ddnscsr.
COMPLEX for mkl_cdnscsr.
DOUBLE COMPLEX for mkl_zdnscsr.
If the conversion type is from uncompressed to CSR, on input adns
contains an uncompressed (dense) representation of matrix A.
lda INTEGER. Specifies the leading dimension of adns as declared in the calling
(sub)program.
For zero-based indexing of A, lda must be at least max(1, n).
acsr (input/output)
REAL for mkl_sdnscsr.
DOUBLE PRECISION for mkl_ddnscsr.
COMPLEX for mkl_cdnscsr.
DOUBLE COMPLEX for mkl_zdnscsr.
If conversion type is from CSR to uncompressed, on input acsr contains
the non-zero elements of the matrix A. Its length is equal to the number of
non-zero elements in the matrix A. Refer to values array description in
Sparse Matrix Storage Formats for more details.
310
BLAS and Sparse BLAS Routines 2
ia (input/output)INTEGER. Array of length m + 1.
The value ofia(m + 1) is equal to the number of non-zeros plus one. Refer
to rowIndex array description in Sparse Matrix Storage Formats for more
details.
Output Parameters
acsr, ja, ia If conversion type is from uncompressed to CSR, on output acsr, ja, and
ia contain the compressed sparse row (CSR) format (3-array variation) of
matrix A (see Sparse Matrix Storage Formats for a description of the
storage format).
info INTEGER. Integer info indicator only for restoring the matrix A from the CSR
format.
If info=0, the execution is successful.
If info=i, the routine is interrupted processing the i-th row because there
is no space in the arrays acsr and ja according to the value nzmax.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
INTEGER job(8)
INTEGER m, n, lda, info
INTEGER ja(*), ia(m+1)
REAL adns(*), acsr(*)
311
2 Intel Math Kernel Library Developer Reference
mkl_?csrcoo
Converts a sparse matrix in the CSR format to the
coordinate format and vice versa.
Syntax
call mkl_scsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
call mkl_dcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
call mkl_ccsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
call mkl_zcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
Include Files
mkl.fi
Description
This routine converts a sparse matrix A stored in the compressed sparse row (CSR) format (3-array
variation) to coordinate format and vice versa.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
job INTEGER
Array, contains the following conversion parameters:
job(1)
If job(1)=0, the matrix in the CSR format is converted to the coordinate
format;
if job(1)=1, the matrix in the coordinate format is converted to the CSR
format.
if job(1)=2, the matrix in the coordinate format is converted to the CSR
format, and the column indices in CSR representation are sorted in the
increasing order within each row.
job(2)
If job(2)=0, zero-based indexing for the matrix in CSR format is used;
job(3)
312
BLAS and Sparse BLAS Routines 2
If job(3)=0, zero-based indexing for the matrix in coordinate format is
used;
if job(3)=1, one-based indexing for the matrix in coordinate format is
used.
job(5)
job(5)=nzmax - maximum number of the non-zero elements allowed if
job(1)=0.
job(6) - job indicator.
For conversion to the coordinate format:
If job(6)=1, only array rowind is filled in for the output storage.
If job(6)=2, arrays rowind, colind are filled in for the output storage.
If job(6)=3, all arrays rowind, colind, acoo are filled in for the output
storage.
For conversion to the CSR format:
If job(6)=0, all arrays acsr, ja, ia are filled in for the output storage.
If job(6)=2, then it is assumed that the routine already has been called
with the job(6)=1, and the user allocated the required space for storing
the output arrays acsr and ja.
nnz INTEGER. Specifies the number of non-zero elements of the matrix A for
job(1)0.
Refer to nnz description in Coordinate Format for more details.
acsr (input/output)
REAL for mkl_scsrcoo.
DOUBLE PRECISION for mkl_dcsrcoo.
COMPLEX for mkl_ccsrcoo.
DOUBLE COMPLEX for mkl_zcsrcoo.
Array containing non-zero elements of the matrix A. Its length is equal to
the number of non-zero elements in the matrix A. Refer to values array
description in Sparse Matrix Storage Formats for more details.
ja (input/output) INTEGER. Array containing the column indices for each non-
zero element of the matrix A.
Its length is equal to the length of the array acsr. Refer to columns array
description in Sparse Matrix Storage Formats for more details.
313
2 Intel Math Kernel Library Developer Reference
acoo (input/output)
REAL for mkl_scsrcoo.
DOUBLE PRECISION for mkl_dcsrcoo.
COMPLEX for mkl_ccsrcoo.
DOUBLE COMPLEX for mkl_zcsrcoo.
Array containing non-zero elements of the matrix A. Its length is equal to
the number of non-zero elements in the matrix A. Refer to values array
description in Sparse Matrix Storage Formats for more details.
rowind (input/output)INTEGER. Array of length nnz, contains the row indices for
each non-zero element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
colind (input/output)INTEGER. Array of length nnz, contains the column indices for
each non-zero element of the matrix A. Refer to columns array description
in Coordinate Format for more details.
Output Parameters
nnz Returns the number of converted elements of the matrix A for job(1)=0.
info INTEGER. Integer info indicator only for converting the matrix A from the
CSR format.
If info=0, the execution is successful.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
INTEGER n, nnz, info
INTEGER ja(*), ia(n+1), rowind(*), colind(*)
REAL acsr(*), acoo(*)
SUBROUTINE mkl_dcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
INTEGER n, nnz, info
INTEGER ja(*), ia(n+1), rowind(*), colind(*)
DOUBLE PRECISION acsr(*), acoo(*)
314
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_ccsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
INTEGER n, nnz, info
INTEGER ja(*), ia(n+1), rowind(*), colind(*)
COMPLEX acsr(*), acoo(*)
SUBROUTINE mkl_zcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
INTEGER n, nnz, info
INTEGER ja(*), ia(n+1), rowind(*), colind(*)
DOUBLE COMPLEX acsr(*), acoo(*)
mkl_?csrbsr
Converts a square sparse matrix in the CSR format to
the BSR format and vice versa.
Syntax
call mkl_scsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
call mkl_dcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
call mkl_ccsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
call mkl_zcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
Include Files
mkl.fi
Description
This routine converts a square sparse matrix A stored in the compressed sparse row (CSR) format (3-array
variation) to the block sparse row (BSR) format and vice versa.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
job INTEGER
Array, contains the following conversion parameters:
job(1)
If job(1)=0, the matrix in the CSR format is converted to the BSR format;
if job(1)=1, the matrix in the BSR format is converted to the CSR format.
job(2)
If job(2)=0, zero-based indexing for the matrix in CSR format is used;
315
2 Intel Math Kernel Library Developer Reference
job(3)
If job(3)=0, zero-based indexing for the matrix in the BSR format is used;
if job(3)=1, one-based indexing for the matrix in the BSR format is used.
job(4) is only used for conversion to CSR format. By default, the converter
saves the blocks without checking whether an element is zero or not. If
job(4)=1, then the converter only saves non-zero elements in blocks.
job(6) - job indicator.
For conversion to the BSR format:
If job(6)=0, only arrays jab, iab are generated for the output storage.
If job(6)>0, all output arrays absr, jab, and iab are filled in for the
output storage.
If job(6)=-1, iab(1) returns the number of non-zero blocks.
m INTEGER. Actual row dimension of the matrix A for convert to the BSR
format; block row dimension of the matrix A for convert to the CSR format.
ldabsr INTEGER. Leading dimension of the array absr as declared in the calling
program. ldabsr must be greater than or equal to mblk*mblk.
acsr (input/output)
REAL for mkl_scsrbsr.
DOUBLE PRECISION for mkl_dcsrbsr.
COMPLEX for mkl_ccsrbsr.
DOUBLE COMPLEX for mkl_zcsrbsr.
Array containing non-zero elements of the matrix A. Its length is equal to
the number of non-zero elements in the matrix A. Refer to values array
description in Sparse Matrix Storage Formats for more details.
absr (input/output)
REAL for mkl_scsrbsr.
316
BLAS and Sparse BLAS Routines 2
DOUBLE PRECISION for mkl_dcsrbsr.
COMPLEX for mkl_ccsrbsr.
DOUBLE COMPLEX for mkl_zcsrbsr.
Array containing elements of non-zero blocks of the matrix A. Its length is
equal to the number of non-zero blocks in the matrix A multiplied by
mblk*mblk. Refer to values array description in BSR Format for more
details.
jab (input/output)INTEGER. Array containing the column indices for each non-
zero block of the matrix A.
Its length is equal to the number of non-zero blocks of the matrix A. Refer
to columns array description in BSR Format for more details.
Output Parameters
info INTEGER. Integer info indicator only for converting the matrix A from the
CSR format.
If info=0, the execution is successful.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
INTEGER m, mblk, ldabsr, info
INTEGER ja(*), ia(m+1), jab(*), iab(*)
REAL acsr(*), absr(ldabsr,*)
SUBROUTINE mkl_dcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
INTEGER m, mblk, ldabsr, info
INTEGER ja(*), ia(m+1), jab(*), iab(*)
DOUBLE PRECISION acsr(*), absr(ldabsr,*)
317
2 Intel Math Kernel Library Developer Reference
SUBROUTINE mkl_ccsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
INTEGER m, mblk, ldabsr, info
INTEGER ja(*), ia(m+1), jab(*), iab(*)
COMPLEX acsr(*), absr(ldabsr,*)
SUBROUTINE mkl_zcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
INTEGER m, mblk, ldabsr, info
INTEGER ja(*), ia(m+1), jab(*), iab(*)
DOUBLE COMPLEX acsr(*), absr(ldabsr,*)
mkl_?csrcsc
Converts a square sparse matrix in the CSR format to
the CSC format and vice versa.
Syntax
call mkl_scsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
call mkl_dcsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
call mkl_ccsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
call mkl_zcsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
Include Files
mkl.fi
Description
This routine converts a square sparse matrix A stored in the compressed sparse row (CSR) format (3-array
variation) to the compressed sparse column (CSC) format and vice versa.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
job INTEGER
Array, contains the following conversion parameters:
job(1)
If job(1)=0, the matrix in the CSR format is converted to the CSC format;
if job(1)=1, the matrix in the CSC format is converted to the CSR format.
job(2)
If job(2)=0, zero-based indexing for the matrix in CSR format is used;
318
BLAS and Sparse BLAS Routines 2
job(3)
If job(3)=0, zero-based indexing for the matrix in the CSC format is used;
if job(3)=1, one-based indexing for the matrix in the CSC format is used.
If job(6)0, all output arrays acsc, ja1, and ia1 are filled in for the
output storage.
For conversion to the CSR format:
If job(6)=0, only arrays ja, ia are filled in for the output storage.
If job(6)0, all output arrays acsr, ja, and ia are filled in for the output
storage.
acsr (input/output)
REAL for mkl_scsrcsc.
DOUBLE PRECISION for mkl_dcsrcsc.
COMPLEX for mkl_ccsrcsc.
DOUBLE COMPLEX for mkl_zcsrcsc.
Array containing non-zero elements of the square matrix A. Its length is
equal to the number of non-zero elements in the matrix A. Refer to values
array description in Sparse Matrix Storage Formats for more details.
acsc (input/output)
REAL for mkl_scsrcsc.
DOUBLE PRECISION for mkl_dcsrcsc.
COMPLEX for mkl_ccsrcsc.
DOUBLE COMPLEX for mkl_zcsrcsc.
Array containing non-zero elements of the square matrix A. Its length is
equal to the number of non-zero elements in the matrix A. Refer to values
array description in Sparse Matrix Storage Formats for more details.
319
2 Intel Math Kernel Library Developer Reference
ja1 (input/output)INTEGER. Array containing the row indices for each non-zero
element of the matrix A.
Its length is equal to the length of the array acsc. Refer to columns array
description in Sparse Matrix Storage Formats for more details.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
INTEGER job(8)
INTEGER m, info
INTEGER ja(*), ia(m+1), ja1(*), ia1(m+1)
REAL acsr(*), acsc(*)
320
BLAS and Sparse BLAS Routines 2
mkl_?csrdia
Converts a sparse matrix in the CSR format to the
diagonal format and vice versa.
Syntax
call mkl_scsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
call mkl_dcsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
call mkl_ccsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
call mkl_zcsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
Include Files
mkl.fi
Description
This routine converts a sparse matrix A stored in the compressed sparse row (CSR) format (3-array
variation) to the diagonal format and vice versa.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
job INTEGER
Array, contains the following conversion parameters:
job(1)
If job(1)=0, the matrix in the CSR format is converted to the diagonal
format;
if job(1)=1, the matrix in the diagonal format is converted to the CSR
format.
job(2)
If job(2)=0, zero-based indexing for the matrix in CSR format is used;
job(3)
If job(3)=0, zero-based indexing for the matrix in the diagonal format is
used;
if job(3)=1, one-based indexing for the matrix in the diagonal format is
used.
job(6) - job indicator.
For conversion to the diagonal format:
If job(6)=0, diagonals are not selected internally, and acsr_rem, ja_rem,
ia_rem are not filled in for the output storage.
321
2 Intel Math Kernel Library Developer Reference
If job(6)0, each entry in the array adia is not checked whether it is zero.
acsr (input/output)
REAL for mkl_scsrdia.
DOUBLE PRECISION for mkl_dcsrdia.
COMPLEX for mkl_ccsrdia.
DOUBLE COMPLEX for mkl_zcsrdia.
Array containing non-zero elements of the matrix A. Its length is equal to
the number of non-zero elements in the matrix A. Refer to values array
description in Sparse Matrix Storage Formats for more details.
adia (input/output)
REAL for mkl_scsrdia.
DOUBLE PRECISION for mkl_dcsrdia.
COMPLEX for mkl_ccsrdia.
DOUBLE COMPLEX for mkl_zcsrdia.
Array of size (ndiag x idiag) containing diagonals of the matrix A.
The key point of the storage is that each element in the array adia retains
the row number of the original matrix. To achieve this diagonals in the
lower triangular part of the matrix are padded from the top, and those in
the upper triangular part are padded from the bottom.
ndiag INTEGER.
322
BLAS and Sparse BLAS Routines 2
Specifies the leading dimension of the array adia as declared in the calling
(sub)program, must be at least max(1, m).
distance INTEGER.
Array of length idiag, containing the distances between the main diagonal
and each non-zero diagonal to be extracted. The distance is positive if the
diagonal is above the main diagonal, and negative if the diagonal is below
the main diagonal. The main diagonal has a distance equal to zero.
idiag INTEGER.
Number of diagonals to be extracted. For conversion to diagonal format on
return this parameter may be modified.
acsr_rem, ja_rem, ia_rem Remainder of the matrix in the CSR format if it is needed for conversion to
the diagonal format.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
INTEGER job(8)
INTEGER m, info, ndiag, idiag
INTEGER ja(*), ia(m+1), distance(*), ja_rem(*), ia_rem(*)
REAL acsr(*), adia(*), acsr_rem(*)
SUBROUTINE mkl_dcsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
INTEGER job(8)
INTEGER m, info, ndiag, idiag
INTEGER ja(*), ia(m+1), distance(*), ja_rem(*), ia_rem(*)
DOUBLE PRECISION acsr(*), adia(*), acsr_rem(*)
SUBROUTINE mkl_ccsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
INTEGER job(8)
INTEGER m, info, ndiag, idiag
INTEGER ja(*), ia(m+1), distance(*), ja_rem(*), ia_rem(*)
COMPLEX acsr(*), adia(*), acsr_rem(*)
323
2 Intel Math Kernel Library Developer Reference
SUBROUTINE mkl_zcsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
INTEGER job(8)
INTEGER m, info, ndiag, idiag
INTEGER ja(*), ia(m+1), distance(*), ja_rem(*), ia_rem(*)
DOUBLE COMPLEX acsr(*), adia(*), acsr_rem(*)
mkl_?csrsky
Converts a sparse matrix in CSR format to the skyline
format and vice versa.
Syntax
call mkl_scsrsky(job, m, acsr, ja, ia, asky, pointers, info)
call mkl_dcsrsky(job, m, acsr, ja, ia, asky, pointers, info)
call mkl_ccsrsky(job, m, acsr, ja, ia, asky, pointers, info)
call mkl_zcsrsky(job, m, acsr, ja, ia, asky, pointers, info)
Include Files
mkl.fi
Description
This routine converts a sparse matrix A stored in the compressed sparse row (CSR) format (3-array
variation) to the skyline format and vice versa.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
job INTEGER
Array, contains the following conversion parameters:
job(1)
If job(1)=0, the matrix in the CSR format is converted to the skyline
format;
if job(1)=1, the matrix in the skyline format is converted to the CSR
format.
job(2)
If job(2)=0, zero-based indexing for the matrix in CSR format is used;
job(3)
If job(3)=0, zero-based indexing for the matrix in the skyline format is
used;
324
BLAS and Sparse BLAS Routines 2
if job(3)=1, one-based indexing for the matrix in the skyline format is
used.
job(4)
For conversion to the skyline format:
If job(4)=0, the upper part of the matrix A in the CSR format is converted.
If job(4)=1, the lower part of the matrix A in the CSR format is converted.
If job(6)=1, all output arrays asky and pointers are filled in for the
output storage.
acsr (input/output)
REAL for mkl_scsrsky.
DOUBLE PRECISION for mkl_dcsrsky.
COMPLEX for mkl_ccsrsky.
DOUBLE COMPLEX for mkl_zcsrsky.
Array containing non-zero elements of the matrix A. Its length is equal to
the number of non-zero elements in the matrix A. Refer to values array
description in Sparse Matrix Storage Formats for more details.
asky (input/output)
REAL for mkl_scsrsky.
325
2 Intel Math Kernel Library Developer Reference
pointers (input/output)INTEGER.
Array with dimension (m+1), where m is number of rows for lower triangle
(columns for upper triangle), pointers(i) - pointers(1) + 1 gives the
index of element in the array asky that is first non-zero element in row
(column)i . The value of pointers(m + 1) is set to nnz + pointers(1),
where nnz is the number of elements in the array asky. Refer to pointers
array description in Skyline Storage Format for more details
Output Parameters
info INTEGER. Integer info indicator only for converting the matrix A from the
CSR format.
If info=0, the execution is successful.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsky(job, m, acsr, ja, ia, asky, pointers, info)
INTEGER job(8)
INTEGER m, info
INTEGER ja(*), ia(m+1), pointers(m+1)
REAL acsr(*), asky(*)
326
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_ccsrsky(job, m, acsr, ja, ia, asky, pointers, info)
INTEGER job(8)
INTEGER m, info
INTEGER ja(*), ia(m+1), pointers(m+1)
COMPLEX acsr(*), asky(*)
mkl_?csradd
Computes the sum of two matrices stored in the CSR
format (3-array variation) with one-based indexing.
Syntax
call mkl_scsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_dcsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_ccsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_zcsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic,
nzmax, info)
Include Files
mkl.fi
Description
C := A+beta*op(B)
where:
A, B, C are the sparse matrices in the CSR format (3-array variation).
op(B) is one of op(B) = B, or op(B) = BT, or op(B) = BH
beta is a scalar.
The routine works correctly if and only if the column indices in sparse matrix representations of matrices A
and B are arranged in the increasing order for each row. If not, use the parameter sort (see below) to
reorder column indices and the corresponding elements of the input matrices.
NOTE
This routine supports only one-based indexing of the input arrays.
327
2 Intel Math Kernel Library Developer Reference
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
request INTEGER.
If request=0, the routine performs addition. The memory for the output
arrays ic, jc, c must be allocated beforehand.
sort INTEGER. Specifies the type of reordering. If this parameter is not set
(default), the routine does not perform reordering.
If sort=1, the routine arranges the column indices ja for each row in the
increasing order and reorders the corresponding values of the matrix A in
the array a.
If sort=2, the routine arranges the column indices jb for each row in the
increasing order and reorders the corresponding values of the matrix B in
the array b.
If sort=3, the routine performs reordering for both input matrices A and B.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A. For each row the column indices must be arranged in the
increasing order.
328
BLAS and Sparse BLAS Routines 2
The length of this array is equal to the length of the array a. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.
jb INTEGER. Array containing the column indices for each non-zero element of
the matrix B. For each row the column indices must be arranged in the
increasing order.
The length of this array is equal to the length of the array b. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.
Output Parameters
329
2 Intel Math Kernel Library Developer Reference
jc INTEGER. Array containing the column indices for each non-zero element of
the matrix C.
The length of this array is equal to the length of the array c. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.
info INTEGER.
If info=0, the execution is successful.
If info=I>0, the routine stops calculation in the I-th row of the matrix C
because number of elements in C exceeds nzmax.
If info=-1, the routine calculates only the size of the arrays c and jc and
returns this value plus 1 as the last element of the array ic.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic, nzmax,
info)
CHARACTER trans
INTEGER request, sort, m, n, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
REAL a(*), b(*), c(*), beta
SUBROUTINE mkl_dcsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic, nzmax,
info)
CHARACTER trans
INTEGER request, sort, m, n, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
DOUBLE PRECISION a(*), b(*), c(*), beta
330
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_ccsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic, nzmax,
info)
CHARACTER trans
INTEGER request, sort, m, n, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
COMPLEX a(*), b(*), c(*), beta
SUBROUTINE mkl_zcsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic, nzmax,
info)
CHARACTER trans
INTEGER request, sort, m, n, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
DOUBLE COMPLEX a(*), b(*), c(*), beta
mkl_?csrmultcsr
Computes product of two sparse matrices stored in
the CSR format (3-array variation) with one-based
indexing.
Syntax
call mkl_scsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_dcsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_ccsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_zcsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
Include Files
mkl.fi
Description
C := op(A)*B
where:
A, B, C are the sparse matrices in the CSR format (3-array variation);
op(A) is one of op(A) = A, or op(A) =AT, or op(A) = AH .
You can use the parameter sort to perform or not perform reordering of non-zero entries in input and output
sparse matrices. The purpose of reordering is to rearrange non-zero entries in compressed sparse row matrix
so that column indices in compressed sparse representation are sorted in the increasing order for each row.
The following table shows correspondence between the value of the parameter sort and the type of
reordering performed by this routine for each sparse matrix involved:
331
2 Intel Math Kernel Library Developer Reference
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
request INTEGER.
If request=0, the routine performs multiplication, the memory for the
output arrays ic, jc, c must be allocated beforehand.
If request=2, the routine has been called previously with the parameter
request=1, the output arrays jc and c are allocated in the calling program
and they are of the length ic(m+1) - 1 at least.
332
BLAS and Sparse BLAS Routines 2
Array containing non-zero elements of the matrix A. Its length is equal to
the number of non-zero elements in the matrix A. Refer to values array
description in Sparse Matrix Storage Formats for more details.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A. For each row the column indices must be arranged in the
increasing order.
The length of this array is equal to the length of the array a. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.
jb INTEGER. Array containing the column indices for each non-zero element of
the matrix B. For each row the column indices must be arranged in the
increasing order.
The length of this array is equal to the length of the array b. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.
Output Parameters
333
2 Intel Math Kernel Library Developer Reference
jc INTEGER. Array containing the column indices for each non-zero element of
the matrix C.
The length of this array is equal to the length of the array c. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.
info INTEGER.
If info=0, the execution is successful.
If info=I>0, the routine stops calculation in the I-th row of the matrix C
because number of elements in C exceeds nzmax.
If info=-1, the routine calculates only the size of the arrays c and jc and
returns this value plus 1 as the last element of the array ic.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
CHARACTER*1 trans
INTEGER request, sort, m, n, k, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
REAL a(*), b(*), c(*)
SUBROUTINE mkl_dcsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
CHARACTER*1 trans
INTEGER request, sort, m, n, k, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
DOUBLE PRECISION a(*), b(*), c(*)
334
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_ccsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
CHARACTER*1 trans
INTEGER request, sort, m, n, k, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
COMPLEX a(*), b(*), c(*)
SUBROUTINE mkl_zcsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
CHARACTER*1 trans
INTEGER request, sort, m, n, k, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
DOUBLE COMPLEX a(*), b(*), c(*)
mkl_?csrmultd
Computes product of two sparse matrices stored in
the CSR format (3-array variation) with one-based
indexing. The result is stored in the dense matrix.
Syntax
call mkl_scsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_dcsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_ccsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_zcsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
Include Files
mkl.fi
Description
C := op(A)*B
where:
A, B are the sparse matrices in the CSR format (3-array variation), C is dense matrix;
op(A) is one of op(A) = A, or op(A) =AT, or op(A) = AH .
The routine works correctly if and only if the column indices in sparse matrix representations of matrices A
and B are arranged in the increasing order for each row.
NOTE
This routine supports only one-based indexing of the input arrays.
335
2 Intel Math Kernel Library Developer Reference
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A. For each row the column indices must be arranged in the
increasing order.
The length of this array is equal to the length of the array a. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.
jb INTEGER. Array containing the column indices for each non-zero element of
the matrix B. For each row the column indices must be arranged in the
increasing order.
336
BLAS and Sparse BLAS Routines 2
The length of this array is equal to the length of the array b. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.
Output Parameters
ldc INTEGER. Specifies the leading dimension of the dense matrix C as declared
in the calling (sub)program. Must be at least max(m, 1) when trans =
'N' or 'n', or max(1, n) otherwise.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
CHARACTER*1 trans
INTEGER m, n, k, ldc
INTEGER ja(*), jb(*), ia(*), ib(*)
REAL a(*), b(*), c(ldc, *)
337
2 Intel Math Kernel Library Developer Reference
The data type is included in the name only if the function accepts dense matrix or scalar floating point
parameters.
The <operation> field indicates the type of operation:
338
BLAS and Sparse BLAS Routines 2
optimize analyze the matrix using hints and store optimization information in matrix
handle
spmm/spmmd compute sparse matrix by sparse matrix product and store the result as a
sparse/dense matrix
The format is included in the function name only if the function parameters include an explicit sparse matrix
in one of the conventional sparse matrix formats.
Computational routines operate on a matrix handle that stores a matrix in CSR or BSR formats. Other
formats should be converted to CSR or BSR format before calling any computational routines. For more
information see Sparse Matrix Storage Formats.
339
2 Intel Math Kernel Library Developer Reference
340
BLAS and Sparse BLAS Routines 2
mkl_sparse_?_create_csr
Creates a handle for a CSR format matrix.
Syntax
stat = mkl_sparse_s_create_csr (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_d_create_csr (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_c_create_csr (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_z_create_csr (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_create_csr routine creates a handle for an m-by-k matrix A in CSR format.
Input Parameters
indexing sparse_index_base_t.
Indicates how input arrays are indexed.
rows C_INT.
Number of rows of matrix A.
cols C_INT.
Number of columns of matrix A.
rows_start C_INT.
Array of length at least m. This array contains row indices, such that
rows_start(i) - rows_start(1) is the first index of row i in the arrays
values and col_indx.
Refer to pointerb array description in CSR Format for more details.
rows_end C_INT.
Array of at least length m. This array contains row indices, such that
rows_end(i) - rows_start(1) - 1 is the last index of row i in the arrays
values and col_indx.
Refer to pointerE array description in CSR Format for more details.
341
2 Intel Math Kernel Library Developer Reference
col_indx C_INT .
For one-based indexing, array containing the column indices plus one for
each non-zero element of the matrix A. For zero-based indexing, array
containing the column indices for each non-zero element of the matrix A.
Its length is at least rows_end(rows - 1) - rows_start(1).
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data for subsequent Inspector-executor Sparse
BLAS operations.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_create_csc
Creates a handle for a CSC format matrix.
Syntax
stat = mkl_sparse_s_create_csc (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_d_create_csc (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
342
BLAS and Sparse BLAS Routines 2
stat = mkl_sparse_c_create_csc (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_z_create_csc (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_create_csc routine creates a handle for an m-by-k matrix A in CSC format.
Input Parameters
indexing sparse_index_base_t.
Indicates how input arrays are indexed.
rows C_INT.
Number of rows of the matrix A.
cols C_INT.
Number of columns of the matrix A.
rows_start C_INT.
Array of length at least m. This array contains row indices, such that
rows_start(i) - rows_start(1) is the first index of row i in the arrays
values and col_indx.
Refer to pointerb array description in CSC Format for more details.
rows_end C_INT.
Array of at least length m. This array contains row indices, such that
rows_end(i) - rows_start(1) - 1 is the last index of row i in the arrays
values and col_indx.
Refer to pointerE array description in CSC Format for more details.
col_indx C_INT.
For one-based indexing, array containing the column indices plus one for
each non-zero element of the matrix A. For zero-based indexing, array
containing the column indices for each non-zero element of the matrix A.
Its length is at least rows_end(rows - 1) - rows_start(1).
343
2 Intel Math Kernel Library Developer Reference
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_create_coo
Creates a handle for a matrix in COO format.
Syntax
stat = mkl_sparse_s_create_coo (A, indexing, rows, cols, nnz, row_indx, col_indx,
values)
stat = mkl_sparse_d_create_coo (A, indexing, rows, cols, nnz, row_indx, col_indx,
values)
stat = mkl_sparse_c_create_coo (A, indexing, rows, cols, nnz, row_indx, col_indx,
values)
stat = mkl_sparse_z_create_coo (A, indexing, rows, cols, nnz, row_indx, col_indx,
values)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_create_coo routine creates a handle for an m-by-k matrix A in COO format.
344
BLAS and Sparse BLAS Routines 2
Input Parameters
indexing sparse_index_base_t.
Indicates how input arrays are indexed.
rows C_INT.
Number of rows of matrix A.
cols C_INT.
Number of columns of matrix A.
nnz C_INT.
Specifies the number of non-zero elements of the matrix A.
Refer to nnz description in Coordinate Format for more details.
row_indx C_INT.
Array of length nnz, containing the row indices for each non-zero element
of matrix A.
Refer to rows array description in Coordinate Format for more details.
col_indx C_INT.
Array of length nnz, containing the column indices for each non-zero
element of matrix A.
Refer to columns array description in Coordinate Format for more details.
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
345
2 Intel Math Kernel Library Developer Reference
mkl_sparse_?_create_bsr
Creates a handle for a matrix in BSR format.
Syntax
stat = mkl_sparse_s_create_bsr (A, indexing, block_layout, rows, cols, block_size,
rows_start, rows_end, col_indx, values)
stat = mkl_sparse_d_create_bsr (A, indexing, block_layout, rows, cols, block_size,
rows_start, rows_end, col_indx, values)
stat = mkl_sparse_c_create_bsr (A, indexing, block_layout, rows, cols, block_size,
rows_start, rows_end, col_indx, values)
stat = mkl_sparse_z_create_bsr (A, indexing, block_layout, rows, cols, block_size,
rows_start, rows_end, col_indx, values)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_create_bsr routine creates a handle for an m-by-k matrix A in BSR format.
Input Parameters
indexing sparse_index_base_t.
Indicates how input arrays are indexed.
block_layout sparse_index_base_t.
346
BLAS and Sparse BLAS Routines 2
Specifies layout of blocks:
NOTE
If you specify SPARSE_INDEX_BASE_ZERO for indexing, you must use
SPARSE_LAYOUT_ROW_MAJOR for block_layout. Similarly, if you
specify SPARSE_INDEX_BASE_ONE for indexing, you must use
SPARSE_LAYOUT_COLUMN_MAJOR for block_layout. Otherwise
mkl_sparse_?_create_bsr returns stat =
SPARSE_STATUS_NOT_SUPPORTED.
rows C_INT.
Number of block rows of matrix A.
cols C_INT.
Number of block columns of matrix A.
block_size C_INT.
Size of blocks in matrix A.
rows_start C_INT.
Array of length m. This array contains row indices, such that
rows_start(i) - rows_start(1) is the first index of block row i in the
arrays values and col_indx.
rows_end C_INT.
Array of length m. This array contains row indices, such that rows_end(i)
- rows_start(1) - 1 is the last index of block row i in the arrays values
and col_indx.
col_indx C_INT.
For one-based indexing, array containing the column indices plus one for
each non-zero block of the matrix A. For zero-based indexing, array
containing the column indices for each non-zero block of the matrix A. Its
length is rows_end(rows - 1) - rows_start(1).
347
2 Intel Math Kernel Library Developer Reference
Refer to the values array description in BSR Format for more details.
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_copy
Creates a copy of a matrix handle.
Syntax
stat = mkl_sparse_copy (source, descr, dest)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_copy routine creates a copy of a matrix handle.
Input Parameters
source SPARSE_MATRIX_T.
Specifies handle containing internal data.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
348
BLAS and Sparse BLAS Routines 2
type - Specifies the type of a sparse matrix:
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
349
2 Intel Math Kernel Library Developer Reference
mkl_sparse_destroy
Frees memory allocated for matrix handle.
Syntax
stat = mkl_sparse_destroy (A)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_destroy routine frees memory allocated for matrix handle.
NOTE
You must free memory allocated for matrices after completing use of them. The mkl_sparse_destroy
provides a utility to do so.
Input Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
350
BLAS and Sparse BLAS Routines 2
SPARSE_STATUS_INTER An error in algorithm implementation occurred.
NAL_ERROR
SPARSE_STATUS_NOT_S The requested operation is not supported.
UPPORTED
mkl_sparse_convert_csr
Converts internal matrix representation to CSR
format.
Syntax
stat = mkl_sparse_convert_csr (source, operation, dest)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_convert_csr routine converts internal matrix representation to CSR format.
Input Parameters
source SPARSE_MATRIX_T.
Handle containing internal data.
operation C_INT.
Specifies operation op() on input matrix.
Output Parameters
dest SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
351
2 Intel Math Kernel Library Developer Reference
mkl_sparse_convert_bsr
Converts internal matrix representation to BSR format
or changes BSR block size.
Syntax
stat = mkl_sparse_convert_bsr (source, block_size, block_layout, operation, dest)
Include Files
mkl_spblas.f90
Description
Themkl_sparse_convert_bsr routine converts internal matrix representation to BSR format or changes
BSR block size.
Input Parameters
source SPARSE_MATRIX_T.
Handle containing internal data.
block_size C_INT.
Size of the block in the output structure.
block_layout sparse_index_base_t.
Specifies layout of blocks:
operation C_INT.
Specifies operation op() on input matrix.
352
BLAS and Sparse BLAS Routines 2
SPARSE_OPERATION_CO Conjugate transpose, op(A) = AH.
NJUGATE_TRANSPOSE
Output Parameters
dest SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_export_csr
Exports CSR matrix from internal representation.
Syntax
stat = mkl_sparse_s_export_csr (source, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_d_export_csr (source, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_c_export_csr (source, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_z_export_csr (source, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
Include Files
mkl_spblas.f90
Description
If the matrix specified by the source handle is in CSR format, the mkl_sparse_?_export_csr routine
exports an m-by-k matrix A in CSR format matrix from the internal representation. The routine returns
pointers to the internal representation and does not allocate additional memory.
353
2 Intel Math Kernel Library Developer Reference
NOTE
Since the exported data is a copy of an internally stored structure, any changes made to it have no
effect on subsequent Inspector-executor Sparse BLAS operations.
If the matrix is not already in CSR format, the routine returns SPARSE_STATUS_INVALID_VALUE.
Input Parameters
source SPARSE_MATRIX_T.
Handle containing internal data.
Output Parameters
indexing sparse_index_base_t.
Indicates how input arrays are indexed.
rows C_INT.
Number of rows of the matrix source.
cols C_INT.
Number of columns of the matrix source.
rows_start C_INT.
Pointer to array of length m. This array contains row indices, such that
rows_start(i) - rows_start(1) is the first index of row i in the arrays
values and col_indx.
Refer to pointerb array description in CSR Format for more details.
rows_end C_INT.
Pointer to array of length m. This array contains row indices, such that
rows_end(i) - rows_start(1) - 1 is the last index of row i in the arrays
values and col_indx.
Refer to pointerE array description in CSR Format for more details.
col_indx C_INT.
For one-based indexing, pointer to array containing the column indices plus
one for each non-zero element of the matrix source. For zero-based
indexing, pointer to array containing the column indices for each non-zero
element of the matrix source. Its length is rows_end(rows - 1) -
rows_start(1).
354
BLAS and Sparse BLAS Routines 2
C_FLOAT_COMPLEX for mkl_sparse_c_export_csr
C_DOUBLE_COMPLEX for mkl_sparse_z_export_csr
Pointer to array containing non-zero elements of the matrix A. Its length is
equal to length of the col_indx array.
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_export_bsr
Exports BSR matrix from internal representation.
Syntax
stat = mkl_sparse_s_export_bsr (source, indexing, block_layout, rows, cols,
block_size, rows_start, rows_end, col_indx, values)
stat = mkl_sparse_d_export_bsr (source, indexing, block_layout, rows, cols,
block_size, rows_start, rows_end, col_indx, values)
stat = mkl_sparse_c_export_bsr (source, indexing, block_layout, rows, cols,
block_size, rows_start, rows_end, col_indx, values)
stat = mkl_sparse_z_export_bsr (source, indexing, block_layout, rows, cols,
block_size, rows_start, rows_end, col_indx, values)
Include Files
mkl_spblas.f90
Description
If the matrix specified by the source handle is in BSR format, the mkl_sparse_?_export_bsr routine
exports an m-by-k matrix A in BSR format from the internal representation. The routine returns pointers to
the internal representation and does not allocate additional memory.
355
2 Intel Math Kernel Library Developer Reference
NOTE
Since the exported data is a copy of an internally stored structure, any changes made to it have no
effect on subsequent Inspector-executor Sparse BLAS operations.
If the matrix is not already in BSR format, the routine returns SPARSE_STATUS_INVALID_VALUE.
Input Parameters
source SPARSE_MATRIX_T.
Handle containing internal data.
Output Parameters
indexing sparse_index_base_t.
Indicates how input arrays are indexed.
block_layout sparse_index_base_t.
Specifies layout of blocks:
rows C_INT.
Number of block rows of the matrix source.
cols C_INT.
Number of columns of the matrix source. Number of block columns of
matrix source.
block_size C_INT.
Size of the block in matrix source.
rows_start C_INT.
Pointer to array of length m. This array contains row indices, such that
rows_start(i) - rows_start(1) is the first index of block row i in the
arrays values and col_indx.
rows_end C_INT.
356
BLAS and Sparse BLAS Routines 2
Pointer to array of length m. This array contains row indices, such that
rows_end(i) - rows_start(1) - 1 is the last index of block row i in the
arrays values and col_indx.
col_indx C_INT.
For one-based indexing, pointer to array containing the column indices plus
one for each non-zero blocks of the matrix source. For zero-based indexing,
pointer to array containing the column indices for each non-zero blocks of
the matrix source. Its length is rows_end(m - 1) - rows_start(1).
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_set_value
Changes a single value of matrix in internal
representation.
Syntax
stat = mkl_sparse_s_set_value (A , row, col, value);
357
2 Intel Math Kernel Library Developer Reference
Include Files
mkl_spblas.f90
Description
Use the mkl_sparse_?_set_value routine to change a single value of a matrix in internal Inspector-
executor Sparse BLAS format.
Input Parameters
A SPARSE_MATRIX_T.
Specifies handle containing internal data.
row C_INT .
Indicates row of matrix in which to set value.
col C_INT .
Indicates column of matrix in which to set value.
Output Parameters
stat INTEGER.
Value indicating whether the operation was successful or not, and why:
358
BLAS and Sparse BLAS Routines 2
Inspector-executor Sparse BLAS Analysis Routines
Analysis Routines and Their Data Types
Routine or Function Description
Group
mkl_sparse_set_sv_hint Provides estimate of number and type of upcoming triangular system solver
operations.
mkl_sparse_set_sm_hint Provides estimate of number and type of upcoming triangular matrix solve
with multiple right hand sides operations.
mkl_sparse_optimize Analyzes matrix structure and performs optimizations using the hints
provided in the handle.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
mkl_sparse_set_mv_hint
Provides estimate of number and type of upcoming
matrix-vector operations.
Syntax
stat = mkl_sparse_set_mv_hint (A, operation, descr, expected_calls)
Include Files
mkl_spblas.f90
Description
Use the mkl_sparse_set_mv_hint routine to provide the Inspector-executor Sparse BLAS API an estimate
of the number of upcoming matrix-vector multiplication operations for performance optimization, and specify
whether or not to perform an operation on the matrix.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
359
2 Intel Math Kernel Library Developer Reference
Optimization Notice
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
type - Specifies the type of a sparse matrix:
360
BLAS and Sparse BLAS Routines 2
SPARSE_FILL_MODE_UP The upper triangular matrix part is processed.
PER
diag - Specifies diagonal type for non-general matrices:
expected_calls C_INT.
Number of expected calls to execution routine.
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_set_sv_hint
Provides estimate of number and type of upcoming
triangular system solver operations.
Syntax
stat = mkl_sparse_set_sv_hint (A, operation, descr, expected_calls)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_sv_hint routine provides an estimate of the number of upcoming triangular system solver
operations and type of these operations for performance optimization.
361
2 Intel Math Kernel Library Developer Reference
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
type - Specifies the type of a sparse matrix:
362
BLAS and Sparse BLAS Routines 2
SPARSE_FILL_MODE_LO The lower triangular matrix part is processed.
WER
SPARSE_FILL_MODE_UP The upper triangular matrix part is processed.
PER
diag - Specifies diagonal type for non-general matrices:
expected_calls C_INT.
Number of expected calls to execution routine.
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_set_mm_hint
Provides estimate of number and type of upcoming
matrix-matrix multiplication operations.
Syntax
stat = mkl_sparse_set_mm_hint (A, operation, descr, layout, dense_matrix_size,
expected_calls)
Include Files
mkl_spblas.f90
363
2 Intel Math Kernel Library Developer Reference
Description
The mkl_sparse_set_mm_hint routine provides an estimate of the number of upcoming matrix-matrix
multiplication operations and type of these operations for performance optimization purposes.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
type - Specifies the type of a sparse matrix:
364
BLAS and Sparse BLAS Routines 2
SPARSE_MATRIX_TYPE_ The matrix is block-diagonal (only diagonal
BLOCK_DIAGONAL blocks are processed. (Applies to BSR format
only.)
layout C_INT.
Specifies layout of elements:
dense_matrix_size C_INT.
Number of columns in dense matrix.
expected_calls C_INT.
Number of expected calls to execution routine.
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
365
2 Intel Math Kernel Library Developer Reference
mkl_sparse_set_sm_hint
Provides estimate of number and type of upcoming
triangular matrix solve with multiple right hand sides
operations.
Syntax
stat = mkl_sparse_set_sm_hint (A, operation, descr, layout, dense_matrix_size,
expected_calls)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_set_sm_hint routine provides an estimate of the number of upcoming triangular matrix
solve with multiple right hand sides operations and type of these operations for performance optimization
purposes.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
descr MATRIX_DESCR.
366
BLAS and Sparse BLAS Routines 2
Descriptor specifying sparse matrix properties.
type - Specifies the type of a sparse matrix:
layout C_INT.
Specifies layout of elements:
dense_matrix_size C_INT.
Number of right-hand-side.
expected_calls C_INT.
Number of expected calls to execution routine.
367
2 Intel Math Kernel Library Developer Reference
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_set_memory_hint
Provides memory requirements for performance
optimization purposes.
Syntax
stat = mkl_sparse_set_memory_hint (A, policy)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_set_memory_hint routine allocates additional memory for further performance
optimization purposes.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
368
BLAS and Sparse BLAS Routines 2
Input Parameters
policy C_INT.
Specify memory utilization policy for optimization routine using these types:
SPARSE_MEMORY_AGGRE Default.
SSIVE Routine can allocate memory up to the size of
matrix A for converting into the appropriate
sparse format.
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_optimize
Analyzes matrix structure and performs optimizations
using the hints provided in the handle.
Syntax
stat = mkl_sparse_optimize (A)
Include Files
mkl_spblas.f90
369
2 Intel Math Kernel Library Developer Reference
Description
The mkl_sparse_optimize routine analyzes matrix structure and performs optimizations using the hints
provided in the handle. Generally, specifying a higher number of expected operations allows for more
aggressive and time consuming optimizations.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
370
BLAS and Sparse BLAS Routines 2
Inspector-executor Sparse BLAS Execution Routines
Execution Routines and Their Data Types
Routine or Data Types Description
Function Group
mkl_sparse_?_mv
Computes a sparse matrix-vector product.
Syntax
stat = mkl_sparse_s_mv (operation, alpha, A, descr, x, beta, y)
stat = mkl_sparse_d_mv (operation, alpha, A, descr, x, beta, y)
stat = mkl_sparse_c_mv (operation, alpha, A, descr, x, beta, y)
stat = mkl_sparse_z_mv (operation, alpha, A, descr, x, beta, y)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_mv routine computes a sparse matrix-vector product defined as
y := alpha*op(A)*x + beta*y
where:
alpha and beta are scalars, x and y are vectors, and A is a matrix handle of a matrix with m rows and k
columns.
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
371
2 Intel Math Kernel Library Developer Reference
A SPARSE_MATRIX_T.
Handle containing sparse matrix in internal data structure.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
type - Specifies the type of a sparse matrix:
372
BLAS and Sparse BLAS Routines 2
SPARSE_DIAG_NON_UNI Diagonal elements might not be equal to one.
T
SPARSE_DIAG_UNIT Diagonal elements are equal to one.
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
373
2 Intel Math Kernel Library Developer Reference
mkl_sparse_?_trsv
Solves a system of linear equations for a triangular
sparse matrix.
Syntax
stat = mkl_sparse_s_trsv (operation, alpha, A, descr, x, y)
stat = mkl_sparse_d_trsv (operation, alpha, A, descr, x, y)
stat = mkl_sparse_c_trsv (operation, alpha, A, descr, x, y)
stat = mkl_sparse_z_trsv (operation, alpha, A, descr, x, y)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_trsv routine solves a system of linear equations for a matrix:
op(A)*y = alpha * x
where A is a triangular sparse matrix, alpha is a scalar, and x and y are vectors.
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
374
BLAS and Sparse BLAS Routines 2
A SPARSE_MATRIX_T.
Handle containing sparse matrix in internal data structure.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
type - Specifies the type of a sparse matrix:
375
2 Intel Math Kernel Library Developer Reference
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_mm
Computes the product of a sparse matrix and a dense
matrix.
Syntax
stat = mkl_sparse_s_mm (operation, alpha, A, descr, layout, x, columns, ldx, beta, y,
ldy)
stat = mkl_sparse_d_mm (operation, alpha, A, descr, layout, x, columns, ldx, beta, y,
ldy)
stat = mkl_sparse_c_mm (operation, alpha, A, descr, layout, x, columns, ldx, beta, y,
ldy)
stat = mkl_sparse_z_mm (operation, alpha, A, descr, layout, x, columns, ldx, beta, y,
ldy)
Include Files
mkl_spblas.f90
376
BLAS and Sparse BLAS Routines 2
Description
The mkl_sparse_?_mm routine performs a matrix-matrix operation:
y := alpha*op(A)*x + beta*y
where alpha and beta are scalars, A is a sparse matrix, and x and y are dense matrices.
The mkl_sparse_?_mm and mkl_sparse_?_trsm routines support these configurations:
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
A SPARSE_MATRIX_T.
Handle containing sparse matrix in internal data structure.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
type - Specifies the type of a sparse matrix:
377
2 Intel Math Kernel Library Developer Reference
layout C_INT.
Describes the storage scheme for the dense matrix:
layout = layout =
SPARSE_LAYOUT_CO SPARSE_LAYOUT_ROW_
LUMN_MAJOR MAJOR
378
BLAS and Sparse BLAS Routines 2
rows (number of ldx If op(A) = A, number
rows in x) of columns in A
If op(A) = AT, number
of rows in A
columns C_INT.
Number of columns of matrix y.
ldx C_INT.
Specifies the leading dimension of matrix x.
layout = layout =
SPARSE_LAYOUT_CO SPARSE_LAYOUT_ROW_
LUMN_MAJOR MAJOR
Output Parameters
sparse_status_t INTEGER.
379
2 Intel Math Kernel Library Developer Reference
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_trsm
Solves a system of linear equations with multiple right
hand sides for a triangular sparse matrix.
Syntax
stat = mkl_sparse_s_trsm (operation, alpha, A, descr, layout, x, columns, ldx, y, ldy)
stat = mkl_sparse_d_trsm (operation, alpha, A, descr, layout, x, columns, ldx, y, ldy)
stat = mkl_sparse_c_trsm (operation, alpha, A, descr, layout, x, columns, ldx, y, ldy)
stat = mkl_sparse_z_trsm (operation, alpha, A, descr, layout, x, columns, ldx, y, ldy)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_trsm routine solves a system of linear equations with multiple right hand sides for a
triangular sparse matrix:
y := alpha*inv(op(A))*x
where:
alpha is a scalar, x and y are dense matrices, and A is a sparse matrix.
The mkl_sparse_?_mm and mkl_sparse_?_trsm routines support these configurations:
380
BLAS and Sparse BLAS Routines 2
BSR: general non-transposed
matrix multiplication only
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
A SPARSE_MATRIX_T.
Handle containing sparse matrix in internal data structure.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
type - Specifies the type of a sparse matrix:
381
2 Intel Math Kernel Library Developer Reference
layout C_INT.
Describes the storage scheme for the dense matrix:
layout = layout =
SPARSE_LAYOUT_CO SPARSE_LAYOUT_ROW_
LUMN_MAJOR MAJOR
columns C_INT.
Number of columns in matrix y.
ldx C_INT.
Specifies the leading dimension of matrix x.
382
BLAS and Sparse BLAS Routines 2
C_DOUBLE for mkl_sparse_d_trsm
C_FLOAT_COMPLEX for mkl_sparse_c_trsm
C_DOUBLE_COMPLEX for mkl_sparse_z_trsm
Array of size at least rows*cols, where
layout = layout =
SPARSE_LAYOUT_CO SPARSE_LAYOUT_ROW_
LUMN_MAJOR MAJOR
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_add
Computes sum of two sparse matrices.
Syntax
stat = mkl_sparse_s_add (operation, A, alpha, B, C)
383
2 Intel Math Kernel Library Developer Reference
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_add routine performs a matrix-matrix operation:
C := alpha*op(A) + B
Input Parameters
A SPARSE_MATRIX_T.
Handle containing a sparse matrix in internal data structure.
operation C_INT.
Specifies operation op() on input matrix.
B SPARSE_MATRIX_T.
Handle containing a sparse matrix in internal data structure.
Output Parameters
C SPARSE_MATRIX_T.
Handle containing the resulting sparse matrix in internal data structure.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
384
BLAS and Sparse BLAS Routines 2
SPARSE_STATUS_NOT_I The routine encountered an empty handle or
NITIALIZED matrix array.
mkl_sparse_spmm
Computes the product of two sparse matrices and
stores the result as a sparse matrix.
Syntax
stat = mkl_sparse_spmm (operation, A, B, C)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_spmm routine performs a matrix-matrix operation:
C := op(A) *B
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
A SPARSE_MATRIX_T.
Handle containing a sparse matrix in internal data structure.
B SPARSE_MATRIX_T.
Handle containing a sparse matrix in internal data structure.
385
2 Intel Math Kernel Library Developer Reference
Output Parameters
C SPARSE_MATRIX_T.
Handle containing the resulting sparse matrix in internal data structure.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
mkl_sparse_?_spmmd
Computes the product of two sparse matrices and
stores the result as a dense matrix.
Syntax
stat = mkl_sparse_s_spmmd (operation, A, B, layout, C, ldc)
stat = mkl_sparse_d_spmmd (operation, A, B, layout, C, ldc)
stat = mkl_sparse_c_spmmd (operation, A, B, layout, C, ldc)
stat = mkl_sparse_z_spmmd (operation, A, B, layout, C, ldc)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_spmmd routine performs a matrix-matrix operation:
C := op(A)*B
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
386
BLAS and Sparse BLAS Routines 2
SPARSE_OPERATION_NO Non-transpose, op(A) = A.
N_TRANSPOSE
SPARSE_OPERATION_TR Transpose, op(A) = AT.
ANSPOSE
SPARSE_OPERATION_CO Conjugate transpose, op(A) = AH.
NJUGATE_TRANSPOSE
A SPARSE_MATRIX_T.
Handle containing a sparse matrix in internal data structure.
B SPARSE_MATRIX_T.
Handle containing a sparse matrix in internal data structure.
layout C_INT.
Describes the storage scheme for the dense matrix:
ldC C_INT.
Leading dimension of matrix C.
Output Parameters
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
387
2 Intel Math Kernel Library Developer Reference
BLAS-like Extensions
Intel MKL provides C and Fortran routines to extend the functionality of the BLAS routines. These include
routines to compute vector products, matrix-vector products, and matrix-matrix products.
Intel MKL also provides routines to perform certain data manipulation, including matrix in-place and out-of-
place transposition operations combined with simple matrix arithmetic operations. Transposition operations
are Copy As Is, Conjugate transpose, Transpose, and Conjugate. Each routine adds the possibility of scaling
during the transposition operation by giving some alpha and/or beta parameters. Each routine supports
both row-major orderings and column-major orderings.
Table BLAS-like Extensions lists these routines.
The <?> symbol in the routine short names is a precision prefix that indicates the data type:
s REAL
d DOUBLE PRECISION
c COMPLEX
z DOUBLE COMPLEX
BLAS-like Extensions
Routine Data Types Description
?axpby s, d, c, z Scales two vectors, adds them to one another and stores
result in the vector (routines).
388
BLAS and Sparse BLAS Routines 2
Routine Data Types Description
?axpby
Scales two vectors, adds them to one another and
stores result in the vector.
Syntax
call saxpby(n, a, x, incx, b, y, incy)
call daxpby(n, a, x, incx, b, y, incy)
call caxpby(n, a, x, incx, b, y, incy)
call zaxpby(n, a, x, incx, b, y, incy)
call axpby(x, y [,a] [,b])
Include Files
mkl.fi, blas.f90
Description
y := a*x + b*y
where:
a and b are scalars
x and y are vectors each with n elements.
Input Parameters
389
2 Intel Math Kernel Library Developer Reference
Output Parameters
Example
For examples of routine usage, see the code in the Intel MKL installation directory:
saxpby: examples\blas\source\saxpbyx.f
daxpby: examples\blas\source\daxpbyx.f
caxpby: examples\blas\source\caxpbyx.f
zaxpby: examples\blas\source\zaxpbyx.f
390
BLAS and Sparse BLAS Routines 2
y Holds the array of size n.
?gem2vu
Computes two matrix-vector products using a general
matrix (real data)
Syntax
call sgem2vu(m, n, alpha, a, lda, x1, incx1, x2, incx2, beta, y1, incy1, y2, incy2)
call dgem2vu(m, n, alpha, a, lda, x1, incx1, x2, incx2, beta, y1, incy1, y2, incy2)
call gem2vu(a, x1, x2, y1, y2 [,alpha][,beta] )
Include Files
mkl.fi, blas.f90
Description
y1 := alpha*A*x1 + beta*y1,
and
y2 := alpha*A'*x2 + beta*y2,
where:
alpha and beta are scalars,
x1, x2, y1, and y2 are vectors,
A is an m-by-n matrix.
Input Parameters
391
2 Intel Math Kernel Library Developer Reference
Output Parameters
392
BLAS and Sparse BLAS Routines 2
BLAS 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or reconstructible arguments, see BLAS 95
Interface Conventions.
Specific details for the routine gem2vu interface are the following:
x1 Holds the vector with the number of elements rx1 where rx1 = n.
x2 Holds the vector with the number of elements rx2 where rx2 = m.
y1 Holds the vector with the number of elements ry1 where ry1 = m.
y2 Holds the vector with the number of elements ry2 where ry2 = n.
?gem2vc
Computes two matrix-vector products using a general
matrix (complex data)
Syntax
call cgem2vc(m, n, alpha, a, lda, x1, incx1, x2, incx2, beta, y1, incy1, y2, incy2)
call zgem2vc(m, n, alpha, a, lda, x1, incx1, x2, incx2, beta, y1, incy1, y2, incy2)
call gem2vc(a, x1, x2, y1, y2 [,alpha][,beta] )
Include Files
mkl.fi, blas.f90
Description
y1 := alpha*A*x1 + beta*y1,
and
y2 := alpha*conjg(A')*x2 + beta*y2,
where:
alpha and beta are scalars,
x1, x2, y1, and y2 are vectors,
A is an m-by-n matrix.
Input Parameters
393
2 Intel Math Kernel Library Developer Reference
394
BLAS and Sparse BLAS Routines 2
Array, size at least (1+(n-1)*abs(incy2)). Before entry with non-zero
beta, the incremented array y2 must contain the vector y2.
Output Parameters
x1 Holds the vector with the number of elements rx1 where rx1 = n.
x2 Holds the vector with the number of elements rx2 where rx2 = m.
y1 Holds the vector with the number of elements ry1 where ry1 = m.
y2 Holds the vector with the number of elements ry2 where ry2 = n.
?gemmt
Computes a matrix-matrix product with general
matrices but updates only the upper or lower
triangular part of the result matrix.
Syntax
call sgemmt (uplo, transa, transb, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dgemmt (uplo, transa, transb, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call cgemmt (uplo, transa, transb, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zgemmt (uplo, transa, transb, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call gemmt (a, b, c[, uplo] [, transa] [, transb] [, alpha] [, beta])
Include Files
mkl.fi, blas.f90
395
2 Intel Math Kernel Library Developer Reference
Description
The ?gemmt routines compute a scalar-matrix-matrix product with general matrices and add the result to the
upper or lower part of a scalar-matrix product. These routines are similar to the ?gemm routines, but they
only access and update a triangular part of the square result matrix (see Application Notes below).
The operation is defined as
C := alpha*op(A)*op(B) + beta*C,
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,
alpha and beta are scalars,
A, B and C are matrices:
op(A) is an n-by-k matrix,
op(B) is a k-by-n matrix,
C is an n-by-n upper or lower triangular matrix.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array c is used. If uplo = 'U' or 'u', then the upper triangular part of the
array c is used. If uplo = 'L' or 'l', then the lower triangular part of the
array c is used.
k INTEGER. Specifies the number of columns of the matrix op(A) and the
number of rows of the matrix op(B). The value of k must be at least zero.
396
BLAS and Sparse BLAS Routines 2
DOUBLE PRECISION for dgemmt
COMPLEX for cgemmt
DOUBLE COMPLEX for zgemmt
Array, size lda by ka, where ka is k when transa = 'N' or 'n', and is n
otherwise. Before entry with transa = 'N' or 'n', the leading n-by-k part
of the array a must contain the matrix A, otherwise the leading k-by-n part
of the array a must contain the matrix A.
Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular
part of the array c must contain the upper triangular part of the matrix C
and the strictly lower triangular part of c is not referenced.
Before entry with uplo = 'L' or 'l', the leading n-by-n lower triangular
part of the array c must contain the lower triangular part of the matrix C
and the strictly upper triangular part of c is not referenced.
397
2 Intel Math Kernel Library Developer Reference
Output Parameters
c When uplo = 'U' or 'u', the upper triangular part of the array c is
overwritten by the upper triangular part of the updated matrix.
When uplo = 'L' or 'l', the lower triangular part of the array c is
overwritten by the lower triangular part of the updated matrix.
398
BLAS and Sparse BLAS Routines 2
Application Notes
These routines only access and update the upper or lower triangular part of the result matrix. This can be
useful when the result is known to be symmetric; for example, when computing a product of the form C :=
alpha*B*S*BT + beta*C , where S and C are symmetric matrices and B is a general matrix. In this case,
first compute A := B*S (which can be done using the corresponding ?symm routine), then compute C :=
alpha*A*BT + beta*C using the ?gemmt routine.
?gemm3m
Computes a scalar-matrix-matrix product using matrix
multiplications and adds the result to a scalar-matrix
product.
Syntax
call cgemm3m(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zgemm3m(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call gemm3m(a, b, c [,transa][,transb] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?gemm3m routines perform a matrix-matrix operation with general complex matrices. These routines are
similar to the ?gemm routines, but they use fewer matrix multiplication operations (see Application Notes
below).
The operation is defined as
C := alpha*op(A)*op(B) + beta*C,
where:
op(x) is one of op(x) = x, or op(x) = x', or op(x) = conjg(x'),
alpha and beta are scalars,
A, B and C are matrices:
op(A) is an m-by-k matrix,
op(B) is a k-by-n matrix,
C is an m-by-n matrix.
Input Parameters
399
2 Intel Math Kernel Library Developer Reference
m INTEGER. Specifies the number of rows of the matrix op(A) and of the
matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix op(B) and the
number of columns of the matrix C.
The value of n must be at least zero.
k INTEGER. Specifies the number of columns of the matrix op(A) and the
number of rows of the matrix op(B).
400
BLAS and Sparse BLAS Routines 2
c COMPLEX for cgemm3m
DOUBLE COMPLEX for zgemm3m
Array, size ldc by n. Before entry, the leading m-by-n part of the array c
must contain the matrix C, except when beta is equal to zero, in which
case c need not be set on entry.
Output Parameters
401
2 Intel Math Kernel Library Developer Reference
Application Notes
These routines perform a complex matrix multiplication by forming the real and imaginary parts of the input
matrices. This uses three real matrix multiplications and five real matrix additions instead of the conventional
four real matrix multiplications and two real matrix additions. The use of three real matrix multiplications
reduces the time spent in matrix operations by 25%, resulting in significant savings in compute time for
large matrices.
If the errors in the floating point calculations satisfy the following conditions:
fl(x op y)=(x op y)(1+),||u, op=,/, fl(xy)=x(1+)y(1+), ||,||u
then for an n-by-n matrix =fl(C1+iC2)= fl((A1+iA2)(B1+iB2))=1+i2, the following bounds are
satisfied:
1-C1 2(n+1)uAB+O(u2),
2-C2 4(n+4)uAB+O(u2),
where A=max(A1,A2), and B=max(B1,B2).
?gemm_batch
Computes scalar-matrix-matrix products and adds the
results to scalar matrix products for groups of general
matrices.
Syntax
call sgemm_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
a_array, lda_array, b_array, ldb_array, beta_array, c_array, ldc_array, group_count,
group_size)
call dgemm_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
a_array, lda_array, b_array, ldb_array, beta_array, c_array, ldc_array, group_count,
group_size)
call cgemm_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
a_array, lda_array, b_array, ldb_array, beta_array, c_array, ldc_array, group_count,
group_size)
call zgemm_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
a_array, lda_array, b_array, ldb_array, beta_array, c_array, ldc_array, group_count,
group_size)
call sgemm_batch(a_array, b_array, c_array, m_array, n_array, k_array, group_size
[,transa_array][,transb_array] [,alpha_array][,beta_array])
Include Files
mkl.fi, blas.f90
402
BLAS and Sparse BLAS Routines 2
Description
The ?gemm_batch routines perform a series of matrix-matrix operations with general matrices. They are
similar to the ?gemm routine counterparts, but the ?gemm_batch routines perform matrix-matrix operations
with groups of matrices, processing a number of groups at once. The groups contain matrices with the same
parameters.
The operation is defined as
idx = 1
for i = 1..group_count
alpha and beta in alpha_array(i) and beta_array(i)
for j = 1..group_size(i)
A, B, and C matrix in a_array(idx), b_array(idx), and c_array(idx)
C := alpha*op(A)*op(B) + beta*C,
idx = idx + 1
end for
end for
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,
alpha and beta are scalar elements of alpha_array and beta_array,
A, B and C are matrices such that for m, n, and k which are elements of m_array, n_array, and k_array:
See also gemm for a detailed description of multiplication for general matrices and ?gemm3m_batch, BLAS-
like extension routines for similar matrix-matrix operations.
NOTE
Error checking is not performed for Intel MKL Windows* single dynamic libraries for the ?gemm_batch
routines.
Input Parameters
403
2 Intel Math Kernel Library Developer Reference
b_array INTEGER*8 for Intel 64 architecture
INTEGER*4 for IA-32 architecture
Array, size total_batch_count, of pointers to arrays used to store B
matrices.
ldb_array INTEGER.
Array of size group_count. For the group i, ldbi = ldb_array(i)
specifies the leading dimension of the array storing matrix B as declared in
the calling (sub)program.
When transbi = 'N' or 'n', then ldbi must be at least max(1, ki),
otherwise ldbi must be at least max(1, ni).
404
BLAS and Sparse BLAS Routines 2
DOUBLE PRECISION for dgemm_batch
COMPLEX for cgemm_batch
DOUBLE COMPLEX for zgemm_batch
Array of size group_count. For the group i, beta_array(i) specifies the
scalar betai.
When betai is equal to zero, then C matrices in group i need not be set on
input.
c_array INTEGER*8 for Intel 64 architecture
INTEGER*4 for IA-32 architecture
Array, size total_batch_count, of pointers to arrays used to store C
matrices.
ldc_array INTEGER.
Array of size group_count. For the group i, ldci = ldc_array(i)
specifies the leading dimension of all arrays storing matrix C in group i as
declared in the calling (sub)program.
ldci must be at least max(1, mi).
group_count INTEGER.
Specifies the number of groups. Must be at least 0.
group_size INTEGER.
Array of size group_count. The element group_size(i) specifies the
number of matrices in group i. Each element in group_size must be at
least 0.
Output Parameters
405
2 Intel Math Kernel Library Developer Reference
kb = k otherwise,
mb = k if transb_array = 'N',
mb = n otherwise.
m_array Array indicating number of rows of matrices op(A) and C for each group.
n_array Array indicating number of columns of matrices op(B) and C for each
group.
k_array Array indicating number of columns of matrices op(A) and number of rows
of matrices op(B) for each group.
group_size Array indicating number of matrices for each group. Each element in
group_size must be at least 0.
transa_array Array with each element set to one of 'N', 'C', or 'T'.
transb_array Array with each element set to one of 'N', 'C', or 'T'.
?gemm3m_batch
Computes scalar-matrix-matrix products and adds the
results to scalar matrix products for groups of general
matrices.
Syntax
call cgemm3m_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
a_array, lda_array, b_array, ldb_array, beta_array, c_array, ldc_array, group_count,
group_size)
call zgemm3m_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
a_array, lda_array, b_array, ldb_array, beta_array, c_array, ldc_array, group_count,
group_size)
call cgemm3m_batch(a_array, b_array, c_array, m_array, n_array, k_array, group_size
[,transa_array][,transb_array] [,alpha_array][,beta_array])
Include Files
mkl.fi, blas.f90
Description
406
BLAS and Sparse BLAS Routines 2
The ?gemm3m_batch routines perform a series of matrix-matrix operations with general matrices. They are
similar to the ?gemm3m routine counterparts, but the ?gemm3m_batch routines perform matrix-matrix
operations with groups of matrices, processing a number of groups at once. The groups contain matrices with
the same parameters. The ?gemm3m_batch routines use fewer matrix multiplications than the ?gemm_batch
routines, as described in the Application Notes.
The operation is defined as
idx = 1
for i = 1..group_count
alpha and beta in alpha_array(i) and beta_array(i)
for j = 1..group_size(i)
A, B, and C matrix in a_array(idx), b_array(idx), and c_array(idx)
C := alpha*op(A)*op(B) + beta*C,
idx = idx + 1
end for
end for
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,
alpha and beta are scalar elements of alpha_array and beta_array,
A, B and C are matrices such that for m, n, and k which are elements of m_array, n_array, and k_array:
See also gemm for a detailed description of multiplication for general matrices and gemm_batch, BLAS-like
extension routines for similar matrix-matrix operations.
NOTE
Error checking is not performed for Intel MKL Windows* single dynamic libraries for the ?
gemm3m_batch routines.
Input Parameters
407
2 Intel Math Kernel Library Developer Reference
b_array INTEGER*8 for Intel 64 architecture
INTEGER*4 for IA-32 architecture
Array, size total_batch_count, of pointers to arrays used to store B
matrices.
ldb_array INTEGER.
Array of size group_count. For the group i, ldbi = ldb_array(i)
specifies the leading dimension of the array storing matrix B as declared in
the calling (sub)program.
When transbi = 'N' or 'n', then ldbi must be at least max(1, ki),
otherwise ldbi must be at least max(1, ni).
408
BLAS and Sparse BLAS Routines 2
When betai is equal to zero, then C matrices in group i need not be set on
input.
c_array INTEGER*8 for Intel 64 architecture
INTEGER*4 for IA-32 architecture
Array, size total_batch_count, of pointers to arrays used to store C
matrices.
ldc_array INTEGER.
Array of size group_count. For the group i, ldci = ldc_array(i)
specifies the leading dimension of all arrays storing matrix C in group i as
declared in the calling (sub)program.
ldci must be at least max(1, mi).
group_count INTEGER.
Specifies the number of groups. Must be at least 0.
group_size INTEGER.
Array of size group_count. The element group_size(i) specifies the
number of matrices in group i. Each element in group_size must be at
least 0.
Output Parameters
409
2 Intel Math Kernel Library Developer Reference
m_array Array indicating number of rows of matrices op(A) and C for each group.
n_array Array indicating number of columns of matrices op(B) and C for each
group.
k_array Array indicating number of columns of matrices op(A) and number of rows
of matrices op(B) for each group.
group_size Array indicating number of matrices for each group. Each element in
group_size must be at least 0.
transa_array Array with each element set to one of 'N', 'C', or 'T'.
transb_array Array with each element set to one of 'N', 'C', or 'T'.
Application Notes
These routines perform a complex matrix multiplication by forming the real and imaginary parts of the input
matrices. This uses three real matrix multiplications and five real matrix additions instead of the conventional
four real matrix multiplications and two real matrix additions. The use of three real matrix multiplications
reduces the time spent in matrix operations by 25%, resulting in significant savings in compute time for
large matrices.
If the errors in the floating point calculations satisfy the following conditions:
fl(x op y)=(x op y)(1+),||u, op=,/, fl(xy)=x(1+)y(1+), ||,||u
then for an n-by-n matrix =fl(C1+iC2)= fl((A1+iA2)(B1+iB2))=1+i2, the following bounds are
satisfied:
1-C1 2(n+1)uAB+O(u2),
2-C2 4(n+4)uAB+O(u2),
where A=max(A1,A2), and B=max(B1,B2).
mkl_?imatcopy
Performs scaling and in-place transposition/copying of
matrices.
Syntax
call mkl_simatcopy(ordering, trans, rows, cols, alpha, ab, lda, ldb)
call mkl_dimatcopy(ordering, trans, rows, cols, alpha, ab, lda, ldb)
call mkl_cimatcopy(ordering, trans, rows, cols, alpha, ab, lda, ldb)
call mkl_zimatcopy(ordering, trans, rows, cols, alpha, ab, lda, ldb)
410
BLAS and Sparse BLAS Routines 2
Include Files
mkl.fi
Description
The mkl_?imatcopy routine performs scaling and in-place transposition/copying of matrices. A transposition
operation can be a normal matrix copy, a transposition, a conjugate transposition, or just a conjugation. The
operation is defined as follows:
AB := alpha*op(AB).
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
NOTE
Different arrays must not overlap.
Input Parameters
If the data is real, then trans = 'R' is the same as trans = 'N', and
trans = 'C' is the same as trans = 'T'.
rows INTEGER. The number of rows in matrix AB before the transpose operation.
411
2 Intel Math Kernel Library Developer Reference
lda INTEGER. Distance between the first elements in adjacent columns (in the
case of the column-major order) or rows (in the case of the row-major
order) in the source matrix; measured in the number of elements.
This parameter must be at least rows if ordering = 'C' or 'c', and
max(1,cols) otherwise.
ldb INTEGER. Distance between the first elements in adjacent columns (in the
case of the column-major order) or rows (in the case of the row-major
order) in the destination matrix; measured in the number of elements.
To determine the minimum value of ldb on output, consider the following
guideline:
If ordering = 'C' or 'c', then
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_simatcopy ( ordering, trans, rows, cols, alpha, ab, lda, ldb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, src_ld, dst_ld
REAL ab(*), alpha*
SUBROUTINE mkl_dimatcopy ( ordering, trans, rows, cols, alpha, ab, lda, ldb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, src_ld, dst_ld
DOUBLE PRECISION ab(*), alpha*
412
BLAS and Sparse BLAS Routines 2
SUBROUTINE mkl_cimatcopy ( ordering, trans, rows, cols, alpha, ab, lda, ldb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, src_ld, dst_ld
COMPLEX ab(*), alpha*
SUBROUTINE mkl_zimatcopy ( ordering, trans, rows, cols, alpha, ab, lda, ldb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, src_ld, dst_ld
DOUBLE COMPLEX ab(*), alpha*
mkl_?omatcopy
Performs scaling and out-place transposition/copying
of matrices.
Syntax
call mkl_somatcopy(ordering, trans, rows, cols, alpha, a, lda, b, ldb)
call mkl_domatcopy(ordering, trans, rows, cols, alpha, a, lda, b, ldb)
call mkl_comatcopy(ordering, trans, rows, cols, alpha, a, lda, b, ldb)
call mkl_zomatcopy(ordering, trans, rows, cols, alpha, a, lda, b, ldb)
Include Files
mkl.fi
Description
NOTE
Different arrays must not overlap.
Input Parameters
413
2 Intel Math Kernel Library Developer Reference
If the data is real, then trans = 'R' is the same as trans = 'N', and
trans = 'C' is the same as trans = 'T'.
414
BLAS and Sparse BLAS Routines 2
If trans = 'T' or 't' or 'C' or 'c', ldb must be at least equal to
cols.
If trans = 'N' or 'n' or 'R' or 'r', ldb must be at least equal to
rows.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatcopy ( ordering, trans, rows, cols, alpha, a, lda, b, ldb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, lda, ldb
REAL alpha, b(ldb,*), a(lda,*)
mkl_?omatcopy2
Performs two-strided scaling and out-of-place
transposition/copying of matrices.
Syntax
call mkl_somatcopy2(ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb,
strideb)
call mkl_domatcopy2(ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb,
strideb)
call mkl_comatcopy2(ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb,
strideb)
415
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
NOTE
Different arrays must not overlap.
Input Parameters
If the data is real, then trans = 'R' is the same as trans = 'N', and
trans = 'C' is the same as trans = 'T'.
416
BLAS and Sparse BLAS Routines 2
This parameter scales the input matrix by alpha.
lda INTEGER.
If ordering = 'R' or 'r', lda represents the number of elements in array
a between adjacent rows of matrix A; lda must be at least equal to the
number of columns of matrix A.
If ordering = 'C' or 'c', lda represents the number of elements in array
a between adjacent columns of matrix A; lda must be at least 1 and not
more than the number of columns in matrix A.
stridea INTEGER.
If ordering = 'R' or 'r', stridea represents the number of elements in
array a between adjacent columns of matrix A. stridea must be at least 1
and not more than the number of columns in matrix A.
If ordering = 'C' or 'c', stridea represents the number of elements in
array a between adjacent rows of matrix A. stridea must be at least equal
to the number of columns in matrix A.
ldb INTEGER.
If ordering = 'R' or 'r', ldb represents the number of elements in array
b between adjacent rows of matrix B.
If trans = 'T' or 't' or 'C' or 'c', ldb must be at least 1 and not
more than rows/strideb.
If trans = 'N' or 'n' or 'R' or 'r', ldb must be at least 1 and not
more than cols/strideb.
strideb INTEGER.
417
2 Intel Math Kernel Library Developer Reference
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatcopy2 ( ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb, strideb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, lda, stridea, ldb, strideb
REAL alpha, b(*), a(*)
SUBROUTINE mkl_domatcopy2 ( ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb, strideb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, lda, stridea, ldb, strideb
DOUBLE PRECISION alpha, b(*), a(*)
SUBROUTINE mkl_comatcopy2 ( ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb, strideb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, lda, stridea, ldb, strideb
COMPLEX alpha, b(*), a(*)
SUBROUTINE mkl_zomatcopy2 ( ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb, strideb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, lda, stridea, ldb, strideb
DOUBLE COMPLEX alpha, b(*), a(*)
mkl_?omatadd
Scales and sums two matrices including in addition to
performing out-of-place transposition operations.
418
BLAS and Sparse BLAS Routines 2
Syntax
call mkl_somatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc)
call mkl_domatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc)
call mkl_comatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc)
call mkl_zomatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc)
Include Files
mkl.fi
Description
The mkl_?omatadd routine scales and adds two matrices, as well as performing out-of-place transposition
operations. A transposition operation can be no operation, a transposition, a conjugate transposition, or a
conjugation (without transposition). The following out-of-place memory movement is done:
C := alpha*op(A) + beta*op(B)
where the op(A) and op(B) operations are transpose, conjugate-transpose, conjugate (no transpose), or no
transpose, depending on the values of transa and transb. If no transposition of the source matrices is
required, m is the number of rows and n is the number of columns in the source matrices A and B. In this
case, the output matrix C is m-by-n.
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
NOTE
Note that different arrays must not overlap.
Input Parameters
419
2 Intel Math Kernel Library Developer Reference
lda INTEGER. Distance between the first elements in adjacent columns (in the
case of the column-major order) or rows (in the case of the row-major
order) in the source matrix A; measured in the number of elements.
For ordering = 'C' or 'c': when transa = 'N', 'n', 'R', or 'r', lda
must be at least max(1,m); otherwise lda must be max(1,n).
For ordering = 'R' or 'r': when transa = 'N', 'n', 'R', or 'r', lda
must be at least max(1,n); otherwise lda must be max(1,m).
420
BLAS and Sparse BLAS Routines 2
ldb INTEGER. Distance between the first elements in adjacent columns (in the
case of the column-major order) or rows (in the case of the row-major
order) in the source matrix B; measured in the number of elements.
For ordering = 'C' or 'c': when transa = 'N', 'n', 'R', or 'r', ldb
must be at least max(1,m); otherwise ldb must be max(1,n).
For ordering = 'R' or 'r': when transa = 'N', 'n', 'R', or 'r', ldb
must be at least max(1,n); otherwise ldb must be max(1,m).
ldc INTEGER. Distance between the first elements in adjacent columns (in the
case of the column-major order) or rows (in the case of the row-major
order) in the destination matrix C; measured in the number of elements.
If ordering = 'C' or 'c', then ldc must be at least max(1, m),
otherwise ldc must be at least max(1, n).
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc )
CHARACTER*1 ordering, transa, transb
INTEGER m, n, lda, ldb, ldc
REAL alpha, beta
REAL a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_domatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc )
CHARACTER*1 ordering, transa, transb
INTEGER m, n, lda, ldb, ldc
DOUBLE PRECISION alpha, beta
DOUBLE PRECISION a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_comatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc )
CHARACTER*1 ordering, transa, transb
INTEGER m, n, lda, ldb, ldc
COMPLEX alpha, beta
COMPLEX a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zomatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc )
CHARACTER*1 ordering, transa, transb
INTEGER m, n, lda, ldb, ldc
DOUBLE COMPLEX alpha, beta
DOUBLE COMPLEX a(lda,*), b(ldb,*), c(ldc,*)
421
2 Intel Math Kernel Library Developer Reference
?gemm_alloc
Allocates storage for a packed matrix.
Syntax
dest = sgemm_alloc (identifier, m, n, k)
dest = dgemm_alloc (identifier, m, n, k)
Include Files
mkl.fi
Description
The ?gemm_alloc routine is one of a set of related routines that enable use of an internal packed storage.
Call the ?gemm_alloc routine first to allocate storage for a packed matrix structure to be used in subsequent
calls, ultimately to compute
C := alpha*op(A)*op(B) + beta*C,
where:
Input Parameters
m INTEGER. Specifies the number of rows of matrix op(A) and of the matrix C.
The value of m must be at least zero.
n INTEGER. Specifies the number of columns of matrix op(B) and the number
of columns of matrix C. The value of n must be at least zero.
k INTEGER. Specifies the number of columns of matrix op(A) and the number
of rows of matrix op(B). The value of k must be at least zero.
Output Parameters
dest POINTER.
Pointer to allocated storage.
See Also
?gemm_packPerforms scaling and packing of the matrix into the previously allocated buffer.
422
BLAS and Sparse BLAS Routines 2
?gemm_computeComputes a matrix-matrix product with general matrices where one or both
input matrices are stored in a packed data structure and adds the result to a scalar-matrix
product.
?gemm_freeFrees the storage previously allocated for the packed matrix.
?gemm for a detailed description of general matrix multiplication.
?gemm_pack
Performs scaling and packing of the matrix into the
previously allocated buffer.
Syntax
call sgemm_pack (identifier, trans, m, n, k, alpha, src, ld, dest)
call dgemm_pack (identifier, trans, m, n, k, alpha, src, ld, dest)
Include Files
mkl.fi
Description
The ?gemm_pack routine is one of a set of related routines that enable use of an internal packed storage.
Call ?gemm_pack after successfully calling ?gemm_alloc. The ?gemm_pack scales the identified matrix by
alpha and packs it into the buffer allocated previously with ?gemm_alloc.
The ?gemm_pack routine performs this operation:
NOTE
For best performance, use the same number of threads for packing and for computing.
If packing for both A and B matrices, you must use the same number of threads for packing A as for
packing B.
Input Parameters
423
2 Intel Math Kernel Library Developer Reference
m INTEGER. Specifies the number of rows of the matrix op(A) and of the
matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix op(B) and the
number of columns of the matrix C. The value of n must be at least zero.
k INTEGER. Specifies the number of columns of the matrix op(A) and the
number of rows of the matrix op(B). The value of k must be at least zero.
dest POINTER.
424
BLAS and Sparse BLAS Routines 2
Scaled and packed internal storage buffer.
Output Parameters
See Also
?gemm_allocAllocates storage for a packed matrix.
?gemm_computeComputes a matrix-matrix product with general matrices where one or both
input matrices are stored in a packed data structure and adds the result to a scalar-matrix
product.
?gemm_freeFrees the storage previously allocated for the packed matrix.
?gemm for a detailed description of general matrix multiplication.
?gemm_compute
Computes a matrix-matrix product with general
matrices where one or both input matrices are stored
in a packed data structure and adds the result to a
scalar-matrix product.
Syntax
call sgemm_compute (transa, transb, m, n, k, a, lda, b, ldb, beta, C, ldc)
call dgemm_compute (transa, transb, m, n, k, a, lda, b, ldb, beta, C, ldc)
Include Files
mkl.fi
Description
The ?gemm_compute routine is one of a set of related routines that enable use of an internal packed storage.
After calling ?gemm_pack call ?gemm_compute to compute
C := op(A)*op(B) + beta*C,
where:
NOTE
For best performance, use the same number of threads for packing and for computing.
If packing for both A and B matrices, you must use the same number of threads for packing A as for
packing B.
425
2 Intel Math Kernel Library Developer Reference
Input Parameters
If transa = 'P' or 'p' the matrix in array a is packed and lda is ignored.
If transb = 'P' or 'p' the matrix in array b is packed and ldb is ignored.
m INTEGER. Specifies the number of rows of the matrix op(A) and of the
matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix op(B) and the
number of columns of the matrix C. The value of n must be at least zero.
k INTEGER. Specifies the number of columns of the matrix op(A) and the
number of rows of the matrix op(B). The value of k must be at least zero.
If transa = 'T', 't', 'C', or 'c', lda must be at least max (1, k).
426
BLAS and Sparse BLAS Routines 2
transb = 'N' or 'n' transb = 'T', 't', 'C', transb = 'P'
or 'c' or 'p'
If transb = 'T', 't', 'C', or 'c', ldb must be at least max (1, n).
Output Parameters
See Also
?gemm_allocAllocates storage for a packed matrix.
?gemm_packPerforms scaling and packing of the matrix into the previously allocated buffer.
?gemm_freeFrees the storage previously allocated for the packed matrix.
?gemm for a detailed description of general matrix multiplication.
?gemm_free
Frees the storage previously allocated for the packed
matrix.
Syntax
call sgemm_free (dest)
call dgemm_free (dest)
427
2 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The ?gemm_free routine is one of a set of related routines that enable use of an internal packed storage. Call
the ?gemm_free routine last to release storage for the packed matrix structure allocated with ?gemm_alloc.
Input Parameters
dest POINTER.
Previously allocated storage.
Output Parameters
See Also
?gemm_allocAllocates storage for a packed matrix.
?gemm_packPerforms scaling and packing of the matrix into the previously allocated buffer.
?gemm_computeComputes a matrix-matrix product with general matrices where one or both
input matrices are stored in a packed data structure and adds the result to a scalar-matrix
product.
?gemm for a detailed description of general matrix multiplication.
428
LAPACK Routines 3
This chapter describes the Intel Math Kernel Library implementation of routines from the LAPACK package
that are used for solving systems of linear equations, linear least squares problems, eigenvalue and singular
value problems, and performing a number of related computational tasks. The library includes LAPACK
routines for both real and complex data. Routines are supported for systems of equations with the following
types of matrices:
general
banded
symmetric or Hermitian positive-definite (full, packed, and rectangular full packed (RFP) storage)
symmetric or Hermitian positive-definite banded
symmetric or Hermitian indefinite (both full and packed storage)
symmetric or Hermitian indefinite banded
triangular (full, packed, and RFP storage)
triangular banded
tridiagonal
diagonally dominant tridiagonal.
NOTE
Different arrays used as parameters to Intel MKL LAPACK routines must not overlap.
WARNING
LAPACK routines assume that input matrices do not contain IEEE 754 special values such as INF or
NaN values. Using these special values may cause LAPACK to return unexpected results or become
unstable.
Intel MKL supports the Fortran 95 interface, which uses simplified routine calls with shorter argument lists, in
addition to the FORTRAN 77 interface to LAPACK computational and driver routines. The syntax section of the
routine description gives the calling sequence for the Fortran 95 interface, where available, immediately after
the FORTRAN 77 calls.
429
3 Intel Math Kernel Library Developer Reference
The Fortran 95 interfaces to the LAPACK computational and driver routines are the same as the FORTRAN 77
names but without the first letter that indicates the data type. For example, the name of the routine that
performs a triangular factorization of general real matrices in Fortran 95 is getrf. Different data types are
handled through the definition of a specific internal parameter that refers to a module block with named
constants for single and double precision.
NOTE
For LAPACK, Intel MKL offers two types of the Fortran 95 interfaces:
using mkl_lapack.fi only through the include 'mkl_lapack.fi' statement. Such interfaces
allow you to make use of the original LAPACK routines with all their arguments
using lapack.f90 that includes improved interfaces. This file is used to generate the module files
lapack95.mod and f95_precision.mod. See also the section "Fortran 95 interfaces and wrappers
to LAPACK and BLAS" of the Intel MKL Developer Guide for details. The module files are used to
process the FORTRAN use clauses referencing the LAPACK interface: use lapack95 and use
f95_precision.
NOTE
Internally, workspace arrays are allocated by the Fortran 95 interface wrapper, and are of optimal size
for the best performance of the routine.
An argument can also be skipped if its value is completely defined by the presence or absence of another
argument in the calling sequence, and the restored value is the only meaningful value for the skipped
argument.
Some generic arguments are declared as optional in the Fortran 95 interface and may or may not be
present in the calling sequence. An argument can be declared optional if it meets one of the following
conditions:
If an argument value is completely defined by the presence or absence of another argument in the
calling sequence, it can be declared optional. The difference from the skipped argument in this case is
that the optional argument can have some meaningful values that are distinct from the value
reconstructed by default. For example, if some argument (like jobz) can take only two values and one
of these values directly implies the use of another argument, then the value of jobz can be uniquely
reconstructed from the actual presence or absence of this second argument, and jobz can be omitted.
If an input argument can take only a few possible values, it can be declared as optional. The default
value of such argument is typically set as the first value in the list and all exceptions to this rule are
explicitly stated in the routine description.
If an input argument has a natural default value, it can be declared as optional. The default value of
such optional argument is set to its natural default value.
430
LAPACK Routines 3
Argument info is declared as optional in the Fortran 95 interface. If it is present in the calling sequence,
the value assigned to info is interpreted as follows:
If this value is more than -1000, its meaning is the same as in the FORTRAN 77 routine.
If this value is equal to -1000, it means that there is not enough work memory.
If this value is equal to -1001, incompatible arguments are present in the calling sequence.
If this value is equal to -i, the ith parameter (counting parameters in the FORTRAN 77 interface, not
the Fortran 95 interface) had an illegal value.
Optional arguments are given in square brackets in the Fortran 95 call syntax.
The "Fortran 95 Notes" subsection at the end of the topic describing each routine details concrete rules for
reconstructing the values of the omitted optional parameters.
Intel MKL Fortran 95 Interfaces for LAPACK Routines vs. Netlib Implementation
The following list presents general digressions of the Intel MKL LAPACK95 implementation from the Netlib
analog:
The Intel MKL Fortran 95 interfaces are provided for pure procedures.
Names of interfaces do not contain the LA_ prefix.
An optional array argument always has the target attribute.
Functionality of the Intel MKL LAPACK95 wrapper is close to the FORTRAN 77 original implementation in
the getrf, gbtrf, and potrf interfaces.
If jobz argument value specifies presence or absence of z argument, then z is always declared as
optional and jobz is restored depending on whether z is present or not.
To avoid double error checking, processing of the info argument is limited to checking of the allocated
memory and disarranging of optional arguments.
If an argument that is present in the list of arguments completely defines another argument, the latter is
always declared as optional.
You can transform an application that uses the Netlib LAPACK interfaces to ensure its work with the Intel MKL
interfaces providing that:
a. The application is correct, that is, unambiguous, compiler-independent, and contains no errors.
b. Each routine name denotes only one specific routine. If any routine name in the application coincides
with a name of the original Netlib routine (for example, after removing the LA_ prefix) but denotes a
routine different from the Netlib original routine, this name should be modified through context name
replacement.
You should transform your application in the following cases:
When using the Netlib routines that differ from the Intel MKL routines only by the LA_ prefix or in the
array attribute target. The only transformation required in this case is context name replacement.
When using Netlib routines that differ from the Intel MKL routines by the LA_ prefix, the target array
attribute, and the names of formal arguments. In the case of positional passing of arguments, no
additional transformation except context name replacement is required. In the case of the keywords
passing of arguments, in addition to the context name replacement the names of mismatching keywords
should also be modified.
When using the Netlib routines that differ from the respective Intel MKL routines by the LA_ prefix, the
target array attribute, sequence of the arguments, arguments missing in Intel MKL but present in Netlib
and, vice versa, present in Intel MKL but missing in Netlib. Remove the differences in the sequence and
range of the arguments in process of all the transformations when you use the Netlib routines specified by
this bullet and the preceding bullet.
When using the getrf, gbtrf, and potrf interfaces, that is, new functionality implemented in Intel MKL
but unavailable in the Netlib source. To override the differences, build the desired functionality explicitly
with the Intel MKL means or create a new subroutine with the new functionality, using specific MKL
interfaces corresponding to LAPACK 77 routines. You can call the LAPACK 77 routines directly but using
the new Intel MKL interfaces is preferable. Note that if the transformed application calls getrf, gbtrf or
potrf without controlling arguments rcond and norm, just context name replacement is enough in
modifying the calls into the Intel MKL interfaces, as described in the first bullet above. The Netlib
functionality is preserved in such cases.
431
3 Intel Math Kernel Library Developer Reference
When using the Netlib auxiliary routines. In this case, call a corresponding subroutine directly, using the
Intel MKL LAPACK 77 interfaces.
Transform your application as follows:
Full storage: an m-by-n matrix A is stored in a two-dimensional array a, with the matrix element aij (i =
1..mj = 1..n), and stored in the array element a(i,j).
Packed storage scheme allows you to store symmetric, Hermitian, or triangular matrices more
compactly: the upper or lower triangle of the matrix is packed by columns in a one-dimensional array.
Band storage: an m-by-n band matrix with kl sub-diagonals and ku superdiagonals is stored compactly in
a two-dimensional array ab with kl+ku+1 rows and n columns. Columns of the matrix are stored in the
corresponding columns of the array, and diagonals of the matrix are stored in rows of the array.
Rectangular Full Packed (RFP) storage: the upper or lower triangle of the matrix is packed combining
the full and packed storage schemes. This combination enables using half of the full storage as packed
storage while maintaining efficiency by using Level 3 BLAS/LAPACK kernels as the full storage.
Generally in LAPACK routines, arrays that hold matrices in packed storage have names ending in p; arrays
with matrices in band storage have names ending in b; arrays with matrices in the RFP storage have names
ending in fp.
For more information on matrix storage schemes, see "Matrix Arguments" in Appendix B.
432
LAPACK Routines 3
|A| the matrix with elements |aij| (absolute values of aij).
i Singular values of the matrix A. They are equal to square roots of the
eigenvalues of AHA. (For more information, see Singular Value
Decomposition).
Error Analysis
In practice, most computations are performed with rounding errors. Besides, you often need to solve a
system Ax = b, where the data (the elements of A and b) are not known exactly. Therefore, it is important
to understand how the data errors and rounding errors can affect the solution x.
Data perturbations. If x is the exact solution of Ax = b, and x + x is the exact solution of a perturbed
problem (A + A)(x + x) = (b + b), then this estimate, given up to linear terms of perturbations,
holds:
In other words, relative errors in A or b may be amplified in the solution vector x by a factor (A) = ||A||
||A-1|| called the condition number of A.
Rounding errors have the same effect as relative perturbations c(n) in the original data. Here is the
machine precision, defined as the smallest positive number x such that 1 + x > 1; and c(n) is a modest
function of the matrix order n. The corresponding solution error is
||x||/||x||c(n)(A). (The value of c(n) is seldom greater than 10n.)
433
3 Intel Math Kernel Library Developer Reference
NOTE
Machine precision depends on the data type used.
Thus, if your matrix A is ill-conditioned (that is, its condition number (A) is very large), then the error in
the solution x can also be large; you might even encounter a complete loss of precision. LAPACK provides
routines that allow you to estimate (A) (see Routines for Estimating the Condition Number) and also give
you a more precise estimate for the actual solution error (see Refining the Solution and Estimating Its Error).
To solve a particular problem, you can call two or more computational routines or call a corresponding driver
routine that combines several tasks in one call. For example, to solve a system of linear equations with a
general matrix, call ?getrf (LU factorization) and then ?getrs (computing the solution). Then, call ?gerfs
to refine the solution and get the error bounds. Alternatively, use the driver routine ?gesvx that performs all
these tasks in one call.
434
LAPACK Routines 3
Matrix type, Factorize Equilibrate Solve Condition Estimate Invert matrix
storage scheme matrix matrix system number error
In the table above, ? denotes s (single precision) or d (double precision) for the FORTRAN 77 interface.
Computational Routines for Systems of Equations with Complex Matrices
Matrix type, Factorize Equilibrate Solve Condition Estimate Invert matrix
storage scheme matrix matrix system number error
435
3 Intel Math Kernel Library Developer Reference
In the table above, ? stands for c (single precision complex) or z (double precision complex) for FORTRAN 77
interface.
LU factorization
436
LAPACK Routines 3
Cholesky factorization of real symmetric positive-definite matrices
Cholesky factorization of real symmetric positive-definite matrices with pivoting
Cholesky factorization of Hermitian positive-definite matrices
Cholesky factorization of Hermitian positive-definite matrices with pivoting
Bunch-Kaufman factorization of real and complex symmetric matrices
Bunch-Kaufman factorization of Hermitian matrices.
?getrf
Computes the LU factorization of a general m-by-n
matrix.
Syntax
call sgetrf( m, n, a, lda, ipiv, info )
call dgetrf( m, n, a, lda, ipiv, info )
call cgetrf( m, n, a, lda, ipiv, info )
call zgetrf( m, n, a, lda, ipiv, info )
call getrf( a [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
A = P*L*U,
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m >
n) and U is upper triangular (upper trapezoidal if m < n). The routine uses partial pivoting, with row
interchanges.
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
437
3 Intel Math Kernel Library Developer Reference
Output Parameters
ipiv INTEGER.
Array, size at least max(1,min(m, n)). The pivot indices; for 1 i
min(m, n), row i was interchanged with row ipiv(i).
Application Notes
The computed L and U are the exact factors of a perturbed matrix A + E, where
|E| c(min(m,n))P|L||U|
(2/3)n3 If m = n,
(1/3)n2(3m-n) If m>n,
(1/3)m2(3n-m) If m<n.
438
LAPACK Routines 3
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
mkl_?getrfnpi
Performs LU factorization (complete or incomplete) of
a general matrix without pivoting.
Syntax
call mkl_sgetrfnpi (m, n, nfact, a, lda, info)
call mkl_dgetrfnpi (m, n, nfact, a, lda, info )
call mkl_cgetrfnpi (m, n, nfact, a, lda, info )
call mkl_zgetrfnpi (m, n, nfact, a, lda, info )
call mkl_getrfnpi ( a [, nfact] [, info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the LU factorization of a general m-by-n matrix A without using pivoting. It supports
incomplete factorization. The factorization has the form:
A = L*U,
where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n) and U is upper triangular
(upper trapezoidal if m < n).
Incomplete factorization has the form:
where L is lower trapezoidal with unit diagonal elements, U is upper trapezoidal, and is the unfactored
part of matrix A. See the application notes section for further details.
NOTE
Use ?getrf if it is possible that the matrix is not diagonal dominant.
Input Parameters
The data types are given for the Fortran interface.
nfact INTEGER. The number of rows and columns to factor; 0 nfact min(m,
n). Note that if nfact < min(m, n), incomplete factorization is performed.
439
3 Intel Math Kernel Library Developer Reference
Output Parameters
Application Notes
The computed L and U are the exact factors of a perturbed matrix A + E, with
(2/3)n3 If m = n = nfact
(1/3)m2(3n-m) If m = nfact<n
440
LAPACK Routines 3
When incomplete factorization is specified, the first nfact rows and columns are factored, with the update of
the remaining rows and columns of A as follows:
where
The result is
L1 is a lower triangular square matrix of order nfact with unit diagonal and U1 is an upper triangular square
matrix of order nfact. L1 and U1 result from LU factorization of matrix A11: A11 = L1U1.
On exit, elements of the upper triangle U1 are stored in place of the upper triangle of block A11 in array a;
elements of the lower triangle L1 are stored in the lower triangle of block A11 in array a (unit diagonal
elements are not stored). Elements of L2 replace elements of A21; U2 replaces elements of A12 and
replaces elements of A22.
441
3 Intel Math Kernel Library Developer Reference
?getrf2
Computes LU factorization using partial pivoting with
row interchanges.
Syntax
call sgetrf2 (m, n, a, lda, ipiv, info )
call dgetrf2 (m, n, a, lda, ipiv, info )
call cgetrf2 (m, n, a, lda, ipiv, info )
call zgetrf2 (m, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
?getrf2 computes an LU factorization of a general m-by-n matrix A using partial pivoting with row
interchanges.
The factorization has the form
A=P*L*U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m >
n), and U is upper triangular (upper trapezoidal if m < n).
This is the recursive version of the algorithm. It divides the matrix into four submatrices:
A11 A12
A=
A21 A22
A11
The subroutine calls itself to factor ,
A12
A12
do the swaps on , solve A12, update A22, then it calls itself to factor A22 and do the swaps on A21.
A22
Input Parameters
442
LAPACK Routines 3
n INTEGER. The number of columns of the matrix A. n >= 0.
lda INTEGER. The leading dimension of the array a. lda >= max(1,m).
Output Parameters
The pivot indices; for 1 <= i <= min(m,n), row i of the matrix was
interchanged with row ipiv(i).
?gbtrf
Computes the LU factorization of a general m-by-n
band matrix.
Syntax
call sgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call dgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call cgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call zgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call gbtrf( ab [,kl] [,m] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine forms the LU factorization of a general m-by-n band matrix A with kl non-zero subdiagonals and
ku non-zero superdiagonals, that is,
A = P*L*U,
443
3 Intel Math Kernel Library Developer Reference
where P is a permutation matrix; L is lower triangular with unit diagonal elements and at most kl non-zero
elements in each column; U is an upper triangular band matrix with kl + ku superdiagonals. The routine uses
partial pivoting, with row interchanges (which creates the additional kl superdiagonals in U).
NOTE
This routine supports the Progress Routine feature. See Progress Function section for details.
Input Parameters
ldab INTEGER. The leading dimension of the array ab. (ldab 2*kl + ku
+ 1)
Output Parameters
ipiv INTEGER.
Array, size at least max(1,min(m, n)). The pivot indices; for 1 i
min(m, n) , row i was interchanged with row ipiv(i).
444
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or reconstructible arguments, see LAPACK
95 Interface Conventions.
Specific details for the routine gbtrf interface are as follows:
ku Restored as ku = lda-2*kl-1.
m If omitted, assumed m = n.
Application Notes
The computed L and U are the exact factors of a perturbed matrix A + E, where
Elements marked * are not used; elements marked + need not be set on entry, but are required by the
routine to store elements of U because of fill-in resulting from the row interchanges.
After calling this routine with m = n, you can call the following routines:
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
445
3 Intel Math Kernel Library Developer Reference
?gttrf
Computes the LU factorization of a tridiagonal matrix.
Syntax
call sgttrf( n, dl, d, du, du2, ipiv, info )
call dgttrf( n, dl, d, du, du2, ipiv, info )
call cgttrf( n, dl, d, du, du2, ipiv, info )
call zgttrf( n, dl, d, du, du2, ipiv, info )
call gttrf( dl, d, du, du2 [, ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the LU factorization of a real or complex tridiagonal matrix A using elimination with
partial pivoting and row interchanges.
The factorization has the form
A = L*U,
where L is a product of permutation and unit lower bidiagonal matrices and U is upper triangular with
nonzeroes in only the main diagonal and first two superdiagonals.
Input Parameters
Output Parameters
dl Overwritten by the (n-1) multipliers that define the matrix L from the
LU factorization of A. The matrix L has unit diagonal elements, and the
(n-1) elements of dl form the subdiagonal. All other elements of L
are zero.
446
LAPACK Routines 3
du Overwritten by the (n-1) elements of the first superdiagonal of U.
ipiv INTEGER.
Array, dimension (n). The pivot indices: for 1 in, row i was
interchanged with row ipiv(i). ipiv(i) is always i or i+1; ipiv(i) = i
indicates a row interchange was not required.
Application Notes
?dttrfb
Computes the factorization of a diagonally dominant
tridiagonal matrix.
Syntax
call sdttrfb( n, dl, d, du, info )
call ddttrfb( n, dl, d, du, info )
call cdttrfb( n, dl, d, du, info )
447
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The ?dttrfb routine computes the factorization of a real or complex tridiagonal matrix A with the BABE
(Burning At Both Ends) algorithm without pivoting. The factorization has the form
A = L1*U*L2
where
L1 and L2 are unit lower bidiagonal with k and n - k - 1 subdiagonal elements, respectively, where k =
n/2, and
U is an upper bidiagonal matrix with nonzeroes in only the main diagonal and first superdiagonal.
Input Parameters
Output Parameters
Application Notes
A diagonally dominant tridiagonal system is defined such that |di| > |dli-1| + |dui| for any i:
448
LAPACK Routines 3
1 < i < n, and |d1| > |du1|, |dn| > |dln-1|
The underlying BABE algorithm is designed for diagonally dominant systems. Such systems are free from the
numerical stability issue unlike the canonical systems that use elimination with partial pivoting (see ?gttrf).
The diagonally dominant systems are much faster than the canonical systems.
NOTE
The current implementation of BABE has a potential accuracy issue on very small or large data
close to the underflow or overflow threshold respectively. Scale the matrix before applying the
solver in the case of such input data.
Applying the ?dttrfb factorization to non-diagonally dominant systems may lead to an accuracy
loss, or false singularity detected due to no pivoting.
?potrf
Computes the Cholesky factorization of a symmetric
(Hermitian) positive-definite matrix.
Syntax
call spotrf( uplo, n, a, lda, info )
call dpotrf( uplo, n, a, lda, info )
call cpotrf( uplo, n, a, lda, info )
call zpotrf( uplo, n, a, lda, info )
call potrf( a [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine forms the Cholesky factorization of a symmetric positive-definite or, for complex data, Hermitian
positive-definite matrix A:
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
449
3 Intel Math Kernel Library Developer Reference
If uplo = 'L', the array a stores the lower triangular part of the
matrix A, and the strictly upper triangular part of the matrix is not
referenced.
Output Parameters
Application Notes
If uplo = 'U', the computed factor U is the exact factor of a perturbed matrix A + E, where
The total number of floating-point operations is approximately (1/3)n3 for real flavors or (4/3)n3 for
complex flavors.
450
LAPACK Routines 3
After calling this routine, you can call the following routines:
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?potrf2
Computes Cholesky factorization using a recursive
algorithm.
Syntax
call spotrf2(uplo, n, a, lda, info)
call dpotrf2(uplo, n, a, lda, info)
call cpotrf2(uplo, n, a, lda, info)
call zpotrf2(uplo, n, a, lda, info)
Include Files
mkl.fi
Description
?potrf2 computes the Cholesky factorization of a real or complex symmetric positive definite matrix A using
the recursive algorithm.
The factorization has the form
for real flavors:
A = UT * U, if uplo = 'U', or
The subroutine calls itself to factor A11. Update and scale A21 or A12, update A22 then call itself to factor
A22.
Input Parameters
451
3 Intel Math Kernel Library Developer Reference
If uplo = 'L', the leading n-by-n lower triangular part of a contains the
lower triangular part of the matrix A, and the strictly upper triangular part
of a is not referenced.
lda max(1,n).
Output Parameters
> 0: if info = i, the leading minor of order i is not positive definite, and the
factorization could not be completed.
?pstrf
Computes the Cholesky factorization with complete
pivoting of a real symmetric (complex Hermitian)
positive semidefinite matrix.
Syntax
call spstrf( uplo, n, a, lda, piv, rank, tol, work, info )
call dpstrf( uplo, n, a, lda, piv, rank, tol, work, info )
call cpstrf( uplo, n, a, lda, piv, rank, tol, work, info )
call zpstrf( uplo, n, a, lda, piv, rank, tol, work, info )
452
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes the Cholesky factorization with complete pivoting of a real symmetric (complex
Hermitian) positive semidefinite matrix. The form of the factorization is:
where P is a permutation matrix stored as vector piv, and U and L are upper and lower triangular matrices,
respectively.
This algorithm does not attempt to check that A is positive semidefinite. This version of the algorithm calls
level 3 BLAS.
Input Parameters
453
3 Intel Math Kernel Library Developer Reference
Output Parameters
piv INTEGER.
Array, size at least max(1, n). The array piv is such that the nonzero
entries are Ppiv(k),k (1 kn).
rank INTEGER.
The rank of a given by the number of steps the algorithm completed.
See Also
Matrix Storage Schemes for LAPACK Routines
?pftrf
Computes the Cholesky factorization of a symmetric
(Hermitian) positive-definite matrix using the
Rectangular Full Packed (RFP) format .
Syntax
call spftrf( transr, uplo, n, a, info )
call dpftrf( transr, uplo, n, a, info )
call cpftrf( transr, uplo, n, a, info )
call zpftrf( transr, uplo, n, a, info )
Include Files
mkl.fi, lapack.f90
Description
The routine forms the Cholesky factorization of a symmetric positive-definite or, for complex data, a
Hermitian positive-definite matrix A:
454
LAPACK Routines 3
The matrix A is in the Rectangular Full Packed (RFP) format. For the description of the RFP format, see Matrix
Storage Schemes.
This is the block version of the algorithm, calling Level 3 BLAS.
Input Parameters
transr CHARACTER*1. Must be 'N', 'T' (for real data) or 'C' (for complex
data).
If transr = 'N', the Normal transr of RFP A is stored.
Output Parameters
See Also
Matrix Storage Schemes for LAPACK Routines
?pptrf
Computes the Cholesky factorization of a symmetric
(Hermitian) positive-definite matrix using packed
storage.
Syntax
call spptrf( uplo, n, ap, info )
455
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine forms the Cholesky factorization of a symmetric positive-definite or, for complex data, Hermitian
positive-definite packed matrix A:
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
Output Parameters
456
LAPACK Routines 3
If info = i, the leading minor of order i (and therefore the matrix A
itself) is not positive-definite, and the factorization could not be
completed. This may indicate an error in forming the matrix A.
Application Notes
If uplo = 'U', the computed factor U is the exact factor of a perturbed matrix A + E, where
The total number of floating-point operations is approximately (1/3)n3 for real flavors and (4/3)n3 for
complex flavors.
After calling this routine, you can call the following routines:
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?pbtrf
Computes the Cholesky factorization of a symmetric
(Hermitian) positive-definite band matrix.
Syntax
call spbtrf( uplo, n, kd, ab, ldab, info )
call dpbtrf( uplo, n, kd, ab, ldab, info )
call cpbtrf( uplo, n, kd, ab, ldab, info )
call zpbtrf( uplo, n, kd, ab, ldab, info )
call pbtrf( ab [, uplo] [,info] )
457
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine forms the Cholesky factorization of a symmetric positive-definite or, for complex data, Hermitian
positive-definite band matrix A:
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
Output Parameters
458
LAPACK Routines 3
If info = i, the leading minor of order i (and therefore the matrix A
itself) is not positive-definite, and the factorization could not be
completed. This may indicate an error in forming the matrix A.
Application Notes
If uplo = 'U', the computed factor U is the exact factor of a perturbed matrix A + E, where
The total number of floating-point operations for real flavors is approximately n(kd+1)2. The number of
operations for complex flavors is 4 times greater. All these estimates assume that kd is much less than n.
After calling this routine, you can call the following routines:
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?pttrf
Computes the factorization of a symmetric (Hermitian)
positive-definite tridiagonal matrix.
Syntax
call spttrf( n, d, e, info )
call dpttrf( n, d, e, info )
call cpttrf( n, d, e, info )
call zpttrf( n, d, e, info )
call pttrf( d, e [,info] )
Include Files
mkl.fi, lapack.f90
459
3 Intel Math Kernel Library Developer Reference
Description
The routine forms the factorization of a symmetric positive-definite or, for complex data, Hermitian positive-
definite tridiagonal matrix A:
A = L*D*LT for real flavors, or
A = L*D*LH for complex flavors,
where D is diagonal and L is unit lower bidiagonal. The factorization may also be regarded as having the form
A = UT*D*U for real flavors, or A = UH*D*U for complex flavors, where U is unit upper bidiagonal.
Input Parameters
Output Parameters
460
LAPACK Routines 3
?sytrf
Computes the Bunch-Kaufman factorization of a
symmetric matrix.
Syntax
call ssytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call dsytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call csytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call zsytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call sytrf( a [, uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a real/complex symmetric matrix A using the Bunch-Kaufman
diagonal pivoting method. The form of the factorization is:
if uplo='U', A = U*D*UT
if uplo='L', A = L*D*LT,
where A is the input matrix, U and L are products of permutation and triangular matrices with unit diagonal
(upper triangular for U and lower triangular for L), and D is a symmetric block-diagonal matrix with 1-by-1
and 2-by-2 diagonal blocks. U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks of
D.
NOTE
This routine supports the Progress Routine feature. See Progress Routinesection for details.
Input Parameters
If uplo = 'L', the array a stores the lower triangular part of the
matrix A, and A is factored as L*D*LT.
461
3 Intel Math Kernel Library Developer Reference
Array, size (lda,*). The array a contains either the upper or the
lower triangular part of the matrix A (see uplo). The second dimension
of a must be at least max(1, n).
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D. If ipiv(i) = k >0, then dii is a 1-by-1
block, and the i-th row and column of A was interchanged with the k-
th row and column.
If uplo = 'U' and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L' and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
462
LAPACK Routines 3
a holds the matrix A of size (n, n)
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The 2-by-2 unit diagonal blocks and the unit diagonal elements of U and L are not stored. The remaining
elements of U and L are stored in the corresponding columns of the array a, but additional row interchanges
are required to recover U or L explicitly (which is seldom necessary).
If ipiv(i) = i for all i =1...n, then all off-diagonal elements of U (L) are stored explicitly in the
corresponding elements of the array a.
If uplo = 'U', the computed factors U and D are the exact factors of a perturbed matrix A + E, where
|E| c(n)P|U||D||UT|PT
c(n) is a modest linear function of n, and is the machine precision. A similar estimate holds for the
computed L and D when uplo = 'L'.
The total number of floating-point operations is approximately (1/3)n3 for real flavors or (4/3)n3 for
complex flavors.
After calling this routine, you can call the following routines:
463
3 Intel Math Kernel Library Developer Reference
If s = 2, the upper triangle of D(k) overwrites A(k-1,k-1), A(k-1,k) and A(k,k), and v overwrites A(1:k-2,k
-1:k).
If s = 2, the lower triangle of D(k) overwrites A(k,k), A(k+1,k), and A(k+1,k+1), and v overwrites A(k
+2:n,k:k+1).
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?sytrf_rook
Computes the bounded Bunch-Kaufman factorization
of a symmetric matrix.
Syntax
call ssytrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call dsytrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call csytrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call zsytrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call sytrf_rook( a [, uplo] [,ipiv] [,info] )
464
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a real/complex symmetric matrix A using the bounded Bunch-
Kaufman ("rook") diagonal pivoting method. The form of the factorization is:
if uplo='U', A = U*D*UT
if uplo='L', A = L*D*LT,
where A is the input matrix, U and L are products of permutation and triangular matrices with unit diagonal
(upper triangular for U and lower triangular for L), and D is a symmetric block-diagonal matrix with 1-by-1
and 2-by-2 diagonal blocks. U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks of
D.
Input Parameters
If uplo = 'L', the array a stores the lower triangular part of the
matrix A, and A is factored as L*D*LT.
465
3 Intel Math Kernel Library Developer Reference
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D.
If ipiv(k) > 0, then rows and columns k and ipiv(k) were
interchanged and Dk, k is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) < 0 and ipiv(k - 1) < 0, then rows
and columns k and -ipiv(k) were interchanged, rows and columns k -
1 and -ipiv(k - 1) were interchanged, and Dk-1:k, k-1:k is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows
and columns k and -ipiv(k) were interchanged, rows and columns k
+ 1 and -ipiv(k + 1) were interchanged, and Dk:k+1, k:k+1 is a 2-by-2
diagonal block.
Application Notes
The total number of floating-point operations is approximately (1/3)n3 for real flavors or (4/3)n3 for
complex flavors.
After calling this routine, you can call the following routines:
466
LAPACK Routines 3
?sytri_rook to compute the inverse of A.
If s = 2, the upper triangle of D(k) overwrites A(k-1,k-1), A(k-1,k) and A(k,k), and v overwrites A(1:k-2,k
-1:k).
If s = 2, the lower triangle of D(k) overwrites A(k,k), A(k+1,k), and A(k+1,k+1), and v overwrites A(k
+2:n,k:k+1).
See Also
Matrix Storage Schemes for LAPACK Routines
467
3 Intel Math Kernel Library Developer Reference
?hetrf
Computes the Bunch-Kaufman factorization of a
complex Hermitian matrix.
Syntax
call chetrf( uplo, n, a, lda, ipiv, work, lwork, info )
call zhetrf( uplo, n, a, lda, ipiv, work, lwork, info )
call hetrf( a [, uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a complex Hermitian matrix A using the Bunch-Kaufman diagonal
pivoting method:
if uplo='U', A = U*D*UH
if uplo='L', A = L*D*LH,
where A is the input matrix, U and L are products of permutation and triangular matrices with unit diagonal
(upper triangular for U and lower triangular for L), and D is a Hermitian block-diagonal matrix with 1-by-1
and 2-by-2 diagonal blocks. U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks of
D.
NOTE
This routine supports the Progress Routine feature. See Progress Routinesection for details.
Input Parameters
If uplo = 'L', the array a stores the lower triangular part of the
matrix A, and A is factored as L*D*LH.
The array a contains the upper or the lower triangular part of the
matrix A (see uplo). The second dimension of a must be at least
max(1, n).
work(*) is a workspace array of dimension at least max(1, lwork).
468
LAPACK Routines 3
lwork INTEGER. The size of the work array (lworkn).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the
first entry of the work array, and no error message related to lwork is
issued by xerbla.
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D. If ipiv(i) = k >0, then dii is a 1-by-1
block, and the i-th row and column of A was interchanged with the k-
th row and column.
If uplo = 'U' and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L' and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
Application Notes
This routine is suitable for Hermitian matrices that are not known to be positive-definite. If A is in fact
positive-definite, the routine does not perform interchanges, and no 2-by-2 diagonal blocks occur in D.
469
3 Intel Math Kernel Library Developer Reference
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The 2-by-2 unit diagonal blocks and the unit diagonal elements of U and L are not stored. The remaining
elements of U and L are stored in the corresponding columns of the array a, but additional row interchanges
are required to recover U or L explicitly (which is seldom necessary).
Ifipiv(i) = i for all i =1...n, then all off-diagonal elements of U (L) are stored explicitly in the corresponding
elements of the array a.
If uplo = 'U', the computed factors U and D are the exact factors of a perturbed matrix A + E, where
|E| c(n)P|U||D||UT|PT
After calling this routine, you can call the following routines:
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?hetrf_rook
Computes the bounded Bunch-Kaufman factorization
of a complex Hermitian matrix.
Syntax
call chetrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call zhetrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call hetrf_rook( a [, uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
470
LAPACK Routines 3
The routine computes the factorization of a complex Hermitian matrix A using the bounded Bunch-Kaufman
diagonal pivoting method:
if uplo='U', A = U*D*UH
if uplo='L', A = L*D*LH,
where A is the input matrix, U (or L ) is a product of permutation and unit upper ( or lower) triangular
matrices, and D is a Hermitian block-diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks.
This is the blocked version of the algorithm, calling Level 3 BLAS.
Input Parameters
The array a contains the upper or the lower triangular part of the
matrix A (see uplo).
If uplo = 'U', the leading n-by-n upper triangular part of a contains
the upper triangular part of the matrix A, and the strictly lower
triangular part of a is not referenced. If uplo = 'L', the leading n-by-n
lower triangular part of a contains the lower triangular part of the
matrix A, and the strictly upper triangular part of a is not referenced.
Output Parameters
471
3 Intel Math Kernel Library Developer Reference
The block diagonal matrix D and the multipliers used to obtain the
factor U or L (see Application Notes for further details).
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D.
If uplo = 'U':
If ipiv(k) > 0, then rows and columns k and ipiv(k) were
interchanged and Dk, k is a 1-by-1 diagonal block.
If ipiv(k) < 0 and ipiv(k - 1) < 0, then rows and columns k and
-ipiv(k) were interchanged and rows and columns k - 1 and -
ipiv(k - 1) were interchanged, Dk - 1:k,k - 1:k is a 2-by-2 diagonal
block.
If uplo = 'L':
Application Notes
472
LAPACK Routines 3
i.e., U is a product of terms P(k)*U(k), where k decreases from n to 1 in steps of 1 or 2, and D is a block
diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks D(k). P(k) is a permutation matrix as defined by
ipiv(k), and U(k) is a unit upper triangular matrix, such that if the diagonal block D(k) is of order s (s = 1
or 2), then
ks s nk
ks I v 0
U k =
s 0 I 0
nk 0 0 I
k1 s nks+1
k1 I 0 0
Lk =
s 0 I 0
nks+1 0 v I
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?sptrf
Computes the Bunch-Kaufman factorization of a
symmetric matrix using packed storage.
Syntax
call ssptrf( uplo, n, ap, ipiv, info )
call dsptrf( uplo, n, ap, ipiv, info )
call csptrf( uplo, n, ap, ipiv, info )
call zsptrf( uplo, n, ap, ipiv, info )
call sptrf( ap [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a real/complex symmetric matrix A stored in the packed format
using the Bunch-Kaufman diagonal pivoting method. The form of the factorization is:
473
3 Intel Math Kernel Library Developer Reference
if uplo='U', A = U*D*UT
if uplo='L', A = L*D*LT,
where U and L are products of permutation and triangular matrices with unit diagonal (upper triangular for U
and lower triangular for L), and D is a symmetric block-diagonal matrix with 1-by-1 and 2-by-2 diagonal
blocks. U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks of D.
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
If uplo = 'L', the array ap stores the lower triangular part of the
matrix A, and A is factored as L*D*LT.
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D. If ipiv(i) = k >0, then dii is a 1-by-1
block, and the i-th row and column of A was interchanged with the k-
th row and column.
If uplo = 'U' and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L' and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
474
LAPACK Routines 3
If info = -i, the i-th parameter had an illegal value.
Application Notes
The 2-by-2 unit diagonal blocks and the unit diagonal elements of U and L are not stored. The remaining
elements of U and L overwrite elements of the corresponding columns of the array ap, but additional row
interchanges are required to recover U or L explicitly (which is seldom necessary).
If ipiv(i) = i for all i = 1...n, then all off-diagonal elements of U (L) are stored explicitly in packed form.
If uplo = 'U', the computed factors U and D are the exact factors of a perturbed matrix A + E, where
|E| c(n)P|U||D||UT|PT
c(n) is a modest linear function of n, and is the machine precision. A similar estimate holds for the
computed L and D when uplo = 'L'.
The total number of floating-point operations is approximately (1/3)n3 for real flavors or (4/3)n3 for
complex flavors.
After calling this routine, you can call the following routines:
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?hptrf
Computes the Bunch-Kaufman factorization of a
complex Hermitian matrix using packed storage.
Syntax
call chptrf( uplo, n, ap, ipiv, info )
call zhptrf( uplo, n, ap, ipiv, info )
call hptrf( ap [,uplo] [,ipiv] [,info] )
475
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a complex Hermitian packed matrix A using the Bunch-Kaufman
diagonal pivoting method:
if uplo='U', A = U*D*UH
if uplo='L', A = L*D*LH,
where A is the input matrix, U and L are products of permutation and triangular matrices with unit diagonal
(upper triangular for U and lower triangular for L), and D is a Hermitian block-diagonal matrix with 1-by-1
and 2-by-2 diagonal blocks. U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks of
D.
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
If uplo = 'L', the array ap stores the lower triangular part of the
matrix A, and A is factored as L*D*LH.
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D. If ipiv(i) = k >0, then dii is a 1-by-1
block, and the i-th row and column of A was interchanged with the k-
th row and column.
476
LAPACK Routines 3
If uplo = 'U' and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L' and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
Application Notes
The 2-by-2 unit diagonal blocks and the unit diagonal elements of U and L are not stored. The remaining
elements of U and L are stored in the array ap, but additional row interchanges are required to recover U or L
explicitly (which is seldom necessary).
If ipiv(i) = i for all i = 1...n, then all off-diagonal elements of U (L) are stored explicitly in the
corresponding elements of the array a.
If uplo = 'U', the computed factors U and D are the exact factors of a perturbed matrix A + E, where
|E| c(n)P|U||D||UT|PT
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
477
3 Intel Math Kernel Library Developer Reference
mkl_?spffrt2, mkl_?spffrtx
Computes the partial LDLT factorization of a
symmetric matrix using packed storage.
Syntax
call mkl_sspffrt2( ap, n, ncolm, work, work2 )
call mkl_dspffrt2( ap, n, ncolm, work, work2 )
call mkl_cspffrt2( ap, n, ncolm, work, work2 )
call mkl_zspffrt2( ap, n, ncolm, work, work2 )
call mkl_sspffrtx( ap, n, ncolm, work, work2 )
call mkl_dspffrtx( ap, n, ncolm, work, work2 )
call mkl_cspffrtx( ap, n, ncolm, work, work2 )
call mkl_zspffrtx( ap, n, ncolm, work, work2 )
Include Files
mkl.fi
Description
The routine computes the partial factorization A = LDLT , where L is a lower triangular matrix and D is a
diagonal matrix.
CAUTION
The routine assumes that the matrix A is factorizable. The routine does not perform pivoting and does
not handle diagonal elements which are zero, which cause the routine to produce incorrect results
without any indication.
a bT
Consider the matrix A = , where a is the element in the first row and first column of A, b is a column
b C
vector of size n - 1 containing the elements from the second through n-th column of A, C is the lower-right
square submatrix of A, and I is the identity matrix.
The mkl_?spffrt2 routine performs ncolm successive factorizations of the form
1
a bT a 0 a 0 a bT
A= = .
b C b I 0 C ba 1bT 0 I
The approximate number of floating point operations performed by real flavors of these routines is
(1/6)*ncolm*(2*ncolm2 - 6*ncolm*n + 3*ncolm + 6*n2 - 6*n + 7).
The approximate number of floating point operations performed by complex flavors of these routines is
(1/3)*ncolm*(4*ncolm2 - 12*ncolm*n + 9*ncolm + 12*n2 - 18*n + 8).
Input Parameters
478
LAPACK Routines 3
COMPLEX for mkl_cspffrt2 and mkl_cspffrtx
DOUBLE COMPLEX for mkl_zspffrt2 and mkl_zspffrtx.
Array, size at least max(1, n(n+1)/2). The array ap contains the lower
triangular part of the matrix A in packed storage (see Matrix Storage
Schemes for uplo = 'L').
Output Parameters
NOTE
Specifying ncolm = n results in complete factorization A = LDLT.
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?getrs
Solves a system of linear equations with an LU-
factored square coefficient matrix, with multiple right-
hand sides.
Syntax
call sgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call dgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call cgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call zgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call getrs( a, ipiv, b [, trans] [,info] )
479
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
A*X = B if trans='N',
AT*X = B if trans='T',
Before calling this routine, you must call ?getrf to compute the LU factorization of A.
Input Parameters
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
The second dimension of a must be at least max(1,n) and the second
dimension of b at least max(1,nrhs).
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?getrf.
Output Parameters
480
LAPACK Routines 3
info INTEGER. If info = 0, the execution is successful.
If info = -i, the i-th parameter had an illegal value.
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
|E| c(n)P|L||U|
Note that cond(A,x) can be much smaller than (A); the condition number of AT and AH might or might
not be equal to (A).
The approximate number of floating-point operations for one right-hand side vector b is 2n2 for real flavors
and 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?gbtrs
Solves a system of linear equations with an LU-
factored band coefficient matrix, with multiple right-
hand sides.
Syntax
call sgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call dgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
481
3 Intel Math Kernel Library Developer Reference
call cgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call zgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call gbtrs( ab, b, ipiv, [, kl] [, trans] [, info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations:
A*X = B if trans='N',
AT*X = B if trans='T',
Here A is an LU-factored general band matrix of order n with kl non-zero subdiagonals and ku nonzero
superdiagonals. Before calling this routine, call ?gbtrf to compute the LU factorization of A.
Input Parameters
ldab INTEGER. The leading dimension of the array ab; ldab 2*kl + ku +1.
ipiv INTEGER. Array, size at least max(1, n). The ipiv array, as returned
by ?gbtrf.
482
LAPACK Routines 3
Output Parameters
ku Restored as lda-2*kl-1.
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
Note that cond(A,x) can be much smaller than (A); the condition number of AT and AH might or might
not be equal to (A).
The approximate number of floating-point operations for one right-hand side vector is 2n(ku + 2kl) for real
flavors. The number of operations for complex flavors is 4 times greater. All these estimates assume that kl
and ku are much less than min(m,n).
To estimate the condition number (A), call ?gbcon.
See Also
Matrix Storage Schemes for LAPACK Routines
483
3 Intel Math Kernel Library Developer Reference
?gttrs
Solves a system of linear equations with a tridiagonal
coefficient matrix using the LU factorization computed
by ?gttrf.
Syntax
call sgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call dgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call cgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call zgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call gttrs( dl, d, du, du2, b, ipiv [, trans] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations with multiple right hand sides:
A*X = B if trans='N',
AT*X = B if trans='T',
Before calling this routine, you must call ?gttrf to compute the LU factorization of A.
Input Parameters
nrhs INTEGER. The number of right-hand sides, that is, the number of
columns in B; nrhs 0.
484
LAPACK Routines 3
The array du contains the (n - 1) elements of the first superdiagonal
of U.
The array du2 contains the (n - 2) elements of the second
superdiagonal of U.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
ipiv INTEGER. Array, size (n). The ipiv array, as returned by ?gttrf.
Output Parameters
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
|E| c(n)P|L||U|
485
3 Intel Math Kernel Library Developer Reference
Note that cond(A,x) can be much smaller than (A); the condition number of AT and AH might or might
not be equal to (A).
The approximate number of floating-point operations for one right-hand side vector b is 7n (including n
divisions) for real flavors and 34n (including 2n divisions) for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?dttrsb
Solves a system of linear equations with a diagonally
dominant tridiagonal coefficient matrix using the LU
factorization computed by ?dttrfb.
Syntax
call sdttrsb( trans, n, nrhs, dl, d, du, b, ldb, info )
call ddttrsb( trans, n, nrhs, dl, d, du, b, ldb, info )
call cdttrsb( trans, n, nrhs, dl, d, du, b, ldb, info )
call zdttrsb( trans, n, nrhs, dl, d, du, b, ldb, info )
call dttrsb( dl, d, du, b [, trans] [, info] )
Include Files
mkl.fi, lapack.f90
Description
The ?dttrsb routine solves the following systems of linear equations with multiple right hand sides for X:
A*X = B if trans='N',
AT*X = B if trans='T',
Input Parameters
486
LAPACK Routines 3
nrhs INTEGER. The number of right-hand sides, that is, the number of
columns in B; nrhs 0.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
Output Parameters
?potrs
Solves a system of linear equations with a Cholesky-
factored symmetric (Hermitian) positive-definite
coefficient matrix.
Syntax
call spotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call dpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call cpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call zpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call potrs( a, b [,uplo] [, info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a symmetric positive-definite or, for
complex data, Hermitian positive-definite matrix A, given the Cholesky factorization of A:
487
3 Intel Math Kernel Library Developer Reference
where L is a lower triangular matrix and U is upper triangular. The system is solved with multiple right-hand
sides stored in the columns of the matrix B.
Before calling this routine, you must call potrf to compute the Cholesky factorization of A.
Input Parameters
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
Output Parameters
488
LAPACK Routines 3
uplo Must be 'U' or 'L'. The default value is 'U'.
Application Notes
If uplo = 'U', the computed solution for each right-hand side b is the exact solution of a perturbed system
of equations (A + E)x = b, where
Note that cond(A,x) can be much smaller than (A). The approximate number of floating-point operations
for one right-hand side vector b is 2n2 for real flavors and 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?pftrs
Solves a system of linear equations with a Cholesky-
factored symmetric (Hermitian) positive-definite
coefficient matrix using the Rectangular Full Packed
(RFP) format.
Syntax
call spftrs( transr, uplo, n, nrhs, a, b, ldb, info )
call dpftrs( transr, uplo, n, nrhs, a, b, ldb, info )
call cpftrs( transr, uplo, n, nrhs, a, b, ldb, info )
call zpftrs( transr, uplo, n, nrhs, a, b, ldb, info )
Include Files
mkl.fi, lapack.f90
Description
The routine solves a system of linear equations A*X = B with a symmetric positive-definite or, for complex
data, Hermitian positive-definite matrix A using the Cholesky factorization of A:
Before calling ?pftrs, you must call ?pftrf to compute the Cholesky factorization of A. L stands for a lower
triangular matrix and U for an upper triangular matrix.
489
3 Intel Math Kernel Library Developer Reference
The matrix A is in the Rectangular Full Packed (RFP) format. For the description of the RFP format, see Matrix
Storage Schemes.
Input Parameters
transr CHARACTER*1. Must be 'N', 'T' (for real data) or 'C' (for complex
data).
If transr = 'N', the untransposed factor of Ais stored in RFP format.
nrhs INTEGER. The number of right-hand sides, that is, the number of
columns of the matrix B; nrhs 0.
Output Parameters
See Also
Matrix Storage Schemes for LAPACK Routines
490
LAPACK Routines 3
?pptrs
Solves a system of linear equations with a packed
Cholesky-factored symmetric (Hermitian) positive-
definite coefficient matrix.
Syntax
call spptrs( uplo, n, nrhs, ap, b, ldb, info )
call dpptrs( uplo, n, nrhs, ap, b, ldb, info )
call cpptrs( uplo, n, nrhs, ap, b, ldb, info )
call zpptrs( uplo, n, nrhs, ap, b, ldb, info )
call pptrs( ap, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a packed symmetric positive-definite or,
for complex data, Hermitian positive-definite matrix A, given the Cholesky factorization of A:
where L is a lower triangular matrix and U is upper triangular. The system is solved with multiple right-hand
sides stored in the columns of the matrix B.
Before calling this routine, you must call ?pptrf to compute the Cholesky factorization of A.
Input Parameters
491
3 Intel Math Kernel Library Developer Reference
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1, nrhs).
Output Parameters
Application Notes
If uplo = 'U', the computed solution for each right-hand side b is the exact solution of a perturbed system
of equations (A + E)x = b, where
If x0 is the true solution, the computed solution x satisfies this error bound:
The approximate number of floating-point operations for one right-hand side vector b is 2n2 for real flavors
and 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
492
LAPACK Routines 3
?pbtrs
Solves a system of linear equations with a Cholesky-
factored symmetric (Hermitian) positive-definite band
coefficient matrix.
Syntax
call spbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call dpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call cpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call zpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call pbtrs( ab, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for real data a system of linear equations A*X = B with a symmetric positive-definite or,
for complex data, Hermitian positive-definite band matrix A, given the Cholesky factorization of A:
where L is a lower triangular matrix and U is upper triangular. The system is solved with multiple right-hand
sides stored in the columns of the matrix B.
Before calling this routine, you must call ?pbtrf to compute the Cholesky factorization of A in the band
storage form.
Input Parameters
493
3 Intel Math Kernel Library Developer Reference
ldab INTEGER. The leading dimension of the array ab; ldabkd +1.
Output Parameters
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
The approximate number of floating-point operations for one right-hand side vector is 4n*kd for real flavors
and 16n*kd for complex flavors.
To estimate the condition number (A), call ?pbcon.
494
LAPACK Routines 3
To refine the solution and estimate the error, call ?pbrfs.
See Also
Matrix Storage Schemes for LAPACK Routines
?pttrs
Solves a system of linear equations with a symmetric
(Hermitian) positive-definite tridiagonal coefficient
matrix using the factorization computed by ?pttrf.
Syntax
call spttrs( n, nrhs, d, e, b, ldb, info )
call dpttrs( n, nrhs, d, e, b, ldb, info )
call cpttrs( uplo, n, nrhs, d, e, b, ldb, info )
call zpttrs( uplo, n, nrhs, d, e, b, ldb, info )
call pttrs( d, e, b [,info] )
call pttrs( d, e, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X a system of linear equations A*X = B with a symmetric (Hermitian) positive-definite
tridiagonal matrix A. Before calling this routine, call ?pttrf to compute the L*D*LT or UT*D*Ufor real data
and the L*D*LH or UH*D*Ufactorization of A for complex data.
Input Parameters
nrhs INTEGER. The number of right-hand sides, that is, the number of
columns of the matrix B; nrhs 0.
495
3 Intel Math Kernel Library Developer Reference
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
Output Parameters
uplo Used in complex flavors only. Must be 'U' or 'L'. The default value is
'U'.
See Also
Matrix Storage Schemes for LAPACK Routines
?sytrs
Solves a system of linear equations with a UDUT- or
LDLT-factored symmetric coefficient matrix.
Syntax
call ssytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call dsytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call csytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zsytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call sytrs( a, b, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
496
LAPACK Routines 3
Description
The routine solves for X the system of linear equations A*X = B with a symmetric matrix A, given the Bunch-
Kaufman factorization of A:
if uplo='U', A = U*D*UT
if uplo='L', A = L*D*LT,
where U and L are upper and lower triangular matrices with unit diagonal and D is a symmetric block-
diagonal matrix. The system is solved with multiple right-hand sides stored in the columns of the matrix B.
You must supply to this routine the factor U (or L) and the array ipiv returned by the factorization routine ?
sytrf.
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LT.
ipiv INTEGER. Array, size at least max(1, n). The ipiv array, as returned
by ?sytrf.
The array b contains the matrix B whose columns are the right-hand
sides for the system of equations. The second dimension of b must be
at least max(1,nrhs).
Output Parameters
497
3 Intel Math Kernel Library Developer Reference
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
The total number of floating-point operations for one right-hand side vector is approximately 2n2 for real
flavors or 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?sytrs_rook
Solves a system of linear equations with a UDU- or
LDL-factored symmetric coefficient matrix.
Syntax
call ssytrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call dsytrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call csytrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zsytrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
498
LAPACK Routines 3
call sytrs_rook( a, b, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves a system of linear equations A*X = B with a symmetric matrix A, using the factorization A
= U*D*UT or A = L*D*LT computed by ?sytrf_rook.
Input Parameters
ipiv INTEGER. Array, size at least max(1, n). The ipiv array, as returned
by ?sytrf_rook.
The array a contains the block diagonal matrix D and the multipliers
used to obtain U or L as computed by ?sytrf_rook (see uplo).
The array b contains the matrix B whose columns are the right-hand
sides for the system of equations.
Output Parameters
499
3 Intel Math Kernel Library Developer Reference
Application Notes
The total number of floating-point operations for one right-hand side vector is approximately 2n2 for real
flavors or 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?hetrs
Solves a system of linear equations with a UDUT- or
LDLT-factored Hermitian coefficient matrix.
Syntax
call chetrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zhetrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call hetrs( a, b, ipiv [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a Hermitian matrix A, given the Bunch-
Kaufman factorization of A:
if uplo='U', A = U*D*UH
if uplo='L', A = L*D*LH,
where U and L are upper and lower triangular matrices with unit diagonal and D is a symmetric block-
diagonal matrix. The system is solved with multiple right-hand sides stored in the columns of the matrix B.
You must supply to this routine the factor U (or L) and the array ipiv returned by the factorization routine ?
hetrf.
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LH.
500
LAPACK Routines 3
nrhs INTEGER. The number of right-hand sides; nrhs 0.
ipiv INTEGER.
Array, size at least max(1, n).
Output Parameters
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
501
3 Intel Math Kernel Library Developer Reference
The total number of floating-point operations for one right-hand side vector is approximately 8n2.
See Also
Matrix Storage Schemes for LAPACK Routines
?hetrs_rook
Solves a system of linear equations with a UDU- or
LDL-factored Hermitian coefficient matrix.
Syntax
call chetrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zhetrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call hetrs_rook( a, b, ipiv [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for a system of linear equations A*X = B with a complex Hermitian matrix A using the
factorization A = U*D*UH or A = L*D*LH computed by ?hetrf_rook.
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n).
502
LAPACK Routines 3
Arrays: a(lda,n), b(ldb,nrhs).
The array a contains the block diagonal matrix D and the multipliers
used to obtain the factor U or L as computed by ?hetrf_rook (see
uplo).
The array b contains the matrix B whose columns are the right-hand
sides for the system of equations.
Output Parameters
?sytrs2
Solves a system of linear equations with a UDU- or
LDL-factored symmetric coefficient matrix.
Syntax
call ssytrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call dsytrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call csytrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call zsytrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call sytrs2( a,b,ipiv[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves a system of linear equations A*X = B with a symmetric matrix A using the factorization of
A:
503
3 Intel Math Kernel Library Developer Reference
if uplo='U', A = U*D*UT
if uplo='L', A = L*D*LT
where
U and L are upper and lower triangular matrices with unit diagonal
D is a symmetric block-diagonal matrix.
The factorization is computed by ?sytrf.
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LT.
The array a contains the block diagonal matrix D and the multipliers
used to obtain the factor U or L as computed by ?sytrf.
ipiv INTEGER. Array of size n. The ipiv array contains details of the
interchanges and the block structure of D as determined by ?sytrf.
504
LAPACK Routines 3
Output Parameters
uplo Indicates how the input matrix A has been factored. Must be 'U' or
'L'.
See Also
?sytrf
Matrix Storage Schemes for LAPACK Routines
?hetrs2
Solves a system of linear equations with a UDU- or
LDL-factored Hermitian coefficient matrix.
Syntax
call chetrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call zhetrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call hetrs2( a, b, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves a system of linear equations A*X = B with a complex Hermitian matrix A using the
factorization of A:
if uplo='U', A = U*D*UH
if uplo='L', A = L*D*LH
where
U and L are upper and lower triangular matrices with unit diagonal
D is a Hermitian block-diagonal matrix.
The factorization is computed by ?hetrf.
505
3 Intel Math Kernel Library Developer Reference
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LH.
The array a contains the block diagonal matrix D and the multipliers
used to obtain the factor U or L as computed by ?hetrf.
ipiv INTEGER. Array of size n. The ipiv array contains details of the
interchanges and the block structure of D as determined by ?hetrf.
Output Parameters
506
LAPACK Routines 3
b Holds the matrix B of size (n, nrhs).
See Also
?hetrf
Matrix Storage Schemes for LAPACK Routines
?sptrs
Solves a system of linear equations with a UDU- or
LDL-factored symmetric coefficient matrix using
packed storage.
Syntax
call ssptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call dsptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call csptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zsptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call sptrs( ap, b, ipiv [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a symmetric matrix A, given the Bunch-
Kaufman factorization of A:
if uplo='U', A = U*D*UT
if uplo='L', A = L*D*LT,
where U and L are upper and lower packed triangular matrices with unit diagonal and D is a symmetric
block-diagonal matrix. The system is solved with multiple right-hand sides stored in the columns of the
matrix B. You must supply the factor U (or L) and the array ipiv returned by the factorization routine ?sptrf.
Input Parameters
ipiv INTEGER.
507
3 Intel Math Kernel Library Developer Reference
Array, size at least max(1, n). The ipiv array, as returned by ?sptrf.
Output Parameters
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
508
LAPACK Routines 3
The total number of floating-point operations for one right-hand side vector is approximately 2n2 for real
flavors or 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?hptrs
Solves a system of linear equations with a UDU- or
LDL-factored Hermitian coefficient matrix using
packed storage.
Syntax
call chptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zhptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call hptrs( ap, b, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a Hermitian matrix A, given the Bunch-
Kaufman factorization of A:
if uplo='U', A = U*D*UH
if uplo='L', A = L*D*LH,
where U and L are upper and lower packed triangular matrices with unit diagonal and D is a symmetric
block-diagonal matrix. The system is solved with multiple right-hand sides stored in the columns of the
matrix B.
You must supply to this routine the arrays ap (containing U or L)and ipiv in the form returned by the
factorization routine ?hptrf.
Input Parameters
509
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER. Array, size at least max(1, n). The ipiv array, as returned
by ?hptrf.
Output Parameters
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
510
LAPACK Routines 3
c(n) is a modest linear function of n, and is the machine precision.
If x0 is the true solution, the computed solution x satisfies this error bound:
The total number of floating-point operations for one right-hand side vector is approximately 8n2 for complex
flavors.
To estimate the condition number (A), call ?hpcon.
See Also
Matrix Storage Schemes for LAPACK Routines
?trtrs
Solves a system of linear equations with a triangular
coefficient matrix, with multiple right-hand sides.
Syntax
call strtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call dtrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call ctrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call ztrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call trtrs( a, b [,uplo] [, trans] [,diag] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations with a triangular matrix A, with multiple
right-hand sides stored in B:
A*X = B if trans='N',
AT*X = B if trans='T',
Input Parameters
511
3 Intel Math Kernel Library Developer Reference
Output Parameters
512
LAPACK Routines 3
Specific details for the routine trtrs interface are as follows:
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
c(n) is a modest linear function of n, and is the machine precision. If x0 is the true solution, the computed
solution x satisfies this error bound:
Note that cond(A,x) can be much smaller than (A); the condition number of AT and AH might or might
not be equal to (A).
The approximate number of floating-point operations for one right-hand side vector b is n2 for real flavors
and 4n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?tptrs
Solves a system of linear equations with a packed
triangular coefficient matrix, with multiple right-hand
sides.
Syntax
call stptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call dtptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call ctptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call ztptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call tptrs( ap, b [,uplo] [, trans] [,diag] [,info] )
513
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations with a packed triangular matrix A, with
multiple right-hand sides stored in B:
A*X = B if trans='N',
AT*X = B if trans='T',
Input Parameters
514
LAPACK Routines 3
The array b(ldb,*) contains the matrix B whose columns are the
right-hand sides for the system of equations.
The second dimension of b must be at least max(1, nrhs).
Output Parameters
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
Note that cond(A,x) can be much smaller than (A); the condition number of AT and AH might or might
not be equal to (A).
The approximate number of floating-point operations for one right-hand side vector b is n2 for real flavors
and 4n2 for complex flavors.
515
3 Intel Math Kernel Library Developer Reference
See Also
Matrix Storage Schemes for LAPACK Routines
?tbtrs
Solves a system of linear equations with a band
triangular coefficient matrix, with multiple right-hand
sides.
Syntax
call stbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call dtbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call ctbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call ztbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call tbtrs( ab, b [,uplo] [, trans] [,diag] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations with a band triangular matrix A, with
multiple right-hand sides stored in B:
A*X = B if trans='N',
AT*X = B if trans='T',
Input Parameters
516
LAPACK Routines 3
kd INTEGER. The number of superdiagonals or subdiagonals in the matrix
A; kd 0.
Output Parameters
517
3 Intel Math Kernel Library Developer Reference
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
|E| c(n)|A|
c(n) is a modest linear function of n, and is the machine precision. If x0 is the true solution, the computed
solution x satisfies this error bound:
Note that cond(A,x) can be much smaller than (A); the condition number of AT and AH might or might
not be equal to (A).
The approximate number of floating-point operations for one right-hand side vector b is 2n*kd for real
flavors and 8n*kd for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?gecon
Estimates the reciprocal of the condition number of a
general matrix in the 1-norm or the infinity-norm.
Syntax
call sgecon( norm, n, a, lda, anorm, rcond, work, iwork, info )
call dgecon( norm, n, a, lda, anorm, rcond, work, iwork, info )
call cgecon( norm, n, a, lda, anorm, rcond, work, rwork, info )
call zgecon( norm, n, a, lda, anorm, rcond, work, rwork, info )
call gecon( a, anorm, rcond [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a general matrix A in the 1-norm or infinity-
norm:
1(A) =||A||1||A-1||1 = (AT) = (AH)
518
LAPACK Routines 3
(A) =||A||||A-1|| = 1(AT) = 1(AH).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
Input Parameters
Output Parameters
519
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b or AH*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2*n2 floating-point operations for real flavors and 8*n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?gbcon
Estimates the reciprocal of the condition number of a
band matrix in the 1-norm or the infinity-norm.
Syntax
call sgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, iwork, info )
call dgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, iwork, info )
call cgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, rwork, info )
call zgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, rwork, info )
call gbcon( ab, ipiv, anorm, rcond [,kl] [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a general band matrix A in the 1-norm or
infinity-norm:
1(A) = ||A||1||A-1||1 = (AT) = (AH)
(A) = ||A||||A-1|| = 1(AT) = 1(AH).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
520
LAPACK Routines 3
Before calling this routine:
Input Parameters
ldab INTEGER. The leading dimension of the array ab. (ldab 2*kl + ku
+1).
ipiv INTEGER. Array, size at least max(1, n). The ipiv array, as returned
by ?gbtrf.
521
3 Intel Math Kernel Library Developer Reference
Output Parameters
ku Restored as ku = lda-2*kl-1.
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b or AH*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2n(ku + 2kl) floating-point operations for real flavors and 8n(ku + 2kl) for complex
flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?gtcon
Estimates the reciprocal of the condition number of a
tridiagonal matrix.
Syntax
call sgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, iwork, info )
call dgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, iwork, info )
call cgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, info )
call zgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, info )
call gtcon( dl, d, du, du2, ipiv, anorm, rcond [,norm] [,info] )
522
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a real or complex tridiagonal matrix A in the
1-norm or infinity-norm:
1(A) = ||A||1||A-1||1
(A) = ||A||||A-1||
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
Input Parameters
ipiv INTEGER.
Array, size (n). The array of pivot indices, as returned by ?gttrf.
523
3 Intel Math Kernel Library Developer Reference
iwork INTEGER. Workspace array, size (n). Used for real flavors only.
Output Parameters
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2
floating-point operations for real flavors and 8n2 for complex flavors.
?pocon
Estimates the reciprocal of the condition number of a
symmetric (Hermitian) positive-definite matrix.
524
LAPACK Routines 3
Syntax
call spocon( uplo, n, a, lda, anorm, rcond, work, iwork, info )
call dpocon( uplo, n, a, lda, anorm, rcond, work, iwork, info )
call cpocon( uplo, n, a, lda, anorm, rcond, work, rwork, info )
call zpocon( uplo, n, a, lda, anorm, rcond, work, rwork, info )
call pocon( a, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a symmetric (Hermitian) positive-definite
matrix A:
1(A) = ||A||1 ||A-1||1 (since A is symmetric or Hermitian, (A) = 1(A)).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
Input Parameters
The array work is a workspace for the routine. The dimension of work
must be at least max(1, 3*n) for real flavors and max(1, 2*n) for
complex flavors.
525
3 Intel Math Kernel Library Developer Reference
Output Parameters
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2
floating-point operations for real flavors and 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?ppcon
Estimates the reciprocal of the condition number of a
packed symmetric (Hermitian) positive-definite
matrix.
Syntax
call sppcon( uplo, n, ap, anorm, rcond, work, iwork, info )
call dppcon( uplo, n, ap, anorm, rcond, work, iwork, info )
call cppcon( uplo, n, ap, anorm, rcond, work, rwork, info )
call zppcon( uplo, n, ap, anorm, rcond, work, rwork, info )
call ppcon( ap, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
526
LAPACK Routines 3
Description
The routine estimates the reciprocal of the condition number of a packed symmetric (Hermitian) positive-
definite matrix A:
1(A) = ||A||1 ||A-1||1 (since A is symmetric or Hermitian, (A) = 1(A)).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
Input Parameters
Output Parameters
527
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2
floating-point operations for real flavors and 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?pbcon
Estimates the reciprocal of the condition number of a
symmetric (Hermitian) positive-definite band matrix.
Syntax
call spbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, iwork, info )
call dpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, iwork, info )
call cpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, rwork, info )
call zpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, rwork, info )
call pbcon( ab, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a symmetric (Hermitian) positive-definite
band matrix A:
1(A) = ||A||1 ||A-1||1 (since A is symmetric or Hermitian, (A) = 1(A)).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
528
LAPACK Routines 3
Input Parameters
ldab INTEGER. The leading dimension of the array ab. (ldabkd +1).
The array work is a workspace for the routine. The dimension of work
must be at least max(1, 3*n) for real flavors and max(1, 2*n) for
complex flavors.
Output Parameters
529
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately
4*n(kd + 1) floating-point operations for real flavors and 16*n(kd + 1) for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?ptcon
Estimates the reciprocal of the condition number of a
symmetric (Hermitian) positive-definite tridiagonal
matrix.
Syntax
call sptcon( n, d, e, anorm, rcond, work, info )
call dptcon( n, d, e, anorm, rcond, work, info )
call cptcon( n, d, e, anorm, rcond, work, info )
call zptcon( n, d, e, anorm, rcond, work, info )
call ptcon( d, e, anorm, rcond [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the reciprocal of the condition number (in the 1-norm) of a real symmetric or complex
Hermitian positive-definite tridiagonal matrix using the factorization A = L*D*LT for real flavors and A =
L*D*LH for complex flavors or A = UT*D*U for real flavors and A = UH*D*U for complex flavors computed
by ?pttrf :
Input Parameters
530
LAPACK Routines 3
The array d contains the n diagonal elements of the diagonal matrix D
from the factorization of A, as computed by ?pttrf ;
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately
4*n(kd + 1) floating-point operations for real flavors and 16*n(kd + 1) for complex flavors.
531
3 Intel Math Kernel Library Developer Reference
?sycon
Estimates the reciprocal of the condition number of a
symmetric matrix.
Syntax
call ssycon( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
call dsycon( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
call csycon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zsycon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call sycon( a, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a symmetric matrix A:
1(A) = ||A||1 ||A-1||1 (since A is symmetric, (A) = 1(A)).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LT.
532
LAPACK Routines 3
lda INTEGER. The leading dimension of a; lda max(1, n).
Output Parameters
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2
floating-point operations for real flavors and 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?sycon_rook
Estimates the reciprocal of the condition number of a
symmetric matrix.
Syntax
call ssycon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
533
3 Intel Math Kernel Library Developer Reference
call dsycon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
call csycon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zsycon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call sycon_rook( a, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a symmetric matrix A:
1(A) = ||A||1 ||A-1||1 (since A is symmetric, (A) = 1(A)).
Before calling this routine:
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LT.
534
LAPACK Routines 3
iwork INTEGER. Workspace array, size at least max(1, n).
Output Parameters
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2
floating-point operations for real flavors and 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?hecon
Estimates the reciprocal of the condition number of a
Hermitian matrix.
Syntax
call checon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zhecon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call hecon( a, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a Hermitian matrix A:
535
3 Intel Math Kernel Library Developer Reference
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LH.
The array work is a workspace for the routine. The dimension of work
must be at least max(1, 2*n).
Output Parameters
536
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or reconstructible arguments, see LAPACK
95 Interface Conventions.
Specific details for the routine hecon interface are as follows:
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 5 and never more than 11. Each solution requires approximately 8n2
floating-point operations.
See Also
Matrix Storage Schemes for LAPACK Routines
?hecon_rook
Estimates the reciprocal of the condition number of a
Hermitian matrix using factorization obtained with one
of the bounded diagonal pivoting methods (max 2
interchanges).
Syntax
call checon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zhecon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call hecon_rook( a, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a Hermitian matrix A using the factorization A
= U*D*UH or A = L*D*LH computed by hetrf_rook.
An estimate is obtained for norm(A-1), and the reciprocal of the condition number is computed as rcond =
1/(anorm*norm(A-1)).
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LH.
537
3 Intel Math Kernel Library Developer Reference
Output Parameters
See Also
Matrix Storage Schemes for LAPACK Routines
?spcon
Estimates the reciprocal of the condition number of a
packed symmetric matrix.
538
LAPACK Routines 3
Syntax
call sspcon( uplo, n, ap, ipiv, anorm, rcond, work, iwork, info )
call dspcon( uplo, n, ap, ipiv, anorm, rcond, work, iwork, info )
call cspcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call zspcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call spcon( ap, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a packed symmetric matrix A:
1(A) = ||A||1 ||A-1||1 (since A is symmetric, (A) = 1(A)).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
Input Parameters
If uplo = 'L', the array ap stores the packed lower triangular factor
L of the factorization A = L*D*LT.
539
3 Intel Math Kernel Library Developer Reference
Output Parameters
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2
floating-point operations for real flavors and 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?hpcon
Estimates the reciprocal of the condition number of a
packed Hermitian matrix.
Syntax
call chpcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call zhpcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call hpcon( ap, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
540
LAPACK Routines 3
Description
The routine estimates the reciprocal of the condition number of a Hermitian matrix A:
1(A) = ||A||1 ||A-1||1 (since A is Hermitian, (A) = k1(A)).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
Input Parameters
If uplo = 'L', the array ap stores the packed lower triangular factor
L of the factorization A = L*D*LT.
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv, as returned by ?hptrf.
Output Parameters
541
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 5 and never more than 11. Each solution requires approximately 8n2
floating-point operations.
See Also
Matrix Storage Schemes for LAPACK Routines
?trcon
Estimates the reciprocal of the condition number of a
triangular matrix.
Syntax
call strcon( norm, uplo, diag, n, a, lda, rcond, work, iwork, info )
call dtrcon( norm, uplo, diag, n, a, lda, rcond, work, iwork, info )
call ctrcon( norm, uplo, diag, n, a, lda, rcond, work, rwork, info )
call ztrcon( norm, uplo, diag, n, a, lda, rcond, work, rwork, info )
call trcon( a, rcond [,uplo] [,diag] [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a triangular matrix A in either the 1-norm or
infinity-norm:
1(A) =||A||1 ||A-1||1 = (AT) = (AH)
(A) =||A|| ||A-1|| =k1 (AT) = 1 (AH) .
Input Parameters
542
LAPACK Routines 3
Indicates whether A is upper or lower triangular:
If uplo = 'U', the array a stores the upper triangle of A, other array
elements are not referenced.
If uplo = 'L', the array a stores the lower triangle of A, other array
elements are not referenced.
Output Parameters
543
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately n2
floating-point operations for real flavors and 4n2 operations for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?tpcon
Estimates the reciprocal of the condition number of a
packed triangular matrix.
Syntax
call stpcon( norm, uplo, diag, n, ap, rcond, work, iwork, info )
call dtpcon( norm, uplo, diag, n, ap, rcond, work, iwork, info )
call ctpcon( norm, uplo, diag, n, ap, rcond, work, rwork, info )
call ztpcon( norm, uplo, diag, n, ap, rcond, work, rwork, info )
call tpcon( ap, rcond [,uplo] [,diag] [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a packed triangular matrix A in either the 1-
norm or infinity-norm:
1(A) =||A||1 ||A-1||1 = (AT) = (AH)
(A) =||A|| ||A-1|| =1 (AT) = 1(AH) .
Input Parameters
544
LAPACK Routines 3
If norm = 'I', then the routine estimates the condition number of
matrix A in infinity-norm.
Output Parameters
545
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately n2
floating-point operations for real flavors and 4n2 operations for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?tbcon
Estimates the reciprocal of the condition number of a
triangular band matrix.
Syntax
call stbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, iwork, info )
call dtbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, iwork, info )
call ctbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, rwork, info )
call ztbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, rwork, info )
call tbcon( ab, rcond [,uplo] [,diag] [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a triangular band matrix A in either the 1-
norm or infinity-norm:
1(A) =||A||1 ||A-1||1 = (AT) = (AH)
(A) =||A|| ||A-1|| =1 (AT) = 1(AH) .
Input Parameters
546
LAPACK Routines 3
If norm = 'I', then the routine estimates the condition number of
matrix A in infinity-norm.
ldab INTEGER. The leading dimension of the array ab. (ldabkd +1).
Output Parameters
547
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately
2*n(kd + 1) floating-point operations for real flavors and 8*n(kd + 1) operations for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
Refining the Solution and Estimating Its Error: LAPACK Computational Routines
This section describes the LAPACK routines for refining the computed solution of a system of linear equations
and estimating the solution error. You can call these routines after factorizing the matrix of the system of
equations and computing the solution (see Routines for Matrix Factorization and Routines for Solving
Systems of Linear Equations).
?gerfs
Refines the solution of a system of linear equations
with a general coefficient matrix and estimates its
error.
Syntax
call sgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call dgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call cgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call gerfs( a, af, ipiv, b, x [,trans] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
548
LAPACK Routines 3
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B or AT*X
= B or AH*X = B with a general matrix A, with multiple right-hand sides. For each computed solution vector
x, the routine computes the component-wise backward error. This error is the smallest relative
perturbation in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
Input Parameters
549
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER.
Array, size at least max(1, n).
iwork INTEGER.
Workspace array, size at least max(1, n).
Output Parameters
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
550
LAPACK Routines 3
For each right-hand side, computation of the backward error involves a minimum of 4n2 floating-point
operations (for real flavors) or 16n2 operations (for complex flavors). In addition, each step of iterative
refinement involves 6n2 operations (for real flavors) or 24n2 operations (for complex flavors); the number of
iterations may range from 1 to 5. Estimating the forward error involves solving a number of systems of linear
equations A*x = b with the same coefficient matrix A and different right hand sides b; the number is usually
4 or 5 and never more than 11. Each solution requires approximately 2n2 floating-point operations for real
flavors or 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?gerfsx
Uses extra precise iterative refinement to improve the
solution to the system of linear equations with a
general coefficient matrix A and provides error bounds
and backward error estimates.
Syntax
call sgerfsx( trans, equed, n, nrhs, a, lda, af, ldaf, ipiv, r, c, b, ldb, x, ldx,
rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork,
info )
call dgerfsx( trans, equed, n, nrhs, a, lda, af, ldaf, ipiv, r, c, b, ldb, x, ldx,
rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork,
info )
call cgerfsx( trans, equed, n, nrhs, a, lda, af, ldaf, ipiv, r, c, b, ldb, x, ldx,
rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork,
info )
call zgerfsx( trans, equed, n, nrhs, a, lda, af, ldaf, ipiv, r, c, b, ldb, x, ldx,
rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork,
info )
Include Files
mkl.fi, lapack.f90
Description
The routine improves the computed solution to a system of linear equations and provides error bounds and
backward error estimates for the solution. In addition to a normwise error bound, the code provides a
maximum componentwise error bound, if possible. See comments for err_bnds_norm and err_bnds_comp
for details of the error bounds.
The original system of linear equations may have been equilibrated before calling this routine, as described
by the parameters equed, r, and c below. In this case, the solution and error bounds returned are for the
original unequilibrated system.
Input Parameters
551
3 Intel Math Kernel Library Developer Reference
If trans = 'C', the system has the form AH*X = B (Conjugate transpose
for complex flavors, Transpose for real flavors).
If equed = 'R', row equilibration was done, that is, A has been
premultiplied by diag(r).
If equed = 'C', column equilibration was done, that is, A has been
postmultiplied by diag(c).
If equed = 'B', both row and column equilibration was done, that is, A has
been replaced by diag(r)*A*diag(c). The right-hand side B has been
changed accordingly.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
The array af contains the factored form of the matrix A, that is, the factors
L and U from the factorization A = P*L*U as computed by ?getrf.
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work (size *) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex flavors.
ipiv INTEGER.
Array, size at least max(1, n). Contains the pivot indices as computed by ?
getrf; for row 1 in, row i of the matrix was interchanged with row
ipiv(i).
552
LAPACK Routines 3
equed = 'R' or 'B', A is multiplied on the left by diag(r); if equed = 'N'
or 'C', r is not accessed.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in Output Arguments section below.
553
3 Intel Math Kernel Library Developer Reference
Default 10.0
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
Output Parameters
554
LAPACK Routines 3
Array of size nrhs by n_err_bnds. For each right-hand side, contains
information about various error bounds and condition numbers
corresponding to the normwise relative error, which is defined as follows:
Normwise relative error in the i-th solution vector
555
3 Intel Math Kernel Library Developer Reference
556
LAPACK Routines 3
Output parameter only if the input contains erroneous values, namely, in
params(1), params(2), params(3). In such a case, the corresponding
elements of params are filled with default values on output.
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0). See the definition of
err_bnds_norm and err_bnds_comp for err = 1. To get information about
all of the right-hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?gbrfs
Refines the solution of a system of linear equations
with a general band coefficient matrix and estimates
its error.
Syntax
call sgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x, ldx, ferr,
berr, work, iwork, info )
call dgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x, ldx, ferr,
berr, work, iwork, info )
call cgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x, ldx, ferr,
berr, work, rwork, info )
call zgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x, ldx, ferr,
berr, work, rwork, info )
call gbrfs( ab, afb, ipiv, b, x [,kl] [,trans] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B or AT*X
= B or AH*X = B with a band matrix A, with multiple right-hand sides. For each computed solution vector x,
the routine computes the component-wise backward error. This error is the smallest relative perturbation
in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
557
3 Intel Math Kernel Library Developer Reference
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
Input Parameters
ipiv INTEGER.
558
LAPACK Routines 3
Array, size at least max(1, n). The ipiv array, as returned by ?gbtrf.
Output Parameters
ku Restored as ku = lda-kl-1.
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
559
3 Intel Math Kernel Library Developer Reference
For each right-hand side, computation of the backward error involves a minimum of 4n(kl + ku) floating-
point operations (for real flavors) or 16n(kl + ku) operations (for complex flavors). In addition, each step of
iterative refinement involves 2n(4kl + 3ku) operations (for real flavors) or 8n(4kl + 3ku) operations (for
complex flavors); the number of iterations may range from 1 to 5. Estimating the forward error involves
solving a number of systems of linear equations A*x = b; the number is usually 4 or 5 and never more than
11. Each solution requires approximately 2n2 floating-point operations for real flavors or 8n2 for complex
flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?gbrfsx
Uses extra precise iterative refinement to improve the
solution to the system of linear equations with a
banded coefficient matrix A and provides error bounds
and backward error estimates.
Syntax
call sgbrfsx( trans, equed, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, r, c, b, ldb,
x, ldx, rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
iwork, info )
call dgbrfsx( trans, equed, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, r, c, b, ldb,
x, ldx, rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
iwork, info )
call cgbrfsx( trans, equed, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, r, c, b, ldb,
x, ldx, rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
rwork, info )
call zgbrfsx( trans, equed, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, r, c, b, ldb,
x, ldx, rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
rwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routine improves the computed solution to a system of linear equations and provides error bounds and
backward error estimates for the solution. In addition to a normwise error bound, the code provides a
maximum componentwise error bound, if possible. See comments for err_bnds_norm and err_bnds_comp
for details of the error bounds.
The original system of linear equations may have been equilibrated before calling this routine, as described
by the parameters equed, r, and c below. In this case, the solution and error bounds returned are for the
original unequilibrated system.
Input Parameters
560
LAPACK Routines 3
If trans = 'C', the system has the form AH*X = B (Conjugate transpose
for complex flavors, Transpose for real flavors).
If equed = 'R', row equilibration was done, that is, A has been
premultiplied by diag(r).
If equed = 'C', column equilibration was done, that is, A has been
postmultiplied by diag(c).
If equed = 'B', both row and column equilibration was done, that is, A has
been replaced by diag(r)*A*diag(c). The right-hand side B has been
changed accordingly.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
ldafb INTEGER. The leading dimension of the array afb; ldafb 2*kl+ku+1.
ipiv INTEGER.
561
3 Intel Math Kernel Library Developer Reference
Array, size at least max(1, n). Contains the pivot indices as computed by ?
gbtrf; for row 1 in, row i of the matrix was interchanged with row
ipiv(i).
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right-hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in Output Arguments section below.
562
LAPACK Routines 3
params(1) : Whether to perform iterative refinement or not. Default: 1.0
(for single precision flavors), 1.0D+0 (for double precision flavors).
Default 10.0
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
Output Parameters
563
3 Intel Math Kernel Library Developer Reference
564
LAPACK Routines 3
"guaranteed". These reciprocal condition
numbers for some appropriately scaled matrix Z
are
565
3 Intel Math Kernel Library Developer Reference
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?gtrfs
Refines the solution of a system of linear equations
with a tridiagonal coefficient matrix and estimates its
error.
Syntax
call sgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x, ldx,
ferr, berr, work, iwork, info )
call dgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x, ldx,
ferr, berr, work, iwork, info )
call cgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x, ldx,
ferr, berr, work, rwork, info )
566
LAPACK Routines 3
call zgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x, ldx,
ferr, berr, work, rwork, info )
call gtrfs( dl, d, du, dlf, df, duf, du2, ipiv, b, x [,trans] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B or AT*X
= B or AH*X = B with a tridiagonal matrix A, with multiple right-hand sides. For each computed solution
vector x, the routine computes the component-wise backward error. This error is the smallest relative
perturbation in elements of A and b such that x is the exact solution of the perturbed system:
|aij|/|aij| |aij|, |bi|/|bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
Input Parameters
nrhs INTEGER. The number of right-hand sides, that is, the number of
columns of the matrix B; nrhs 0.
567
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?gttrf.
iwork INTEGER. Workspace array, size (n). Used for real flavors only.
Output Parameters
568
LAPACK Routines 3
duf Holds the vector of length (n-1).
See Also
Matrix Storage Schemes for LAPACK Routines
?porfs
Refines the solution of a system of linear equations
with a symmetric (Hermitian) positive-definite
coefficient matrix and estimates its error.
Syntax
call sporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call dporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call cporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call porfs( a, af, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
symmetric (Hermitian) positive definite matrix A, with multiple right-hand sides. For each computed solution
vector x, the routine computes the component-wise backward error. This error is the smallest relative
perturbation in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
569
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
570
LAPACK Routines 3
info INTEGER. If info = 0, the execution is successful.
If info = -i, the i-th parameter had an illegal value.
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
For each right-hand side, computation of the backward error involves a minimum of 4n2 floating-point
operations (for real flavors) or 16n2 operations (for complex flavors). In addition, each step of iterative
refinement involves 6n2 operations (for real flavors) or 24n2 operations (for complex flavors); the number of
iterations may range from 1 to 5. Estimating the forward error involves solving a number of systems of linear
equations A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2n2 floating-point operations for real flavors or 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?porfsx
Uses extra precise iterative refinement to improve the
solution to the system of linear equations with a
symmetric/Hermitian positive-definite coefficient
matrix A and provides error bounds and backward
error estimates.
Syntax
call sporfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, s, b, ldb, x, ldx, rcond, berr,
n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork, info )
call dporfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, s, b, ldb, x, ldx, rcond, berr,
n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork, info )
call cporfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, s, b, ldb, x, ldx, rcond, berr,
n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
call zporfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, s, b, ldb, x, ldx, rcond, berr,
n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
571
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine improves the computed solution to a system of linear equations and provides error bounds and
backward error estimates for the solution. In addition to a normwise error bound, the code provides a
maximum componentwise error bound, if possible. See comments for err_bnds_norm and err_bnds_comp
for details of the error bounds.
The original system of linear equations may have been equilibrated before calling this routine, as described
by the parameters equed and s below. In this case, the solution and error bounds returned are for the
original unequilibrated system.
Input Parameters
If equed = 'Y', both row and column equilibration was done, that is, A has
been replaced by diag(s)*A*diag(s). The right-hand side B has been
changed accordingly.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
572
LAPACK Routines 3
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex flavors.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in Output Arguments section below.
573
3 Intel Math Kernel Library Developer Reference
Default 10.0
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
Output Parameters
574
LAPACK Routines 3
DOUBLE PRECISION for double precision flavors.
Array, size at least max(1, nrhs). Contains the componentwise relative
backward error for each solution vector x(j), that is, the smallest relative
change in any element of A or B that makes x(j) an exact solution.
575
3 Intel Math Kernel Library Developer Reference
576
LAPACK Routines 3
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?pprfs
Refines the solution of a system of linear equations
with a symmetric (Hermitian) positive-definite
coefficient matrix stored in a packed format and
estimates its error.
Syntax
call spprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, iwork, info )
call dpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, iwork, info )
call cpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, rwork, info )
call zpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, rwork, info )
call pprfs( ap, afp, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
577
3 Intel Math Kernel Library Developer Reference
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
symmetric (Hermitian) positive definite matrix A, with multiple right-hand sides. For each computed solution
vector x, the routine computes the component-wise backward error. This error is the smallest relative
perturbation in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution
||x - xe||/||x||
where xe is the exact solution.
Input Parameters
578
LAPACK Routines 3
iwork INTEGER. Workspace array, size at least max(1, n).
Output Parameters
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
For each right-hand side, computation of the backward error involves a minimum of 4n2 floating-point
operations (for real flavors) or 16n2 operations (for complex flavors). In addition, each step of iterative
refinement involves 6n2 operations (for real flavors) or 24n2 operations (for complex flavors); the number of
iterations may range from 1 to 5.
Estimating the forward error involves solving a number of systems of linear equations A*x = b; the number
of systems is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2 floating-point
operations for real flavors or 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
579
3 Intel Math Kernel Library Developer Reference
?pbrfs
Refines the solution of a system of linear equations
with a band symmetric (Hermitian) positive-definite
coefficient matrix and estimates its error.
Syntax
call spbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call dpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call cpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call zpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call pbrfs( ab, afb, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
symmetric (Hermitian) positive definite band matrix A, with multiple right-hand sides. For each computed
solution vector x, the routine computes the component-wise backward error. This error is the smallest
relative perturbation in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
Input Parameters
580
LAPACK Routines 3
COMPLEX for cpbrfs
DOUBLE COMPLEX for zpbrfs.
Arrays:
ab(ldab,*) contains the original band matrix A, as supplied to ?
pbtrf.
afb(ldafb,*) contains the factored band matrix A, as returned by ?
pbtrf.
b(ldb,*) contains the right-hand side matrix B.
x(ldx,*) contains the solution matrix X.
work(*) is a workspace array.
The second dimension of ab and afb must be at least max(1, n); the
second dimension of b and x must be at least max(1, nrhs); the
dimension of work must be at least max(1, 3*n) for real flavors and
max(1, 2*n) for complex flavors.
Output Parameters
581
3 Intel Math Kernel Library Developer Reference
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
For each right-hand side, computation of the backward error involves a minimum of 8n*kd floating-point
operations (for real flavors) or 32n*kd operations (for complex flavors). In addition, each step of iterative
refinement involves 12n*kd operations (for real flavors) or 48n*kd operations (for complex flavors); the
number of iterations may range from 1 to 5.
Estimating the forward error involves solving a number of systems of linear equations A*x = b; the number
is usually 4 or 5 and never more than 11. Each solution requires approximately 4n*kd floating-point
operations for real flavors or 16n*kd for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?ptrfs
Refines the solution of a system of linear equations
with a symmetric (Hermitian) positive-definite
tridiagonal coefficient matrix and estimates its error.
Syntax
call sptrfs( n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, info )
call dptrfs( n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, info )
call cptrfs( uplo, n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zptrfs( uplo, n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call ptrfs( d, df, e, ef, b, x [,ferr] [,berr] [,info] )
call ptrfs( d, df, e, ef, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
582
LAPACK Routines 3
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
symmetric (Hermitian) positive definite tridiagonal matrix A, with multiple right-hand sides. For each
computed solution vector x, the routine computes the component-wise backward error. This error is the
smallest relative perturbation in elements of A and b such that x is the exact solution of the perturbed
system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
call the factorization routine ?pttrf
call the solver routine ?pttrs.
Input Parameters
uplo CHARACTER*1. Used for complex flavors only. Must be 'U' or 'L'.
Specifies whether the superdiagonal or the subdiagonal of the
tridiagonal matrix A is stored and how A is factored:
If uplo = 'U', the array e stores the superdiagonal of A, and A is
factored as UH*D*U.
d, df, rwork REAL for single precision flavors DOUBLE PRECISION for double
precision flavors
Arrays: d(n), df(n), rwork(n).
The array rwork is a workspace array used for complex flavors only.
583
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
uplo Used in complex flavors only. Must be 'U' or 'L'. The default value is
'U'.
See Also
Matrix Storage Schemes for LAPACK Routines
584
LAPACK Routines 3
?syrfs
Refines the solution of a system of linear equations
with a symmetric coefficient matrix and estimates its
error.
Syntax
call ssyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call dsyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call csyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zsyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call syrfs( a, af, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
symmetric full-storage matrix A, with multiple right-hand sides. For each computed solution vector x, the
routine computes the component-wise backward error. This error is the smallest relative perturbation in
elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
Input Parameters
585
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?sytrf.
Output Parameters
586
LAPACK Routines 3
ipiv Holds the vector of length n.
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
For each right-hand side, computation of the backward error involves a minimum of 4n2 floating-point
operations (for real flavors) or 16n2 operations (for complex flavors). In addition, each step of iterative
refinement involves 6n2 operations (for real flavors) or 24n2 operations (for complex flavors); the number of
iterations may range from 1 to 5. Estimating the forward error involves solving a number of systems of linear
equations A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2n2 floating-point operations for real flavors or 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?syrfsx
Uses extra precise iterative refinement to improve the
solution to the system of linear equations with a
symmetric indefinite coefficient matrix A and provides
error bounds and backward error estimates.
Syntax
call ssyrfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork, info )
call dsyrfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork, info )
call csyrfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
call zsyrfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routine improves the computed solution to a system of linear equations when the coefficient matrix is
symmetric indefinite, and provides error bounds and backward error estimates for the solution. In addition to
a normwise error bound, the code provides a maximum componentwise error bound, if possible. See
comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
587
3 Intel Math Kernel Library Developer Reference
The original system of linear equations may have been equilibrated before calling this routine, as described
by the parameters equed and s below. In this case, the solution and error bounds returned are for the
original unequilibrated system.
Input Parameters
If equed = 'Y', both row and column equilibration was done, that is, A has
been replaced by diag(s)*A*diag(s). The right-hand side B has been
changed accordingly.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex flavors.
588
LAPACK Routines 3
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges and the
block structure of D as determined by ssytrf for real flavors or dsytrf for
complex flavors.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in Output Arguments section below.
589
3 Intel Math Kernel Library Developer Reference
Default 10.0
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
Output Parameters
590
LAPACK Routines 3
Array, size at least max(1, nrhs). Contains the componentwise relative
backward error for each solution vector x(j), that is, the smallest relative
change in any element of A or B that makes x(j) an exact solution.
591
3 Intel Math Kernel Library Developer Reference
592
LAPACK Routines 3
params REAL for single precision flavors
DOUBLE PRECISION for double precision flavors.
Output parameter only if the input contains erroneous values, namely, in
params(1), params(2), params(3). In such a case, the corresponding
elements of params are filled with default values on output.
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?herfs
Refines the solution of a system of linear equations
with a complex Hermitian coefficient matrix and
estimates its error.
Syntax
call cherfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zherfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call herfs( a, af, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
complex Hermitian full-storage matrix A, with multiple right-hand sides. For each computed solution vector x,
the routine computes the component-wise backward error. This error is the smallest relative perturbation
in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
593
3 Intel Math Kernel Library Developer Reference
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?hetrf.
Output Parameters
594
LAPACK Routines 3
DOUBLE PRECISION for zherfs.
Arrays, size at least max(1, nrhs). Contain the component-wise
forward and backward errors, respectively, for each solution vector.
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
For each right-hand side, computation of the backward error involves a minimum of 16n2 operations. In
addition, each step of iterative refinement involves 24n2 operations; the number of iterations may range
from 1 to 5.
Estimating the forward error involves solving a number of systems of linear equations A*x = b; the number
is usually 4 or 5 and never more than 11. Each solution requires approximately 8n2 floating-point operations.
See Also
Matrix Storage Schemes for LAPACK Routines
?herfsx
Uses extra precise iterative refinement to improve the
solution to the system of linear equations with a
symmetric indefinite coefficient matrix A and provides
error bounds and backward error estimates.
Syntax
call cherfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
595
3 Intel Math Kernel Library Developer Reference
call zherfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routine improves the computed solution to a system of linear equations when the coefficient matrix is
Hermitian indefinite, and provides error bounds and backward error estimates for the solution. In addition to
a normwise error bound, the code provides a maximum componentwise error bound, if possible. See
comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
The original system of linear equations may have been equilibrated before calling this routine, as described
by the parameters equed and s below. In this case, the solution and error bounds returned are for the
original unequilibrated system.
Input Parameters
If equed = 'Y', both row and column equilibration was done, that is, A has
been replaced by diag(s)*A*diag(s). The right-hand side B has been
changed accordingly.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
The array af contains the block diagonal matrix D and the multipliers used
to obtain the factor U or L from the factorization A = U*D*UT or A =
L*D*LT as computed by ssytrf for cherfsx or dsytrf for zherfsx.
596
LAPACK Routines 3
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
max(1,2*n).
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges and the
block structure of D as determined by ssytrf for real flavors or dsytrf for
complex flavors.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in Output Arguments section below.
597
3 Intel Math Kernel Library Developer Reference
Default 10
Output Parameters
598
LAPACK Routines 3
Array, size at least max(1, nrhs). Contains the componentwise relative
backward error for each solution vector x(j), that is, the smallest relative
change in any element of A or B that makes x(j) an exact solution.
599
3 Intel Math Kernel Library Developer Reference
600
LAPACK Routines 3
DOUBLE PRECISION for double precision flavors.
Output parameter only if the input contains erroneous values, namely, in
params(1), params(2), params(3). In such a case, the corresponding
elements of params are filled with default values on output.
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?sprfs
Refines the solution of a system of linear equations
with a packed symmetric coefficient matrix and
estimates the solution error.
Syntax
call ssprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call dsprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call csprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zsprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call sprfs( ap, afp, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
packed symmetric matrix A, with multiple right-hand sides. For each computed solution vector x, the routine
computes the component-wise backward error. This error is the smallest relative perturbation in elements
of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
601
3 Intel Math Kernel Library Developer Reference
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?sptrf.
602
LAPACK Routines 3
Output Parameters
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
For each right-hand side, computation of the backward error involves a minimum of 4n2 floating-point
operations (for real flavors) or 16n2 operations (for complex flavors). In addition, each step of iterative
refinement involves 6n2 operations (for real flavors) or 24n2 operations (for complex flavors); the number of
iterations may range from 1 to 5.
Estimating the forward error involves solving a number of systems of linear equations A*x = b; the number
of systems is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2 floating-point
operations for real flavors or 8n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?hprfs
Refines the solution of a system of linear equations
with a packed complex Hermitian coefficient matrix
and estimates the solution error.
603
3 Intel Math Kernel Library Developer Reference
Syntax
call chprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zhprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call hprfs( ap, afp, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
packed complex Hermitian matrix A, with multiple right-hand sides. For each computed solution vector x, the
routine computes the component-wise backward error. This error is the smallest relative perturbation in
elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine:
Input Parameters
604
LAPACK Routines 3
ldb INTEGER. The leading dimension of b; ldb max(1, n).
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?hptrf.
Output Parameters
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
605
3 Intel Math Kernel Library Developer Reference
For each right-hand side, computation of the backward error involves a minimum of 16n2 operations. In
addition, each step of iterative refinement involves 24n2 operations; the number of iterations may range
from 1 to 5.
Estimating the forward error involves solving a number of systems of linear equations A*x = b; the number
is usually 4 or 5 and never more than 11. Each solution requires approximately 8n2 floating-point operations.
See Also
Matrix Storage Schemes for LAPACK Routines
?trrfs
Estimates the error in the solution of a system of
linear equations with a triangular coefficient matrix.
Syntax
call strrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call dtrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call ctrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call ztrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call trrfs( a, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the errors in the solution to a system of linear equations A*X = B or AT*X = B or
AH*X = B with a triangular matrix A, with multiple right-hand sides. For each computed solution vector x, the
routine computes the component-wise backward error. This error is the smallest relative perturbation in
elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
The routine also estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine, call the solver routine ?trtrs.
Input Parameters
606
LAPACK Routines 3
If trans = 'N', the system has the form A*X = B.
Output Parameters
607
3 Intel Math Kernel Library Developer Reference
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
A call to this routine involves, for each right-hand side, solving a number of systems of linear equations A*x
= b; the number of systems is usually 4 or 5 and never more than 11. Each solution requires approximately
n2 floating-point operations for real flavors or 4n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?tprfs
Estimates the error in the solution of a system of
linear equations with a packed triangular coefficient
matrix.
Syntax
call stprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call dtprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call ctprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call ztprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call tprfs( ap, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
608
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the errors in the solution to a system of linear equations A*X = B or AT*X = B or
AH*X = B with a packed triangular matrix A, with multiple right-hand sides. For each computed solution
vector x, the routine computes the component-wise backward error. This error is the smallest relative
perturbation in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
The routine also estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine, call the solver routine ?tptrs.
Input Parameters
609
3 Intel Math Kernel Library Developer Reference
Output Parameters
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
610
LAPACK Routines 3
A call to this routine involves, for each right-hand side, solving a number of systems of linear equations A*x
= b; the number of systems is usually 4 or 5 and never more than 11. Each solution requires approximately
n2 floating-point operations for real flavors or 4n2 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?tbrfs
Estimates the error in the solution of a system of
linear equations with a triangular band coefficient
matrix.
Syntax
call stbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call dtbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call ctbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call ztbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call tbrfs( ab, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the errors in the solution to a system of linear equations A*X = B or AT*X = B or
AH*X = B with a triangular band matrix A, with multiple right-hand sides. For each computed solution vector
x, the routine computes the component-wise backward error. This error is the smallest relative
perturbation in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
The routine also estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
Before calling this routine, call the solver routine ?tbtrs.
Input Parameters
611
3 Intel Math Kernel Library Developer Reference
ldab INTEGER. The leading dimension of the array ab; ldabkd +1.
Output Parameters
612
LAPACK Routines 3
info INTEGER. If info = 0, the execution is successful.
If info = -i, the i-th parameter had an illegal value.
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
A call to this routine involves, for each right-hand side, solving a number of systems of linear equations A*x
= b; the number of systems is usually 4 or 5 and never more than 11. Each solution requires approximately
2n*kd floating-point operations for real flavors or 8n*kd operations for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?getri
Computes the inverse of an LU-factored general
matrix.
Syntax
call sgetri( n, a, lda, ipiv, work, lwork, info )
call dgetri( n, a, lda, ipiv, work, lwork, info )
call cgetri( n, a, lda, ipiv, work, lwork, info )
613
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a general matrix A. Before calling this routine, call ?getrf to
factorize A.
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n).
Output Parameters
614
LAPACK Routines 3
If info = i, the i-th diagonal element of the factor U is zero, U is
singular, and the inversion could not be completed.
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed inverse X satisfies the following error bound:
|XA - I| c(n)|X|P|L||U|,
where c(n) is a modest linear function of n; is the machine precision; I denotes the identity matrix; P, L,
and U are the factors of the matrix factorization A = P*L*U.
The total number of floating-point operations is approximately (4/3)n3 for real flavors and (16/3)n3 for
complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?potri
Computes the inverse of a symmetric (Hermitian)
positive-definite matrix using the Cholesky
factorization.
Syntax
call spotri( uplo, n, a, lda, info )
call dpotri( uplo, n, a, lda, info )
call cpotri( uplo, n, a, lda, info )
call zpotri( uplo, n, a, lda, info )
call potri( a [,uplo] [,info] )
615
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric positive definite or, for complex flavors, Hermitian
positive-definite matrix A. Before calling this routine, call ?potrf to factorize A.
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
616
LAPACK Routines 3
Application Notes
The computed inverse X satisfies the following error bounds:
where c(n) is a modest linear function of n, and is the machine precision; I denotes the identity matrix.
The 2-norm ||A||2 of a matrix A is defined by ||A||2 = maxxx=1(AxAx)1/2, and the condition number
2(A) is defined by 2(A) = ||A||2 ||A-1||2.
The total number of floating-point operations is approximately (2/3)n3 for real flavors and (8/3)n3 for
complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?pftri
Computes the inverse of a symmetric (Hermitian)
positive-definite matrix in RFP format using the
Cholesky factorization.
Syntax
call spftri( transr, uplo, n, a, info )
call dpftri( transr, uplo, n, a, info )
call cpftri( transr, uplo, n, a, info )
call zpftri( transr, uplo, n, a, info )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric positive definite or, for complex data, Hermitian
positive-definite matrix A using the Cholesky factorization:
The matrix A is in the Rectangular Full Packed (RFP) format. For the description of the RFP format, see Matrix
Storage Schemes.
Input Parameters
transr CHARACTER*1. Must be 'N', 'T' (for real data) or 'C' (for complex
data).
If transr = 'N', the Normal transr of RFP U (if uplo = 'U') or L (if
uplo = 'L') is stored.
If transr = 'T', the Transpose transr of RFP U (if uplo = 'U') or L
(if uplo = 'L' is stored.
617
3 Intel Math Kernel Library Developer Reference
Output Parameters
See Also
Matrix Storage Schemes for LAPACK Routines
?pptri
Computes the inverse of a packed symmetric
(Hermitian) positive-definite matrix using Cholesky
factorization.
Syntax
call spptri( uplo, n, ap, info )
call dpptri( uplo, n, ap, info )
call cpptri( uplo, n, ap, info )
call zpptri( uplo, n, ap, info )
call pptri( ap [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric positive definite or, for complex flavors, Hermitian
positive-definite matrix A in packed form. Before calling this routine, call ?pptrf to factorize A.
618
LAPACK Routines 3
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed inverse X satisfies the following error bounds:
where c(n) is a modest linear function of n, and is the machine precision; I denotes the identity matrix.
The 2-norm ||A||2 of a matrix A is defined by ||A||2 =maxxx=1(AxAx)1/2, and the condition number
2(A) is defined by 2(A) = ||A||2 ||A-1||2 .
619
3 Intel Math Kernel Library Developer Reference
The total number of floating-point operations is approximately (2/3)n3 for real flavors and (8/3)n3 for
complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?sytri
Computes the inverse of a symmetric matrix using
U*D*UT or L*D*LT Bunch-Kaufman factorization.
Syntax
call ssytri( uplo, n, a, lda, ipiv, work, info )
call dsytri( uplo, n, a, lda, ipiv, work, info )
call csytri( uplo, n, a, lda, ipiv, work, info )
call zsytri( uplo, n, a, lda, ipiv, work, info )
call sytri( a, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric matrix A. Before calling this routine, call ?sytrf to
factorize A.
Input Parameters
620
LAPACK Routines 3
ipiv INTEGER.
Array, size at least max(1, n).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info =-i, the i-th parameter had an illegal value.
If info = i, the i-th diagonal element of D is zero, D is singular, and the
inversion could not be completed.
Application Notes
The computed inverse X satisfies the following error bounds:
for uplo = 'L'. Here c(n) is a modest linear function of n, and is the machine precision; I denotes the
identity matrix.
The total number of floating-point operations is approximately (2/3)n3 for real flavors and (8/3)n3 for
complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?sytri_rook
Computes the inverse of a symmetric matrix using
U*D*UT or L*D*LT bounded Bunch-Kaufman
factorization.
Syntax
call ssytri_rook( uplo, n, a, lda, ipiv, work, info )
call dsytri_rook( uplo, n, a, lda, ipiv, work, info )
call csytri_rook( uplo, n, a, lda, ipiv, work, info )
621
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric matrix A. Before calling this routine, call ?
sytrf_rook to factorize A.
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info =-i, the i-th parameter had an illegal value.
622
LAPACK Routines 3
If info = i, the i-th diagonal element of D is zero, D is singular, and the
inversion could not be completed.
Application Notes
The total number of floating-point operations is approximately (2/3)n3 for real flavors and (8/3)n3 for
complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?hetri
Computes the inverse of a complex Hermitian matrix
using U*D*UH or L*D*LH Bunch-Kaufman
factorization.
Syntax
call chetri( uplo, n, a, lda, ipiv, work, info )
call zhetri( uplo, n, a, lda, ipiv, work, info )
call hetri( a, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a complex Hermitian matrix A. Before calling this routine, call ?
hetrf to factorize A.
Input Parameters
623
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?hetrf.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed inverse X satisfies the following error bounds:
for uplo = 'L'. Here c(n) is a modest linear function of n, and is the machine precision; I denotes the
identity matrix.
The total number of floating-point operations is approximately (8/3)n3 for complex flavors.
624
LAPACK Routines 3
See Also
Matrix Storage Schemes for LAPACK Routines
?hetri_rook
Computes the inverse of a complex Hermitian matrix
using U*D*UH or L*D*LH bounded Bunch-Kaufman
factorization.
Syntax
call chetri_rook( uplo, n, a, lda, ipiv, work, info )
call zhetri_rook( uplo, n, a, lda, ipiv, work, info )
call hetri_rook( a, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a complex Hermitian matrix A. Before calling this routine, call ?
hetrf_rook to factorize A.
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?
hetrf_rook.
Output Parameters
625
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The total number of floating-point operations is approximately (8/3)n3 for complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?sytri2
Computes the inverse of a symmetric indefinite matrix
through setting the leading dimension of the
workspace and calling ?sytri2x.
Syntax
call ssytri2( uplo, n, a, lda, ipiv, work, lwork, info )
call dsytri2( uplo, n, a, lda, ipiv, work, lwork, info )
call csytri2( uplo, n, a, lda, ipiv, work, lwork, info )
call zsytri2( uplo, n, a, lda, ipiv, work, lwork, info )
call sytri2( a,ipiv[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric indefinite matrix A using the factorization A =
U*D*UT or A = L*D*LT computed by ?sytrf.
The ?sytri2 routine sets the leading dimension of the workspace before calling ?sytri2x that actually
computes the inverse.
626
LAPACK Routines 3
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n).
Output Parameters
If uplo = 'U', the upper triangular part of the inverse is formed and
the part of A below the diagonal is not referenced.
If uplo = 'L', the lower triangular part of the inverse is formed and
the part of A above the diagonal is not referenced.
info INTEGER.
If info = 0, the execution is successful.
627
3 Intel Math Kernel Library Developer Reference
uplo Indicates how the matrix A has been factored. Must be 'U' or 'L'.
See Also
?sytrf
?sytri2x
Matrix Storage Schemes for LAPACK Routines
?hetri2
Computes the inverse of a Hermitian indefinite matrix
through setting the leading dimension of the
workspace and calling ?hetri2x.
Syntax
call chetri2( uplo, n, a, lda, ipiv, work, lwork, info )
call zhetri2( uplo, n, a, lda, ipiv, work, lwork, info )
call hetri2( a,ipiv[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a Hermitian indefinite matrix A using the factorization A =
U*D*UH or A = L*D*LH computed by ?hetrf.
The ?hetri2 routine sets the leading dimension of the workspace before calling ?hetri2x that actually
computes the inverse.
Input Parameters
628
LAPACK Routines 3
DOUBLE COMPLEX for zhetri2
Array a(size lda by *) contains the block diagonal matrix D and the
multipliers used to obtain the factor U or L as returned by ?sytrf.
ipiv INTEGER.
Array, size at least max(1, n).
Output Parameters
If uplo = 'U', the upper triangular part of the inverse is formed and
the part of A below the diagonal is not referenced.
If uplo = 'L', the lower triangular part of the inverse is formed and
the part of A above the diagonal is not referenced.
info INTEGER.
If info = 0, the execution is successful.
629
3 Intel Math Kernel Library Developer Reference
uplo Indicates how the input matrix A has been factored. Must be 'U' or
'L'.
See Also
?hetrf
?hetri2x
Matrix Storage Schemes for LAPACK Routines
?sytri2x
Computes the inverse of a symmetric indefinite matrix
after ?sytri2sets the leading dimension of the
workspace.
Syntax
call ssytri2x( uplo, n, a, lda, ipiv, work, nb, info )
call dsytri2x( uplo, n, a, lda, ipiv, work, nb, info )
call csytri2x( uplo, n, a, lda, ipiv, work, nb, info )
call zsytri2x( uplo, n, a, lda, ipiv, work, nb, info )
call sytri2x( a,ipiv,nb[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric indefinite matrix A using the factorization A =
U*D*UT or A = L*D*LT computed by ?sytrf.
The ?sytri2x actually computes the inverse after the ?sytri2 routine sets the leading dimension of the
workspace before calling ?sytri2x.
Input Parameters
630
LAPACK Routines 3
work is a workspace array of dimension (n+nb+1)*(nb+3)
where
nb is the block size as set by ?sytrf.
ipiv INTEGER.
Array, size at least max(1, n).
Output Parameters
If info = 'U', the upper triangular part of the inverse is formed and
the part of A below the diagonal is not referenced.
If info = 'L', the lower triangular part of the inverse is formed and
the part of A above the diagonal is not referenced.
info INTEGER.
If info = 0, the execution is successful.
uplo Indicates how the input matrix A has been factored. Must be 'U' or
'L'.
See Also
?sytrf
?sytri2
Matrix Storage Schemes for LAPACK Routines
631
3 Intel Math Kernel Library Developer Reference
?hetri2x
Computes the inverse of a Hermitian indefinite matrix
after ?hetri2sets the leading dimension of the
workspace.
Syntax
call chetri2x( uplo, n, a, lda, ipiv, work, nb, info )
call zhetri2x( uplo, n, a, lda, ipiv, work, nb, info )
call hetri2x( a,ipiv,nb[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a Hermitian indefinite matrix A using the factorization A =
U*D*UH or A = L*D*LH computed by ?hetrf.
The ?hetri2x actually computes the inverse after the ?hetri2 routine sets the leading dimension of the
workspace before calling ?hetri2x.
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n).
632
LAPACK Routines 3
Output Parameters
If info = 'U', the upper triangular part of the inverse is formed and
the part of A below the diagonal is not referenced.
If info = 'L', the lower triangular part of the inverse is formed and
the part of A above the diagonal is not referenced.
info INTEGER.
If info = 0, the execution is successful.
uplo Indicates how the input matrix A has been factored. Must be 'U' or
'L'.
See Also
?hetrf
?hetri2
Matrix Storage Schemes for LAPACK Routines
?sptri
Computes the inverse of a symmetric matrix using
U*D*UT or L*D*LT Bunch-Kaufman factorization of
matrix in packed storage.
Syntax
call ssptri( uplo, n, ap, ipiv, work, info )
call dsptri( uplo, n, ap, ipiv, work, info )
call csptri( uplo, n, ap, ipiv, work, info )
call zsptri( uplo, n, ap, ipiv, work, info )
call sptri( ap, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
633
3 Intel Math Kernel Library Developer Reference
Description
The routine computes the inverse inv(A) of a packed symmetric matrix A. Before calling this routine, call ?
sptrf to factorize A.
Input Parameters
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?sptrf.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
634
LAPACK Routines 3
ap Holds the array A of size (n*(n+1)/2).
Application Notes
The computed inverse X satisfies the following error bounds:
for uplo = 'L'. Here c(n) is a modest linear function of n, and is the machine precision; I denotes the
identity matrix.
The total number of floating-point operations is approximately (2/3)n3 for real flavors and (8/3)n3 for
complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?hptri
Computes the inverse of a complex Hermitian matrix
using U*D*UH or L*D*LH Bunch-Kaufman factorization
of matrix in packed storage.
Syntax
call chptri( uplo, n, ap, ipiv, work, info )
call zhptri( uplo, n, ap, ipiv, work, info )
call hptri( ap, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a complex Hermitian matrix A using packed storage. Before
calling this routine, call ?hptrf to factorize A.
Input Parameters
635
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER.
Array, size at least max(1, n).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed inverse X satisfies the following error bounds:
for uplo = 'L'. Here c(n) is a modest linear function of n, and is the machine precision; I denotes the
identity matrix.
The total number of floating-point operations is approximately (8/3)n3.
636
LAPACK Routines 3
See Also
Matrix Storage Schemes for LAPACK Routines
?trtri
Computes the inverse of a triangular matrix.
Syntax
call strtri( uplo, diag, n, a, lda, info )
call dtrtri( uplo, diag, n, a, lda, info )
call ctrtri( uplo, diag, n, a, lda, info )
call ztrtri( uplo, diag, n, a, lda, info )
call trtri( a [,uplo] [,diag] [,info] )
Include Files
mkl.fi, lapack.f90
Description
Input Parameters
Output Parameters
info INTEGER.
637
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed inverse X satisfies the following error bounds:
where c(n) is a modest linear function of n; is the machine precision; I denotes the identity matrix.
The total number of floating-point operations is approximately (1/3)n3 for real flavors and (4/3)n3 for
complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?tftri
Computes the inverse of a triangular matrix stored in
the Rectangular Full Packed (RFP) format.
Syntax
call stftri( transr, uplo, diag, n, a, info )
call dtftri( transr, uplo, diag, n, a, info )
call ctftri( transr, uplo, diag, n, a, info )
call ztftri( transr, uplo, diag, n, a, info )
Include Files
mkl.fi, lapack.f90
Description
Computes the inverse of a triangular matrix A stored in the Rectangular Full Packed (RFP) format. For the
description of the RFP format, see Matrix Storage Schemes.
This is the block version of the algorithm, calling Level 3 BLAS.
638
LAPACK Routines 3
Input Parameters
transr CHARACTER*1. Must be 'N', 'T' (for real data) or 'C' (for complex
data).
If transr = 'N', the Normal transr of RFP A is stored.
Output Parameters
See Also
Matrix Storage Schemes for LAPACK Routines
?tptri
Computes the inverse of a triangular matrix using
packed storage.
Syntax
call stptri( uplo, diag, n, ap, info )
639
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
640
LAPACK Routines 3
Specific details for the routine tptri interface are as follows:
Application Notes
The computed inverse X satisfies the following error bounds:
where c(n) is a modest linear function of n; is the machine precision; I denotes the identity matrix.
The total number of floating-point operations is approximately (1/3)n3 for real flavors and (4/3)n3 for
complex flavors.
See Also
Matrix Storage Schemes for LAPACK Routines
?geequ
Computes row and column scaling factors intended to
equilibrate a general matrix and reduce its condition
number.
Syntax
call sgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call dgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call cgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call zgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call geequ( a, r, c [,rowcnd] [,colcnd] [,amax] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate an m-by-n matrix A and reduce its
condition number. The output array r returns the row scale factors and the array c the column scale factors.
These factors are chosen to try to make the largest element in each row and column of the matrix B with
elements bij=r(i)*aij*c(j) have absolute value 1.
See ?laqge auxiliary function that uses scaling factors computed by ?geequ.
641
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
642
LAPACK Routines 3
im, the i-th row of A is exactly zero;
i>m, the (i-m)th column of A is exactly zero.
Application Notes
All the components of r and c are restricted to be between SMLNUM = smallest safe number and BIGNUM=
largest safe number. Use of these scaling factors is not guaranteed to reduce the condition number of A but
works well in practice.
SMLNUM and BIGNUM are parameters representing machine precision. You can use the ?lamch routines to
compute them. For example, compute single precision values of SMLNUM and BIGNUM as follows:
If rowcnd 0.1 and amax is neither too large nor too small, it is not worth scaling by r.
If amax is very close to SMLNUM or very close to BIGNUM, the matrix A should be scaled.
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
?geequb
Computes row and column scaling factors restricted to
a power of radix to equilibrate a general matrix and
reduce its condition number.
Syntax
call sgeequb( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call dgeequb( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call cgeequb( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call zgeequb( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
Include Files
mkl.fi, lapack.f90
Description
643
3 Intel Math Kernel Library Developer Reference
The routine computes row and column scalings intended to equilibrate an m-by-n general matrix A and
reduce its condition number. The output array r returns the row scale factors and the array c - the column
scale factors. These factors are chosen to try to make the largest element in each row and column of the
matrix B with elements bi,j =r(i)*ai,j*c(j) have an absolute value of at most the radix.
r(i) and c(j) are restricted to be a power of the radix between SMLNUM = smallest safe number and
BIGNUM = largest safe number. Use of these scaling factors is not guaranteed to reduce the condition number
of a but works well in practice.
SMLNUM and BIGNUM are parameters representing machine precision. You can use the ?lamch routines to
compute them. For example, compute single precision values of SMLNUM and BIGNUM as follows:
This routine differs from ?geequ by restricting the scaling factors to a power of the radix. Except for over-
and underflow, scaling by these factors introduces no additional rounding errors. However, the scaled entries'
magnitudes are no longer equal to approximately 1 but lie between sqrt(radix) and 1/sqrt(radix).
Input Parameters
Output Parameters
If info = 0, or info>m, the array r contains the row scale factors for
the matrix A.
If info = 0, the array c contains the column scale factors for the
matrix A.
644
LAPACK Routines 3
If info = 0 or info>m, rowcnd contains the ratio of the smallest
r(i) to the largest r(i). If rowcnd 0.1, and amax is neither too
large nor too small, it is not worth scaling by r.
info INTEGER.
If info = 0, the execution is successful.
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
?gbequ
Computes row and column scaling factors intended to
equilibrate a banded matrix and reduce its condition
number.
Syntax
call sgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call dgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call cgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call zgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call gbequ( ab, r, c [,kl] [,rowcnd] [,colcnd] [,amax] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate an m-by-n band matrix A and reduce
its condition number. The output array r returns the row scale factors and the array c the column scale
factors. These factors are chosen to try to make the largest element in each row and column of the matrix B
with elements bij=r(i)*aij*c(j) have absolute value 1.
See ?laqgb auxiliary function that uses scaling factors computed by ?gbequ.
645
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
646
LAPACK Routines 3
If info = i and
ku Restored as ku = lda-kl-1.
Application Notes
All the components of r and c are restricted to be between SMLNUM = smallest safe number and BIGNUM=
largest safe number. Use of these scaling factors is not guaranteed to reduce the condition number of A but
works well in practice.
SMLNUM and BIGNUM are parameters representing machine precision. You can use the ?lamch routines to
compute them. For example, compute single precision values of SMLNUM and BIGNUM as follows:
If rowcnd 0.1 and amax is neither too large nor too small, it is not worth scaling by r.
If amax is very close to SMLNUM or very close to BIGNUM, the matrix A should be scaled.
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
?gbequb
Computes row and column scaling factors restricted to
a power of radix to equilibrate a banded matrix and
reduce its condition number.
Syntax
call sgbequb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call dgbequb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call cgbequb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call zgbequb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
647
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate an m-by-n banded matrix A and
reduce its condition number. The output array r returns the row scale factors and the array c - the column
scale factors. These factors are chosen to try to make the largest element in each row and column of the
matrix B with elements b(ij)=r(i)*a(ij)*c(j) have an absolute value of at most the radix.
r(i) and c(j) are restricted to be a power of the radix between SMLNUM = smallest safe number and
BIGNUM = largest safe number. Use of these scaling factors is not guaranteed to reduce the condition
number of a but works well in practice.
SMLNUM and BIGNUM are parameters representing machine precision. You can use the ?lamch routines to
compute them. For example, compute single precision values of SMLNUM and BIGNUM as follows:
This routine differs from ?gbequ by restricting the scaling factors to a power of the radix. Except for over-
and underflow, scaling by these factors introduces no additional rounding errors. However, the scaled entries'
magnitudes are no longer equal to approximately 1 but lie between sqrt(radix) and 1/sqrt(radix).
Input Parameters
Output Parameters
648
LAPACK Routines 3
If info = 0, or info>m, the array r contains the row scale factors for
the matrix A.
If info = 0, the array c contains the column scale factors for the
matrix A.
info INTEGER.
If info = 0, the execution is successful.
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
?poequ
Computes row and column scaling factors intended to
equilibrate a symmetric (Hermitian) positive definite
matrix and reduce its condition number.
Syntax
call spoequ( n, a, lda, s, scond, amax, info )
call dpoequ( n, a, lda, s, scond, amax, info )
call cpoequ( n, a, lda, s, scond, amax, info )
call zpoequ( n, a, lda, s, scond, amax, info )
call poequ( a, s [,scond] [,amax] [,info] )
Include Files
mkl.fi, lapack.f90
649
3 Intel Math Kernel Library Developer Reference
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian) positive-
definite matrix A and reduce its condition number (with respect to the two-norm). The output array s returns
scale factors such that s(i)s[i + 1] contains
These factors are chosen so that the scaled matrix B with elements Bi,j=s(i)*Ai,j*s(j) has diagonal
elements equal to 1.
This choice of s puts the condition number of B within a factor n of the smallest possible condition number
over all possible diagonal scalings.
See ?laqsy auxiliary function that uses scaling factors computed by ?poequ.
Input Parameters
Output Parameters
650
LAPACK Routines 3
DOUBLE PRECISION for double precision flavors.
Absolute value of the largest element of the matrix A.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
If scond 0.1 and amax is neither too large nor too small, it is not worth scaling by s.
If amax is very close to SMLNUM or very close to BIGNUM, the matrix A should be scaled.
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
?poequb
Computes row and column scaling factors intended to
equilibrate a symmetric (Hermitian) positive definite
matrix and reduce its condition number.
Syntax
call spoequb( n, a, lda, s, scond, amax, info )
call dpoequb( n, a, lda, s, scond, amax, info )
call cpoequb( n, a, lda, s, scond, amax, info )
call zpoequb( n, a, lda, s, scond, amax, info )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian) positive-
definite matrix A and reduce its condition number (with respect to the two-norm).
These factors are chosen so that the scaled matrix B with elements Bi,j=s(i)*Ai,j*s(j) has diagonal
elements equal to 1. s(i) is a power of two nearest to, but not exceeding 1/sqrt(Ai,i).
This choice of s puts the condition number of B within a factor n of the smallest possible condition number
over all possible diagonal scalings.
651
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
652
LAPACK Routines 3
?ppequ
Computes row and column scaling factors intended to
equilibrate a symmetric (Hermitian) positive definite
matrix in packed storage and reduce its condition
number.
Syntax
call sppequ( uplo, n, ap, s, scond, amax, info )
call dppequ( uplo, n, ap, s, scond, amax, info )
call cppequ( uplo, n, ap, s, scond, amax, info )
call zppequ( uplo, n, ap, s, scond, amax, info )
call ppequ( ap, s [,scond] [,amax] [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian) positive
definite matrix A in packed storage and reduce its condition number (with respect to the two-norm). The
output array s returns scale factors such that s(i)s[i + 1] contains
These factors are chosen so that the scaled matrix B with elements bij=s(i)*aij*s(j) has diagonal
elements equal to 1.
This choice of s puts the condition number of B within a factor n of the smallest possible condition number
over all possible diagonal scalings.
See ?laqsp auxiliary function that uses scaling factors computed by ?ppequ.
Input Parameters
653
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
If scond 0.1 and amax is neither too large nor too small, it is not worth scaling by s.
If amax is very close to SMLNUM or very close to BIGNUM, the matrix A should be scaled.
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
654
LAPACK Routines 3
?pbequ
Computes row and column scaling factors intended to
equilibrate a symmetric (Hermitian) positive-definite
band matrix and reduce its condition number.
Syntax
call spbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call dpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call cpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call zpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call pbequ( ab, s [,scond] [,amax] [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian) positive
definite band matrix A and reduce its condition number (with respect to the two-norm). The output array s
returns scale factors such that s(i)s[i + 1] contains
These factors are chosen so that the scaled matrix B with elements bij=s(i)*aij*s(j) has diagonal
elements equal to 1. This choice of s puts the condition number of B within a factor n of the smallest possible
condition number over all possible diagonal scalings.
See ?laqsb auxiliary function that uses scaling factors computed by ?pbequ.
Input Parameters
655
3 Intel Math Kernel Library Developer Reference
ldab INTEGER. The leading dimension of the array ab; ldabkd +1.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
If scond 0.1 and amax is neither too large nor too small, it is not worth scaling by s.
If amax is very close to SMLNUM or very close to BIGNUM, the matrix A should be scaled.
656
LAPACK Routines 3
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
?syequb
Computes row and column scaling factors intended to
equilibrate a symmetric indefinite matrix and reduce
its condition number.
Syntax
call ssyequb( uplo, n, a, lda, s, scond, amax, work, info )
call dsyequb( uplo, n, a, lda, s, scond, amax, work, info )
call csyequb( uplo, n, a, lda, s, scond, amax, work, info )
call zsyequb( uplo, n, a, lda, s, scond, amax, work, info )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a symmetric indefinite matrix A and
reduce its condition number (with respect to the two-norm).
The array s contains the scale factors, s(i) = 1/sqrt(A(i,i)). These factors are chosen so that the
scaled matrix B with elements b(i,j)=s(i)*a(i,j)*s(j) has ones on the diagonal.
This choice of s puts the condition number of B within a factor n of the smallest possible condition number
over all possible diagonal scalings.
Input Parameters
657
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
?heequb
Computes row and column scaling factors intended to
equilibrate a Hermitian indefinite matrix and reduce its
condition number.
Syntax
call cheequb( uplo, n, a, lda, s, scond, amax, work, info )
call zheequb( uplo, n, a, lda, s, scond, amax, work, info )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a Hermitian indefinite matrix A and
reduce its condition number (with respect to the two-norm).
658
LAPACK Routines 3
The array s contains the scale factors, s(i) = 1/sqrt(A(i,i)). These factors are chosen so that the
scaled matrix B with elements b(i,j)=s(i)*a(i,j)*s(j) has ones on the diagonal.
This choice of s puts the condition number of B within a factor n of the smallest possible condition number
over all possible diagonal scalings.
Input Parameters
Output Parameters
info INTEGER.
659
3 Intel Math Kernel Library Developer Reference
See Also
Error Analysis
Matrix Storage Schemes for LAPACK Routines
In this table ? stands for s (single precision real), d (double precision real), c (single precision complex), or z
(double precision complex). In the description of ?gesv and ?posv routines, the ? sign stands for combined
character codes ds and zc for the mixed precision subroutines.
660
LAPACK Routines 3
?gesv
Computes the solution to the system of linear
equations with a square coefficient matrix A and
multiple right-hand sides.
Syntax
call sgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call dgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call cgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call zgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call dsgesv( n, nrhs, a, lda, ipiv, b, ldb, x, ldx, work, swork, iter, info )
call zcgesv( n, nrhs, a, lda, ipiv, b, ldb, x, ldx, work, swork, rwork, iter, info )
call gesv( a, b [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n matrix, the columns
of matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
The LU decomposition with partial pivoting and row interchanges is used to factor A as A = P*L*U, where P
is a permutation matrix, L is unit lower triangular, and U is upper triangular. The factored form of A is then
used to solve the system of equations A*X = B.
The dsgesv and zcgesv are mixed precision iterative refinement subroutines for exploiting fast single
precision hardware. They first attempt to factorize the matrix in single precision (dsgesv) or single complex
precision (zcgesv) and use this factorization within an iterative refinement procedure to produce a solution
with double precision (dsgesv) / double complex precision (zcgesv) normwise backward error quality (see
below). If the approach fails, the method switches to a double precision or double complex precision
factorization respectively and computes the solution.
The iterative refinement is not going to be a winning strategy if the ratio single precision performance over
double precision performance is too small. A reasonable strategy should take the number of right-hand sides
and the size of the matrix into account. This might be done with a call to ilaenv in the future. At present,
iterative refinement is implemented.
The iterative refinement process is stopped if
where
The values itermax and bwdmax are fixed to 30 and 1.0d+00 respectively.
661
3 Intel Math Kernel Library Developer Reference
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the
matrix A; n 0.
nrhs INTEGER. The number of right-hand sides, that is, the number of
columns of the matrix B; nrhs 0.
lda INTEGER. The leading dimension of the array a; lda max(1, n).
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the array x; ldx max(1, n).
Output Parameters
662
LAPACK Routines 3
If iterative refinement has been successfully used (info= 0 and
iter 0), then A is unchanged.
If double precision factorization has been used (info= 0 and iter <
0), then the array A contains the factors L and U from the
factorization A = P*L*U; the unit diagonal elements of L are not
stored.
ipiv INTEGER.
Array, size at least max(1, n). The pivot indices that define the
permutation matrix P; row i of the matrix was interchanged with row
ipiv(i). Corresponds to the single precision factorization (if info= 0
and iter 0) or the double precision factorization (if info= 0 and
iter < 0).
iter INTEGER.
If iter < 0: iterative refinement has failed, double precision
factorization has been performed
663
3 Intel Math Kernel Library Developer Reference
NOTE
Fortran 95 Interface is so far not available for the mixed precision subroutines dsgesv/zcgesv.
See Also
ilaenv
?lamch
?getrf
Matrix Storage Schemes for LAPACK Routines
?gesvx
Computes the solution to the system of linear
equations with a square coefficient matrix A and
multiple right-hand sides, and provides error bounds
on the solution.
Syntax
call sgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, ferr, berr, work, iwork, info )
call dgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, ferr, berr, work, iwork, info )
call cgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, ferr, berr, work, rwork, info )
call zgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, ferr, berr, work, rwork, info )
call gesvx( a, b, x [,af] [,ipiv] [,fact] [,trans] [,equed] [,r] [,c] [,ferr] [,berr]
[,rcond] [,rpvgrw] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the LU factorization to compute the solution to a real or complex system of linear equations
A*X = B, where A is an n-by-n matrix, the columns of matrix B are individual right-hand sides, and the
columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?gesvx performs the following steps:
1. If fact = 'E', real scaling factors r and c are computed to equilibrate the system:
664
LAPACK Routines 3
Whether or not the system will be equilibrated depends on the scaling of the matrix A, but if
equilibration is used, A is overwritten by diag(r)*A*diag(c) and B by diag(r)*B (if trans='N') or
diag(c)*B (if trans = 'T' or 'C').
2. If fact = 'N' or 'E', the LU decomposition is used to factor the matrix A (after equilibration if fact
= 'E') as A = P*L*U, where P is a permutation matrix, L is a unit lower triangular matrix, and U is
upper triangular.
3. If some Ui,i= 0, so that U is exactly singular, then the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the
condition number is less than machine precision, info = n + 1 is returned as a warning, but the
routine still goes on to solve for X and compute error bounds as described below.
4. The system of equations is solved for X using the factored form of A.
5. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
6. If equilibration was used, the matrix X is premultiplied by diag(c) (if trans = 'N') or diag(r) (if
trans = 'T' or 'C') so that it solves the original system before equilibration.
Input Parameters
If trans = 'C', the system has the form AH*X = B (Transpose for
real flavors, conjugate transpose for complex flavors).
nrhs INTEGER. The number of right hand sides; the number of columns of
the matrices B and X; nrhs 0.
665
3 Intel Math Kernel Library Developer Reference
The array a(size lda by *) contains the matrix A. If fact = 'F' and
equed is not 'N', then A must have been equilibrated by the scaling
factors in r and/or c. The second dimension of a must be at least
max(1,n).
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if
fact = 'F'. It contains the pivot indices from the factorization A =
P*L*U as computed by ?getrf; row i of the matrix was interchanged
with row ipiv(i).
666
LAPACK Routines 3
If equed = 'N', no equilibration was done (always true if fact =
'N').
If equed = 'R', row equilibration was done, that is, A has been
premultiplied by diag(r).
If equed = 'C', column equilibration was done, that is, A has been
postmultiplied by diag(c).
If equed = 'B', both row and column equilibration was done, that is,
A has been replaced by diag(r)*A*diag(c).
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
iwork INTEGER. Workspace array, size at least max(1, n); used in real
flavors only.
Output Parameters
667
3 Intel Math Kernel Library Developer Reference
668
LAPACK Routines 3
ipiv If fact = 'N'or 'E', then ipiv is an output argument and on exit
contains the pivot indices from the factorization A = P*L*U of the
original matrix A (if fact = 'N') or of the equilibrated matrix A (if
fact = 'E').
work, rwork On exit, work(1) for real flavors, or rwork(1) for complex flavors
(the Fortran interface) contains the reciprocal pivot growth factor
norm(A)/norm(U). The "max absolute element" norm is used. If
work(1) for real flavors, or rwork(1) for complex flavors is much
less than 1, then the stability of the LU factorization of the
(equilibrated) matrix A could be poor. This also means that the
solution x, condition estimator rcond, and forward error bound ferr
could be unreliable. If factorization fails with 0 < infon, then
work(1) for real flavors, or rwork(1) for complex flavors contains
the reciprocal pivot growth factor for the leading info columns of A.
r Holds the vector of length n. Default value for each element is r(i) =
1.0_WP.
669
3 Intel Math Kernel Library Developer Reference
c Holds the vector of length n. Default value for each element is c(i) =
1.0_WP.
fact Must be 'N', 'E', or 'F'. The default value is 'N'. If fact = 'F',
then both arguments af and ipiv must be present; otherwise, an error
is returned.
equed Must be 'N', 'B', 'C', or 'R'. The default value is 'N'.
rpvgrw Real value that contains the reciprocal pivot growth factor norm(A)/
norm(U).
See Also
Matrix Storage Schemes for LAPACK Routines
?gesvxx
Uses extra precise iterative refinement to compute the
solution to the system of linear equations with a
square coefficient matrix A and multiple right-hand
sides
Syntax
call sgesvxx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params,
work, iwork, info )
call dgesvxx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params,
work, iwork, info )
call cgesvxx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params,
work, rwork, info )
call zgesvxx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params,
work, rwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the LU factorization to compute the solution to a real or complex system of linear equations
A*X = B, where A is an n-by-n matrix, the columns of the matrix B are individual right-hand sides, and the
columns of X are the corresponding solutions.
Both normwise and maximum componentwise error bounds are also provided on request. The routine returns
a solution with a small guaranteed error (O(eps), where eps is the working machine precision) unless the
matrix is very ill-conditioned, in which case a warning is returned. Relevant condition numbers are also
calculated and returned.
670
LAPACK Routines 3
The routine accepts user-provided factorizations and equilibration factors; see definitions of the fact and
equed options. Solving with refinement and using a factorization from a previous call of the routine also
produces a solution with O(eps) errors or warnings but that may not be true for general user-provided
factorizations and equilibration factors if they differ from what the routine would itself produce.
The routine ?gesvxx performs the following steps:
1. If fact = 'E', scaling factors r and c are computed to equilibrate the system:
Input Parameters
If trans = 'C', the system has the form AH*X = B (Conjugate Transpose
= Transpose for real flavors, Conjugate Transpose for complex flavors).
671
3 Intel Math Kernel Library Developer Reference
nrhs INTEGER. The number of right hand sides; the number of columns of the
matrices B and X; nrhs 0.
The array a contains the matrix A. If fact = 'F' and equed is not 'N',
then A must have been equilibrated by the scaling factors in r and/or c.
The second dimension of a must be at least max(1,n).
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex flavors.
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if fact
= 'F'. It contains the pivot indices from the factorization A = P*L*U as
computed by ?getrf; row i of the matrix was interchanged with row
ipiv(i).
If equed = 'R', row equilibration was done, that is, A has been
premultiplied by diag(r).
If equed = 'C', column equilibration was done, that is, A has been
postmultiplied by diag(c).
If equed = 'B', both row and column equilibration was done, that is, A has
been replaced by diag(r)*A*diag(c).
672
LAPACK Routines 3
Arrays: r (size n), c (size n). The array r contains the row scale factors
for A, and the array c contains the column scale factors for A. These arrays
are input arguments if fact = 'F' only; otherwise they are output
arguments.
If equed = 'R' or 'B', A is multiplied on the left by diag(r); if equed =
'N' or 'C', r is not accessed.
If fact = 'F' and equed = 'R'or 'B', each element of r must be
positive.
If equed = 'C' or 'B', A is multiplied on the right by diag(c); if equed =
'N' or 'R', c is not accessed.
If fact = 'F' and equed = 'C' or 'B', each element of c must be
positive.
Each element of r or c should be a power of the radix to ensure a reliable
solution and error estimates. Scaling by powers of the radix does not cause
rounding errors unless the result underflows or overflows. Rounding errors
during scaling lead to refining with a matrix that is not equivalent to the
input matrix, producing error estimates that may not be reliable.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in Output Arguments section below.
673
3 Intel Math Kernel Library Developer Reference
Default 10.0
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
Output Parameters
a Array a is not modified on exit if fact = 'F' or 'N', or if fact = 'E' and
equed = 'N'.
If equed'N', A is scaled on exit as follows:
674
LAPACK Routines 3
overwritten by trans = 'T' or 'C' and equed = 'C' or 'B';
675
3 Intel Math Kernel Library Developer Reference
676
LAPACK Routines 3
The first index in err_bnds_comp(i,:) corresponds to the i-th right-hand
side.
The second index in err_bnds_comp(:,err) contains the following three
fields:
ipiv If fact = 'N' or 'E', then ipiv is an output argument and on exit
contains the pivot indices from the factorization A = P*L*U of the original
matrix A (if fact = 'N') or of the equilibrated matrix A (if fact = 'E').
params If an entry is less than 0.0, that entry is filled with the default value used
for that parameter, otherwise the entry is not modified
677
3 Intel Math Kernel Library Developer Reference
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?gbsv
Computes the solution to the system of linear
equations with a band coefficient matrix A and
multiple right-hand sides.
Syntax
call sgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call dgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call cgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call zgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call gbsv( ab, b [,kl] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n band
matrix with kl subdiagonals and ku superdiagonals, the columns of matrix B are individual right-hand sides,
and the columns of X are the corresponding solutions.
The LU decomposition with partial pivoting and row interchanges is used to factor A as A = L*U, where L is a
product of permutation and unit lower triangular matrices with kl subdiagonals, and U is upper triangular
with kl+ku superdiagonals. The factored form of A is then used to solve the system of equations A*X = B.
Input Parameters
678
LAPACK Routines 3
ab, b REAL for sgbsv
DOUBLE PRECISION for dgbsv
COMPLEX for cgbsv
DOUBLE COMPLEX for zgbsv.
Arrays: ab(size ldab by *), b(size ldb by *).
The array ab contains the matrix A in band storage (see Matrix
Storage Schemes). The second dimension of ab must be at least
max(1, n).
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
ldab INTEGER. The leading dimension of the array ab. (ldab 2kl + ku +1)
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). The pivot indices: row i was
interchanged with row ipiv(i).
ku Restored as ku = lda-2*kl-1.
679
3 Intel Math Kernel Library Developer Reference
See Also
Matrix Storage Schemes for LAPACK Routines
?gbsvx
Computes the solution to the real or complex system
of linear equations with a band coefficient matrix A
and multiple right-hand sides, and provides error
bounds on the solution.
Syntax
call sgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, iwork, info )
call dgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, iwork, info )
call cgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, rwork, info )
call zgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, rwork, info )
call gbsvx( ab, b, x [,kl] [,afb] [,ipiv] [,fact] [,trans] [,equed] [,r] [,c] [,ferr]
[,berr] [,rcond] [,rpvgrw] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the LU factorization to compute the solution to a real or complex system of linear equations
A*X = B, AT*X = B, or AH*X = B, where A is a band matrix of order n with kl subdiagonals and ku
superdiagonals, the columns of matrix B are individual right-hand sides, and the columns of X are the
corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?gbsvx performs the following steps:
1. If fact = 'E', real scaling factors r and c are computed to equilibrate the system:
680
LAPACK Routines 3
6. If equilibration was used, the matrix X is premultiplied by diag(c) (if trans = 'N') or diag(r) (if
trans = 'T' or 'C') so that it solves the original system before equilibration.
Input Parameters
If trans = 'C', the system has the form AH*X = B (Transpose for
real flavors, conjugate transpose for complex flavors).
nrhs INTEGER. The number of right hand sides, the number of columns of
the matrices B and X; nrhs 0.
681
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if
fact = 'F'. It contains the pivot indices from the factorization A =
P*L*U as computed by ?gbtrf; row i of the matrix was interchanged
with row ipiv(i).
If equed = 'C', column equilibration was done, that is, A has been
postmultiplied by diag(c).
if equed = 'B', both row and column equilibration was done, that is,
A has been replaced by diag(r)*A*diag(c).
The array r contains the row scale factors for A, and the array c
contains the column scale factors for A. These arrays are input
arguments if fact = 'F' only; otherwise they are output arguments.
682
LAPACK Routines 3
If fact = 'F' and equed = 'C'or 'B', each element of c must be
positive.
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
iwork INTEGER. Workspace array, size at least max(1, n); used in real
flavors only.
Output Parameters
afb If fact = 'N' or 'E', then afb is an output argument and on exit
returns details of the LU factorization of the original matrix A (if fact
= 'N') or of the equilibrated matrix A (if fact = 'E'). See the
description of ab for the form of the equilibrated matrix.
683
3 Intel Math Kernel Library Developer Reference
ipiv If fact = 'N' or 'E', then ipiv is an output argument and on exit
contains the pivot indices from the factorization A = L*U of the
original matrix A (if fact = 'N') or of the equilibrated matrix A (if
fact = 'E').
work, rwork On exit, work(1) for real flavors, or rwork(1) for complex flavors,
contains the reciprocal pivot growth factor norm(A)/norm(U). The
"max absolute element" norm is used. If work(1) for real flavors, or
rwork(1) for complex flavors is much less than 1, then the stability of
684
LAPACK Routines 3
the LU factorization of the (equilibrated) matrix A could be poor. This
also means that the solution x, condition estimator rcond, and forward
error bound ferr could be unreliable. If factorization fails with 0 <
infon, then work(1) for real flavors, or rwork(1) for complex
flavors contains the reciprocal pivot growth factor for the leading info
columns of A.
r Holds the vector of length n. Default value for each element is r(i) =
1.0_WP.
c Holds the vector of length n. Default value for each element is c(i) =
1.0_WP.
equed Must be 'N', 'B', 'C', or 'R'. The default value is 'N'.
fact Must be 'N', 'E', or 'F'. The default value is 'N'. If fact = 'F',
then both arguments af and ipiv must be present; otherwise, an error
is returned.
685
3 Intel Math Kernel Library Developer Reference
rpvgrw Real value that contains the reciprocal pivot growth factor norm(A)/
norm(U).
ku Restored as ku = lda-kl-1.
See Also
Matrix Storage Schemes for LAPACK Routines
?gbsvxx
Uses extra precise iterative refinement to compute the
solution to the system of linear equations with a
banded coefficient matrix A and multiple right-hand
sides
Syntax
call sgbsvxx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c,
b, ldb, x, ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp,
nparams, params, work, iwork, info )
call dgbsvxx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c,
b, ldb, x, ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp,
nparams, params, work, iwork, info )
call cgbsvxx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c,
b, ldb, x, ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp,
nparams, params, work, rwork, info )
call zgbsvxx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c,
b, ldb, x, ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp,
nparams, params, work, rwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the LU factorization to compute the solution to a real or complex system of linear equations
A*X = B, AT*X = B, or AH*X = B, where A is an n-by-n banded matrix, the columns of the matrix B are
individual right-hand sides, and the columns of X are the corresponding solutions.
Both normwise and maximum componentwise error bounds are also provided on request. The routine returns
a solution with a small guaranteed error (O(eps), where eps is the working machine precision) unless the
matrix is very ill-conditioned, in which case a warning is returned. Relevant condition numbers are also
calculated and returned.
The routine accepts user-provided factorizations and equilibration factors; see definitions of the fact and
equed options. Solving with refinement and using a factorization from a previous call of the routine also
produces a solution with O(eps) errors or warnings but that may not be true for general user-provided
factorizations and equilibration factors if they differ from what the routine would itself produce.
The routine ?gbsvxx performs the following steps:
1. If fact = 'E', scaling factors r and c are computed to equilibrate the system:
686
LAPACK Routines 3
trans = 'C': (diag(r)*A*diag(c))H*inv(diag(r))*X = diag(c)*B
Whether or not the system will be equilibrated depends on the scaling of the matrix A, but if
equilibration is used, A is overwritten by diag(r)*A*diag(c) and B by diag(r)*B (if trans='N') or
diag(c)*B (if trans = 'T' or 'C').
2. If fact = 'N' or 'E', the LU decomposition is used to factor the matrix A (after equilibration if fact
= 'E') as A = P*L*U, where P is a permutation matrix, L is a unit lower triangular matrix, and U is
upper triangular.
3. If some Ui,i= 0, so that U is exactly singular, then the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A (see the rcond parameter).
If the reciprocal of the condition number is less than machine precision, the routine still goes on to
solve for X and compute error bounds.
4. The system of equations is solved for X using the factored form of A.
5. By default, unless params(1) is set to zero, the routine applies iterative refinement to improve the
computed solution matrix and calculate error bounds. Refinement calculates the residual to at least
twice the working precision.
6. If equilibration was used, the matrix X is premultiplied by diag(c) (if trans = 'N') or diag(r) (if
trans = 'T' or 'C') so that it solves the original system before equilibration.
Input Parameters
If trans = 'C', the system has the form AH*X = B (Conjugate Transpose
= Transpose for real flavors, Conjugate Transpose for complex flavors).
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
687
3 Intel Math Kernel Library Developer Reference
The array afb is an input argument if fact = 'F'. It contains the factored
form of the banded matrix A, that is, the factors L and U from the
factorization A = P*L*U as computed by ?gbtrf. U is stored as an upper
triangular banded matrix with kl + ku superdiagonals in rows 1 to kl + ku
+ 1. The multipliers used during the factorization are stored in rows kl + ku
+ 2 to 2*kl + ku + 1. If equed is not 'N', then afb is the factored form of
the equilibrated matrix A.
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex flavors.
ldafb INTEGER. The leading dimension of the array afb; ldafb 2*kl+ku+1.
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if fact
= 'F'. It contains the pivot indices from the factorization A = P*L*U as
computed by ?gbtrf; row i of the matrix was interchanged with row
ipiv(i).
If equed = 'R', row equilibration was done, that is, A has been
premultiplied by diag(r).
If equed = 'C', column equilibration was done, that is, A has been
postmultiplied by diag(c).
If equed = 'B', both row and column equilibration was done, that is, A has
been replaced by diag(r)*A*diag(c).
688
LAPACK Routines 3
Arrays: r (size n), c (size n). The array r contains the row scale factors for
A, and the array c contains the column scale factors for A. These arrays are
input arguments if fact = 'F' only; otherwise they are output arguments.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in Output Arguments section below.
Default 10.0
689
3 Intel Math Kernel Library Developer Reference
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
Output Parameters
afb If fact = 'N' or 'E', then afb is an output argument and on exit returns
the factors L and U from the factorization A = PLU of the original matrix A
(if fact = 'N') or of the equilibrated matrix A (if fact = 'E').
690
LAPACK Routines 3
r, c These arrays are output arguments if fact'F'. Each element of these
arrays is a power of the radix. See the description of r, c in Input
Arguments section.
691
3 Intel Math Kernel Library Developer Reference
692
LAPACK Routines 3
The second index in err_bnds_comp(:,err) contains the following three
fields:
ipiv If fact = 'N' or 'E', then ipiv is an output argument and on exit
contains the pivot indices from the factorization A = P*L*U of the original
matrix A (if fact = 'N') or of the equilibrated matrix A (if fact = 'E').
params If an entry is less than 0.0, that entry is filled with the default value used
for that parameter, otherwise the entry is not modified.
693
3 Intel Math Kernel Library Developer Reference
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?gtsv
Computes the solution to the system of linear
equations with a tridiagonal coefficient matrix A and
multiple right-hand sides.
Syntax
call sgtsv( n, nrhs, dl, d, du, b, ldb, info )
call dgtsv( n, nrhs, dl, d, du, b, ldb, info )
call cgtsv( n, nrhs, dl, d, du, b, ldb, info )
call zgtsv( n, nrhs, dl, d, du, b, ldb, info )
call gtsv( dl, d, du, b [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n tridiagonal matrix, the
columns of matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
The routine uses Gaussian elimination with partial pivoting.
Note that the equation AT*X = B may be solved by interchanging the order of the arguments du and dl.
Input Parameters
694
LAPACK Routines 3
d REAL for sgtsv
DOUBLE PRECISION for dgtsv
COMPLEX for cgtsv
DOUBLE COMPLEX for zgtsv.
The array d (size n) contains the diagonal elements of A.
Output Parameters
If info = i, U(i, i) is exactly zero, and the solution has not been
computed. The factorization has not been completed unless i = n.
695
3 Intel Math Kernel Library Developer Reference
See Also
Matrix Storage Schemes for LAPACK Routines
?gtsvx
Computes the solution to the real or complex system
of linear equations with a tridiagonal coefficient matrix
A and multiple right-hand sides, and provides error
bounds on the solution.
Syntax
call sgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
ldx, rcond, ferr, berr, work, iwork, info )
call dgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
ldx, rcond, ferr, berr, work, iwork, info )
call cgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
ldx, rcond, ferr, berr, work, rwork, info )
call zgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
ldx, rcond, ferr, berr, work, rwork, info )
call gtsvx( dl, d, du, b, x [,dlf] [,df] [,duf] [,du2] [,ipiv] [,fact] [,trans] [,ferr]
[,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the LU factorization to compute the solution to a real or complex system of linear equations
A*X = B, AT*X = B, or AH*X = B, where A is a tridiagonal matrix of order n, the columns of matrix B are
individual right-hand sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?gtsvx performs the following steps:
1. If fact = 'N', the LU decomposition is used to factor the matrix A as A = L*U, where L is a product
of permutation and unit lower bidiagonal matrices and U is an upper triangular matrix with nonzeroes in
only the main diagonal and first two superdiagonals.
2. If some Ui,i= 0, so that U is exactly singular, then the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the
condition number is less than machine precision, info = n + 1 is returned as a warning, but the
routine still goes on to solve for X and compute error bounds as described below.
3. The system of equations is solved for X using the factored form of A.
4. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
696
LAPACK Routines 3
Input Parameters
nrhs INTEGER. The number of right hand sides, the number of columns of
the matrices B and X; nrhs 0.
697
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER.
Array, size at least max(1, n). If fact = 'F', then ipiv is an input
argument and on entry contains the pivot indices, as returned by ?
gttrf.
iwork INTEGER. Workspace array, size (n). Used for real flavors only.
Output Parameters
dlf If fact = 'N', then dlf is an output argument and on exit contains
the (n-1) multipliers that define the matrix L from the LU
factorization of A.
duf If fact = 'N', then duf is an output argument and on exit contains
the (n-1) elements of the first superdiagonal of U.
du2 If fact = 'N', then du2 is an output argument and on exit contains
the (n-2) elements of the second superdiagonal of U.
698
LAPACK Routines 3
An estimate of the reciprocal condition number of the matrix A. If
rcond is less than the machine precision (in particular, if rcond =0),
the matrix is singular to working precision. This condition is indicated
by a return code of info>0.
699
3 Intel Math Kernel Library Developer Reference
fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then
the arguments dlf, df, duf, du2, and ipiv must be present; otherwise,
an error is returned.
See Also
Matrix Storage Schemes for LAPACK Routines
?dtsvb
Computes the solution to the system of linear
equations with a diagonally dominant tridiagonal
coefficient matrix A and multiple right-hand sides.
Syntax
call sdtsvb( n, nrhs, dl, d, du, b, ldb, info )
call ddtsvb( n, nrhs, dl, d, du, b, ldb, info )
call cdtsvb( n, nrhs, dl, d, du, b, ldb, info )
call zdtsvb( n, nrhs, dl, d, du, b, ldb, info )
call dtsvb( dl, d, du, b [, info])
Include Files
mkl.fi, lapack.f90
Description
The ?dtsvb routine solves a system of linear equations A*X = B for X, where A is an n-by-n diagonally
dominant tridiagonal matrix, the columns of matrix B are individual right-hand sides, and the columns of X
are the corresponding solutions. The routine uses the BABE (Burning At Both Ends) algorithm.
Note that the equation AT*X = B may be solved by interchanging the order of the arguments du and dl.
Input Parameters
700
LAPACK Routines 3
dl, d, du, b REAL for sdtsvb
DOUBLE PRECISION for ddtsvb
COMPLEX for cdtsvb
DOUBLE COMPLEX for zdtsvb.
Arrays: dl (size n - 1), d (size n), du (size n - 1), b(size ldb,*).
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
Output Parameters
If info = i, uii is exactly zero, and the solution has not been
computed. The factorization has not been completed unless i = n.
Application Notes
A diagonally dominant tridiagonal system is defined such that |di| > |dli-1| + |dui| for any i:
The underlying BABE algorithm is designed for diagonally dominant systems. Such systems have no
numerical stability issue unlike the canonical systems that use elimination with partial pivoting (see ?gtsv).
The diagonally dominant systems are much faster than the canonical systems.
NOTE
The current implementation of BABE has a potential accuracy issue on very small or large data
close to the underflow or overflow threshold respectively. Scale the matrix before applying the
solver in the case of such input data.
Applying the ?dtsvb factorization to non-diagonally dominant systems may lead to an accuracy
loss, or false singularity detected due to no pivoting.
701
3 Intel Math Kernel Library Developer Reference
?posv
Computes the solution to the system of linear
equations with a symmetric or Hermitian positive-
definite coefficient matrix A and multiple right-hand
sides.
Syntax
call sposv( uplo, n, nrhs, a, lda, b, ldb, info )
call dposv( uplo, n, nrhs, a, lda, b, ldb, info )
call cposv( uplo, n, nrhs, a, lda, b, ldb, info )
call zposv( uplo, n, nrhs, a, lda, b, ldb, info )
call dsposv( uplo, n, nrhs, a, lda, b, ldb, x, ldx, work, swork, iter, info )
call zcposv( uplo, n, nrhs, a, lda, b, ldb, x, ldx, work, swork, rwork, iter, info )
call posv( a, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n
symmetric/Hermitian positive-definite matrix, the columns of matrix B are individual right-hand sides, and
the columns of X are the corresponding solutions.
The Cholesky decomposition is used to factor A as
A = UT*U (real flavors) and A = UH*U (complex flavors), if uplo = 'U'
or A = L*LT (real flavors) and A = L*LH (complex flavors), if uplo = 'L',
where U is an upper triangular matrix and L is a lower triangular matrix. The factored form of A is then used
to solve the system of equations A*X = B.
The dsposv and zcposv are mixed precision iterative refinement subroutines for exploiting fast single
precision hardware. They first attempt to factorize the matrix in single precision (dsposv) or single complex
precision (zcposv) and use this factorization within an iterative refinement procedure to produce a solution
with double precision (dsposv) / double complex precision (zcposv) normwise backward error quality (see
below). If the approach fails, the method switches to a double precision or double complex precision
factorization respectively and computes the solution.
The iterative refinement is not going to be a winning strategy if the ratio single precision/complex
performance over double precision/double complex performance is too small. A reasonable strategy should
take the number of right-hand sides and the size of the matrix into account. This might be done with a call to
ilaenv in the future. At present, iterative refinement is implemented.
The iterative refinement process is stopped if
iter > itermax
or for all the right-hand sides:
rnmr < sqrt(n)*xnrm*anrm*eps*bwdmax,
where
iter is the number of the current iteration in the iterative refinement process
rnmr is the infinity-norm of the residual
xnrm is the infinity-norm of the solution
702
LAPACK Routines 3
anrm is the infinity-operator-norm of the matrix A
eps is the machine epsilon returned by dlamch (Epsilon).
The values itermax and bwdmax are fixed to 30 and 1.0d+00 respectively.
Input Parameters
Note that in the case of zcposv the imaginary parts of the diagonal
elements need not be set and are assumed to be zero.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
ldx INTEGER. The leading dimension of the array x; ldx max(1, n).
703
3 Intel Math Kernel Library Developer Reference
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). The pivot indices that define the
permutation matrix P; row i of the matrix was interchanged with row
ipiv(i). Corresponds to the single precision factorization (if info= 0
and iter 0) or the double precision factorization (if info= 0 and
iter < 0).
iter INTEGER.
If iter < 0: iterative refinement has failed, double precision
factorization has been performed
704
LAPACK Routines 3
a Holds the matrix A of size (n,n).
See Also
Matrix Storage Schemes for LAPACK Routines
?posvx
Uses the Cholesky factorization to compute the
solution to the system of linear equations with a
symmetric or Hermitian positive-definite coefficient
matrix A, and provides error bounds on the solution.
Syntax
call sposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
ferr, berr, work, iwork, info )
call dposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
ferr, berr, work, iwork, info )
call cposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
ferr, berr, work, rwork, info )
call zposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
ferr, berr, work, rwork, info )
call posvx( a, b, x [,uplo] [,af] [,fact] [,equed] [,s] [,ferr] [,berr] [,rcond]
[,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the Cholesky factorization A=UT*U (real flavors) / A=UH*U (complex flavors) or A=L*LT (real
flavors) / A=L*LH (complex flavors) to compute the solution to a real or complex system of linear equations
A*X = B, where A is a n-by-n real symmetric/Hermitian positive definite matrix, the columns of matrix B are
individual right-hand sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?posvx performs the following steps:
1. If fact = 'E', real scaling factors s are computed to equilibrate the system:
diag(s)*A*diag(s)*inv(diag(s))*X = diag(s)*B.
Whether or not the system will be equilibrated depends on the scaling of the matrix A, but if
equilibration is used, A is overwritten by diag(s)*A*diag(s) and B by diag(s)*B.
2. If fact = 'N' or 'E', the Cholesky decomposition is used to factor the matrix A (after equilibration if
fact = 'E') as
A = UT*U (real), A = UH*U (complex), if uplo = 'U',
or A = L*LT (real), A = L*LH (complex), if uplo = 'L',
705
3 Intel Math Kernel Library Developer Reference
3. If the leading i-by-i principal minor is not positive-definite, then the routine returns with info = i.
Otherwise, the factored form of A is used to estimate the condition number of the matrix A. If the
reciprocal of the condition number is less than machine precision, info = n + 1 is returned as a
warning, but the routine still goes on to solve for X and compute error bounds as described below.
4. The system of equations is solved for X using the factored form of A.
5. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
6. If equilibration was used, the matrix X is premultiplied by diag(s) so that it solves the original system
before equilibration.
Input Parameters
706
LAPACK Routines 3
The array af is an input argument if fact = 'F'. It contains the
triangular factor U or L from the Cholesky factorization of A in the
same storage format as A. If equed is not 'N', then af is the factored
form of the equilibrated matrix diag(s)*A*diag(s). The second
dimension of af must be at least max(1,n).
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1, nrhs).
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
iwork INTEGER. Workspace array, size at least max(1, n); used in real
flavors only.
Output Parameters
707
3 Intel Math Kernel Library Developer Reference
708
LAPACK Routines 3
Array, size at least max(1, nrhs). Contains the component-wise
relative backward error for each solution vector xj, that is, the
smallest relative change in any element of A or B that makes xj an
exact solution.
If info = i, and in, the leading minor of order i (and therefore the
matrix A itself) is not positive-definite, so the factorization could not
be completed, and the solution and error bounds could not be
computed; rcond =0 is returned.
If info = i, and i = n + 1, then U is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to
working precision. Nevertheless, the solution and error bounds are
computed because there are a number of situations where the
computed solution can be more accurate than the value of rcond
would suggest.
s Holds the vector of length n. Default value for each element is s(i) =
1.0_WP.
fact Must be 'N', 'E', or 'F'. The default value is 'N'. If fact = 'F',
then af must be present; otherwise, an error is returned.
See Also
Matrix Storage Schemes for LAPACK Routines
709
3 Intel Math Kernel Library Developer Reference
?posvxx
Uses extra precise iterative refinement to compute the
solution to the system of linear equations with a
symmetric or Hermitian positive-definite coefficient
matrix A applying the Cholesky factorization.
Syntax
call sposvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork,
info )
call dposvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork,
info )
call cposvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork,
info )
call zposvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork,
info )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the Cholesky factorization A=UT*U (real flavors) / A=UH*U (complex flavors) or A=L*LT (real
flavors) / A=L*LH (complex flavors) to compute the solution to a real or complex system of linear equations
A*X = B, where A is an n-by-n real symmetric/Hermitian positive definite matrix, the columns of matrix B
are individual right-hand sides, and the columns of X are the corresponding solutions.
Both normwise and maximum componentwise error bounds are also provided on request. The routine returns
a solution with a small guaranteed error (O(eps), where eps is the working machine precision) unless the
matrix is very ill-conditioned, in which case a warning is returned. Relevant condition numbers are also
calculated and returned.
The routine accepts user-provided factorizations and equilibration factors; see definitions of the fact and
equed options. Solving with refinement and using a factorization from a previous call of the routine also
produces a solution with O(eps) errors or warnings but that may not be true for general user-provided
factorizations and equilibration factors if they differ from what the routine would itself produce.
The routine ?posvxx performs the following steps:
710
LAPACK Routines 3
3. If the leading i-by-i principal minor is not positive-definite, the routine returns with info = i.
Otherwise, the factored form of A is used to estimate the condition number of the matrix A (see the
rcond parameter). If the reciprocal of the condition number is less than machine precision, the routine
still goes on to solve for X and compute error bounds.
4. The system of equations is solved for X using the factored form of A.
5. By default, unless params(1) is set to zero, the routine applies iterative refinement to get a small error
and error bounds. Refinement calculates the residual to at least twice the working precision.
6. If equilibration was used, the matrix X is premultiplied by diag(s) so that it solves the original system
before equilibration.
Input Parameters
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
The array a contains the matrix A as specified by uplo . If fact = 'F' and
equed = 'Y', then A must have been equilibrated by the scaling factors in
s, and a must contain the equilibrated matrix diag(s)*A*diag(s). The
second dimension of a must be at least max(1,n).
711
3 Intel Math Kernel Library Developer Reference
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex flavors.
ldaf INTEGER. The leading dimension of the array af; ldaf max(1,n).
if equed = 'Y', both row and column equilibration was done, that is, A has
been replaced by diag(s)*A*diag(s).
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in the Output Arguments section below.
712
LAPACK Routines 3
=0.0 No refinement is performed and no error bounds
are computed.
Default 10.0
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
Output Parameters
a Array a is not modified on exit if fact = 'F' or 'N', or if fact = 'E' and
equed = 'N'.
If fact = 'E' and equed = 'Y', A is overwritten by diag(s)*A*diag(s).
713
3 Intel Math Kernel Library Developer Reference
714
LAPACK Routines 3
The array is indexed by the type of error information as described below.
There are currently up to three pieces of information returned.
The first index in err_bnds_norm(i,:) corresponds to the i-th right-hand
side.
The second index in err_bnds_norm(:,err) contains the following three
fields:
715
3 Intel Math Kernel Library Developer Reference
params If an entry is less than 0.0, that entry is filled with the default value used
for that parameter, otherwise the entry is not modified.
716
LAPACK Routines 3
If 0 < infon: Uinfo,info is exactly zero. The factorization has been
completed, but the factor U is exactly singular, so the solution and error
bounds could not be computed; rcond = 0 is returned.
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?ppsv
Computes the solution to the system of linear
equations with a symmetric (Hermitian) positive
definite packed coefficient matrix A and multiple right-
hand sides.
Syntax
call sppsv( uplo, n, nrhs, ap, b, ldb, info )
call dppsv( uplo, n, nrhs, ap, b, ldb, info )
call cppsv( uplo, n, nrhs, ap, b, ldb, info )
call zppsv( uplo, n, nrhs, ap, b, ldb, info )
call ppsv( ap, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n real
symmetric/Hermitian positive-definite matrix stored in packed format, the columns of matrix B are individual
right-hand sides, and the columns of X are the corresponding solutions.
The Cholesky decomposition is used to factor A as
A = UT*U (real flavors) and A = UH*U (complex flavors), if uplo = 'U'
or A = L*LT (real flavors) and A = L*LH (complex flavors), if uplo = 'L',
where U is an upper triangular matrix and L is a lower triangular matrix. The factored form of A is then used
to solve the system of equations A*X = B.
Input Parameters
717
3 Intel Math Kernel Library Developer Reference
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
Output Parameters
See Also
Matrix Storage Schemes for LAPACK Routines
718
LAPACK Routines 3
?ppsvx
Uses the Cholesky factorization to compute the
solution to the system of linear equations with a
symmetric (Hermitian) positive definite packed
coefficient matrix A, and provides error bounds on the
solution.
Syntax
call sppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond, ferr,
berr, work, iwork, info )
call dppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond, ferr,
berr, work, iwork, info )
call cppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond, ferr,
berr, work, rwork, info )
call zppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond, ferr,
berr, work, rwork, info )
call ppsvx( ap, b, x [,uplo] [,af] [,fact] [,equed] [,s] [,ferr] [,berr] [,rcond]
[,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the Cholesky factorization A=UT*U (real flavors) / A=UH*U (complex flavors) or A=L*LT (real
flavors) / A=L*LH (complex flavors) to compute the solution to a real or complex system of linear equations
A*X = B, where A is a n-by-n symmetric or Hermitian positive-definite matrix stored in packed format, the
columns of matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?ppsvx performs the following steps:
1. If fact = 'E', real scaling factors s are computed to equilibrate the system:
diag(s)*A*diag(s)*inv(diag(s))*X = diag(s)*B.
Whether or not the system will be equilibrated depends on the scaling of the matrix A, but if
equilibration is used, A is overwritten by diag(s)*A*diag(s) and B by diag(s)*B.
2. If fact = 'N' or 'E', the Cholesky decomposition is used to factor the matrix A (after equilibration if
fact = 'E') as
A = UT*U (real), A = UH*U (complex), if uplo = 'U',
or A = L*LT (real), A = L*LH (complex), if uplo = 'L',
719
3 Intel Math Kernel Library Developer Reference
Input Parameters
The array afp is an input argument if fact = 'F' and contains the
triangular factor U or L from the Cholesky factorization of A in the
same storage format as A. If equed is not 'N', then afp is the
factored form of the equilibrated matrix A.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
work(*) is a workspace array.
The dimension of arrays ap and afp must be at least max(1, n(n
+1)/2); the second dimension of b must be at least max(1,nrhs);
the dimension of work must be at least max(1, 3*n) for real flavors
and max(1, 2*n) for complex flavors.
720
LAPACK Routines 3
equed CHARACTER*1. Must be 'N' or 'Y'.
equed is an input argument if fact = 'F'. It specifies the form of
equilibration that was done:
if equed = 'N', no equilibration was done (always true if fact =
'N');
if equed = 'Y', equilibration was done, that is, A has been replaced
by diag(s)A*diag(s).
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
iwork INTEGER. Workspace array, size at least max(1, n); used in real
flavors only.
Output Parameters
721
3 Intel Math Kernel Library Developer Reference
afp If fact = 'N'or 'E', then afp is an output argument and on exit
returns the triangular factor U or L from the Cholesky factorization
A=UT*U or A=L*LT (real routines), A=UH*U or A=L*LH (complex
routines) of the original matrix A (if fact = 'N'), or of the
equilibrated matrix A (if fact = 'E'). See the description of ap for
the form of the equilibrated matrix.
If info = i, and in, the leading minor of order i (and therefore the
matrix A itself) is not positive-definite, so the factorization could not
be completed, and the solution and error bounds could not be
computed; rcond = 0 is returned.
722
LAPACK Routines 3
If info = i, and i = n + 1, then U is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to
working precision. Nevertheless, the solution and error bounds are
computed because there are a number of situations where the
computed solution can be more accurate than the value of rcond
would suggest.
s Holds the vector of length n. Default value for each element is s(i) =
1.0_WP.
fact Must be 'N', 'E', or 'F'. The default value is 'N'. If fact = 'F',
then af must be present; otherwise, an error is returned.
See Also
Matrix Storage Schemes for LAPACK Routines
?pbsv
Computes the solution to the system of linear
equations with a symmetric or Hermitian positive-
definite band coefficient matrix A and multiple right-
hand sides.
Syntax
call spbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call dpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call cpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call zpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call pbsv( ab, b [,uplo] [,info] )
723
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n
symmetric/Hermitian positive definite band matrix, the columns of matrix B are individual right-hand sides,
and the columns of X are the corresponding solutions.
The Cholesky decomposition is used to factor A as
A = UT*U (real flavors) and A = UH*U (complex flavors), if uplo = 'U'
or A = L*LT (real flavors) and A = L*LH (complex flavors), if uplo = 'L',
where U is an upper triangular band matrix and L is a lower triangular band matrix, with the same number of
superdiagonals or subdiagonals as A. The factored form of A is then used to solve the system of equations
A*X = B.
Input Parameters
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
ldab INTEGER. The leading dimension of the array ab; ldabkd +1.
724
LAPACK Routines 3
Output Parameters
See Also
Matrix Storage Schemes for LAPACK Routines
?pbsvx
Uses the Cholesky factorization to compute the
solution to the system of linear equations with a
symmetric (Hermitian) positive-definite band
coefficient matrix A, and provides error bounds on the
solution.
Syntax
call spbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, iwork, info )
call dpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, iwork, info )
call cpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, rwork, info )
call zpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, rwork, info )
call pbsvx( ab, b, x [,uplo] [,afb] [,fact] [,equed] [,s] [,ferr] [,berr] [,rcond]
[,info] )
Include Files
mkl.fi, lapack.f90
725
3 Intel Math Kernel Library Developer Reference
Description
The routine uses the Cholesky factorization A=UT*U (real flavors) / A=UH*U (complex flavors) or A=L*LT (real
flavors) / A=L*LH (complex flavors) to compute the solution to a real or complex system of linear equations
A*X = B, where A is a n-by-n symmetric or Hermitian positive definite band matrix, the columns of matrix B
are individual right-hand sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?pbsvx performs the following steps:
1. If fact = 'E', real scaling factors s are computed to equilibrate the system:
diag(s)*A*diag(s)*inv(diag(s))*X = diag(s)*B.
Whether or not the system will be equilibrated depends on the scaling of the matrix A, but if
equilibration is used, A is overwritten by diag(s)*A*diag(s) and B by diag(s)*B.
2. If fact = 'N' or 'E', the Cholesky decomposition is used to factor the matrix A (after equilibration if
fact = 'E') as
A = UT*U (real), A = UH*U (complex), if uplo = 'U',
or A = L*LT (real), A = L*LH (complex), if uplo = 'L',
where U is an upper triangular band matrix and L is a lower triangular band matrix.
3. If the leading i-by-i principal minor is not positive definite, then the routine returns with info = i.
Otherwise, the factored form of A is used to estimate the condition number of the matrix A. If the
reciprocal of the condition number is less than machine precision, info = n+1 is returned as a
warning, but the routine still goes on to solve for X and compute error bounds as described below.
4. The system of equations is solved for X using the factored form of A.
5. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
6. If equilibration was used, the matrix X is premultiplied by diag(s) so that it solves the original system
before equilibration.
Input Parameters
726
LAPACK Routines 3
n INTEGER. The order of matrix A; n 0.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1, nrhs).
727
3 Intel Math Kernel Library Developer Reference
Array, size (n). The array s contains the scale factors for A. This array
is an input argument if fact = 'F' only; otherwise it is an output
argument.
If equed = 'N', s is not accessed.
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
iwork INTEGER. Workspace array, size at least max(1, n); used in real
flavors only.
Output Parameters
afb If fact = 'N'or 'E', then afb is an output argument and on exit
returns the triangular factor U or L from the Cholesky factorization
A=UT*U or A=L*LT (real routines), A=UH*U or A=L*LH (complex
routines) of the original matrix A (if fact = 'N'), or of the
equilibrated matrix A (if fact = 'E'). See the description of ab for
the form of the equilibrated matrix.
728
LAPACK Routines 3
An estimate of the reciprocal condition number of the matrix A after
equilibration (if done). If rcond is less than the machine precision (in
particular, if rcond = 0), the matrix is singular to working precision.
This condition is indicated by a return code of info > 0.
If info = i, and in, the leading minor of order i (and therefore the
matrix A itself) is not positive definite, so the factorization could not
be completed, and the solution and error bounds could not be
computed; rcond =0 is returned. If info = i, and i = n + 1, then U
is nonsingular, but rcond is less than machine precision, meaning that
the matrix is singular to working precision. Nevertheless, the solution
and error bounds are computed because there are a number of
situations where the computed solution can be more accurate than the
value of rcond would suggest.
729
3 Intel Math Kernel Library Developer Reference
s Holds the vector with the number of elements n. Default value for
each element is s(i) = 1.0_WP.
fact Must be 'N', 'E', or 'F'. The default value is 'N'. If fact = 'F',
then af must be present; otherwise, an error is returned.
See Also
Matrix Storage Schemes for LAPACK Routines
?ptsv
Computes the solution to the system of linear
equations with a symmetric or Hermitian positive
definite tridiagonal coefficient matrix A and multiple
right-hand sides.
Syntax
call sptsv( n, nrhs, d, e, b, ldb, info )
call dptsv( n, nrhs, d, e, b, ldb, info )
call cptsv( n, nrhs, d, e, b, ldb, info )
call zptsv( n, nrhs, d, e, b, ldb, info )
call ptsv( d, e, b [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n
symmetric/Hermitian positive-definite tridiagonal matrix, the columns of matrix B are individual right-hand
sides, and the columns of X are the corresponding solutions.
A is factored as A = L*D*LT (real flavors) or A = L*D*LH (complex flavors), and the factored form of A is
then used to solve the system of equations A*X = B.
Input Parameters
730
LAPACK Routines 3
DOUBLE PRECISION for double precision flavors.
Array, dimension at least max(1, n). Contains the diagonal elements
of the tridiagonal matrix A.
Output Parameters
See Also
Matrix Storage Schemes for LAPACK Routines
731
3 Intel Math Kernel Library Developer Reference
?ptsvx
Uses factorization to compute the solution to the
system of linear equations with a symmetric
(Hermitian) positive definite tridiagonal coefficient
matrix A, and provides error bounds on the solution.
Syntax
call sptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr, work,
info )
call dptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr, work,
info )
call cptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr, work,
rwork, info )
call zptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr, work,
rwork, info )
call ptsvx( d, e, b, x [,df] [,ef] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the Cholesky factorization A = L*D*LT (real)/A = L*D*LH (complex) to compute the
solution to a real or complex system of linear equations A*X = B, where A is a n-by-n symmetric or
Hermitian positive definite tridiagonal matrix, the columns of matrix B are individual right-hand sides, and
the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?ptsvx performs the following steps:
1. If fact = 'N', the matrix A is factored as A = L*D*LT (real flavors)/A = L*D*LH (complex flavors),
where L is a unit lower bidiagonal matrix and D is diagonal. The factorization can also be regarded as
having the form A = UT*D*U (real flavors)/A = UH*D*U (complex flavors).
2. If the leading i-by-i principal minor is not positive-definite, then the routine returns with info = i.
Otherwise, the factored form of A is used to estimate the condition number of the matrix A. If the
reciprocal of the condition number is less than machine precision, info = n+1 is returned as a
warning, but the routine still goes on to solve for X and compute error bounds as described below.
3. The system of equations is solved for X using the factored form of A.
4. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
Input Parameters
732
LAPACK Routines 3
n INTEGER. The order of matrix A; n 0.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
The array work is a workspace array. The dimension of work must be
at least 2*n for real flavors, and at least n for complex flavors.
Output Parameters
733
3 Intel Math Kernel Library Developer Reference
df, ef These arrays are output arguments if fact = 'N'. See the
description of df, ef in Input Arguments section.
If info = i, and in, the leading minor of order i (and therefore the
matrix A itself) is not positive-definite, so the factorization could not
be completed, and the solution and error bounds could not be
computed; rcond =0 is returned.
If info = i, and i = n + 1, then U is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to
working precision. Nevertheless, the solution and error bounds are
computed because there are a number of situations where the
computed solution can be more accurate than the value of rcond
would suggest.
734
LAPACK Routines 3
b Holds the matrix B of size (n,nrhs).
fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then
both arguments af and ipiv must be present; otherwise, an error is
returned.
See Also
Matrix Storage Schemes for LAPACK Routines
?sysv
Computes the solution to the system of linear
equations with a real or complex symmetric coefficient
matrix A and multiple right-hand sides.
Syntax
call ssysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call dsysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call csysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zsysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call sysv( a, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n
symmetric matrix, the columns of matrix B are individual right-hand sides, and the columns of X are the
corresponding solutions.
The diagonal pivoting method is used to factor A as A = U*D*UT or A = L*D*LT, where U (or L) is a product
of permutation and unit upper (lower) triangular matrices, and D is symmetric and block diagonal with 1-
by-1 and 2-by-2 diagonal blocks.
The factored form of A is then used to solve the system of equations A*X = B.
Input Parameters
735
3 Intel Math Kernel Library Developer Reference
The array a contains the upper or the lower triangular part of the
symmetric matrix A (see uplo). The second dimension of a must be at
least max(1, n).
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D, as determined by ?sytrf.
If ipiv(i) = k >0, then dii is a 1-by-1 diagonal block, and the i-th
row and column of A was interchanged with the k-th row and column.
If uplo = 'U' and ipiv(i) = ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) = ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
736
LAPACK Routines 3
work(1) If info = 0, on exit work(1) contains the minimum value of lwork
required for optimum performance. Use this lwork for subsequent
runs.
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
See Also
Matrix Storage Schemes for LAPACK Routines
?sysv_rook
Computes the solution to the system of linear
equations with a real or complex symmetric coefficient
matrix A and multiple right-hand sides.
Syntax
call ssysv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call dsysv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call csysv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zsysv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
737
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n
symmetric matrix, the columns of matrix B are individual right-hand sides, and the columns of X are the
corresponding solutions.
The diagonal pivoting method is used to factor A as A = U*D*UT or A = L*D*LT, where U (or L) is a product
of permutation and unit upper (lower) triangular matrices, and D is symmetric and block diagonal with 1-
by-1 and 2-by-2 diagonal blocks.
The ?sysv_rook routine is called to compute the factorization of a complex symmetric matrix A using the
bounded Bunch-Kaufman ("rook") diagonal pivoting method.
The factored form of A is then used to solve the system of equations A*X = B.
Input Parameters
The array a contains the upper or the lower triangular part of the
symmetric matrix A (see uplo). The second dimension of a must be at
least max(1, n).
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
738
LAPACK Routines 3
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the
first entry of the work array, and no error message related to lwork is
issued by xerbla. See Application Notes below for details and for the
suggested value of lwork.
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D, as determined by sytrf_rook.
If ipiv(k) > 0, then rows and columns k and ipiv(k) were
interchanged and Dk, k is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) < 0 and ipiv(k - 1) < 0, then rows
and columns k and -ipiv(k) were interchanged, rows and columns k -
1 and -ipiv(k - 1) were interchanged, and Dk-1:k, k-1:k is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows
and columns k and -ipiv(k) were interchanged, rows and columns k
+ 1 and -ipiv(k + 1) were interchanged, and Dk:k+1, k:k+1 is a 2-by-2
diagonal block.
739
3 Intel Math Kernel Library Developer Reference
See Also
Matrix Storage Schemes for LAPACK Routines
?sysvx
Uses the diagonal pivoting factorization to compute
the solution to the system of linear equations with a
real or complex symmetric coefficient matrix A, and
provides error bounds on the solution.
Syntax
call ssysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, iwork, info )
call dsysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, iwork, info )
call csysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, rwork, info )
call zsysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, rwork, info )
call sysvx( a, b, x [,uplo] [,af] [,ipiv] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a real or complex system of
linear equations A*X = B, where A is a n-by-n symmetric matrix, the columns of matrix B are individual
right-hand sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?sysvx performs the following steps:
1. If fact = 'N', the diagonal pivoting method is used to factor the matrix A. The form of the
factorization is A = U*D*UT or A = L*D*LT, where U (or L) is a product of permutation and unit upper
(lower) triangular matrices, and D is symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
2. If some di,i= 0, so that D is exactly singular, then the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the
condition number is less than machine precision, info = n+1 is returned as a warning, but the routine
still goes on to solve for X and compute error bounds as described below.
3. The system of equations is solved for X using the factored form of A.
4. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
Input Parameters
740
LAPACK Routines 3
If fact = 'N', the matrix A will be copied to af and factored.
The array a contains the upper or the lower triangular part of the
symmetric matrix A (see uplo). The second dimension of a must be at
least max(1,n).
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if
fact = 'F'. It contains details of the interchanges and the block
structure of D, as determined by ?sytrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 diagonal block, and the i-th
row and column of A was interchanged with the k-th row and column.
If uplo = 'U'and ipiv(i) = ipiv(i-1) = -m < 0, then D has a
2-by-2 block in rows/columns i and i-1, and (i-1)-th row and column
of A was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) = ipiv(i+1) = -m < 0, then D has a
2-by-2 block in rows/columns i and i+1, and (i+1)-th row and column
of A was interchanged with the m-th row and column.
741
3 Intel Math Kernel Library Developer Reference
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
iwork INTEGER. Workspace array, size at least max(1, n); used in real
flavors only.
Output Parameters
742
LAPACK Routines 3
berr REAL for single precision flavors
DOUBLE PRECISION for double precision flavors.
Array, size at least max(1, nrhs). Contains the component-wise
relative backward error for each solution vector x(j), that is, the
smallest relative change in any element of A or B that makes x(j) an
exact solution.
If info = i, and in, then dii is exactly zero. The factorization has
been completed, but the block diagonal matrix D is exactly singular,
so the solution and error bounds could not be computed; rcond = 0 is
returned.
If info = i, and i = n + 1, then D is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to
working precision. Nevertheless, the solution and error bounds are
computed because there are a number of situations where the
computed solution can be more accurate than the value of rcond
would suggest.
fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then
both arguments af and ipiv must be present; otherwise, an error is
returned.
743
3 Intel Math Kernel Library Developer Reference
Application Notes
The value of lwork must be at least max(1,m*n), where for real flavors m = 3 and for complex flavors m =
2. For better performance, try using lwork = max(1, m*n, n*blocksize), where blocksize is the optimal
block size for ?sytrf.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
See Also
Matrix Storage Schemes for LAPACK Routines
?sysvxx
Uses extra precise iterative refinement to compute the
solution to the system of linear equations with a
symmetric indefinite coefficient matrix A applying the
diagonal pivoting factorization.
Syntax
call ssysvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
iwork, info )
call dsysvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
iwork, info )
call csysvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
rwork, info )
call zsysvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
rwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a real or complex system of
linear equations A*X = B, where A is an n-by-n real symmetric/Hermitian matrix, the columns of matrix B
are individual right-hand sides, and the columns of X are the corresponding solutions.
Both normwise and maximum componentwise error bounds are also provided on request. The routine returns
a solution with a small guaranteed error (O(eps), where eps is the working machine precision) unless the
matrix is very ill-conditioned, in which case a warning is returned. Relevant condition numbers are also
calculated and returned.
744
LAPACK Routines 3
The routine accepts user-provided factorizations and equilibration factors; see definitions of the fact and
equed options. Solving with refinement and using a factorization from a previous call of the routine also
produces a solution with O(eps) errors or warnings but that may not be true for general user-provided
factorizations and equilibration factors if they differ from what the routine would itself produce.
The routine ?sysvxx performs the following steps:
where U or L is a product of permutation and unit upper (lower) triangular matrices, and D is a
symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
3. If some D(i,i)=0, so that D is exactly singular, the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A (see the rcond parameter).
If the reciprocal of the condition number is less than machine precision, the routine still goes on to
solve for X and compute error bounds.
4. The system of equations is solved for X using the factored form of A.
5. By default, unless params(1) is set to zero, the routine applies iterative refinement to get a small error
and error bounds. Refinement calculates the residual to at least twice the working precision.
6. If equilibration was used, the matrix X is premultiplied by diag(r) so that it solves the original system
before equilibration.
Input Parameters
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
745
3 Intel Math Kernel Library Developer Reference
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex flavors.
ldaf INTEGER. The leading dimension of the array af; ldaf max(1,n).
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if fact
= 'F'. It contains details of the interchanges and the block structure of D
as determined by ?sytrf. If ipiv(k) > 0, rows and columns k and
ipiv(k) were interchanged and D(k,k) is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, rows and columns k-1
and -ipiv(k) were interchanged and D(k-1:k, k-1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, rows and columns k+1
and -ipiv(k) were interchanged and D(k:k+1, k:k+1) is a 2-by-2
diagonal block.
if equed = 'Y', both row and column equilibration was done, that is, A has
been replaced by diag(s)*A*diag(s).
746
LAPACK Routines 3
Array, size (n). The array s contains the scale factors for A. If equed =
'Y', A is multiplied on the left and right by diag(s).
This array is an input argument if fact = 'F' only; otherwise it is an
output argument.
If fact = 'F' and equed = 'Y', each element of s must be positive.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in the Output Arguments section below.
Default 10.0
747
3 Intel Math Kernel Library Developer Reference
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
Output Parameters
748
LAPACK Routines 3
749
3 Intel Math Kernel Library Developer Reference
750
LAPACK Routines 3
err=3 Reciprocal condition number. Estimated
componentwise reciprocal condition number.
Compared with the threshold
sqrt(n)*slamch() for single precision flavors
and sqrt(n)*dlamch() for double precision
flavors to determine if the error estimate is
"guaranteed". These reciprocal condition
numbers for some appropriately scaled matrix Z
are:
ipiv If fact = 'N', ipiv is an output argument and on exit contains details of
the interchanges and the block structure D, as determined by ssytrf for
single precision flavors and dsytrf for double precision flavors.
params If an entry is less than 0.0, that entry is filled with the default value used
for that parameter, otherwise the entry is not modified.
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
?hesv
Computes the solution to the system of linear
equations with a Hermitian matrix A and multiple
right-hand sides.
751
3 Intel Math Kernel Library Developer Reference
Syntax
call chesv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zhesv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call hesv( a, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the complex system of linear equations A*X = B, where A is an n-by-n symmetric
matrix, the columns of matrix B are individual right-hand sides, and the columns of X are the corresponding
solutions.
The diagonal pivoting method is used to factor A as A = U*D*UH or A = L*D*LH, where U (or L) is a product
of permutation and unit upper (lower) triangular matrices, and D is Hermitian and block diagonal with 1-by-1
and 2-by-2 diagonal blocks.
The factored form of A is then used to solve the system of equations A*X = B.
Input Parameters
If uplo = 'L', the array a stores the lower triangular part of the
matrix A, and A is factored as L*D*LH.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
752
LAPACK Routines 3
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the
first entry of the work array, and no error message related to lwork is
issued by xerbla. See Application Notes below for details and for the
suggested value of lwork.
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D, as determined by ?hetrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 diagonal block, and the i-th
row and column of A was interchanged with the k-th row and column.
If uplo = 'U'and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
753
3 Intel Math Kernel Library Developer Reference
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
See Also
Matrix Storage Schemes for LAPACK Routines
?hesv_rook
Computes the solution to the system of linear
equations for Hermitian matrices using the bounded
Bunch-Kaufman diagonal pivoting method.
Syntax
call chesv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zhesv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call hesv_rook( a, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the complex system of linear equations A*X = B, where A is an n-by-n Hermitian
matrix, and X and B are n-by-nrhs matrices.
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, and D is Hermitian and
block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
hetrf_rook is called to compute the factorization of a complex Hermition matrix A using the bounded Bunch-
Kaufman ("rook") diagonal pivoting method.
The factored form of A is then used to solve the system of equations A*X = B by calling ?HETRS_ROOK,
which uses BLAS level 2 routines.
Input Parameters
754
LAPACK Routines 3
If uplo = 'U', the array a stores the upper triangular part of the
matrix A.
If uplo = 'L', the array a stores the lower triangular part of the
matrix A.
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D, as determined by ?hetrf_rook.
If uplo = 'U':
755
3 Intel Math Kernel Library Developer Reference
See Also
Matrix Storage Schemes for LAPACK Routines
?hesvx
Uses the diagonal pivoting factorization to compute
the solution to the complex system of linear equations
with a Hermitian coefficient matrix A, and provides
error bounds on the solution.
Syntax
call chesvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, rwork, info )
call zhesvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, rwork, info )
756
LAPACK Routines 3
call hesvx( a, b, x [,uplo] [,af] [,ipiv] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a complex system of linear
equations A*X = B, where A is an n-by-n Hermitian matrix, the columns of matrix B are individual right-hand
sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?hesvx performs the following steps:
1. If fact = 'N', the diagonal pivoting method is used to factor the matrix A. The form of the
factorization is A = U*D*UH or A = L*D*LH, where U (or L) is a product of permutation and unit upper
(lower) triangular matrices, and D is Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
2. If some di,i= 0, so that D is exactly singular, then the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the
condition number is less than machine precision, info = n+1 is returned as a warning, but the routine
still goes on to solve for X and compute error bounds as described below.
3. The system of equations is solved for X using the factored form of A.
4. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
Input Parameters
If uplo = 'L', the array a stores the lower triangular part of the
Hermitian matrix A; A is factored as L*D*LH.
757
3 Intel Math Kernel Library Developer Reference
The array a contains the upper or the lower triangular part of the
Hermitian matrix A (see uplo). The second dimension of a must be at
least max(1,n).
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if
fact = 'F'. It contains details of the interchanges and the block
structure of D, as determined by ?hetrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 diagonal block, and the i-th
row and column of A was interchanged with the k-th row and column.
If uplo = 'U'and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
Output Parameters
758
LAPACK Routines 3
Array, size ldx by *.
If info = 0 or info = n+1, the array x contains the solution matrix
X to the system of equations. The second dimension of x must be at
least max(1,nrhs).
af, ipiv These arrays are output arguments if fact = 'N'. See the
description of af, ipiv in Input Arguments section.
If info = i, and in, then dii is exactly zero. The factorization has
been completed, but the block diagonal matrix D is exactly singular,
so the solution and error bounds could not be computed; rcond = 0 is
returned.
If info = i, and i = n + 1, then D is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to
working precision. Nevertheless, the solution and error bounds are
computed because there are a number of situations where the
computed solution can be more accurate than the value of rcond
would suggest.
759
3 Intel Math Kernel Library Developer Reference
fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then
both arguments af and ipiv must be present; otherwise, an error is
returned.
Application Notes
The value of lwork must be at least 2*n. For better performance, try using lwork = n*blocksize, where
blocksize is the optimal block size for ?hetrf.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
See Also
Matrix Storage Schemes for LAPACK Routines
?hesvxx
Uses extra precise iterative refinement to compute the
solution to the system of linear equations with a
Hermitian indefinite coefficient matrix A applying the
diagonal pivoting factorization.
Syntax
call chesvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
rwork, info )
760
LAPACK Routines 3
call zhesvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
rwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a complex/double complex
system of linear equations A*X = B, where A is an n-by-n Hermitian matrix, the columns of matrix B are
individual right-hand sides, and the columns of X are the corresponding solutions.
Both normwise and maximum componentwise error bounds are also provided on request. The routine returns
a solution with a small guaranteed error (O(eps), where eps is the working machine precision) unless the
matrix is very ill-conditioned, in which case a warning is returned. Relevant condition numbers are also
calculated and returned.
The routine accepts user-provided factorizations and equilibration factors; see definitions of the fact and
equed options. Solving with refinement and using a factorization from a previous call of the routine also
produces a solution with O(eps) errors or warnings but that may not be true for general user-provided
factorizations and equilibration factors if they differ from what the routine would itself produce.
The routine ?hesvxx performs the following steps:
where U or L is a product of permutation and unit upper (lower) triangular matrices, and D is a
symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
3. If some D(i,i)=0, so that D is exactly singular, the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A (see the rcond parameter).
If the reciprocal of the condition number is less than machine precision, the routine still goes on to
solve for X and compute error bounds.
4. The system of equations is solved for X using the factored form of A.
5. By default, unless params(1) is set to zero, the routine applies iterative refinement to get a small error
and error bounds. Refinement calculates the residual to at least twice the working precision.
6. If equilibration was used, the matrix X is premultiplied by diag(r) so that it solves the original system
before equilibration.
Input Parameters
761
3 Intel Math Kernel Library Developer Reference
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
max(1,5*n).
ldaf INTEGER. The leading dimension of the array af; ldaf max(1,n).
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if fact
= 'F'. It contains details of the interchanges and the block structure of D
as determined by ?sytrf.
If ipiv(k) > 0, rows and columns k and ipiv(k) were interchanged and
D(k,k) is a 1-by-1 diagonal block.
762
LAPACK Routines 3
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, rows and columns k-1
and -ipiv(k) were interchanged and D(k-1:k, k-1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, rows and columns k+1
and -ipiv(k) were interchanged and D(k:k+1, k:k+1) is a 2-by-2
diagonal block.
if equed = 'Y', both row and column equilibration was done, that is, A has
been replaced by diag(s)*A*diag(s).
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in the Output Arguments section below.
763
3 Intel Math Kernel Library Developer Reference
Default 10
Output Parameters
764
LAPACK Routines 3
DOUBLE PRECISION for zhesvxx.
Reciprocal scaled condition number. An estimate of the reciprocal Skeel
condition number of the matrix A after equilibration (if done). If rcond is
less than the machine precision, in particular, if rcond = 0, the matrix is
singular to working precision. Note that the error may still be small even if
this number is very small and the matrix appears ill-conditioned.
765
3 Intel Math Kernel Library Developer Reference
766
LAPACK Routines 3
err=2 "Guaranteed" error bpound. The estimated
forward error, almost certainly within a factor of
10 of the true error so long as the next entry is
greater than the threshold sqrt(n)*slamch()
for chesvxx and sqrt(n)*dlamch() for
zhesvxx. This error bound should only be
trusted if the previous boolean is true.
ipiv If fact = 'N', ipiv is an output argument and on exit contains details of
the interchanges and the block structure D, as determined by ssytrf for
single precision flavors and dsytrf for double precision flavors.
params If an entry is less than 0.0, that entry is filled with the default value used
for that parameter, otherwise the entry is not modified.
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
Matrix Storage Schemes for LAPACK Routines
767
3 Intel Math Kernel Library Developer Reference
?spsv
Computes the solution to the system of linear
equations with a real or complex symmetric coefficient
matrix A stored in packed format, and multiple right-
hand sides.
Syntax
call sspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call dspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call cspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call spsv( ap, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n
symmetric matrix stored in packed format, the columns of matrix B are individual right-hand sides, and the
columns of X are the corresponding solutions.
The diagonal pivoting method is used to factor A as A = U*D*UT or A = L*D*LT, where U (or L) is a product
of permutation and unit upper (lower) triangular matrices, and D is symmetric and block diagonal with 1-
by-1 and 2-by-2 diagonal blocks.
The factored form of A is then used to solve the system of equations A*X = B.
Input Parameters
768
LAPACK Routines 3
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D, as determined by ?sptrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 block, and the i-th row and
column of A was interchanged with the k-th row and column.
If uplo = 'U'and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
See Also
Matrix Storage Schemes for LAPACK Routines
769
3 Intel Math Kernel Library Developer Reference
?spsvx
Uses the diagonal pivoting factorization to compute
the solution to the system of linear equations with a
real or complex symmetric coefficient matrix A stored
in packed format, and provides error bounds on the
solution.
Syntax
call sspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, iwork, info )
call dspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, iwork, info )
call cspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call zspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call spsvx( ap, b, x [,uplo] [,afp] [,ipiv] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a real or complex system of
linear equations A*X = B, where A is a n-by-n symmetric matrix stored in packed format, the columns of
matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?spsvx performs the following steps:
1. If fact = 'N', the diagonal pivoting method is used to factor the matrix A. The form of the
factorization is A = U*D*UT orA = L*D*LT, where U (or L) is a product of permutation and unit upper
(lower) triangular matrices, and D is symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
2. If some di,i= 0, so that D is exactly singular, then the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the
condition number is less than machine precision, info = n+1 is returned as a warning, but the routine
still goes on to solve for X and compute error bounds as described below.
3. The system of equations is solved for X using the factored form of A.
4. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
Input Parameters
770
LAPACK Routines 3
uplo CHARACTER*1. Must be 'U' or 'L'.
Indicates whether the upper or lower triangular part of A is stored and
how A is factored:
If uplo = 'U', the array ap stores the upper triangular part of the
symmetric matrix A, and A is factored as U*D*UT.
If uplo = 'L', the array ap stores the lower triangular part of the
symmetric matrix A; A is factored as L*D*LT.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
work(*) is a workspace array.
The dimension of arrays ap and afp must be at least max(1, n(n
+1)/2); the second dimension of b must be at least max(1,nrhs);
the dimension of work must be at least max(1,3*n) for real flavors
and max(1,2*n) for complex flavors.
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if
fact = 'F'. It contains details of the interchanges and the block
structure of D, as determined by ?sptrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 block, and the i-th row and
column of A was interchanged with the k-th row and column.
If uplo = 'U'and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
771
3 Intel Math Kernel Library Developer Reference
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
iwork INTEGER. Workspace array, size at least max(1, n); used in real
flavors only.
Output Parameters
afp, ipiv These arrays are output arguments if fact = 'N'. See the
description of afp, ipiv in Input Arguments section.
If info = i, and in, then dii is exactly zero. The factorization has
been completed, but the block diagonal matrix D is exactly singular,
so the solution and error bounds could not be computed; rcond = 0 is
returned.
If info = i, and i = n + 1, then D is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to
working precision. Nevertheless, the solution and error bounds are
772
LAPACK Routines 3
computed because there are a number of situations where the
computed solution can be more accurate than the value of rcond
would suggest.
fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then
both arguments af and ipiv must be present; otherwise, an error is
returned.
See Also
Matrix Storage Schemes for LAPACK Routines
?hpsv
Computes the solution to the system of linear
equations with a Hermitian coefficient matrix A stored
in packed format, and multiple right-hand sides.
Syntax
call chpsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zhpsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call hpsv( ap, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n Hermitian matrix
stored in packed format, the columns of matrix B are individual right-hand sides, and the columns of X are
the corresponding solutions.
773
3 Intel Math Kernel Library Developer Reference
The diagonal pivoting method is used to factor A as A = U*D*UH or A = L*D*LH, where U (or L) is a product
of permutation and unit upper (lower) triangular matrices, and D is Hermitian and block diagonal with 1-by-1
and 2-by-2 diagonal blocks.
The factored form of A is then used to solve the system of equations A*X = B.
Input Parameters
Output Parameters
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D, as determined by ?hptrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 block, and the i-th row and
column of A was interchanged with the k-th row and column.
If uplo = 'U'and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
774
LAPACK Routines 3
info INTEGER. If info = 0, the execution is successful.
If info = -i, the i-th parameter had an illegal value.
See Also
Matrix Storage Schemes for LAPACK Routines
?hpsvx
Uses the diagonal pivoting factorization to compute
the solution to the system of linear equations with a
Hermitian coefficient matrix A stored in packed
format, and provides error bounds on the solution.
Syntax
call chpsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call zhpsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call hpsvx( ap, b, x [,uplo] [,afp] [,ipiv] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a complex system of linear
equations A*X = B, where A is a n-by-n Hermitian matrix stored in packed format, the columns of matrix B
are individual right-hand sides, and the columns of X are the corresponding solutions.
Error bounds on the solution and a condition estimate are also provided.
The routine ?hpsvx performs the following steps:
1. If fact = 'N', the diagonal pivoting method is used to factor the matrix A. The form of the
factorization is A = U*D*UH or A = L*D*LH, where U (or L) is a product of permutation and unit upper
(lower) triangular matrices, and D is a Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
775
3 Intel Math Kernel Library Developer Reference
2. If some di,i = 0, so that D is exactly singular, then the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the
condition number is less than machine precision, info = n+1 is returned as a warning, but the routine
still goes on to solve for X and compute error bounds as described below.
3. The system of equations is solved for X using the factored form of A.
4. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
Input Parameters
If uplo = 'L', the array ap stores the lower triangular part of the
Hermitian matrix A, and A is factored as L*D*LH.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
work(*) is a workspace array.
The dimension of arrays ap and afp must be at least max(1,n(n
+1)/2); the second dimension of b must be at least max(1,nrhs);
the dimension of work must be at least max(1,2*n).
ipiv INTEGER.
776
LAPACK Routines 3
Array, size at least max(1, n). The array ipiv is an input argument if
fact = 'F'. It contains details of the interchanges and the block
structure of D, as determined by ?hptrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 block, and the i-th row and
column of A was interchanged with the k-th row and column.
If uplo = 'U'and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A
was interchanged with the m-th row and column.
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
Output Parameters
afp, ipiv These arrays are output arguments if fact = 'N'. See the
description of afp, ipiv in Input Arguments section.
777
3 Intel Math Kernel Library Developer Reference
If info = i, and in, then dii is exactly zero. The factorization has
been completed, but the block diagonal matrix D is exactly singular,
so the solution and error bounds could not be computed; rcond = 0 is
returned.
If info = i, and i = n + 1, then D is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to
working precision. Nevertheless, the solution and error bounds are
computed because there are a number of situations where the
computed solution can be more accurate than the value of rcond
would suggest.
fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then
both arguments af and ipiv must be present; otherwise, an error is
returned.
See Also
Matrix Storage Schemes for LAPACK Routines
778
LAPACK Routines 3
LAPACK Least Squares and Eigenvalue Problem Routines
This section includes descriptions of LAPACK computational routines and driver routines for solving linear
least squares problems, eigenvalue and singular value problems, and performing a number of related
computational tasks. For a full reference on LAPACK routines and related information see [LUG].
Least Squares Problems. A typical least squares problem is as follows: given a matrix A and a vector b,
find the vector x that minimizes the sum of squares i((Ax)i - bi)2 or, equivalently, find the vector x that
minimizes the 2-norm ||Ax - b||2.
In the most usual case, A is an m-by-n matrix with mn and rank(A) = n. This problem is also referred to
as finding the least squares solution to an overdetermined system of linear equations (here we have more
equations than unknowns). To solve this problem, you can use the QR factorization of the matrix A (see QR
Factorization).
If m < n and rank(A) = m, there exist an infinite number of solutions x which exactly satisfy Ax = b, and
thus minimize the norm ||Ax - b||2. In this case it is often useful to find the unique solution that
minimizes ||x||2. This problem is referred to as finding the minimum-norm solution to an
underdetermined system of linear equations (here we have more unknowns than equations). To solve this
problem, you can use the LQ factorization of the matrix A (see LQ Factorization).
In the general case you may have a rank-deficient least squares problem, with rank(A)< min(m, n): find
the minimum-norm least squares solution that minimizes both ||x||2 and ||Ax - b||2. In this case (or
when the rank of A is in doubt) you can use the QR factorization with pivoting or singular value
decomposition (see Singular Value Decomposition).
Eigenvalue Problems. The eigenvalue problems (from German eigen "own") are stated as follows: given a
matrix A, find the eigenvalues and the corresponding eigenvectorsz that satisfy the equation
Az = z (right eigenvectors z)
or the equation
zHA = zH (left eigenvectors z).
If A is a real symmetric or complex Hermitian matrix, the above two equations are equivalent, and the
problem is called a symmetric eigenvalue problem. Routines for solving this type of problems are described
in the section Symmetric Eigenvalue Problems.
Routines for solving eigenvalue problems with nonsymmetric or non-Hermitian matrices are described in the
section Nonsymmetric Eigenvalue Problems.
The library also includes routines that handle generalized symmetric-definite eigenvalue problems: find
the eigenvalues and the corresponding eigenvectors x that satisfy one of the following equations:
Az = Bz, ABz = z, or BAz = z,
where A is symmetric or Hermitian, and B is symmetric positive-definite or Hermitian positive-definite.
Routines for reducing these problems to standard symmetric eigenvalue problems are described in the
section Generalized Symmetric-Definite Eigenvalue Problems.
To solve a particular problem, you usually call several computational routines. Sometimes you need to
combine the routines of this chapter with other LAPACK routines described in "LAPACK Routines: Linear
Equations" as well as with BLAS routines described in "BLAS and Sparse BLAS Routines".
For example, to solve a set of least squares problems minimizing ||Ax - b||2 for all columns b of a given
matrix B (where A and B are real matrices), you can call ?geqrf to form the factorization A = QR, then call ?
ormqr to compute C = QHB and finally call the BLAS routine ?trsm to solve for X the system of equations RX
= C.
Another way is to call an appropriate driver routine that performs several tasks in one call. For example, to
solve the least squares problem the driver routine ?gels can be used.
779
3 Intel Math Kernel Library Developer Reference
Orthogonal Factorizations
Singular Value Decomposition
Symmetric Eigenvalue Problems
Generalized Symmetric-Definite Eigenvalue Problems
Nonsymmetric Eigenvalue Problems
Generalized Nonsymmetric Eigenvalue Problems
Generalized Singular Value Decomposition
See also the respective driver routines.
where R is an n-by-n upper triangular matrix with real diagonal elements, and Q is an m-by-m orthogonal (or
unitary) matrix.
You can use the QR factorization for solving the following least squares problem: minimize ||Ax - b||2
where A is a full-rank m-by-n matrix (mn). After factoring the matrix, compute the solution x by solving Rx
= (Q1)Tb.
If m < n, the QR factorization is given by
A = QR = Q(R1R2)
where R is trapezoidal, R1 is upper triangular and R2 is rectangular.
Q is represented as a product of min(m, n) elementary reflectors. Routines are provided to work with Q in
this representation.
LQ Factorization LQ factorization of an m-by-n matrix A is as follows. If mn,
where L is an m-by-m lower triangular matrix with real diagonal elements, and Q is an n-by-n orthogonal (or
unitary) matrix.
If m > n, the LQ factorization is
where L1 is an n-by-n lower triangular matrix, L2 is rectangular, and Q is an n-by-n orthogonal (or unitary)
matrix.
780
LAPACK Routines 3
You can use the LQ factorization to find the minimum-norm solution of an underdetermined system of linear
equations Ax = b where A is an m-by-n matrix of rank m (m < n). After factoring the matrix, compute the
solution vector x as follows: solve Ly = b for y, and then compute x = (Q1)Hy.
Table "Computational Routines for Orthogonal Factorization" lists LAPACK routines that perform orthogonal
factorization of matrices.
Computational Routines for Orthogonal Factorization
Matrix type, factorization Factorize without Factorize with Generate Apply
pivoting pivoting matrix Q matrix Q
?geqrf
Computes the QR factorization of a general m-by-n
matrix.
Syntax
call sgeqrf(m, n, a, lda, tau, work, lwork, info)
call dgeqrf(m, n, a, lda, tau, work, lwork, info)
call cgeqrf(m, n, a, lda, tau, work, lwork, info)
call zgeqrf(m, n, a, lda, tau, work, lwork, info)
call geqrf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
781
3 Intel Math Kernel Library Developer Reference
The routine forms the QR factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
pivoting is performed.
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors. Routines are provided to work with Q in this representation.
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
Output Parameters
782
LAPACK Routines 3
work(1) If info = 0, on exit work(1) contains the minimum value of lwork
required for optimum performance. Use this lwork for subsequent runs.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed factorization is the exact factorization of a matrix A + E, where
||E||2 = O()||A||2.
The approximate number of floating-point operations for real flavors is
(4/3)n3 if m = n,
(2/3)n2(3m-n) if m > n,
(2/3)m2(3n-m) if m < n.
783
3 Intel Math Kernel Library Developer Reference
(The columns of the computed X are the least squares solution vectors x.)
To compute the elements of Q explicitly, call
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?geqr
Computes a QR factorization of a general matrix, with
best performance for tall and skinny matrices.
call sgeqr(m, n, a, lda, t, tsize, work, lwork, info)
call dgeqr(m, n, a, lda, t, tsize, work, lwork, info)
call cgeqr(m, n, a, lda, t, tsize, work, lwork, info)
call zgeqr(m, n, a, lda, t, tsize, work, lwork, info)
Description
The ?geqr routine computes a QR factorization of an m-by-n matrix A. If the matrix is tall and skinny (m is
substantially larger than m), a highly scalable algorithm is used to avoid communication overhead.
NOTE
The internal format of the elementary reflectors generated by ?geqr is only compatible with the ?
gemqr routine and not any other QR routines.
Input Parameters
tsize INTEGER. If tsize 5, the size of the array t. If tsize = -1 or tsize = -2,
then the routine performs a workspace query. The routine calculates the
sizes required for the t and work arrays and returns these values as the
first entries of the t and work arrays, without issuing any error message
related to t or work by xerbla.
784
LAPACK Routines 3
If tsize = -1, the routine calculates the optimal size of t for optimum
performance and returns this value in t(1).
If tsize = -2, the routine calculates then minimum size required for t and
returns this value in t(1).
lwork INTEGER. The size of the array work. If lwork = -1 or lwork = -2, then the
routine performs a workspace query. The routine only calculates the sizes of
the t and work arrays and returns these values as the first entries of the t
and work arrays, without issuing any error message related to t or work by
xerbla.
If lwork = -1, the routine calculates the optimal size of work for optimum
performance and returns this value in work(1).
If lwork = -2, the routine calculates the minimum size required for work
and returns this value in work(1).
Output Parameters
If info = 0, t(1) returns the optimal value for tsize. You can specify that
it return the minimum required value for tsize instead - see the tsize
description for details. The remaining entries of t contains part of the data
structure used to represent Q. To apply or construct Q, you need to retain a
and t and pass them to other routines.
If info = 0, work(1) contains the optimal value for lwork. You can specify
that it return the minimum required value for lwork instead - see the
lwork description for details.
785
3 Intel Math Kernel Library Developer Reference
info INTEGER.
info = 0 indicates a successful exit.
info < 0: if info = -i, the i-th argument had an illegal value.
See Also
?gemqr Multiples a matrix C by a real orthogonal or complex unitary matrix Q, as computed by ?
geqr, with best performance for tall and skinny matrices.
?geqrfp
Computes the QR factorization of a general m-by-n
matrix with non-negative diagonal elements.
Syntax
call sgeqrfp(m, n, a, lda, tau, work, lwork, info)
call dgeqrfp(m, n, a, lda, tau, work, lwork, info)
call cgeqrfp(m, n, a, lda, tau, work, lwork, info)
call zgeqrfp(m, n, a, lda, tau, work, lwork, info)
Include Files
mkl.fi
Description
The routine forms the QR factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
pivoting is performed. The diagonal entries of R are real and nonnegative.
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors. Routines are provided to work with Q in this representation.
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
786
LAPACK Routines 3
lwork INTEGER. The size of the work array (lworkn).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
787
3 Intel Math Kernel Library Developer Reference
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed factorization is the exact factorization of a matrix A + E, where
||E||2 = O()||A||2.
The approximate number of floating-point operations for real flavors is
(4/3)n3 if m = n,
(2/3)n2(3m-n) if m > n,
(2/3)m2(3n-m) if m < n.
(The columns of the computed X are the least squares solution vectors x.)
To compute the elements of Q explicitly, call
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?geqrt
Computes a blocked QR factorization of a general real
or complex matrix using the compact WY
representation of Q.
Syntax
call sgeqrt(m, n, nb, a, lda, t, ldt, work, info)
call dgeqrt(m, n, nb, a, lda, t, ldt, work, info)
call cgeqrt(m, n, nb, a, lda, t, ldt, work, info)
call zgeqrt(m, n, nb, a, lda, t, ldt, work, info)
call geqrt(a, t, nb[, info])
788
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The strictly lower triangular matrix V contains the elementary reflectors H(i) in the ith column below the
diagonal. For example, if m=5 and n=3, the matrix V is
where vi represents one of the vectors that define H(i). The vectors are returned in the lower triangular part
of array a.
NOTE
The 1s along the diagonal of V are not stored in a.
Let k = min(m,n). The number of blocks is b = ceiling(k/nb), where each block is of order nb except for
the last block, which is of order ib = k - (b-1)*nb. For each of the b blocks, a upper triangular block
reflector factor is computed:t1, t2, ..., tb. The nb-by-nb (and ib-by-ib for the last block) ts are stored
in the nb-by-n array t as
Input Parameters
789
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info < 0 and info = -i, the ith argument had an illegal value.
?gemqrt
Multiplies a general matrix by the orthogonal/unitary
matrix Q of the QR factorization formed by ?geqrt.
Syntax
call sgemqrt(side, trans, m, n, k, nb, v, ldv, t, ldt, c, ldc, work, info)
call dgemqrt(side, trans, m, n, k, nb, v, ldv, t, ldt, c, ldc, work, info)
call cgemqrt(side, trans, m, n, k, nb, v, ldv, t, ldt, c, ldc, work, info)
call zgemqrt(side, trans, m, n, k, nb, v, ldv, t, ldt, c, ldc, work, info)
call gemqrt( v, t, c, k, nb[, trans][, side][, info])
Include Files
mkl.fi, lapack.f90
Description
The ?gemqrt routine overwrites the general real or complex m-by-n matrixC with
where Q is a real orthogonal (complex unitary) matrix defined as the product of k elementary reflectors
Q = H(1) H(2)... H(k) = I - V*T*VT for real flavors, and
generated using the compact WY representation as returned by geqrt. Q is of order m if side = 'L' and of
order n if side = 'R'.
790
LAPACK Routines 3
Input Parameters
side CHARACTER
='L': apply Q, QT, or QH from the left.
='R': apply Q, QT, or QH from the right.
trans CHARACTER
='N', no transpose, apply Q.
='T', transpose, apply QT.
='C', transpose, apply QH.
nb INTEGER.
The block size used for the storage of t, knb 1. This must be the same
value of nb used to generate t in geqrt.
The ith column must contain the vector which defines the elementary
reflector H(i), for i = 1,2,...,k, as returned by geqrt in the first k columns of
its array argument a.
ldt INTEGER. The leading dimension of the array t. ldt must be at least nb.
791
3 Intel Math Kernel Library Developer Reference
ldc INTEGER. The leadinng dimension of the array c. ldc must be at least
max(1, m).
Output Parameters
info INTEGER.
= 0: the execution is successful.
< 0: if info = -i, the ith argument had an illegal value.
?geqpf
Computes the QR factorization of a general m-by-n
matrix with pivoting.
Syntax
call sgeqpf(m, n, a, lda, jpvt, tau, work, info)
call dgeqpf(m, n, a, lda, jpvt, tau, work, info)
call cgeqpf(m, n, a, lda, jpvt, tau, work, rwork, info)
call zgeqpf(m, n, a, lda, jpvt, tau, work, rwork, info)
call geqpf(a, jpvt [,tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine is deprecated and has been replaced by routine geqp3.
The routine ?geqpf forms the QR factorization of a general m-by-n matrix A with column pivoting: A*P =
Q*R (see Orthogonal Factorizations). Here P denotes an n-by-n permutation matrix.
792
LAPACK Routines 3
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors. Routines are provided to work with Q in this representation.
Input Parameters
Output Parameters
793
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed factorization is the exact factorization of a matrix A + E, where
||E||2 = O()||A||2.
The approximate number of floating-point operations for real flavors is
(4/3)n3 if m = n,
(2/3)n2(3m-n) if m > n,
(2/3)m2(3n-m) if m < n.
(The columns of the computed X are the permuted least squares solution vectors x; the output array jpvt
specifies the permutation order.)
To compute the elements of Q explicitly, call
794
LAPACK Routines 3
?geqp3
Computes the QR factorization of a general m-by-n
matrix with column pivoting using level 3 BLAS.
Syntax
call sgeqp3(m, n, a, lda, jpvt, tau, work, lwork, info)
call dgeqp3(m, n, a, lda, jpvt, tau, work, lwork, info)
call cgeqp3(m, n, a, lda, jpvt, tau, work, lwork, rwork, info)
call zgeqp3(m, n, a, lda, jpvt, tau, work, lwork, rwork, info)
call geqp3(a, jpvt [,tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the QR factorization of a general m-by-n matrix A with column pivoting: A*P = Q*R (see
Orthogonal Factorizations) using Level 3 BLAS. Here P denotes an n-by-n permutation matrix. Use this
routine instead of geqpf for better performance.
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors. Routines are provided to work with Q in this representation.
Input Parameters
lwork INTEGER. The size of the work array; must be at least max(1, 3*n+1) for
real flavors, and at least max(1, n+1) for complex flavors.
jpvt INTEGER.
Array, size at least max(1, n).
795
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
To solve a set of least squares problems minimizing ||A*x - b||2 for all columns b of a given matrix B, you
can call the following:
796
LAPACK Routines 3
?geqp3 (this routine) to factorize A*P = Q*R;
(The columns of the computed X are the permuted least squares solution vectors x; the output array jpvt
specifies the permutation order.)
To compute the elements of Q explicitly, call
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?orgqr
Generates the real orthogonal matrix Q of the QR
factorization formed by ?geqrf.
Syntax
call sorgqr(m, n, k, a, lda, tau, work, lwork, info)
call dorgqr(m, n, k, a, lda, tau, work, lwork, info)
call orgqr(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of m-by-m orthogonal matrix Q of the QR factorization formed by
the routines geqrf or geqpf. Use this routine after a call to sgeqrf/dgeqrf or sgeqpf/dgeqpf.
Usually Q is determined from the QR factorization of an m by p matrix A with mp. To compute the whole
matrix Q, use:
To compute the leading p columns of Q (which form an orthonormal basis in the space spanned by the
columns of A):
797
3 Intel Math Kernel Library Developer Reference
To compute the leading k columns of Qk (which form an orthonormal basis in the space spanned by leading k
columns of the matrix A):
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
798
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Interface Conventions.
Specific details for the routine orgqr interface are the following:
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed Q differs from an exactly orthogonal matrix by a matrix E such that
||E||2 = O()|*|A||2 where is the machine precision.
The total number of floating-point operations is approximately 4*m*n*k - 2*(m + n)*k2 + (4/3)*k3.
?ormqr
Multiplies a real matrix by the orthogonal matrix Q of
the QR factorization formed by ?geqrf or ?geqpf.
Syntax
call sormqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormqr(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real matrix C by Q or QT, where Q is the orthogonal matrix Q of the QR factorization
formed by the routines geqrf or geqpf.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QT*C,
C*Q, or C*QT (overwriting the result on C).
799
3 Intel Math Kernel Library Developer Reference
Input Parameters
800
LAPACK Routines 3
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The complex counterpart of this routine is unmqr.
801
3 Intel Math Kernel Library Developer Reference
?gemqr
Multiples a matrix C by a real orthogonal or complex
unitary matrix Q, as computed by ?geqr, with best
performance for tall and skinny matrices.
call sgemqr(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call dgemqr(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call cgemqr(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call zgemqr(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
Description
The ?gemqr routine multiplies an m-by-n matrix C by Op(Q), where matrix Q is the factor from the LQ
factorization of matrix A formed by ?geqr, and
Op(Q) = Q, or
Op(Q) = QT, or
Op(Q) = QH.
NOTE
You must use ?geqr for LQ factorization before calling ?gemqr. ?gemqr is not compatible with QR
factorization routines other than ?geqr.
Input Parameters
side CHARACTER*1.
If side = 'L': apply Op(Q) from the left.
trans CHARACTER*1.
If trans = 'N': No transpose, Op(Q) = Q.
if side = 'R', nk 0.
802
LAPACK Routines 3
DOUBLE PRECISION for dgemqr
COMPLEX for cgemqr
COMPLEX*16 for zgemqr
Array, size (lda,k).
Output Parameters
info INTEGER.
info = 0 indicates a successful exit.
803
3 Intel Math Kernel Library Developer Reference
info < 0: if info = -i, the i-th argument had an illegal value.
?ungqr
Generates the complex unitary matrix Q of the QR
factorization formed by ?geqrf.
Syntax
call cungqr(m, n, k, a, lda, tau, work, lwork, info)
call zungqr(m, n, k, a, lda, tau, work, lwork, info)
call ungqr(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of m-by-m unitary matrix Q of the QR factorization formed by the
routines geqrf or geqpf. Use this routine after a call to cgeqrf/zgeqrf or cgeqpf/zgeqpf.
Usually Q is determined from the QR factorization of an m by p matrix A with mp. To compute the whole
matrix Q, use:
To compute the leading p columns of Q (which form an orthonormal basis in the space spanned by the
columns of A):
To compute the matrix Qk of the QR factorization of the leading k columns of the matrix A:
To compute the leading k columns of Qk (which form an orthonormal basis in the space spanned by the
leading k columns of the matrix A):
Input Parameters
804
LAPACK Routines 3
The size of tau must be at least max(1, k).
work is a workspace array, its dimension max(1, lwork).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork =n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed Q differs from an exactly unitary matrix by a matrix E such that ||E||2 = O()*||A||2,
where is the machine precision.
The total number of floating-point operations is approximately 16*m*n*k - 8*(m + n)*k2 + (16/3)*k3.
805
3 Intel Math Kernel Library Developer Reference
?unmqr
Multiplies a complex matrix by the unitary matrix Q of
the QR factorization formed by ?geqrf.
Syntax
call cunmqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmqr(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a rectangular complex matrix C by Q or QH, where Q is the unitary matrix Q of the QR
factorization formed by the routines geqrf or geqpf.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QH*C,
C*Q, or C*QH (overwriting the result on C).
Input Parameters
806
LAPACK Routines 3
The size of tau must be at least max(1, k).
c(ldc,*) contains the m-by-n matrix C.
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, or C*QH (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
807
3 Intel Math Kernel Library Developer Reference
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if side
= 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The real counterpart of this routine is ormqr.
?gelqf
Computes the LQ factorization of a general m-by-n
matrix.
Syntax
call sgelqf(m, n, a, lda, tau, work, lwork, info)
call dgelqf(m, n, a, lda, tau, work, lwork, info)
call cgelqf(m, n, a, lda, tau, work, lwork, info)
call zgelqf(m, n, a, lda, tau, work, lwork, info)
call gelqf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the LQ factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
pivoting is performed.
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors. Routines are provided to work with Q in this representation.
NOTE
This routine supports the Progress Routine feature. See Progress Function section for details.
Input Parameters
808
LAPACK Routines 3
n INTEGER. The number of columns in A (n 0).
lwork INTEGER. The size of the work array; at least max(1, m).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
Contains scalars that define elementary reflectors for the matrix Q (see
Orthogonal Factorizations).
info INTEGER.
If info = 0, the execution is successful.
809
3 Intel Math Kernel Library Developer Reference
Application Notes
For better performance, try using lwork =m*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed factorization is the exact factorization of a matrix A + E, where
||E||2 = O() ||A||2.
The approximate number of floating-point operations for real flavors is
(4/3)n3 if m = n,
(2/3)n2(3m-n) if m > n,
(2/3)m2(3n-m) if m < n.
(The columns of the computed X are the minimum-norm solution vectors x. Here A is an m-by-n matrix with
m < n; Q1 denotes the first m columns of Q).
To compute the elements of Q explicitly, call
810
LAPACK Routines 3
unglq (for complex matrices).
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?gelq
Computes an LQ factorization of a general matrix.
call sgelq(m, n, a, lda, t, tsize, work, lwork, info)
call dgelq(m, n, a, lda, t, tsize, work, lwork, info)
call cgelq(m, n, a, lda, t, tsize, work, lwork, info)
call zgelq(m, n, a, lda, t, tsize, work, lwork, info)
Description
The ?gelq routines computes an LQ factorization of an m-by-n matrix A. If the matrix is short and wide (n is
substantially larger than m), a highly scalable algorithm is used to avoid communication overhead.
NOTE
The internal format of the elementary reflectors generated by ?gelq is only compatible with the ?
gemlq routine and not any other LQ routines.
NOTE
An optimized version of ?gelq is not available.
Input Parameters
tsize INTEGER. If tsize 5, the size of the array t. If tsize = -1 or tsize = -2,
then the routine performs a workspace query. The routine calculates the
sizes required for the t and work arrays and returns these values as the
first entries of the t and work arrays, without issuing any error message
related to t or work by xerbla.
If tsize = -1, the routine calculates the optimal size of t for optimum
performance and returns this value in t(1).
811
3 Intel Math Kernel Library Developer Reference
If tsize = -2, the routine calculates then minimum size required for t and
returns this value in t(1).
lwork INTEGER. The size of the array work. If lwork = -1 or lwork = -2, then the
routine performs a workspace query. The routine only calculates the sizes of
the t and work arrays and returns these values as the first entries of the t
and work arrays, without issuing any error message related to t or work by
xerbla.
If lwork = -1, the routine calculates the optimal size of work for optimum
performance and returns this value in work(1).
If lwork = -2, the routine calculates the minimum size required for work
and returns this value in work(1).
Output Parameters
If info = 0, t(1) returns the optimal value for tsize. You can specify that
it return the minimum required value for tsize instead - see the tsize
description for details. The remaining entries of t contains part of the data
structure used to represent Q. To apply or construct Q, you need to retain a
and t and pass them to other routines.
If info = 0, work(1) contains the optimal value for lwork. You can specify
that it return the minimum required value for lwork instead - see the
lwork description for details.
info INTEGER.
812
LAPACK Routines 3
info = 0 indicates a successful exit.
info < 0: if info = -i, the i-th argument had an illegal value.
See Also
?gemlq Multiples a matrix C by a real orthogonal or complex unitary matrix Q, as computed by ?
gelq.
?orglq
Generates the real orthogonal matrix Q of the LQ
factorization formed by ?gelqf.
Syntax
call sorglq(m, n, k, a, lda, tau, work, lwork, info)
call dorglq(m, n, k, a, lda, tau, work, lwork, info)
call orglq(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of n-by-n orthogonal matrix Q of the LQ factorization formed by the
routines gelqf. Use this routine after a call to sgelqf/dgelqf.
Usually Q is determined from the LQ factorization of an p-by-n matrix A with np. To compute the whole
matrix Q, use:
To compute the leading p rows of Q, which form an orthonormal basis in the space spanned by the rows of A,
use:
To compute the leading k rows of Qk, which form an orthonormal basis in the space spanned by the leading k
rows of A, use:
Input Parameters
813
3 Intel Math Kernel Library Developer Reference
lwork INTEGER. The size of the work array; at least max(1, m).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork =m*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
814
LAPACK Routines 3
The computed Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O()*||A||2,
where is the machine precision.
The total number of floating-point operations is approximately 4*m*n*k - 2*(m + n)*k2 + (4/3)*k3.
?ormlq
Multiplies a real matrix by the orthogonal matrix Q of
the LQ factorization formed by ?gelqf.
Syntax
call sormlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormlq(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real m-by-n matrix C by Q or QT, where Q is the orthogonal matrix Q of the LQ
factorization formed by the routine gelqf.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QT*C,
C*Q, or C*QT (overwriting the result on C).
Input Parameters
815
3 Intel Math Kernel Library Developer Reference
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
816
LAPACK Routines 3
trans Must be 'N' or 'T'. The default value is 'N'.
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork= -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork= -1, the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The complex counterpart of this routine is unmlq.
?gemlq
Multiples a matrix C by a real orthogonal or complex
unitary matrix Q, as computed by ?gelq.
call sgemlq(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call dgemlq(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call cgemlq(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call zgemlq(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
Description
The ?gemlq routine multiplies an m-by-n matrix C by Op(Q), where matrix Q is the factor from the LQ
factorization of matrix A formed by ?gelq, and
Op(Q) = Q, or
Op(Q) = QT, or
Op(Q) = QH.
NOTE
You must use ?gelq for LQ factorization before calling ?gemlq. ?gemlq is not compatible with LQ
factorization routines other than ?gelq.
NOTE
An optimized version of ?gemlq is not available.
817
3 Intel Math Kernel Library Developer Reference
Input Parameters
side CHARACTER*1.
If side = 'L': apply Op(Q) from the left.
trans CHARACTER*1.
If trans = 'N': No transpose, Op(Q) = Q.
if side = 'R', nk 0.
818
LAPACK Routines 3
lwork INTEGER. The size of the array work.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array and returns this value as
work(1); no error message related to lwork is issued by xerbla.
Output Parameters
info INTEGER.
info = 0 indicates a successful exit.
info < 0: if info = -i, the i-th argument had an illegal value.
?unglq
Generates the complex unitary matrix Q of the LQ
factorization formed by ?gelqf.
Syntax
call cunglq(m, n, k, a, lda, tau, work, lwork, info)
call zunglq(m, n, k, a, lda, tau, work, lwork, info)
call unglq(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of n-by-n unitary matrix Q of the LQ factorization formed by the
routines gelqf. Use this routine after a call to cgelqf/zgelqf.
Usually Q is determined from the LQ factorization of an p-by-n matrix A with n < p. To compute the whole
matrix Q, use:
To compute the leading p rows of Q, which form an orthonormal basis in the space spanned by the rows of A,
use:
819
3 Intel Math Kernel Library Developer Reference
To compute the leading k rows of Qk, which form an orthonormal basis in the space spanned by the leading k
rows of A, use:
Input Parameters
lwork INTEGER. The size of the work array; at least max(1, m).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
820
LAPACK Routines 3
Application Notes
For better performance, try using lwork = m*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed Q differs from an exactly unitary matrix by a matrix E such that ||E||2 = O()*||A||2,
where is the machine precision.
The total number of floating-point operations is approximately 16*m*n*k - 8*(m + n)*k2 + (16/3)*k3.
?unmlq
Multiplies a complex matrix by the unitary matrix Q of
the LQ factorization formed by ?gelqf.
Syntax
call cunmlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmlq(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real m-by-n matrix C by Q or QH, where Q is the unitary matrix Q of the LQ
factorization formed by the routine gelqf.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QH*C,
C*Q, or C*QH (overwriting the result on C).
Input Parameters
821
3 Intel Math Kernel Library Developer Reference
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, or C*QH (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
822
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Interface Conventions.
Specific details for the routine unmlq interface are the following:
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The real counterpart of this routine is ormlq.
?geqlf
Computes the QL factorization of a general m-by-n
matrix.
Syntax
call sgeqlf(m, n, a, lda, tau, work, lwork, info)
call dgeqlf(m, n, a, lda, tau, work, lwork, info)
call cgeqlf(m, n, a, lda, tau, work, lwork, info)
call zgeqlf(m, n, a, lda, tau, work, lwork, info)
call geqlf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the QL factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
pivoting is performed.
823
3 Intel Math Kernel Library Developer Reference
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors. Routines are provided to work with Q in this representation.
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
lwork INTEGER. The size of the work array; at least max(1, n).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
824
LAPACK Routines 3
work(1) If info = 0, on exit work(1) contains the minimum value of lwork
required for optimum performance.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork =n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
Related routines include:
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?orgql
Generates the real matrix Q of the QL factorization
formed by ?geqlf.
Syntax
call sorgql(m, n, k, a, lda, tau, work, lwork, info)
825
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine generates an m-by-n real matrix Q with orthonormal columns, which is defined as the last n
columns of a product of k elementary reflectors H(i) of order m: Q = H(k) *...* H(2)*H(1) as returned
by the routines geqlf. Use this routine after a call to sgeqlf/dgeqlf.
Input Parameters
lwork INTEGER. The size of the work array; at least max(1, n).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
826
LAPACK Routines 3
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork =n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The complex counterpart of this routine is ungql.
?ungql
Generates the complex matrix Q of the QL
factorization formed by ?geqlf.
Syntax
call cungql(m, n, k, a, lda, tau, work, lwork, info)
call zungql(m, n, k, a, lda, tau, work, lwork, info)
call ungql(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates an m-by-n complex matrix Q with orthonormal columns, which is defined as the last n
columns of a product of k elementary reflectors H(i) of order m: Q = H(k) *...* H(2)*H(1) as returned
by the routines geqlf/geqlf . Use this routine after a call to cgeqlf/zgeqlf.
827
3 Intel Math Kernel Library Developer Reference
Input Parameters
lwork INTEGER. The size of the work array; at least max(1, n).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
828
LAPACK Routines 3
tau Holds the vector of length (k).
Application Notes
For better performance, try using lwork =n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The real counterpart of this routine is orgql.
?ormql
Multiplies a real matrix by the orthogonal matrix Q of
the QL factorization formed by ?geqlf.
Syntax
call sormql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormql(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real m-by-n matrix C by Q or QT, where Q is the orthogonal matrix Q of the QL
factorization formed by the routine geqlf.
Depending on the parameters side and trans, the routine ormql can form one of the matrix products Q*C,
QT*C, C*Q, or C*QT (overwriting the result over C).
Input Parameters
829
3 Intel Math Kernel Library Developer Reference
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
830
LAPACK Routines 3
If info = -i, the i-th parameter had an illegal value.
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The complex counterpart of this routine is unmql.
?unmql
Multiplies a complex matrix by the unitary matrix Q of
the QL factorization formed by ?geqlf.
Syntax
call cunmql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmql(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
831
3 Intel Math Kernel Library Developer Reference
Description
The routine multiplies a complex m-by-n matrix C by Q or QH, where Q is the unitary matrix Q of the QL
factorization formed by the routine geqlf.
Depending on the parameters side and trans, the routine unmql can form one of the matrix products Q*C,
QH*C, C*Q, or C*QH (overwriting the result over C).
Input Parameters
832
LAPACK Routines 3
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, or C*QH (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
833
3 Intel Math Kernel Library Developer Reference
?gerqf
Computes the RQ factorization of a general m-by-n
matrix.
Syntax
call sgerqf(m, n, a, lda, tau, work, lwork, info)
call dgerqf(m, n, a, lda, tau, work, lwork, info)
call cgerqf(m, n, a, lda, tau, work, lwork, info)
call zgerqf(m, n, a, lda, tau, work, lwork, info)
call gerqf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the RQ factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
pivoting is performed.
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors. Routines are provided to work with Q in this representation.
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
834
LAPACK Routines 3
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork =m*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
835
3 Intel Math Kernel Library Developer Reference
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
Related routines include:
See Also
mkl_progress
Matrix Storage Schemes for LAPACK Routines
?orgrq
Generates the real matrix Q of the RQ factorization
formed by ?gerqf.
Syntax
call sorgrq(m, n, k, a, lda, tau, work, lwork, info)
call dorgrq(m, n, k, a, lda, tau, work, lwork, info)
call orgrq(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates an m-by-n real matrix with orthonormal rows, which is defined as the last m rows of a
product of k elementary reflectors H(i) of order n: Q = H(1)* H(2)*...*H(k)as returned by the routines
gerqf. Use this routine after a call to sgerqf/dgerqf.
Input Parameters
836
LAPACK Routines 3
On entry, the (m - k + i)-th row of a must contain the vector which defines
the elementary reflector H(i), for i = 1,2,...,k, as returned by sgerqf/
dgerqf in the last k rows of its array argument a;
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by sgerqf/dgerqf;
lwork INTEGER. The size of the work array; at least max(1, m).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork =m*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
837
3 Intel Math Kernel Library Developer Reference
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The complex counterpart of this routine is ungrq.
?ungrq
Generates the complex matrix Q of the RQ
factorization formed by ?gerqf.
Syntax
call cungrq(m, n, k, a, lda, tau, work, lwork, info)
call zungrq(m, n, k, a, lda, tau, work, lwork, info)
call ungrq(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates an m-by-n complex matrix with orthonormal rows, which is defined as the last m rows
of a product of k elementary reflectors H(i) of order n: Q = H(1)H* H(2)H*...*H(k)H as returned by the
routines gerqf. Use this routine after a call to cgerqf/zgerqf.
Input Parameters
lwork INTEGER. The size of the work array; at least max(1, m).
838
LAPACK Routines 3
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork =m*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The real counterpart of this routine is orgrq.
?ormrq
Multiplies a real matrix by the orthogonal matrix Q of
the RQ factorization formed by ?gerqf.
Syntax
call sormrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormrq(a, tau, c [,side] [,trans] [,info])
839
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real m-by-n matrix C by Q or QT, where Q is the real orthogonal matrix defined as a
product of k elementary reflectors Hi : Q = H1H2 ... Hk as returned by the RQ factorization routine gerqf.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QT*C,
C*Q, or C*QT (overwriting the result over C).
Input Parameters
tau(i) must contain the scalar factor of the elementary reflector Hi, as
returned by sgerqf/dgerqf.
840
LAPACK Routines 3
lwork INTEGER. The size of the work array. Constraints:
lwork max(1, n) if side = 'L';
lwork max(1, m) if side = 'R'.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
841
3 Intel Math Kernel Library Developer Reference
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The complex counterpart of this routine is unmrq.
?unmrq
Multiplies a complex matrix by the unitary matrix Q of
the RQ factorization formed by ?gerqf.
Syntax
call cunmrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmrq(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a complex m-by-n matrix C by Q or QH, where Q is the complex unitary matrix defined
as a product of k elementary reflectors H(i) of order n: Q = H(1)H* H(2)H*...*H(k)Has returned by the
RQ factorization routine gerqf .
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QH*C,
C*Q, or C*QH (overwriting the result over C).
Input Parameters
842
LAPACK Routines 3
On entry, the ith row of a must contain the vector which defines the
elementary reflector H(i), for i = 1,2,...,k, as returned by cgerqf/zgerqf in
the last k rows of its array argument a.
The second dimension of a must be at least max(1, m) if side = 'L', and
at least max(1, n) if side = 'R'.
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by cgerqf/zgerqf.
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, or C*QH (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
843
3 Intel Math Kernel Library Developer Reference
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The real counterpart of this routine is ormrq.
?tzrzf
Reduces the upper trapezoidal matrix A to upper
triangular form.
Syntax
call stzrzf(m, n, a, lda, tau, work, lwork, info)
call dtzrzf(m, n, a, lda, tau, work, lwork, info)
call ctzrzf(m, n, a, lda, tau, work, lwork, info)
call ztzrzf(m, n, a, lda, tau, work, lwork, info)
call tzrzf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces the m-by-n (mn) real/complex upper trapezoidal matrix A to upper triangular form by
means of orthogonal/unitary transformations. The upper trapezoidal matrix A = [A1 A2] = [A1:m, 1:m, A1:m, m
+1:n] is factored as
A = [R0]*Z,
where Z is an n-by-n orthogonal/unitary matrix, R is an m-by-m upper triangular matrix, and 0 is the m-by-
(n-m) zero matrix.
See larz that applies an elementary reflector returned by ?tzrzf to a general matrix.
844
LAPACK Routines 3
Input Parameters
The leading m-by-n upper trapezoidal part of the array a contains the
matrix A to be factorized.
The second dimension of a must be at least max(1, n).
work is a workspace array, its dimension max(1, lwork).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
845
3 Intel Math Kernel Library Developer Reference
Application Notes
The factorization is obtained by Householder's method. The k-th transformation matrix, Z(k), which is used
to introduce zeros into the (m - k + 1)-th row of A, is given in the form
tau is a scalar and z(k) is an l-element vector. tau and z(k) are chosen to annihilate the elements of the k-th
row of A2.
The scalar tau is returned in the k-th element of tau and the vector u(k) in the k-th row of A, such that the
elements of z(k) are stored in a(k, m+1), ..., a(k, n).
846
LAPACK Routines 3
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
Related routines include:
?ormrz
Multiplies a real matrix by the orthogonal matrix
defined from the factorization formed by ?tzrzf.
Syntax
call sormrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call dormrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call ormrz(a, tau, c, l [, side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The ?ormrz routine multiplies a real m-by-n matrix C by Q or QT, where Q is the real orthogonal matrix
defined as a product of k elementary reflectors H(i) of order n: Q = H(1)* H(2)*...*H(k) as returned by
the factorization routine tzrzf .
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QT*C,
C*Q, or C*QT (overwriting the result over C).
The matrix Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
847
3 Intel Math Kernel Library Developer Reference
l INTEGER.
The number of columns of the matrix A containing the meaningful part of
the Householder reflectors. Constraints:
0 lm, if side = 'L';
0 ln, if side = 'R'.
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by stzrzf/dtzrzf.
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as specified by side
and trans).
848
LAPACK Routines 3
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The complex counterpart of this routine is unmrz.
?unmrz
Multiplies a complex matrix by the unitary matrix
defined from the factorization formed by ?tzrzf.
Syntax
call cunmrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call zunmrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call unmrz(a, tau, c, l [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
849
3 Intel Math Kernel Library Developer Reference
Description
The routine multiplies a complex m-by-n matrix C by Q or QH, where Q is the unitary matrix defined as a
product of k elementary reflectors H(i):
Input Parameters
l INTEGER.
The number of columns of the matrix A containing the meaningful part of
the Householder reflectors. Constraints:
0 lm, if side = 'L';
0 ln, if side = 'R'.
On entry, the ith row of a must contain the vector which defines the
elementary reflector H(i), for i = 1,2,...,k, as returned by ctzrzf/ztzrzf
in the last k rows of its array argument a.
The second dimension of a must be at least max(1, m) if side = 'L', and
at least max(1, n) if side = 'R'.
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by ctzrzf/ztzrzf.
850
LAPACK Routines 3
The second dimension of c must be at least max(1, n)
work is a workspace array, its dimension max(1, lwork).
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, or C*QH (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
851
3 Intel Math Kernel Library Developer Reference
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The real counterpart of this routine is ormrz.
?ggqrf
Computes the generalized QR factorization of two
matrices.
Syntax
call sggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call dggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call cggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call zggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call ggqrf(a, b [,taua] [,taub] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the generalized QR factorization of an n-by-m matrix A and an n-by-p matrix B as A =
Q*R, B = Q*T*Z, where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix,
and R and T assume one of the forms:
or
852
LAPACK Routines 3
Input Parameters
lwork INTEGER. The size of the work array; must be at least max(1, n, m, p).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
853
3 Intel Math Kernel Library Developer Reference
if np, the upper triangle of the subarray b(1:n, p-n+1:p ) contains the n-
by-n upper triangular matrix T;
if n > p, the elements on and above the (n-p)th subdiagonal contain the n-
by-p upper trapezoidal matrix T; the remaining elements, with the array
taub, represent the orthogonal/unitary matrix Z as a product of elementary
reflectors.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(1)H(2)...H(k), where k = min(n,m).
Each H(i) has the form
H(i) = I - a*v*vT for real flavors, or
H(i) = I - a*v*vH for complex flavors,
where a is a real/complex scalar, and v is a real/complex vector with vj = 0 for 1 ji - 1, vi = 1.
854
LAPACK Routines 3
Z = H(1)H(2)...H(k), where k = min(n,p).
Each H(i) has the form
H(i) = I - b*v*vT for real flavors, or
H(i) = I - b*v*vH for complex flavors,
where b is a real/complex scalar, and v is a real/complex vector with vp - k + 1 = 1, vj = 0 for p - k + 1 jp -
1, .
On exit, for 1 jp - k + i - 1, vj is stored in b(n-k+i, 1:p-k+i-1) and b is stored in taub(i).
For better performance, try using lwork max(n,m, p)*max(nb1,nb2,nb3), where nb1 is the optimal
blocksize for the QR factorization of an n-by-m matrix, nb2 is the optimal blocksize for the RQ factorization of
an n-by-p matrix, and nb3 is the optimal blocksize for a call of ormqr/unmqr.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?ggrqf
Computes the generalized RQ factorization of two
matrices.
Syntax
call sggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call dggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call cggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call zggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call ggrqf(a, b [,taua] [,taub] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the generalized RQ factorization of an m-by-n matrix A and an p-by-n matrix B as A =
R*Q, B = Z*T*Q, where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix,
and R and T assume one of the forms:
or
855
3 Intel Math Kernel Library Developer Reference
or
Input Parameters
856
LAPACK Routines 3
ldb INTEGER. The leading dimension of b; at least max(1, p).
lwork INTEGER. The size of the work array; must be at least max(1, n, m, p).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
857
3 Intel Math Kernel Library Developer Reference
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(1)H(2)...H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I - taua*v*vT for real flavors, or
H(i) = I - taua*v*vH for complex flavors,
where taua is a real/complex scalar, and v is a real/complex vector with vn - k + i = 1, vn - k + i + 1:n = 0.
On exit, v1:n - k + i - 1 is stored in a(m-k+i,1:n-k+i-1) and taua is stored in taua(i).
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork= -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork= -1, the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?tpqrt
Computes a blocked QR factorization of a real or
complex "triangular-pentagonal" matrix, which is
composed of a triangular block and a pentagonal
block, using the compact WY representation for Q.
858
LAPACK Routines 3
Syntax
call stpqrt(m, n, l, nb, a, lda, b, ldb, t, ldt, work, info)
call dtpqrt(m, n, l, nb, a, lda, b, ldb, t, ldt, work, info)
call ctpqrt(m, n, l, nb, a, lda, b, ldb, t, ldt, work, info)
call ztpqrt(m, n, l, nb, a, lda, b, ldb, t, ldt, work, info)
call tpqrt(a, b, t, nb[, info])
Include Files
mkl.fi, lapack.f90
Description
where A is an n-by-n upper triangular matrix, and B is an m-by-n pentagonal matrix consisting of an (m-l)-
by-n rectangular matrix B1 on top of an l-by-n upper trapezoidal matrix B2:
The upper trapezoidal matrix B2 consists of the first l rows of an n-by-n upper triangular matrix, where 0
l min(m,n). If l=0, B is an m-by-n rectangular matrix. If m=l=n, B is upper triangular. The elementary
reflectors H(i) are stored in the ith column below the diagonal in the (n+m)-by-n input matrix C. The
structure of vectors defining the elementary reflectors is illustrated by:
The elements of the unit matrix I are not stored. Thus, V contains all of the necessary information, and is
returned in array b.
NOTE
Note that V has the same form as B:
859
3 Intel Math Kernel Library Developer Reference
Input Parameters
nb INTEGER. The block size to use in the blocked QR factorization (nnb 1).
b size (ldb, n), the pentagonal m-by-n matrix B. The first (m-l) rows
contain the rectangular B1 matrix, and the next l rows contain the upper
trapezoidal B2 matrix.
work size (nb, n) is a workspace array.
Output Parameters
a The elements on and above the diagonal of the array contain the upper
triangular matrix R.
860
LAPACK Routines 3
info INTEGER.
If info = 0, the execution is successful.
If info < 0 and info = -i, the ith argument had an illegal value.
?tpmqrt
Applies a real or complex orthogonal matrix obtained
from a "triangular-pentagonal" complex block reflector
to a general real or complex matrix, which consists of
two blocks.
Syntax
call stpmqrt(side, trans, m, n, k, l, nb, v, ldv, t, ldt, a, lda, b, ldb, work, info)
call dtpmqrt(side, trans, m, n, k, l, nb, v, ldv, t, ldt, a, lda, b, ldb, work, info)
call ctpmqrt(side, trans, m, n, k, l, nb, v, ldv, t, ldt, a, lda, b, ldb, work, info)
call ztpmqrt(side, trans, m, n, k, l, nb, v, ldv, t, ldt, a, lda, b, ldb, work, info)
call tpmqrt( v, t, a, b, k, nb[, trans][, side][, info])
Include Files
mkl.fi, lapack.f90
Description
The columns of the pentagonal matrix V contain the elementary reflectors H(1), H(2), ..., H(k); V is
composed of a rectangular block V1 and a trapezoidal block V2:
The size of the trapezoidal block V2 is determined by the parameter l, where 0 lk. V2 is upper trapezoidal,
consisting of the first l rows of a k-by-k upper triangular matrix.
If side = 'L':
If side = 'R':
861
3 Intel Math Kernel Library Developer Reference
Input Parameters
side CHARACTER*1
='L': apply Q, QT, or QH from the left.
='R': apply Q, QT, or QH from the right.
trans CHARACTER*1
='N', no transpose, apply Q.
='T', transpose, apply QT.
='C', transpose, apply QH.
nb INTEGER.
The block size used for the storage of t, knb 1. This must be the same
value of nb used to generate t in tpqrt.
The ith column must contain the vector which defines the elementary
reflector H(i), for i = 1,2,...,k, as returned by tpqrt in array argument b.
862
LAPACK Routines 3
DOUBLE PRECISION for dtpmqrt
COMPLEX for ctpmqrt
COMPLEX*16 for ztpmqrt.
Array, size (ldt, k).
ldt INTEGER. The leading dimension of the array t. ldt must be at least nb.
ldb INTEGER. The leading dimension of the array b. ldb must be at least
max(1,m).
Output Parameters
863
3 Intel Math Kernel Library Developer Reference
864
LAPACK Routines 3
Decision Tree: Singular Value Decomposition
Figure "Decision Tree: Singular Value Decomposition" presents a decision tree that helps you choose the right
sequence of routines for SVD, depending on whether you need singular values only or singular vectors as
well, whether A is real or complex, and so on.
You can use the SVD to find a minimum-norm solution to a (possibly) rank-deficient least squares problem of
minimizing ||Ax - b||2. The effective rank k of the matrix A can be determined as the number of singular
values which exceed a suitable threshold. The minimum-norm solution is
x = Vk(k)-1c
where k is the leading k-by-k submatrix of , the matrix Vk consists of the first k columns of V = PV1, and
the vector c consists of the first k elements of UHb = U1HQHb.
?gebrd
Reduces a general matrix to bidiagonal form.
Syntax
call sgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call dgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call cgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call zgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call gebrd(a [, d] [,e] [,tauq] [,taup] [,info])
Include Files
mkl.fi, lapack.f90
865
3 Intel Math Kernel Library Developer Reference
Description
The routine reduces a general m-by-n matrix A to a bidiagonal matrix B by an orthogonal (unitary)
transformation.
where B1 is an n-by-n upper diagonal matrix, Q and P are orthogonal or, for a complex A, unitary matrices;
Q1 consists of the first n columns of Q.
If m < n, the reduction is given by
Input Parameters
lwork INTEGER.
The dimension of work; at least max(1, m, n).
866
LAPACK Routines 3
See Application Notes for the suggested value of lwork.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
867
3 Intel Math Kernel Library Developer Reference
Application Notes
For better performance, try using lwork = (m + n)*blocksize, where blocksize is a machine-dependent
value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed matrices Q, B, and P satisfy QBPH = A + E, where ||E||2 = c(n) ||A||2, c(n) is a
modestly increasing function of n, and is the machine precision.
The approximate number of floating-point operations for real flavors is
(4/3)*n2*(3*m - n) for mn,
(4/3)*m2*(3*n - m) for m < n.
The number of operations for complex flavors is four times greater.
If n is much less than m, it can be more efficient to first form the QR factorization of A by calling geqrf and
then reduce the factor R to bidiagonal form. This requires approximately 2*n2*(m + n) floating-point
operations.
If m is much less than n, it can be more efficient to first form the LQ factorization of A by calling gelqf and
then reduce the factor L to bidiagonal form. This requires approximately 2*m2*(m + n) floating-point
operations.
?gbbrd
Reduces a general band matrix to bidiagonal form.
Syntax
call sgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc, work,
info)
call dgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc, work,
info)
call cgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc, work,
rwork, info)
call zgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc, work,
rwork, info)
868
LAPACK Routines 3
call gbbrd(ab [, c] [,d] [,e] [,q] [,pt] [,kl] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces an m-by-n band matrix A to upper bidiagonal matrix B: A = Q*B*PH. Here the matrices
Q and P are orthogonal (for real A) or unitary (for complex A). They are determined as products of Givens
rotation matrices, and may be formed explicitly by the routine if required. The routine can also update a
matrix C as follows: C = QH*C.
Input Parameters
869
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
870
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Interface Conventions.
Specific details for the routine gbbrd interface are the following:
m If omitted, assumed m = n.
ku Restored as ku = lda-kl-1.
Application Notes
The computed matrices Q, B, and P satisfy Q*B*PH = A + E, where ||E||2 = c(n) ||A||2, c(n) is a
modestly increasing function of n, and is the machine precision.
If m = n, the total number of floating-point operations for real flavors is approximately the sum of:
?orgbr
Generates the real orthogonal matrix Q or PT
determined by ?gebrd.
Syntax
call sorgbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call dorgbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call orgbr(a, tau [,vect] [,info])
871
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of the orthogonal matrices Q and PT formed by the routines gebrd/
gebrd. Use this routine after a call to sgebrd/dgebrd. All valid combinations of arguments are described in
Input parameters. In most cases you need the following:
To compute the whole m-by-m matrix Q:
Input Parameters
m, n INTEGER. The number of rows (m) and columns (n) in the matrix Q or PT to
be returned (m 0, n 0).
872
LAPACK Routines 3
Array, size min (m,k) if vect = 'Q', min (n,k) if vect = 'P'.
Scalar factor of the elementary reflector H(i) or G(i), which determines Q
and PT as returned by gebrd in the array tauq or taup.
lwork INTEGER. Dimension of the array work. See Application Notes for the
suggested value of lwork.
If lwork = -1 then the routine performs a workspace query and calculates
the optimal size of the work array, returns this value as the first entry of the
work array, and no error message related to lwork is issued by xerbla.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = min(m,n)*blocksize, where blocksize is a machine-dependent
value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
873
3 Intel Math Kernel Library Developer Reference
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O().
The approximate numbers of floating-point operations for the cases listed in Description are as follows:
To form the whole of Q:
(4/3)*n*(3m2 - 3m*n + n2) if m > n;
(4/3)*m3 if mn.
To form the n leading columns of Q when m > n:
?ormbr
Multiplies an arbitrary real matrix by the real
orthogonal matrix Q or PT determined by ?gebrd.
Syntax
call sormbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormbr(a, tau, c [,vect] [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
Given an arbitrary real matrix C, this routine forms one of the matrix products Q*C, QT*C, C*Q, C*QT, P*C,
PT*C, C*P, C*PT, where Q and P are orthogonal matrices computed by a call to gebrd. The routine overwrites
the product on C.
Input Parameters
In the descriptions below, r denotes the order of Q or PT:
If side = 'L', r = m; if side = 'R', r = n.
874
LAPACK Routines 3
If side = 'L', multipliers are applied to C from the left.
Constraints: m 0, n 0, k 0.
Its second dimension must be at least max(1, min(r,k)) for vect = 'Q', or
max(1, r)) for vect = 'P'.
875
3 Intel Math Kernel Library Developer Reference
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, C*Q,T, P*C, PT*C, C*P, or C*PT,
as specified by vect, side, and trans.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using
lwork = n*blocksize for side = 'L', or
lwork = m*blocksize for side = 'R',
where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance of the
blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
876
LAPACK Routines 3
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed product differs from the exact product by a matrix E such that ||E||2 = O()*||C||2.
?ungbr
Generates the complex unitary matrix Q or PH
determined by ?gebrd.
Syntax
call cungbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call zungbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call ungbr(a, tau [,vect] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of the unitary matrices Q and PH formed by the routines gebrd/
gebrd. Use this routine after a call to cgebrd/zgebrd. All valid combinations of arguments are described in
Input Parameters; in most cases you need the following:
To compute the whole m-by-m matrix Q, use:
877
3 Intel Math Kernel Library Developer Reference
Input Parameters
Constraints: m 0, n 0, k 0.
The dimension of tau must be at least max(1, min(m, k)) for vect = 'Q',
or max(1, min(m, k)) for vect = 'P'.
Output Parameters
878
LAPACK Routines 3
work(1) If info = 0, on exit work(1) contains the minimum value of lwork
required for optimum performance. Use this lwork for subsequent runs.
info INTEGER.
If info = 0, the execution is successful.
k = n, if vect = 'Q'.
Application Notes
For better performance, try using lwork = min(m,n)*blocksize, where blocksize is a machine-dependent
value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O().
(8/3)n2(3m - n2).
To compute the whole matrix PH:
(16/3)n3 if mn;
(16/3)m(3n2 - 3m*n + m2) if m < n.
To form the m leading columns of PH when m < n:
879
3 Intel Math Kernel Library Developer Reference
?unmbr
Multiplies an arbitrary complex matrix by the unitary
matrix Q or P determined by ?gebrd.
Syntax
call cunmbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmbr(a, tau, c [,vect] [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
Given an arbitrary complex matrix C, this routine forms one of the matrix products Q*C, QH*C, C*Q, C*QH,
P*C, PH*C, C*P, or C*PH, where Q and P are unitary matrices computed by a call to gebrd/gebrd. The routine
overwrites the product on C.
Input Parameters
In the descriptions below, r denotes the order of Q or PH:
If side = 'L', r = m; if side = 'R', r = n.
Constraints: m 0, n 0, k 0.
880
LAPACK Routines 3
DOUBLE COMPLEX for zunmbr.
Arrays:
a(lda,*) is the array a as returned by ?gebrd.
Its second dimension must be at least max(1, min(r,k)) for vect = 'Q', or
max(1, r)) for vect = 'P'.
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, C*QH, P*C, PH*C, C*P, or
C*PH, as specified by vect, side, and trans.
info INTEGER.
If info = 0, the execution is successful.
881
3 Intel Math Kernel Library Developer Reference
Application Notes
For better performance, use
lwork = n*blocksize for side = 'L', or
lwork = m*blocksize for side = 'R',
where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance of the
blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed product differs from the exact product by a matrix E such that ||E||2 = O()*||C||2.
882
LAPACK Routines 3
The real counterpart of this routine is ormbr.
?bdsqr
Computes the singular value decomposition of a
general matrix that has been reduced to bidiagonal
form.
Syntax
call sbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work, info)
call dbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work, info)
call cbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, rwork, info)
call zbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, rwork, info)
call rbdsqr(d, e [,vt] [,u] [,c] [,uplo] [,info])
call bdsqr(d, e [,vt] [,u] [,c] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the singular values and, optionally, the right and/or left singular vectors from the
Singular Value Decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal matrix B using the implicit
zero-shift QR algorithm. The SVD of B has the form B = Q*S*PH where S is the diagonal matrix of singular
values, Q is an orthogonal matrix of left singular vectors, and P is an orthogonal matrix of right singular
vectors. If left singular vectors are requested, this subroutine actually returns U *Q instead of Q, and, if right
singular vectors are requested, this subroutine returns PH *VT instead of PH, for given real/complex input
matrices U and VT. When U and VT are the orthogonal/unitary matrices that reduce a general matrix A to
bidiagonal form: A = U*B*VT, as computed by ?gebrd, then
A = (U*Q)*S*(PH*VT)
is the SVD of A. Optionally, the subroutine may also compute QH *C for a given real/complex input matrix C.
See also lasq1, lasq2, lasq3, lasq4, lasq5, lasq6 used by this routine.
Input Parameters
ncvt INTEGER. The number of columns of the matrix VT, that is, the number of
right singular vectors (ncvt 0).
nru INTEGER. The number of rows in U, that is, the number of left singular
vectors (nru 0).
883
3 Intel Math Kernel Library Developer Reference
ncc INTEGER. The number of columns in the matrix C used for computing the
product QH*C (ncc 0). Set ncc = 0 if no matrix C is supplied.
c(ldc,*) contains the n-by-ncc matrix C for computing the product QH*C.
The second dimension of c must be at least max(1, ncc). The array is not
referenced if ncc = 0.
884
LAPACK Routines 3
ldu max(1, nru) .
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info > 0,
In all other cases when ncvt, nru, or ncc > 0, the algorithm did not
converge; d and e contain the elements of a bidiagonal matrix that is
orthogonally similar to the input matrix B; if info = i, i elements of e
have not converged to zero.
885
3 Intel Math Kernel Library Developer Reference
ncvt If argument vt is present, then ncvt is equal to the number of columns in matrix
VT; otherwise, ncvt is set to zero.
nru If argument u is present, then nru is equal to the number of rows in matrix U;
otherwise, nru is set to zero.
ncc If argument c is present, then ncc is equal to the number of columns in matrix C;
otherwise, ncc is set to zero.
Note that two variants of Fortran 95 interface for bdsqr routine are needed because of an ambiguous choice
between real and complex cases appear when vt, u, and c are omitted. Thus, the name rbdsqr is used in
real cases (single or double precision), and the name bdsqr is used in complex cases (single or double
precision).
Application Notes
Each singular value and singular vector is computed to high relative accuracy. However, the reduction to
bidiagonal form (prior to calling the routine) may decrease the relative accuracy in the small singular values
of the original matrix if its singular values vary widely in magnitude.
If si is an exact singular value of B, and si is the corresponding computed value, then
|si - i| p*(m,n)**i
where p(m, n) is a modestly increasing function of m and n, and is the machine precision.
If only singular values are computed, they are computed more accurately than when some singular vectors
are also computed (that is, the function p(m, n) is smaller).
If ui is the corresponding exact left singular vector of B, and wi is the corresponding computed left singular
vector, then the angle (ui, wi) between them is bounded as follows:
(ui, wi) p(m,n)* / min ij(|i - j|/|i + j|).
Here minij(|i - j|/|i + j|) is the relative gap between i and the other singular values. A similar
error bound holds for the right singular vectors.
The total number of real floating-point operations is roughly proportional to n2 if only the singular values are
computed. About 6n2*nru additional operations (12n2*nru for complex flavors) are required to compute the
left singular vectors and about 6n2*ncvt operations (12n2*ncvt for complex flavors) to compute the right
singular vectors.
?bdsdc
Computes the singular value decomposition of a real
bidiagonal matrix using a divide and conquer method.
Syntax
call sbdsdc(uplo, compq, n, d, e, u, ldu, vt, ldvt, q, iq, work, iwork, info)
call dbdsdc(uplo, compq, n, d, e, u, ldu, vt, ldvt, q, iq, work, iwork, info)
call bdsdc(d, e [,u] [,vt] [,q] [,iq] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
886
LAPACK Routines 3
Description
The routine computes the Singular Value Decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal
matrix B: B = U**VT, using a divide and conquer method, where is a diagonal matrix with non-negative
diagonal elements (the singular values of B), and U and V are orthogonal matrices of left and right singular
vectors, respectively. ?bdsdc can be used to compute all singular values, and optionally, singular vectors or
singular vectors in compact form.
This rotuine uses ?lasd0, ?lasd1, ?lasd2, ?lasd3, ?lasd4, ?lasd5, ?lasd6, ?lasd7, ?lasd8, ?lasd9, ?
lasda, ?lasdq, ?lasdt.
Input Parameters
ldvt INTEGER. The leading dimension of the output array vt; ldvt 1.
If singular vectors are desired, then ldvt max(1, n).
Output Parameters
887
3 Intel Math Kernel Library Developer Reference
e On exit, e is overwritten.
If compq = 'I', then on exit u contains the left singular vectors of the
bidiagonal matrix B, unless info 0 (seeinfo). For other values of compq, u
is not referenced.
The second dimension of u must be at least max(1,n).
if compq = 'I', then on exit vtT contains the right singular vectors of the
bidiagonal matrix B, unless info 0 (seeinfo). For other values of compq,
vt is not referenced. The second dimension of vt must be at least max(1,n).
If compq = 'P', then on exit, if info = 0, q and iq contain the left and
right singular vectors in a compact form. Specifically, q contains all the
REAL (for sbdsdc) or DOUBLE PRECISION (for dbdsdc) data for singular
vectors. For other values of compq, q is not referenced.
iq INTEGER.
Array: iq(*).
If compq = 'P', then on exit, if info = 0, q and iq contain the left and
right singular vectors in a compact form. Specifically, iq contains all the
INTEGER data for singular vectors. For other values of compq, iq is not
referenced.
info INTEGER.
If info = 0, the execution is successful.
888
LAPACK Routines 3
compq Restored based on the presence of arguments u, vt, q, and iq as follows:
compq = 'N', if none of u, vt, q, and iq are present,
compq = 'I', if both u and vt are present. Arguments u and vt must either be
both present or both omitted,
compq = 'P', if both q and iq are present. Arguments q and iq must either be
both present or both omitted.
Note that there will be an error condition if all of u, vt, q, and iq arguments are
present simultaneously.
See Also
?lasd0
?lasd1
?lasd2
?lasd3
?lasd4
?lasd5
?lasd6
?lasd7
?lasd8
?lasd9
?lasda
?lasdq
?lasdt
889
3 Intel Math Kernel Library Developer Reference
890
LAPACK Routines 3
891
3 Intel Math Kernel Library Developer Reference
?sytrd
Reduces a real symmetric matrix to tridiagonal form.
Syntax
call ssytrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call dsytrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call sytrd(a, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a real symmetric matrix A to symmetric tridiagonal form T by an orthogonal similarity
transformation: A = Q*T*QT. The orthogonal matrix Q is not formed explicitly but is represented as a
product of n-1 elementary reflectors. Routines are provided for working with Q in this representation (see
Application Notes below).
Input Parameters
892
LAPACK Routines 3
a(lda,*) is an array containing either upper or lower triangular part of the
matrix A, as specified by uplo. If uplo = 'U', the leading n-by-n upper
triangular part of a contains the upper triangular part of the matrix A, and
the strictly lower triangular part of A is not referenced. If uplo = 'L', the
leading n-by-n lower triangular part of a contains the lower triangular part
of the matrix A, and the strictly upper triangular part of A is not referenced.
The second dimension of a must be at least max(1, n).
Output Parameters
a On exit,
if uplo = 'U', the diagonal and first superdiagonal of A are overwritten by
the corresponding elements of the tridiagonal matrix T, and the elements
above the first superdiagonal, with the array tau, represent the orthogonal
matrix Q as a product of elementary reflectors;
if uplo = 'L', the diagonal and first subdiagonal of A are overwritten by
the corresponding elements of the tridiagonal matrix T, and the elements
below the first subdiagonal, with the array tau, represent the orthogonal
matrix Q as a product of elementary reflectors.
work(1) If info=0, on exit work(1) contains the minimum value of lwork required
for optimum performance. Use this lwork for subsequent runs.
info INTEGER.
If info = 0, the execution is successful.
893
3 Intel Math Kernel Library Developer Reference
Note that diagonal (d) and off-diagonal (e) elements of the matrix T are omitted because they are kept in the
matrix A on exit.
Application Notes
For better performance, try using lwork =n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed matrix T is exactly similar to a matrix A+E, where ||E||2 = c(n)**||A||2, c(n) is a
modestly increasing function of n, and is the machine precision.
The approximate number of floating-point operations is (4/3)n3.
After calling this routine, you can call the following:
?syrdb
Reduces a real symmetric matrix to tridiagonal form
with Successive Bandwidth Reduction approach.
Syntax
call ssyrdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
call dsyrdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
894
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine reduces a real symmetric matrix A to symmetric tridiagonal form T by an orthogonal similarity
transformation: A = Q*T*QT and optionally multiplies matrix Z by Q, or simply forms Q.
This routine reduces a full symmetric matrix A to the banded symmetric matrix B, and then to the tridiagonal
symmetric matrix T with a Successive Bandwidth Reduction approach after C. Bischof's works (see for
instance, [Bischof00]). ?syrdb is functionally close to ?sytrd routine but the tridiagonal form may differ
from those obtained by ?sytrd. Unlike ?sytrd, the orthogonal matrix Q cannot be restored from the details
of matrix A on exit.
Input Parameters
ldz INTEGER. The leading dimension of z; at least max(1, n). Not referenced if
jobz = 'N'
895
3 Intel Math Kernel Library Developer Reference
Output Parameters
If jobz = 'N' or 'U', then overwritten by the banded matrix B and details
of the orthogonal matrix QB to reduce A to B as specified by uplo.
z On exit,
if jobz = 'U', then the matrix z is overwritten by Z*Q.
work(1) If info=0, on exit work(1) contains the minimum value of lwork required
for optimum performance. Use this lwork for subsequent runs.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*(3*kd+3).
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
For better performance, try using kd equal to 40 if n 2000 and 64 otherwise.
Try using ?syrdb instead of ?sytrd on large matrices obtaining only eigenvalues - when no eigenvectors are
needed, especially in multi-threaded environment. ?syrdb becomes faster beginning approximately with n =
1000, and much faster at larger matrices with a better scalability than ?sytrd.
896
LAPACK Routines 3
Avoid applying ?syrdb for computing eigenvectors due to the two-step reduction, that is, the number of
operations needed to apply orthogonal transformations to Z is doubled compared to the traditional one-step
reduction. In that case it is better to apply ?sytrd and ?ormtr/?orgtr to obtain tridiagonal form along with
the orthogonal transformation matrix Q.
?herdb
Reduces a complex Hermitian matrix to tridiagonal
form with Successive Bandwidth Reduction approach.
Syntax
call cherdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
call zherdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
Include Files
mkl.fi
Description
The routine reduces a complex Hermitian matrix A to symmetric tridiagonal form T by a unitary similarity
transformation: A = Q*T*QT and optionally multiplies matrix Z by Q, or simply forms Q.
This routine reduces a full symmetric matrix A to the banded symmetric matrix B, and then to the tridiagonal
symmetric matrix T with a Successive Bandwidth Reduction approach after C. Bischof's works (see for
instance, [Bischof00]). ?herdb is functionally close to ?hetrd routine but the tridiagonal form may differ
from those obtained by ?hetrd. Unlike ?hetrd, the orthogonal matrix Q cannot be restored from the details
of matrix A on exit.
Input Parameters
897
3 Intel Math Kernel Library Developer Reference
ldz INTEGER. The leading dimension of z; at least max(1, n). Not referenced if
jobz = 'N'
Output Parameters
If jobz = 'N' or 'U', then overwritten by the banded matrix B and details
of the unitary matrix QB to reduce A to B as specified by uplo.
z On exit,
if jobz = 'U', then the matrix z is overwritten by Z*Q .
work(1) If info=0, on exit work(1) contains the minimum value of lwork required
for optimum performance. Use this lwork for subsequent runs.
info INTEGER.
If info = 0, the execution is successful.
898
LAPACK Routines 3
If info = -i, the i-th parameter had an illegal value.
Application Notes
For better performance, try using lwork = n*(3*kd+3).
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
For better performance, try using kd equal to 40 if n 2000 and 64 otherwise.
Try using ?herdb instead of ?hetrd on large matrices obtaining only eigenvalues - when no eigenvectors are
needed, especially in multi-threaded environment. ?herdb becomes faster beginning approximately with n =
1000, and much faster at larger matrices with a better scalability than ?hetrd.
Avoid applying ?herdb for computing eigenvectors due to the two-step reduction, that is, the number of
operations needed to apply orthogonal transformations to Z is doubled compared to the traditional one-step
reduction. In that case it is better to apply ?hetrd and ?unmtr/?ungtr to obtain tridiagonal form along with
the unitary transformation matrix Q.
?orgtr
Generates the real orthogonal matrix Q determined
by ?sytrd.
Syntax
call sorgtr(uplo, n, a, lda, tau, work, lwork, info)
call dorgtr(uplo, n, a, lda, tau, work, lwork, info)
call orgtr(a, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the n-by-n orthogonal matrix Q formed by sytrd when reducing a real
symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to ?sytrd.
Input Parameters
899
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = (n-1)*blocksize, where blocksize is a machine-dependent
value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
900
LAPACK Routines 3
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O(),
where is the machine precision.
The approximate number of floating-point operations is (4/3)n3.
The complex counterpart of this routine is ungtr.
?ormtr
Multiplies a real matrix by the real orthogonal matrix
Q determined by ?sytrd.
Syntax
call sormtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call dormtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call ormtr(a, tau, c [,side] [,uplo] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real matrix C by Q or QT, where Q is the orthogonal matrix Q formed by sytrd when
reducing a real symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to ?sytrd.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QT*C,
C*Q, or C*QT (overwriting the result on C).
Input Parameters
In the descriptions below, r denotes the order of Q:
If side = 'L', r = m; if side = 'R', r = n.
901
3 Intel Math Kernel Library Developer Reference
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
902
LAPACK Routines 3
r = n if side = 'R'.
Application Notes
For better performance, try using lwork = n*blocksize for side = 'L', or lwork = m*blocksize for
side = 'R', where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed product differs from the exact product by a matrix E such that ||E||2 = O()*||C||2.
The total number of floating-point operations is approximately 2*m2*n, if side = 'L', or 2*n2*m, if side =
'R'.
The complex counterpart of this routine is unmtr.
?hetrd
Reduces a complex Hermitian matrix to tridiagonal
form.
Syntax
call chetrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call zhetrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call hetrd(a, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a complex Hermitian matrix A to symmetric tridiagonal form T by a unitary similarity
transformation: A = Q*T*QH. The unitary matrix Q is not formed explicitly but is represented as a product of
n-1 elementary reflectors. Routines are provided to work with Q in this representation. (They are described
later in this section .)
903
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
a On exit,
if uplo = 'U', the diagonal and first superdiagonal of A are overwritten by
the corresponding elements of the tridiagonal matrix T, and the elements
above the first superdiagonal, with the array tau, represent the orthogonal
matrix Q as a product of elementary reflectors;
if uplo = 'L', the diagonal and first subdiagonal of A are overwritten by
the corresponding elements of the tridiagonal matrix T, and the elements
below the first subdiagonal, with the array tau, represent the orthogonal
matrix Q as a product of elementary reflectors.
904
LAPACK Routines 3
tau COMPLEX for chetrdDOUBLE COMPLEX for zhetrd.
Array, size at least max(1, n-1). Stores (n-1) scalars that define elementary
reflectors in decomposition of the unitary matrix Q in a product of n-1
elementary reflectors.
info INTEGER.
If info = 0, the execution is successful.
Note that diagonal (d) and off-diagonal (e) elements of the matrix T are omitted because they are kept in the
matrix A on exit.
Application Notes
For better performance, try using lwork =n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed matrix T is exactly similar to a matrix A + E, where ||E||2 = c(n)**||A||2, c(n) is a
modestly increasing function of n, and is the machine precision.
The approximate number of floating-point operations is (16/3)n3.
905
3 Intel Math Kernel Library Developer Reference
?ungtr
Generates the complex unitary matrix Q determined
by ?hetrd.
Syntax
call cungtr(uplo, n, a, lda, tau, work, lwork, info)
call zungtr(uplo, n, a, lda, tau, work, lwork, info)
call ungtr(a, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the n-by-n unitary matrix Q formed by hetrd when reducing a complex
Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call to ?hetrd.
Input Parameters
Output Parameters
906
LAPACK Routines 3
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = (n-1)*blocksize, where blocksize is a machine-dependent
value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed matrix Q differs from an exactly unitary matrix by a matrix E such that ||E||2 = O(), where
is the machine precision.
The approximate number of floating-point operations is (16/3)n3.
?unmtr
Multiplies a complex matrix by the complex unitary
matrix Q determined by ?hetrd.
Syntax
call cunmtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call zunmtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call unmtr(a, tau, c [,side] [,uplo] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
907
3 Intel Math Kernel Library Developer Reference
The routine multiplies a complex matrix C by Q or QH, where Q is the unitary matrix Q formed by hetrd when
reducing a complex Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call to ?
hetrd.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QH*C,
C*Q, or C*QH (overwriting the result on C).
Input Parameters
In the descriptions below, r denotes the order of Q:
If side = 'L', r = m; if side = 'R', r = n.
908
LAPACK Routines 3
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, or C*QH (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize (for side = 'L') or lwork = m*blocksize (for
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed product differs from the exact product by a matrix E such that ||E||2 = O()*||C||2, where
is the machine precision.
The total number of floating-point operations is approximately 8*m2*n if side = 'L' or 8*n2*m if side =
'R'.
909
3 Intel Math Kernel Library Developer Reference
?orm22/?unm22
Multiplies a general matrix by an orthogonal/unitary
matrix with a 2x2 structure.
Syntax
call sorm22 (side, trans, m, n, n1, n2, q, ldq, c, ldc, work, lwork, info )
call dorm22 (side, trans, m, n, n1, n2, q, ldq, c, ldc, work, lwork, info )
call cunm22(side, trans, m, n, n1, n2, q, ldq, c, ldc, work, lwork, info)
call zunm22(side, trans, m, n, n1, n2, q, ldq, c, ldc, work, lwork, info)
Include Files
mkl.fi
Description
?orm22/?unm22 overwrites the general real/complex m-by-n matrix C with
trans = 'T' QT * C C * QT
trans = 'C' QH * C C * QH
where Q is a real orthogonal/complex unitary matrix of order nq, with nq = m if side = 'L' and nq = n if side
= 'R'.
The orthogonal/unitary matrix Q processes a 2-by-2 block structure:
Q11 Q12
Q=
Q21 Q22
where Q12 is an n1-by-n1 lower triangular matrix and Q21 is an n2-by-n2 upper triangular matrix.
Input Parameters
910
LAPACK Routines 3
n 0.
ldc max(1,m).
Output Parameters
911
3 Intel Math Kernel Library Developer Reference
Q*C,
QT*C,
QH * C,
C*QT,
C * QH, or
C*Q.
?sptrd
Reduces a real symmetric matrix to tridiagonal form
using packed storage.
Syntax
call ssptrd(uplo, n, ap, d, e, tau, info)
call dsptrd(uplo, n, ap, d, e, tau, info)
call sptrd(ap, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a packed real symmetric matrix A to symmetric tridiagonal form T by an orthogonal
similarity transformation: A = Q*T*QT. The orthogonal matrix Q is not formed explicitly but is represented as
a product of n-1 elementary reflectors. Routines are provided for working with Q in this representation. See
Application Notes below for details.
Input Parameters
912
LAPACK Routines 3
Array, size at least max(1, n(n+1)/2). Contains either upper or lower
triangle of A (as specified by uplo) in the packed form described in Matrix
Storage Schemes.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Note that diagonal (d) and off-diagonal (e) elements of the matrix T are omitted because they are kept in the
matrix A on exit.
Application Notes
The matrix Q is represented as a product of n-1 elementary reflectors, as follows :
On exit, tau is stored in tau(i), and v(1:i-1) is stored in AP, overwriting A(1:i-1, i+1).
If uplo = 'L', Q = H(1)H(2) ... H(n-1)
913
3 Intel Math Kernel Library Developer Reference
On exit, tau is stored in tau(i), and v(i+2:n) is stored in AP, overwriting A(i+2:n, i).
The computed matrix T is exactly similar to a matrix A+E, where ||E||2 = c(n)**||A||2, c(n) is a
modestly increasing function of n, and is the machine precision. The approximate number of floating-point
operations is (4/3)n3.
?opgtr
Generates the real orthogonal matrix Q determined
by ?sptrd.
Syntax
call sopgtr(uplo, n, ap, tau, q, ldq, work, info)
call dopgtr(uplo, n, ap, tau, q, ldq, work, info)
call opgtr(ap, tau, q [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the n-by-n orthogonal matrix Q formed by sptrd when reducing a packed real
symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to ?sptrd.
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'. Use the same uplo as supplied to ?
sptrd.
ldq INTEGER. The leading dimension of the output array q; at least max(1, n).
914
LAPACK Routines 3
Workspace array, size at least max(1, n-1).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O(),
where is the machine precision.
The approximate number of floating-point operations is (4/3)n3.
?opmtr
Multiplies a real matrix by the real orthogonal matrix
Q determined by ?sptrd.
Syntax
call sopmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call dopmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call opmtr(ap, tau, c [,side] [,uplo] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
915
3 Intel Math Kernel Library Developer Reference
The routine multiplies a real matrix C by Q or QT, where Q is the orthogonal matrix Q formed by sptrd when
reducing a packed real symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to ?
sptrd.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QT*C,
C*Q, or C*QT (overwriting the result on C).
Input Parameters
In the descriptions below, r denotes the order of Q:
If side = 'L', r = m; if side = 'R', r = n.
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
916
LAPACK Routines 3
If info = -i, the i-th parameter had an illegal value.
Application Notes
The computed product differs from the exact product by a matrix E such that ||E||2 = O() ||C||2, where
is the machine precision.
The total number of floating-point operations is approximately 2*m2*n if side = 'L', or 2*n2*m if side =
'R'.
The complex counterpart of this routine is upmtr.
?hptrd
Reduces a complex Hermitian matrix to tridiagonal
form using packed storage.
Syntax
call chptrd(uplo, n, ap, d, e, tau, info)
call zhptrd(uplo, n, ap, d, e, tau, info)
call hptrd(ap, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a packed complex Hermitian matrix A to symmetric tridiagonal form T by a unitary
similarity transformation: A = Q*T*QH. The unitary matrix Q is not formed explicitly but is represented as a
product of n-1 elementary reflectors. Routines are provided for working with Q in this representation (see
Application Notes below).
917
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
918
LAPACK Routines 3
uplo Must be 'U' or 'L'. The default value is 'U'.
Note that diagonal (d) and off-diagonal (e) elements of the matrix T are omitted because they are kept in the
matrix A on exit.
Application Notes
The computed matrix T is exactly similar to a matrix A + E, where ||E||2 = c(n)**||A||2, c(n) is a
modestly increasing function of n, and is the machine precision.
The approximate number of floating-point operations is (16/3)n3.
?upgtr
Generates the complex unitary matrix Q determined
by ?hptrd.
Syntax
call cupgtr(uplo, n, ap, tau, q, ldq, work, info)
call zupgtr(uplo, n, ap, tau, q, ldq, work, info)
call upgtr(ap, tau, q [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the n-by-n unitary matrix Q formed by hptrd when reducing a packed
complex Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call to ?hptrd.
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'. Use the same uplo as supplied to ?
hptrd.
919
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O(),
where is the machine precision.
The approximate number of floating-point operations is (16/3)n3.
?upmtr
Multiplies a complex matrix by the unitary matrix Q
determined by ?hptrd.
Syntax
call cupmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call zupmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call upmtr(ap, tau, c [,side] [,uplo] [,trans] [,info])
920
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a complex matrix C by Q or QH, where Q is the unitary matrix formed by hptrd when
reducing a packed complex Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call
to ?hptrd.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QH*C,
C*Q, or C*QH (overwriting the result on C).
Input Parameters
In the descriptions below, r denotes the order of Q:
If side = 'L', r = m; if side = 'R', r = n.
921
3 Intel Math Kernel Library Developer Reference
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, or C*QH (as specified by side
and trans).
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed product differs from the exact product by a matrix E such that ||E||2 = O()*||C||2, where
is the machine precision.
The total number of floating-point operations is approximately 8*m2*n if side = 'L' or 8*n2*m if side =
'R'.
The real counterpart of this routine is opmtr.
?sbtrd
Reduces a real symmetric band matrix to tridiagonal
form.
Syntax
call ssbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call dsbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call sbtrd(ab[, q] [,vect] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
922
LAPACK Routines 3
Description
The routine reduces a real symmetric band matrix A to symmetric tridiagonal form T by an orthogonal
similarity transformation: A = Q*T*QT. The orthogonal matrix Q is determined as a product of Givens
rotations.
If required, the routine can also form the matrix Q explicitly.
Input Parameters
923
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If present, vect must be equal to 'V' or 'U' and the argument q must also be
present. Note that there will be an error condition if vect is present and q
omitted.
Note that diagonal (d) and off-diagonal (e) elements of the matrix T are omitted because they are kept in the
matrix A on exit.
924
LAPACK Routines 3
Application Notes
The computed matrix T is exactly similar to a matrix A+E, where ||E||2 = c(n)**||A||2, c(n) is a
modestly increasing function of n, and is the machine precision. The computed matrix Q differs from an
exactly orthogonal matrix by a matrix E such that ||E||2 = O().
The total number of floating-point operations is approximately 6n2*kd if vect = 'N', with 3n3*(kd-1)/kd
additional operations if vect = 'V'.
?hbtrd
Reduces a complex Hermitian band matrix to
tridiagonal form.
Syntax
call chbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call zhbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call hbtrd(ab [, q] [,vect] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a complex Hermitian band matrix A to symmetric tridiagonal form T by a unitary
similarity transformation: A = Q*T*QH. The unitary matrix Q is determined as a product of Givens rotations.
Input Parameters
925
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
926
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Interface Conventions.
Specific details for the routine hbtrd interface are the following:
If present, vect must be equal to 'V' or 'U' and the argument q must also be
present. Note that there will be an error condition if vect is present and q
omitted.
Note that diagonal (d) and off-diagonal (e) elements of the matrix T are omitted because they are kept in the
matrix A on exit.
Application Notes
The computed matrix T is exactly similar to a matrix A + E, where ||E||2 = c(n)**||A||2, c(n) is a
modestly increasing function of n, and is the machine precision. The computed matrix Q differs from an
exactly unitary matrix by a matrix E such that ||E||2 = O().
The total number of floating-point operations is approximately 20n2*kd if vect = 'N', with
10n3*(kd-1)/kd additional operations if vect = 'V'.
The real counterpart of this routine is sbtrd.
?sterf
Computes all eigenvalues of a real symmetric
tridiagonal matrix using QR algorithm.
Syntax
call ssterf(n, d, e, info)
call dsterf(n, d, e, info)
call sterf(d, e [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues of a real symmetric tridiagonal matrix T (which can be obtained by
reducing a symmetric or Hermitian matrix to tridiagonal form). The routine uses a square-root-free variant of
the QR algorithm.
If you need not only the eigenvalues but also the eigenvectors, call steqr.
927
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info = i, the algorithm failed to find all the eigenvalues after 30n
iterations:
i off-diagonal elements have not converged to zero. On exit, d and e
contain, respectively, the diagonal and off-diagonal elements of a
tridiagonal matrix orthogonally similar to T.
If info = -i, the i-th parameter had an illegal value.
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix T+E such that ||E||2 = O()*||T||2,
where is the machine precision.
If i is an exact eigenvalue, and mi is the corresponding computed value, then
|i - i| c(n)**||T||2
where c(n) is a modestly increasing function of n.
The total number of floating-point operations depends on how rapidly the algorithm converges. Typically, it is
about 14n2.
928
LAPACK Routines 3
?steqr
Computes all eigenvalues and eigenvectors of a
symmetric or Hermitian matrix reduced to tridiagonal
form (QR algorithm).
Syntax
call ssteqr(compz, n, d, e, z, ldz, work, info)
call dsteqr(compz, n, d, e, z, ldz, work, info)
call csteqr(compz, n, d, e, z, ldz, work, info)
call zsteqr(compz, n, d, e, z, ldz, work, info)
call rsteqr(d, e [,z] [,compz] [,info])
call steqr(d, e [,z] [,compz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a real symmetric tridiagonal
matrix T. In other words, the routine can compute the spectral factorization: T = Z**ZT. Here is a
diagonal matrix whose diagonal elements are the eigenvalues i; Z is an orthogonal matrix whose columns
are eigenvectors. Thus,
T*zi = i*zi for i = 1, 2, ..., n.
The routine normalizes the eigenvectors so that ||zi||2 = 1.
You can also use the routine for computing the eigenvalues and eigenvectors of an arbitrary real symmetric
(or complex Hermitian) matrix A reduced to tridiagonal form T: A = Q*T*QH. In this case, the spectral
factorization is as follows: A = Q*T*QH = (Q*Z)**(Q*Z)H. Before calling ?steqr, you must reduce A to
tridiagonal form and generate the explicit matrix Q by calling the following routines:
If you need eigenvalues only, it's more efficient to call sterf. If T is positive-definite, pteqr can compute small
eigenvalues more accurately than ?steqr.
To solve the problem by a single call, use one of the divide and conquer routines stevd, syevd, spevd, or
sbevd for real symmetric matrices or heevd, hpevd, or hbevd for complex Hermitian matrices.
Input Parameters
929
3 Intel Math Kernel Library Developer Reference
If vect = 'V', z must contain the orthogonal matrix used in the reduction
to tridiagonal form.
The second dimension of z must be:
at least 1 if compz = 'N';
Output Parameters
930
LAPACK Routines 3
z If info = 0, contains the n-by-n matrix the columns of which are
orthonormal eigenvectors (the i-th column corresponds to the i-th
eigenvalue).
info INTEGER.
If info = 0, the execution is successful.
If info = i, the algorithm failed to find all the eigenvalues after 30n
iterations: i off-diagonal elements have not converged to zero. On exit, d
and e contain, respectively, the diagonal and off-diagonal elements of a
tridiagonal matrix orthogonally similar to T.
If info = -i, the i-th parameter had an illegal value.
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix T+E such that ||E||2 = O()*||T||2,
where is the machine precision.
If i is an exact eigenvalue, and i is the corresponding computed value, then
|i - i| c(n)**||T||2
If zi is the corresponding exact eigenvector, and wi is the corresponding computed vector, then the angle
(zi, wi) between them is bounded as follows:
(zi, wi) c(n)**||T||2 / minij|i - j|.
931
3 Intel Math Kernel Library Developer Reference
The total number of floating-point operations depends on how rapidly the algorithm converges. Typically, it is
about
24n2 if compz = 'N';
7n3 (for complex flavors, 14n3) if compz = 'V' or 'I'.
?stemr
Computes selected eigenvalues and eigenvectors of a
real symmetric tridiagonal matrix.
Syntax
call sstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz, tryrac,
work, lwork, iwork, liwork, info)
call dstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz, tryrac,
work, lwork, iwork, liwork, info)
call cstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz, tryrac,
work, lwork, iwork, liwork, info)
call zstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz, tryrac,
work, lwork, iwork, liwork, info)
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric tridiagonal
matrix T. Any such unreduced matrix has a well defined set of pairwise different real eigenvalues, the
corresponding real eigenvectors are pairwise orthogonal.
The spectrum may be computed either completely or partially by specifying either an interval (vl,vu] or a
range of indices il:iu for the desired eigenvalues.
Depending on the number of desired eigenvalues, these are computed either by bisection or the dqds
algorithm. Numerically orthogonal eigenvectors are computed by the use of various suitable L*D*LT
factorizations near clusters of close eigenvalues (referred to as RRRs, Relatively Robust Representations). An
informal sketch of the algorithm follows.
For each unreduced block (submatrix) of T,
a. Compute T - sigma*I = L*D*LT, so that L and D define all the wanted eigenvalues to high relative
accuracy. This means that small relative changes in the entries of L and D cause only small relative
changes in the eigenvalues and eigenvectors. The standard (unfactored) representation of the
tridiagonal matrix T does not have this property in general.
b. Compute the eigenvalues to suitable accuracy. If the eigenvectors are desired, the algorithm attains full
accuracy of the computed eigenvalues only right before the corresponding vectors have to be
computed, see steps c and d.
c. For each cluster of close eigenvalues, select a new shift close to the cluster, find a new factorization,
and refine the shifted eigenvalues to suitable accuracy.
d. For each eigenvalue with a large enough relative separation compute the corresponding eigenvector by
forming a rank revealing twisted factorization. Go back to step c for any clusters that remain.
932
LAPACK Routines 3
Input Parameters
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1iliun, if n>0.
ldz 1 otherwise.
933
3 Intel Math Kernel Library Developer Reference
If nzc = -1, then a workspace query is assumed; the routine calculates the
number of columns of the array z that are needed to hold the eigenvectors.
This value is returned as the first entry of the array z, and no error
message related to nzc is issued by the routine xerbla.
tryrac LOGICAL.
If tryrac= .TRUE. is true, it indicates that the code should check whether
the tridiagonal matrix defines its eigenvalues to high relative accuracy. If
so, the code uses relative-accuracy preserving algorithms that might be (a
bit) slower depending on the matrix. If the matrix does not define its
eigenvalues to high relative accuracy, the code can uses possibly faster
algorithms.
If tryrac= .FALSE. is not true, the code is not required to guarantee
relatively accurate eigenvalues and can use the fastest possible techniques.
lwork INTEGER.
The dimension of the array work,
lwork max(1, 18*n).
If lwork=-1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
iwork INTEGER.
Workspace array, size (liwork).
liwork INTEGER.
The dimension of the array iwork.
lworkmax(1, 10*n) if the eigenvectors are desired, and lworkmax(1,
8*n) if only the eigenvalues are to be computed.
If liwork=-1, then a workspace query is assumed; the routine only
calculates the optimal size of the iwork array, returns this value as the first
entry of the iwork array, and no error message related to liwork is issued by
xerbla.
Output Parameters
934
LAPACK Routines 3
m INTEGER.
The total number of eigenvalues found, 0mn.
If jobz = 'V', and info = 0, then the first m columns of z contain the
orthonormal eigenvectors of the matrix T corresponding to the selected
eigenvalues, with the i-th column of z holding the eigenvector associated
with w(i).
Note: you must ensure that at least max(1,m) columns are supplied in the
array z ; if range = 'V', the exact value of m is not known in advance and
can be computed with a workspace query by setting nzc=-1, see
description of the parameter nzc.
isuppz INTEGER.
Array, size (2*max(1, m)).
tryrac On exit, TRUE. tryrac is set to .FALSE. if the matrix does not define its
eigenvalues to high relative accuracy.
work(1) On exit, if info = 0, then work(1) returns the optimal (and minimal) size
of lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the optimal size of liwork.
info INTEGER.
If = 0, the execution is successful.
935
3 Intel Math Kernel Library Developer Reference
?stedc
Computes all eigenvalues and eigenvectors of a
symmetric tridiagonal matrix using the divide and
conquer method.
Syntax
call sstedc(compz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call dstedc(compz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call cstedc(compz, n, d, e, z, ldz, work, lwork, rwork, lrwork, iwork, liwork, info)
call zstedc(compz, n, d, e, z, ldz, work, lwork, rwork, lrwork, iwork, liwork, info)
call rstedc(d, e [,z] [,compz] [,info])
call stedc(d, e [,z] [,compz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a symmetric tridiagonal
matrix using the divide and conquer method. The eigenvectors of a full or band real symmetric or complex
Hermitian matrix can also be found if sytrd/hetrd or sptrd/hptrd or sbtrd/hbtrd has been used to reduce this
matrix to tridiagonal form.
See also laed0, laed1, laed2, laed3, laed4, laed5, laed6, laed7, laed8, laed9, and laeda used by this function.
Input Parameters
936
LAPACK Routines 3
DOUBLE PRECISION for dstedc
COMPLEX for cstedc
DOUBLE COMPLEX for zstedc.
Arrays: z(ldz, *), work(*).
If compz = 'V', then, on entry, z must contain the orthogonal/unitary
matrix used to reduce the original matrix to tridiagonal form.
The second dimension of z must be at least max(1, n).
work is a workspace array, its dimension max(1, lwork).
lrwork INTEGER. The dimension of the array rwork (used for complex flavors only).
If compz = 'N', or n 1, lrwork must be at least 1.
Note that for compz = 'V'or 'I', and if n is less than or equal to the
minimum divide size, usually 25, then lrwork need only be max(1,
2*(n-1)).
937
3 Intel Math Kernel Library Developer Reference
Note that for compz = 'V'or 'I', and if n is less than or equal to the
minimum divide size, usually 25, then liwork need only be 1.
Output Parameters
rwork(1) On exit, if info = 0, then rwork(1) returns the optimal lrwork (for complex
flavors only).
info INTEGER.
If info = 0, the execution is successful.
938
LAPACK Routines 3
Specific details for the routine stedc interface are the following:
If present, compz must be equal to 'I' or 'V' and the argument z must also be
present. Note that there will be an error condition if compz is present and z
omitted.
Note that two variants of Fortran 95 interface for stedc routine are needed because of an ambiguous choice
between real and complex cases appear when z and work are omitted. Thus, the name rstedc is used in
real cases (single or double precision), and the name stedc is used in complex cases (single or double
precision).
Application Notes
The required size of workspace arrays must be as follows.
For sstedc/dstedc:
If compz = 'V' and n > 1 then lwork must be at least (1 + 3n + 2nlog2n + 4n2), where log2(n) =
smallest integer k such that 2kn.
For cstedc/zstedc:
If compz = 'V' and n > 1, lrwork must be at least (1 + 3n + 2nlog2n + 4n2), where log2(n ) = smallest
integer k such that 2kn.
The required value of liwork for complex flavors is the same as for real flavors.
If lwork (or liwork or lrwork, if supplied) is equal to -1, then the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork, rwork). This operation
is called a workspace query.
Note that if lwork (liwork, lrwork) is less than the minimal required value and is not equal to -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
939
3 Intel Math Kernel Library Developer Reference
?stegr
Computes selected eigenvalues and eigenvectors of a
real symmetric tridiagonal matrix.
Syntax
call sstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
lwork, iwork, liwork, info)
call dstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
lwork, iwork, liwork, info)
call cstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
lwork, iwork, liwork, info)
call zstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
lwork, iwork, liwork, info)
call rstegr(d, e, w [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
call stegr(d, e, w [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric tridiagonal
matrix T.
The spectrum may be computed either completely or partially by specifying either an interval (vl,vu] or a
range of indices il:iu for the desired eigenvalues.
?stegr is a compatibility wrapper around the improved stemr routine. See its description for further details.
Note that the abstol parameter no longer provides any benefit and hence is no longer used.
See also auxiliary lasq2lasq5, lasq6, used by this routine.
Input Parameters
940
LAPACK Routines 3
d(*) contains the diagonal elements of T.
The dimension of d must be at least max(1, n).
e(*) contains the subdiagonal elements of T in elements 1 to n-1; e(n)
need not be set on input, but it is used as a workspace.
The dimension of e must be at least max(1, n).
work(lwork) is a workspace array.
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0.
lwork INTEGER.
The dimension of the array work,
lworkmax(1, 18*n) if jobz = 'V', and
lworkmax(1, 12*n) if jobz = 'N'.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla. See Application Notes below for details.
iwork INTEGER.
Workspace array, size (liwork).
liwork INTEGER.
The dimension of the array iwork, liwork max(1, 10*n) if the
eigenvectors are desired, and liwork max(1, 8*n) if only the
eigenvalues are to be computed..
941
3 Intel Math Kernel Library Developer Reference
Output Parameters
Note: if range = 'V', the exact value of m is not known in advance and an
upper bound must be used. Using n = m is always safe.
isuppz INTEGER.
Array, size at least (2*max(1, m)).
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
942
LAPACK Routines 3
If info = -i, the i-th parameter had an illegal value.
vl Default value for this argument is vl = - HUGE (vl) where HUGE(a) means the
largest machine number of the same precision as argument a.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Note that two variants of Fortran 95 interface for stegr routine are needed because of an ambiguous choice
between real and complex cases appear when z is omitted. Thus, the name rstegr is used in real cases
(single or double precision), and the name stegr is used in complex cases (single or double precision).
943
3 Intel Math Kernel Library Developer Reference
Application Notes
?stegr works only on machines which follow IEEE-754 floating-point standard in their handling of infinities
and NaNs. Normal execution of ?stegr may create NaNs and infinities and hence may abort due to a floating
point exception in environments which do not conform to the IEEE-754 standard.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run, or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, then the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), then the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work, iwork). This operation is called a workspace
query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, then the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
?pteqr
Computes all eigenvalues and (optionally) all
eigenvectors of a real symmetric positive-definite
tridiagonal matrix.
Syntax
call spteqr(compz, n, d, e, z, ldz, work, info)
call dpteqr(compz, n, d, e, z, ldz, work, info)
call cpteqr(compz, n, d, e, z, ldz, work, info)
call zpteqr(compz, n, d, e, z, ldz, work, info)
call rpteqr(d, e [,z] [,compz] [,info])
call pteqr(d, e [,z] [,compz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a real symmetric positive-
definite tridiagonal matrix T. In other words, the routine can compute the spectral factorization: T =
Z**ZT.
Here is a diagonal matrix whose diagonal elements are the eigenvalues i; Z is an orthogonal matrix whose
columns are eigenvectors. Thus,
T*zi = i*zi for i = 1, 2, ..., n.
(The routine normalizes the eigenvectors so that ||zi||2 = 1.)
You can also use the routine for computing the eigenvalues and eigenvectors of real symmetric (or complex
Hermitian) positive-definite matrices A reduced to tridiagonal form T: A = Q*T*QH. In this case, the spectral
factorization is as follows: A = Q*T*QH = (QZ)**(QZ)H. Before calling ?pteqr, you must reduce A to
tridiagonal form and generate the explicit matrix Q by calling the following routines:
944
LAPACK Routines 3
for real matrices: for complex matrices:
The routine first factorizes T as L*D*LH where L is a unit lower bidiagonal matrix, and D is a diagonal matrix.
Then it forms the bidiagonal matrix B = L*D1/2 and calls ?bdsqr to compute the singular values of B, which
are the square roots of the eigenvalues of T.
Input Parameters
945
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Note that two variants of Fortran 95 interface for pteqr routine are needed because of an ambiguous choice
between real and complex cases appear when z is omitted. Thus, the name rpteqr is used in real cases
(single or double precision), and the name pteqr is used in complex cases (single or double precision).
946
LAPACK Routines 3
Application Notes
If i is an exact eigenvalue, and i is the corresponding computed value, then
|i - i| c(n)**K*i
where c(n) is a modestly increasing function of n, is the machine precision, and K = ||DTD||2 *||
(DTD)-1||2, D is diagonal with dii = tii-1/2.
If zi is the corresponding exact eigenvector, and wi is the corresponding computed vector, then the angle (zi,
wi) between them is bounded as follows:
(ui, wi) c(n)K / minij(|i - j|/|i + j|).
Here minij(|i - j|/|i + j|) is the relative gap between i and the other eigenvalues.
The total number of floating-point operations depends on how rapidly the algorithm converges.
Typically, it is about
30n2 if compz = 'N';
6n3 (for complex flavors, 12n3) if compz = 'V' or 'I'.
?stebz
Computes selected eigenvalues of a real symmetric
tridiagonal matrix by bisection.
Syntax
call sstebz (range, order, n, vl, vu, il, iu, abstol, d, e, m, nsplit, w, iblock,
isplit, work, iwork, info)
call dstebz (range, order, n, vl, vu, il, iu, abstol, d, e, m, nsplit, w, iblock,
isplit, work, iwork, info)
call stebz(d, e, m, nsplit, w, iblock, isplit [, order] [,vl] [,vu] [,il] [,iu]
[,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes some (or all) of the eigenvalues of a real symmetric tridiagonal matrix T by bisection.
The routine searches for zero or negligible off-diagonal elements to see if T splits into block-diagonal form T
= diag(T1, T2, ...). Then it performs bisection on each of the blocks Ti and returns the block index of
each computed eigenvalue, so that a subsequent call to stein can also take advantage of the block structure.
See also laebz.
Input Parameters
947
3 Intel Math Kernel Library Developer Reference
Output Parameters
948
LAPACK Routines 3
w REAL for sstebz
DOUBLE PRECISION for dstebz.
Array, size at least max(1, n). The computed eigenvalues, stored in w(1) to
w(m).
info INTEGER.
If info = 0, the execution is successful.
If info = 3:
vl Default value for this argument is vl = - HUGE (vl) where HUGE(a) means the
largest machine number of the same precision as argument a.
949
3 Intel Math Kernel Library Developer Reference
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il,
iu is present, Note that there will be an error condition if one of or both vl and vu
are present and at the same time one of or both il and iu are present.
Application Notes
The eigenvalues of T are computed to high relative accuracy which means that if they vary widely in
magnitude, then any small eigenvalues will be computed more accurately than, for example, with the
standard QR method. However, the reduction to tridiagonal form (prior to calling the routine) may exclude
the possibility of obtaining high relative accuracy in the small eigenvalues of the original matrix if its
eigenvalues vary widely in magnitude.
?stein
Computes the eigenvectors corresponding to specified
eigenvalues of a real symmetric tridiagonal matrix.
Syntax
call sstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call dstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call cstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call zstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call stein(d, e, w, iblock, isplit, z [,ifailv] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the eigenvectors of a real symmetric tridiagonal matrix T corresponding to specified
eigenvalues, by inverse iteration. It is designed to be used in particular after the specified eigenvalues have
been computed by ?stebz with order = 'B', but may also be used when the eigenvalues have been
computed by other routines.
If you use this routine after ?stebz, it can take advantage of the block structure by performing inverse
iteration on each block Ti separately, which is more efficient than using the whole matrix T.
If T has been formed by reduction of a full symmetric or Hermitian matrix A to tridiagonal form, you can
transform eigenvectors of T to eigenvectors of A by calling ?ormtr or ?opmtr (for real flavors) or by calling ?
unmtr or ?upmtr (for complex flavors).
950
LAPACK Routines 3
Input Parameters
If you did not call ?stebz with order = 'B', set all elements of iblock to
1, and isplit(1) to n.)
ldz INTEGER. The leading dimension of the output array z; ldz max(1, n).
iwork INTEGER.
Workspace array, size at least max(1, n).
Output Parameters
ifailv INTEGER.
951
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
Application Notes
Each computed eigenvector zi is an exact eigenvector of a matrix T+Ei, where ||Ei||2 = O()*||T||2.
However, a set of eigenvectors computed by this routine may not be orthogonal to so high a degree of
accuracy as those computed by ?steqr.
?disna
Computes the reciprocal condition numbers for the
eigenvectors of a symmetric/ Hermitian matrix or for
the left or right singular vectors of a general matrix.
Syntax
call sdisna(job, m, n, d, sep, info)
call ddisna(job, m, n, d, sep, info)
call disna(d, sep [,job] [,minmn] [,info])
Include Files
mkl.fi, lapack.f90
952
LAPACK Routines 3
Description
The routine computes the reciprocal condition numbers for the eigenvectors of a real symmetric or complex
Hermitian matrix or for the left or right singular vectors of a general m-by-n matrix.
The reciprocal condition number is the 'gap' between the corresponding eigenvalue or singular value and the
nearest other one.
The bound on the error, measured by angle in radians, in the i-th computed vector is given by
?lamch('E')*(anorm/sep(i))
where anorm = ||A||2 = max( |d(j)| ). sep(i) is not allowed to be smaller than slamch('E')*anorm in
order to limit the size of the error bound.
?disna may also be used to compute error bounds for eigenvectors of the generalized symmetric definite
eigenproblem.
Input Parameters
job CHARACTER*1. Must be 'E','L', or 'R'. Specifies for which problem the
reciprocal condition numbers should be computed:
job = 'E': for the eigenvectors of a symmetric/Hermitian matrix;
n INTEGER.
If job = 'L', or 'R', the number of columns of the matrix (n 0). Ignored
if job = 'E'.
This array must contain the eigenvalues (if job = 'E') or singular values
(if job = 'L' or 'R') of the matrix, in either increasing or decreasing
order.
If singular values, they must be non-negative.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
953
3 Intel Math Kernel Library Developer Reference
minmn Indicates which of the values m or n is smaller. Must be either 'M' or 'N', the
default is 'M'.
If job = 'E', this argument is superfluous, If job = 'L' or 'R', this argument
is used by the routine.
954
LAPACK Routines 3
?sygst
Reduces a real symmetric-definite generalized
eigenvalue problem to the standard form.
Syntax
call ssygst(itype, uplo, n, a, lda, b, ldb, info)
call dsygst(itype, uplo, n, a, lda, b, ldb, info)
call sygst(a, b [,itype] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
Input Parameters
If uplo = 'L', the array a stores the lower triangle of A; you must supply
B in the factored form B = L*LT.
955
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
Forming the reduced matrix C is a stable procedure. However, it involves implicit multiplication by inv(B) (if
itype = 1) or B (if itype = 2 or 3). When the routine is used as a step in the computation of eigenvalues
and eigenvectors of the original problem, there may be a significant loss of accuracy if B is ill-conditioned
with respect to inversion.
The approximate number of floating-point operations is n3.
?hegst
Reduces a complex Hermitian positive-definite
generalized eigenvalue problem to the standard form.
Syntax
call chegst(itype, uplo, n, a, lda, b, ldb, info)
call zhegst(itype, uplo, n, a, lda, b, ldb, info)
call hegst(a, b [,itype] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
956
LAPACK Routines 3
Description
The routine reduces a complex Hermitian positive-definite generalized eigenvalue problem to standard form.
3 B*A*x = *x
Before calling this routine, you must call ?potrf to compute the Cholesky factorization: B = UH*U or B =
L*LH.
Input Parameters
If uplo = 'L', the array a stores the lower triangle of A; you must supply
B in the factored form B = L*LH.
957
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
Forming the reduced matrix C is a stable procedure. However, it involves implicit multiplication by B-1 (if
itype = 1) or B (if itype = 2 or 3). When the routine is used as a step in the computation of eigenvalues
and eigenvectors of the original problem, there may be a significant loss of accuracy if B is ill-conditioned
with respect to inversion.
The approximate number of floating-point operations is n3.
?spgst
Reduces a real symmetric-definite generalized
eigenvalue problem to the standard form using packed
storage.
Syntax
call sspgst(itype, uplo, n, ap, bp, info)
call dspgst(itype, uplo, n, ap, bp, info)
call spgst(ap, bp [,itype] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
958
LAPACK Routines 3
to the standard form C*y = *y, using packed matrix storage. Here A is a real symmetric matrix, and B is a
real symmetric positive-definite matrix. Before calling this routine, call ?pptrf to compute the Cholesky
factorization: B = UT*U or B = L*LT.
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
959
3 Intel Math Kernel Library Developer Reference
Application Notes
Forming the reduced matrix C is a stable procedure. However, it involves implicit multiplication by inv(B) (if
itype = 1) or B (if itype = 2 or 3). When the routine is used as a step in the computation of eigenvalues
and eigenvectors of the original problem, there may be a significant loss of accuracy if B is ill-conditioned
with respect to inversion.
The approximate number of floating-point operations is n3.
?hpgst
Reduces a generalized eigenvalue problem with a
Hermitian matrix to a standard eigenvalue problem
using packed storage.
Syntax
call chpgst(itype, uplo, n, ap, bp, info)
call zhpgst(itype, uplo, n, ap, bp, info)
call hpgst(ap, bp [,itype] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
Input Parameters
960
LAPACK Routines 3
If itype = 2, the generalized eigenproblem is A*B*z = lambda*z
If uplo = 'L', ap stores the packed lower triangle of A; you must supply B
in the factored form B = L*LH.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
961
3 Intel Math Kernel Library Developer Reference
Application Notes
Forming the reduced matrix C is a stable procedure. However, it involves implicit multiplication by inv(B) (if
itype = 1) or B (if itype = 2 or 3). When the routine is used as a step in the computation of eigenvalues
and eigenvectors of the original problem, there may be a significant loss of accuracy if B is ill-conditioned
with respect to inversion.
The approximate number of floating-point operations is n3.
?sbgst
Reduces a real symmetric-definite generalized
eigenproblem for banded matrices to the standard
form using the factorization performed by ?pbstf.
Syntax
call ssbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, info)
call dsbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, info)
call sbgst(ab, bb [,x] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
To reduce the real symmetric-definite generalized eigenproblem A*z = *B*z to the standard form C*y=*y,
where A, B and C are banded, this routine must be preceded by a call to pbstf, which computes the split
Cholesky factorization of the positive-definite matrix B: B=ST*S. The split Cholesky factorization, compared
with the ordinary Cholesky factorization, allows the work to be approximately halved.
This routine overwrites A with C = XT*A*X, where X = inv(S)*Q and Q is an orthogonal matrix chosen
(implicitly) to preserve the bandwidth of A. The routine also has an option to allow the accumulation of X,
and then, if z is an eigenvector of C, X*z is an eigenvector of the original system.
Input Parameters
962
LAPACK Routines 3
DOUBLE PRECISION for dsbgst
ab(ldab,*) is an array containing either upper or lower triangular part of the
symmetric matrix A (as specified by uplo) in band storage format.
The second dimension of the array ab must be at least max(1, n).
bb(ldbb,*) is an array containing the banded split Cholesky factor of B as
specified by uplo, n and kb and returned by pbstf/pbstf.
The second dimension of the array bb must be at least max(1, n).
work(*) is a workspace array, dimension at least max(1, 2*n)
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1.
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
ldx The leading dimension of the output array x. Constraints: if vect = 'N',
then ldx 1;
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
963
3 Intel Math Kernel Library Developer Reference
Application Notes
Forming the reduced matrix C involves implicit multiplication by inv(B). When the routine is used as a step
in the computation of eigenvalues and eigenvectors of the original problem, there may be a significant loss of
accuracy if B is ill-conditioned with respect to inversion.
If ka and kb are much less than n then the total number of floating-point operations is approximately
6n2*kb, when vect = 'N'. Additional (3/2)n3*(kb/ka) operations are required when vect = 'V'.
?hbgst
Reduces a complex Hermitian positive-definite
generalized eigenproblem for banded matrices to the
standard form using the factorization performed by ?
pbstf.
Syntax
call chbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, rwork, info)
call zhbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, rwork, info)
call hbgst(ab, bb [,x] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
To reduce the complex Hermitian positive-definite generalized eigenproblem A*z = *B*z to the standard
form C*x = *y, where A, B and C are banded, this routine must be preceded by a call to pbstf/pbstf, which
computes the split Cholesky factorization of the positive-definite matrix B: B = SH*S. The split Cholesky
factorization, compared with the ordinary Cholesky factorization, allows the work to be approximately halved.
This routine overwrites A with C = XH*A*X, where X = inv(S)*Q, and Q is a unitary matrix chosen
(implicitly) to preserve the bandwidth of A. The routine also has an option to allow the accumulation of X,
and then, if z is an eigenvector of C, X*z is an eigenvector of the original system.
Input Parameters
964
LAPACK Routines 3
ka INTEGER. The number of super- or sub-diagonals in A
(ka 0).
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1.
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
965
3 Intel Math Kernel Library Developer Reference
vect Restored based on the presence of the argument x as follows: vect = 'V', if x
is present, vect = 'N', if x is omitted.
Application Notes
Forming the reduced matrix C involves implicit multiplication by inv(B). When the routine is used as a step
in the computation of eigenvalues and eigenvectors of the original problem, there may be a significant loss of
accuracy if B is ill-conditioned with respect to inversion. The total number of floating-point operations is
approximately 20n2*kb, when vect = 'N'. Additional 5n3*(kb/ka) operations are required when vect =
'V'. All these estimates assume that both ka and kb are much less than n.
?pbstf
Computes a split Cholesky factorization of a real
symmetric or complex Hermitian positive-definite
banded matrix used in ?sbgst/?hbgst .
Syntax
call spbstf(uplo, n, kb, bb, ldbb, info)
call dpbstf(uplo, n, kb, bb, ldbb, info)
call cpbstf(uplo, n, kb, bb, ldbb, info)
call zpbstf(uplo, n, kb, bb, ldbb, info)
call pbstf(bb [, uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes a split Cholesky factorization of a real symmetric or complex Hermitian positive-
definite band matrix B. It is to be used in conjunction with sbgst/hbgst.
The factorization has the form B = ST*S (or B = SH*S for complex flavors), where S is a band matrix of the
same bandwidth as B and the following structure: S is upper triangular in the first (n+kb)/2 rows and lower
triangular in the remaining rows.
Input Parameters
966
LAPACK Routines 3
If uplo = 'U', bb stores the upper triangular part of B.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed factor S is the exact factor of a perturbed matrix B + E, where
967
3 Intel Math Kernel Library Developer Reference
The total number of floating-point operations for real flavors is approximately n(kb+1)2. The number of
operations for complex flavors is 4 times greater. All these estimates assume that kb is much less than n.
After calling this routine, you can call sbgst/hbgst to solve the generalized eigenproblem Az = Bz, where A
and B are banded and B is positive-definite.
The number of eigenvectors may be less than the matrix order (but is not less than the number of
distinct eigenvalues of A).
Eigenvalues may be complex even for a real matrix A.
If a real nonsymmetric matrix has a complex eigenvalue a+bi corresponding to an eigenvector z, then a-
bi is also an eigenvalue. The eigenvalue a-bi corresponds to the eigenvector whose elements are
complex conjugate to the elements of z.
To solve a nonsymmetric eigenvalue problem with LAPACK, you usually need to reduce the matrix to the
upper Hessenberg form and then solve the eigenvalue problem with the Hessenberg matrix obtained. Table
"Computational Routines for Solving Nonsymmetric Eigenvalue Problems" lists LAPACK routines to reduce the
matrix to the upper Hessenberg form by an orthogonal (or unitary) similarity transformation A = QHQH as
well as routines to solve eigenvalue problems with Hessenberg matrices, forming the Schur factorization of
such matrices and computing the corresponding condition numbers. The corresponding routine names in the
Fortran 95 interface are without the first symbol.
The decision tree in Figure "Decision Tree: Real Nonsymmetric Eigenvalue Problems" helps you choose the
right routine or sequence of routines for an eigenvalue problem with a real nonsymmetric matrix. If you need
to solve an eigenvalue problem with a complex non-Hermitian matrix, use the decision tree shown in Figure
"Decision Tree: Complex Non-Hermitian Eigenvalue Problems".
Computational Routines for Solving Nonsymmetric Eigenvalue Problems
Operation performed Routines for real matrices Routines for complex matrices
968
LAPACK Routines 3
Operation performed Routines for real matrices Routines for complex matrices
969
3 Intel Math Kernel Library Developer Reference
?gehrd
Reduces a general matrix to upper Hessenberg form.
Syntax
call sgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call dgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call cgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call zgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call gehrd(a [, tau] [,ilo] [,ihi] [,info])
Include Files
mkl.fi, lapack.f90
970
LAPACK Routines 3
Description
The routine reduces a general matrix A to upper Hessenberg form H by an orthogonal or unitary similarity
transformation A = Q*H*QH. Here H has real subdiagonal elements.
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of elementary
reflectors. Routines are provided to work with Q in this representation.
Input Parameters
ilo, ihi INTEGER. If A is an output by ?gebal, then ilo and ihi must contain the
values returned by that routine. Otherwise ilo = 1 and ihi = n. (If n >
0, then 1 iloihin; if n = 0, ilo = 1 and ihi = 0.)
lwork INTEGER. The size of the work array; at least max(1, n).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
a The elements on and above the subdiagonal contain the upper Hessenberg
matrix H. The subdiagonal elements of H are real. The elements below the
subdiagonal, with the array tau, represent the orthogonal matrix Q as a
product of n elementary reflectors.
971
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed Hessenberg matrix H is exactly similar to a nearby matrix A + E, where ||E||2 < c(n)||
A||2, c(n) is a modestly increasing function of n, and is the machine precision.
The approximate number of floating-point operations for real flavors is (2/3)*(ihi - ilo)2(2ihi + 2ilo
+ 3n); for complex flavors it is 4 times greater.
?orghr
Generates the real orthogonal matrix Q determined
by ?gehrd.
Syntax
call sorghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call dorghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call orghr(a, tau [,ilo] [,ihi] [,info])
972
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the orthogonal matrix Q that has been determined by a preceding call to
sgehrd/dgehrd. (The routine ?gehrd reduces a real general matrix A to upper Hessenberg form H by an
orthogonal similarity transformation, A = Q*H*QT, and represents the matrix Q as a product of ihi-
iloelementary reflectors. Here ilo and ihi are values determined by sgebal/dgebal when balancing the
matrix; if the matrix has not been balanced, ilo = 1 and ihi = n.)
Input Parameters
ilo, ihi INTEGER. These must be the same parameters ilo and ihi, respectively, as
supplied to ?gehrd. (If n > 0, then 1 iloihin; if n = 0, ilo = 1 and
ihi = 0.)
973
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork =(ihi-ilo)*blocksize where blocksize is a machine-dependent
value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O(), where is the
machine precision.
The approximate number of floating-point operations is (4/3)(ihi-ilo)3.
?ormhr
Multiplies an arbitrary real matrix C by the real
orthogonal matrix Q determined by ?gehrd.
Syntax
call sormhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork, info)
974
LAPACK Routines 3
call dormhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork, info)
call ormhr(a, tau, c [,ilo] [,ihi] [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a matrix C by the orthogonal matrix Q that has been determined by a preceding call to
sgehrd/dgehrd. (The routine ?gehrd reduces a real general matrix A to upper Hessenberg form H by an
orthogonal similarity transformation, A = Q*H*QT, and represents the matrix Q as a product of ihi-
iloelementary reflectors. Here ilo and ihi are values determined by sgebal/dgebal when balancing the
matrix;if the matrix has not been balanced, ilo = 1 and ihi = n.)
With ?ormhr, you can form one of the matrix products Q*C, QT*C, C*Q, or C*QT, overwriting the result on C
(which may be any real rectangular matrix).
A common application of ?ormhr is to transform a matrix V of eigenvectors of H to the matrix QV of
eigenvectors of A.
Input Parameters
ilo, ihi INTEGER. These must be the same parameters ilo and ihi, respectively, as
supplied to ?gehrd.
975
3 Intel Math Kernel Library Developer Reference
The dimension of tau must be at least max (1, m-1) if side = 'L' and at
least max (1, n-1) if side = 'R'.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
976
LAPACK Routines 3
ilo Default value for this argument is ilo = 1.
Application Notes
For better performance, lwork should be at least n*blocksize if side = 'L' and at least m*blocksize if side
= 'R', where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance
of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O()|*|C||2, where
is the machine precision.
The approximate number of floating-point operations is
2n(ihi-ilo)2 if side = 'L';
2m(ihi-ilo)2 if side = 'R'.
The complex counterpart of this routine is unmhr.
?unghr
Generates the complex unitary matrix Q determined
by ?gehrd.
Syntax
call cunghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call zunghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call unghr(a, tau [,ilo] [,ihi] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine is intended to be used following a call to cgehrd/zgehrd, which reduces a complex matrix A to
upper Hessenberg form H by a unitary similarity transformation: A = Q*H*QH. ?gehrd represents the matrix
Q as a product of ihi-iloelementary reflectors. Here ilo and ihi are values determined by cgebal/zgebal
when balancing the matrix; if the matrix has not been balanced, ilo = 1 and ihi = n.
Use the routine unghr to generate Q explicitly as a square matrix. The matrix Q has the structure:
977
3 Intel Math Kernel Library Developer Reference
Input Parameters
ilo, ihi INTEGER. These must be the same parameters ilo and ihi, respectively, as
supplied to ?gehrd . (If n > 0, then 1 iloihin. If n = 0, then ilo =
1 and ihi = 0.)
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
978
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Interface Conventions.
Specific details for the routine unghr interface are the following:
Application Notes
For better performance, try using lwork = (ihi-ilo)*blocksize, where blocksize is a machine-
dependent value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O(), where is the
machine precision.
The approximate number of real floating-point operations is (16/3)(ihi-ilo)3.
?unmhr
Multiplies an arbitrary complex matrix C by the
complex unitary matrix Q determined by ?gehrd.
Syntax
call cunmhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork, info)
call zunmhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork, info)
call unmhr(a, tau, c [,ilo] [,ihi] [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
979
3 Intel Math Kernel Library Developer Reference
The routine multiplies a matrix C by the unitary matrix Q that has been determined by a preceding call to
cgehrd/zgehrd. (The routine ?gehrd reduces a real general matrix A to upper Hessenberg form H by an
orthogonal similarity transformation, A = Q*H*QH, and represents the matrix Q as a product of ihi-ilo
elementary reflectors. Here ilo and ihi are values determined by cgebal/zgebal when balancing the
matrix; if the matrix has not been balanced, ilo = 1 and ihi = n.)
With ?unmhr, you can form one of the matrix products Q*C, QH*C, C*Q, or C*QH, overwriting the result on C
(which may be any complex rectangular matrix). A common application of this routine is to transform a
matrix V of eigenvectors of H to the matrix QV of eigenvectors of A.
Input Parameters
ilo, ihi INTEGER. These must be the same parameters ilo and ihi, respectively, as
supplied to ?gehrd .
980
LAPACK Routines 3
lda INTEGER. The leading dimension of a; at least max(1, m) if side = 'L'
and at least max (1, n) if side = 'R'.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
981
3 Intel Math Kernel Library Developer Reference
Application Notes
For better performance, lwork should be at least n*blocksize if side = 'L' and at least m*blocksize if side
= 'R', where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance
of the blocked algorithm.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O()*||C||2, where
is the machine precision.
The approximate number of floating-point operations is
8n(ihi-ilo)2 if side = 'L';
8m(ihi-ilo)2 if side = 'R'.
The real counterpart of this routine is ormhr.
?gebal
Balances a general matrix to improve the accuracy of
computed eigenvalues and eigenvectors.
Syntax
call sgebal(job, n, a, lda, ilo, ihi, scale, info)
call dgebal(job, n, a, lda, ilo, ihi, scale, info)
call cgebal(job, n, a, lda, ilo, ihi, scale, info)
call zgebal(job, n, a, lda, ilo, ihi, scale, info)
call gebal(a [, scale] [,ilo] [,ihi] [,job] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine balances a matrix A by performing either or both of the following two similarity transformations:
(1) The routine first attempts to permute A to block upper triangular form:
982
LAPACK Routines 3
where P is a permutation matrix, and A'11 and A'33 are upper triangular. The diagonal elements of A'11 and
A'33 are eigenvalues of A. The rest of the eigenvalues of A are the eigenvalues of the central diagonal block
A'22, in rows and columns ilo to ihi. Subsequent operations to compute the eigenvalues of A (or its Schur
factorization) need only be applied to these rows and columns; this can save a significant amount of work if
ilo > 1 and ihi < n.
If no suitable permutation exists (as is often the case), the routine sets ilo = 1 and ihi = n, and A'22 is
the whole of A.
(2) The routine applies a diagonal similarity transformation to A', to make the rows and columns of A'22 as
close in norm as possible:
This scaling can reduce the norm of the matrix (that is, ||A''22|| < ||A'22||), and hence reduce the
effect of rounding errors on the accuracy of computed eigenvalues and eigenvectors.
Input Parameters
Output Parameters
ilo, ihi INTEGER. The values ilo and ihi such that on exit a(i,j) is zero if i > j and
1 j < ilo or ihi < jn.
If job = 'N' or 'S', then ilo = 1 and ihi = n.
983
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
job Must be 'B', 'S', 'P', or 'N'. The default value is 'B'.
Application Notes
The errors are negligible, compared with those in subsequent computations.
If the matrix A is balanced by this routine, then any eigenvectors computed subsequently are eigenvectors of
the matrix A'' and hence you must call gebak to transform them back to eigenvectors of A.
If the Schur vectors of A are required, do not call this routine with job = 'S' or 'B', because then the
balancing transformation is not orthogonal (not unitary for complex flavors).
If you call this routine with job = 'P', then any Schur vectors computed subsequently are Schur vectors of
the matrix A'', and you need to call gebak (with side = 'R') to transform them back to Schur vectors of A.
?gebak
Transforms eigenvectors of a balanced matrix to those
of the original nonsymmetric matrix.
Syntax
call sgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call dgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
984
LAPACK Routines 3
call cgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call zgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call gebak(v, scale [,ilo] [,ihi] [,job] [,side] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine is intended to be used after a matrix A has been balanced by a call to ?gebal, and eigenvectors
of the balanced matrix A''22 have subsequently been computed. For a description of balancing, see gebal. The
balanced matrix A'' is obtained as A''= D*P*A*PT*inv(D), where P is a permutation matrix and D is a
diagonal scaling matrix. This routine transforms the eigenvectors as follows:
if x is a right eigenvector of A'', then PT*inv(D)*x is a right eigenvector of A; if y is a left eigenvector of A'',
then PT*D*y is a left eigenvector of A.
Input Parameters
job CHARACTER*1. Must be 'N' or 'P' or 'S' or 'B'. The same parameter job
as supplied to ?gebal.
ilo, ihi INTEGER. The values ilo and ihi, as returned by ?gebal. (If n > 0, then 1
iloihin;
if n = 0, then ilo = 1 and ihi = 0.)
985
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
job Must be 'B', 'S', 'P', or 'N'. The default value is 'B'.
Application Notes
The errors in this routine are negligible.
The approximate number of floating-point operations is approximately proportional to m*n.
?hseqr
Computes all eigenvalues and (optionally) the Schur
factorization of a matrix reduced to Hessenberg form.
Syntax
call shseqr(job, compz, n, ilo, ihi, h, ldh, wr, wi, z, ldz, work, lwork, info)
call dhseqr(job, compz, n, ilo, ihi, h, ldh, wr, wi, z, ldz, work, lwork, info)
call chseqr(job, compz, n, ilo, ihi, h, ldh, w, z, ldz, work, lwork, info)
call zhseqr(job, compz, n, ilo, ihi, h, ldh, w, z, ldz, work, lwork, info)
call hseqr(h, wr, wi [,ilo] [,ihi] [,z] [,job] [,compz] [,info])
call hseqr(h, w [,ilo] [,ihi] [,z] [,job] [,compz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally the Schur factorization, of an upper Hessenberg
matrix H: H = Z*T*ZH, where T is an upper triangular (or, for real flavors, quasi-triangular) matrix (the
Schur form of H), and Z is the unitary or orthogonal matrix whose columns are the Schur vectors zi.
986
LAPACK Routines 3
You can also use this routine to compute the Schur factorization of a general matrix A which has been
reduced to upper Hessenberg form H:
A = Q*H*QH, where Q is unitary (orthogonal for real flavors);
A = (QZ)*T*(QZ)H.
In this case, after reducing A to Hessenberg form by gehrd, call orghr to form Q explicitly and then pass Q
to ?hseqr with compz = 'V'.
You can also call gebal to balance the original matrix before reducing it to Hessenberg form by ?hseqr, so
that the Hessenberg matrix H will have the structure:
Input Parameters
ilo, ihi INTEGER. If A has been balanced by ?gebal, then ilo and ihi must contain
the values returned by ?gebal. Otherwise, ilo must be set to 1 and ihi to n.
987
3 Intel Math Kernel Library Developer Reference
Output Parameters
988
LAPACK Routines 3
Contain the real and imaginary parts, respectively, of the computed
eigenvalues, unless info > 0. Complex conjugate pairs of eigenvalues
appear consecutively with the eigenvalue having positive imaginary part
first. The eigenvalues are stored in the same order as on the diagonal of the
Schur form T (if computed).
info INTEGER.
If info = 0, the execution is successful.
If info > 0, and compz = 'V', then on exit (final value of Z) = (initial
value of Z)*U, where U is the unitary matrix (regardless of the value of
job).
If info > 0, and compz = 'I', then on exit (final value of Z) = U, where
U is the unitary matrix (regardless of the value of job).
If info > 0, and compz = 'N', then Z is not accessed.
989
3 Intel Math Kernel Library Developer Reference
If present, compz must be equal to 'I' or 'V' and the argument z must also be
present. Note that there will be an error condition if compz is present and z
omitted.
Application Notes
The computed Schur factorization is the exact factorization of a nearby matrix H + E, where ||E||2 < O()
||H||2/si, and is the machine precision.
If i is an exact eigenvalue, and i is the corresponding computed value, then |i - i|c(n)**||H||2/si,
where c(n) is a modestly increasing function of n, and si is the reciprocal condition number of i. The
condition numbers si may be computed by calling trsna.
The total number of floating-point operations depends on how rapidly the algorithm converges; typical
numbers are as follows.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?hsein
Computes selected eigenvectors of an upper
Hessenberg matrix that correspond to specified
eigenvalues.
Syntax
call shsein(side, eigsrc, initv, select, n, h, ldh, wr, wi, vl, ldvl, vr, ldvr, mm,
m, work, ifaill, ifailr, info)
990
LAPACK Routines 3
call dhsein(side, eigsrc, initv, select, n, h, ldh, wr, wi, vl, ldvl, vr, ldvr, mm,
m, work, ifaill, ifailr, info)
call chsein(side, eigsrc, initv, select, n, h, ldh, w, vl, ldvl, vr, ldvr, mm, m,
work, rwork, ifaill, ifailr, info)
call zhsein(side, eigsrc, initv, select, n, h, ldh, w, vl, ldvl, vr, ldvr, mm, m,
work, rwork, ifaill, ifailr, info)
call hsein(h, wr, wi, select [, vl] [,vr] [,ifaill] [,ifailr] [,initv] [,eigsrc] [,m]
[,info])
call hsein(h, w, select [,vl] [,vr] [,ifaill] [,ifailr] [,initv] [,eigsrc] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes left and/or right eigenvectors of an upper Hessenberg matrix H, corresponding to
selected eigenvalues.
The right eigenvector x and the left eigenvector y, corresponding to an eigenvalue , are defined by: H*x =
*x and yH*H = *yH (or HH*y = **y). Here * denotes the conjugate of .
The eigenvectors are computed by inverse iteration. They are scaled so that, for a real eigenvector x, max|
xi| = 1, and for a complex eigenvector, max(|Rexi| + |Imxi|) = 1.
If H has been formed by reduction of a general matrix A to upper Hessenberg form, then eigenvectors of H
may be transformed to eigenvectors of A by ormhr or unmhr.
Input Parameters
select LOGICAL.
991
3 Intel Math Kernel Library Developer Reference
Array, size at least max (1, n). Specifies which eigenvectors are to be
computed.
For real flavors:
To obtain the real eigenvector corresponding to the real eigenvalue wr(j),
set select(j) to .TRUE.
vr(ldvr,*)
If initv = 'V' and side = 'R' or 'B', then vr must contain starting
vectors for inverse iteration for the right eigenvectors. Each starting vector
must be stored in the same column or columns as will be used to store the
corresponding eigenvector.
If initv = 'N', then vr need not be set.
992
LAPACK Routines 3
ldh INTEGER. The leading dimension of h; at least max(1, n).
Output Parameters
993
3 Intel Math Kernel Library Developer Reference
vl, vr If side = 'L' or 'B', vl contains the computed left eigenvectors (as
specified by select).
If side = 'R' or 'B', vr contains the computed right eigenvectors (as
specified by select).
The eigenvectors treated column-wise form a rectangular n-by-mm matrix.
info INTEGER.
If info = 0, the execution is successful.
994
LAPACK Routines 3
h Holds the matrix H of size (n,n).
ifaill Holds the vector of length (mm). Note that there will be an error condition if ifaill
is present and vl is omitted.
ifailr Holds the vector of length (mm). Note that there will be an error condition if ifailr
is present and vr is omitted.
Application Notes
Each computed right eigenvector x i is the exact eigenvector of a nearby matrix A + Ei, such that ||Ei|| <
O()||A||. Hence the residual is small:
||Axi - ixi|| = O()||A||.
However, eigenvectors corresponding to close or coincident eigenvalues may not accurately span the relevant
subspaces.
Similar remarks apply to computed left eigenvectors.
?trevc
Computes selected eigenvectors of an upper (quasi-)
triangular matrix computed by ?hseqr.
Syntax
call strevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work, info)
call dtrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work, info)
call ctrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work, rwork,
info)
call ztrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work, rwork,
info)
call trevc(t [, howmny] [,select] [,vl] [,vr] [,m] [,info])
995
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes some or all of the right and/or left eigenvectors of an upper triangular matrix T (or, for
real flavors, an upper quasi-triangular matrix T). Matrices of this type are produced by the Schur
factorization of a general matrix: A = Q*T*QH, as computed by hseqr.
The right eigenvector x and the left eigenvector y of T corresponding to an eigenvalue w, are defined by:
T*x = w*x, yH*T = w*yH, where yH denotes the conjugate transpose of y.
The eigenvalues are not input to this routine, but are read directly from the diagonal blocks of T.
This routine returns the matrices X and/or Y of right and left eigenvectors of T, or the products Q*X and/or
Q*Y, where Q is an input matrix.
If Q is the orthogonal/unitary factor that reduces a matrix A to Schur form T, then Q*X and Q*Y are the
matrices of right and left eigenvectors of A.
Input Parameters
select LOGICAL.
Array, size at least max (1, n).
If howmny = 'S', select specifies which eigenvectors are to be computed.
996
LAPACK Routines 3
n INTEGER. The order of the matrix T (n 0).
vr(ldvr,*)
If howmny = 'B' and side = 'R' or 'B', then vr must contain an n-by-n
matrix Q (usually the matrix of Schur vectors returned by ?hseqr). .
997
3 Intel Math Kernel Library Developer Reference
Output Parameters
vl, vr If side = 'L' or 'B', vl contains the computed left eigenvectors (as
specified by howmny and select).
If side = 'R' or 'B', vr contains the computed right eigenvectors (as
specified by howmny and select).
The eigenvectors treated column-wise form a rectangular n-by-mm matrix.
m INTEGER.
For complex flavors: the number of selected eigenvectors.
If howmny = 'A' or 'B', m is set to n.
info INTEGER.
If info = 0, the execution is successful.
998
LAPACK Routines 3
t Holds the matrix T of size (n,n).
side If omitted, this argument is restored based on the presence of arguments vl and
vr as follows:
side = 'B', if both vl and vr are present,
side = 'L', if vr is omitted,
side = 'R', if vl is omitted.
Note that there will be an error condition if both vl and vr are omitted.
howmny If omitted, this argument is restored based on the presence of argument select
as follows:
howmny = 'V', if q is present,
howmny = 'N', if q is omitted.
If present, vect = 'V' or 'U' and the argument q must also be present.
Note that there will be an error condition if both select and howmny are present.
Application Notes
If xi is an exact right eigenvector and yi is the corresponding computed eigenvector, then the angle (yi,
xi) between them is bounded as follows: (yi,xi)(c(n)||T||2)/sepi where sepi is the reciprocal
condition number of xi. The condition number sepi may be computed by calling ?trsna.
?trsna
Estimates condition numbers for specified eigenvalues
and right eigenvectors of an upper (quasi-) triangular
matrix.
Syntax
call strsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm, m, work,
ldwork, iwork, info)
call dtrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm, m, work,
ldwork, iwork, info)
call ctrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm, m, work,
ldwork, rwork, info)
call ztrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm, m, work,
ldwork, rwork, info)
call trsna(t [, s] [,sep] [,vl] [,vr] [,select] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
999
3 Intel Math Kernel Library Developer Reference
The routine estimates condition numbers for specified eigenvalues and/or right eigenvectors of an upper
triangular matrix T (or, for real flavors, upper quasi-triangular matrix T in canonical Schur form). These are
the same as the condition numbers of the eigenvalues and right eigenvectors of an original matrix A =
Z*T*ZH (with unitary or, for real flavors, orthogonal Z), from which T may have been derived.
The routine computes the reciprocal of the condition number of an eigenvalue i as si = |vT*u|/(||u||E||
v||E) for real flavors and si = |vH*u|/(||u||E||v||E) for complex flavors,
where:
This reciprocal condition number always lies between zero (ill-conditioned) and one (well-conditioned).
An approximate error estimate for a computed eigenvalue i is then given by *||T||/si, where is the
machine precision.
To estimate the reciprocal of the condition number of the right eigenvector corresponding to i, the routine
first calls trexc to reorder the diagonal elements of matrix T so that i is in the leading position:
The reciprocal condition number of the eigenvector is then estimated as sepi, the smallest singular value of
the matrix T22 - i*I.
An approximate error estimate for a computed right eigenvector u corresponding to i is then given by *||
T||/sepi.
Input Parameters
If job = 'V', then condition numbers for eigenvectors only are computed.
select LOGICAL.
Array, size at least max (1, n) if howmny = 'S' and at least 1 otherwise.
1000
LAPACK Routines 3
to select condition numbers for the eigenpair corresponding to a complex
conjugate pair of eigenvalues j and j + 1), select(j) and/or select(j + 1)
must be set .TRUE.
vr(ldvr,*)
If job = 'E' or 'B', then vr must contain the right eigenvectors of T (or of
any matrix Q*T*QH with Q unitary or orthogonal) corresponding to the
eigenpairs specified by howmny and select. The eigenvectors must be
stored in consecutive columns of vr, as returned by trevc or hsein.
The second dimension of vr must be at least max(1, mm) if job = 'E' or
'B' and at least 1 if job = 'V'.
The array vr is not referenced if job = 'V'.
1001
3 Intel Math Kernel Library Developer Reference
mm INTEGER. The number of elements in the arrays s and sep, and the number
of columns in vl and vr (if used). Must be at least m (the precise number
required).
If howmny = 'A', mm = n;
Output Parameters
1002
LAPACK Routines 3
For real flavors: for a complex eigenvector, two consecutive elements of sep
are set to the same value; if the eigenvalues cannot be reordered to
compute sep(j), then sep(j) is set to zero; this can only occur when the true
value would be very small anyway. The array sep is not referenced if job =
'E'.
m INTEGER.
For complex flavors: the number of selected eigenpairs.
If howmny = 'A', m is set to n.
For real flavors: the number of elements of s and/or sep actually used to
store the estimated condition numbers.
If howmny = 'A', m is set to n.
info INTEGER.
If info = 0, the execution is successful.
Note that the arguments s, vl, and vr must either be all present or all omitted.
Otherwise, an error condition is observed.
1003
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed values sepi may overestimate the true value, but seldom by a factor of more than 3.
?trexc
Reorders the Schur factorization of a general matrix.
Syntax
call strexc(compq, n, t, ldt, q, ldq, ifst, ilst, work, info)
call dtrexc(compq, n, t, ldt, q, ldq, ifst, ilst, work, info)
call ctrexc(compq, n, t, ldt, q, ldq, ifst, ilst, info)
call ztrexc(compq, n, t, ldt, q, ldq, ifst, ilst, info)
call trexc(t, ifst, ilst [,q] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reorders the Schur factorization of a general matrix A = Q*T*QH, so that the diagonal element or
block of T with row index ifst is moved to row ilst.
The reordered Schur form S is computed by an unitary (or, for real flavors, orthogonal) similarity
transformation: S = ZH*T*Z. Optionally the updated matrix P of Schur vectors is computed as P = Q*Z,
giving A = P*S*PH.
Input Parameters
1004
LAPACK Routines 3
ldt INTEGER. The leading dimension of t; at least max(1, n).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed matrix S is exactly similar to a matrix T+E, where ||E||2 = O()*||T||2, and is the
machine precision.
1005
3 Intel Math Kernel Library Developer Reference
Note that if a 2 by 2 diagonal block is involved in the re-ordering, its off-diagonal elements are in general
changed; the diagonal elements and the eigenvalues of the block are unchanged unless the block is
sufficiently ill-conditioned, in which case they may be noticeably altered. It is possible for a 2 by 2 block to
break into two 1 by 1 blocks, that is, for a pair of complex eigenvalues to become purely real.
The approximate number of floating-point operations is
?trsen
Reorders the Schur factorization of a matrix and
(optionally) computes the reciprocal condition
numbers for the selected cluster of eigenvalues and
respective invariant subspace.
Syntax
call strsen(job, compq, select, n, t, ldt, q, ldq, wr, wi, m, s, sep, work, lwork,
iwork, liwork, info)
call dtrsen(job, compq, select, n, t, ldt, q, ldq, wr, wi, m, s, sep, work, lwork,
iwork, liwork, info)
call ctrsen(job, compq, select, n, t, ldt, q, ldq, w, m, s, sep, work, lwork, info)
call ztrsen(job, compq, select, n, t, ldt, q, ldq, w, m, s, sep, work, lwork, info)
call trsen(t, select [,wr] [,wi] [,m] [,s] [,sep] [,q] [,info])
call trsen(t, select [,w] [,m] [,s] [,sep] [,q] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reorders the Schur factorization of a general matrix A = Q*T*QT (for real flavors) or A = Q*T*QH
(for complex flavors) so that a selected cluster of eigenvalues appears in the leading diagonal elements (or,
for real flavors, diagonal blocks) of the Schur form. The reordered Schur form R is computed by a unitary
(orthogonal) similarity transformation: R = ZH*T*Z. Optionally the updated matrix P of Schur vectors is
computed as P = Q*Z, giving A = P*R*PH.
Let
where the selected eigenvalues are precisely the eigenvalues of the leading m-by-m submatrix T11. Let P be
correspondingly partitioned as (Q1Q2) where Q1 consists of the first m columns of Q. Then A*Q1 = Q1*T11,
and so the m columns of Q1 form an orthonormal basis for the invariant subspace corresponding to the
selected cluster of eigenvalues.
1006
LAPACK Routines 3
Optionally the routine also computes estimates of the reciprocal condition numbers of the average of the
cluster of eigenvalues and of the invariant subspace.
Input Parameters
If job = 'E', then only the condition number for the cluster of eigenvalues
is computed.
If job = 'V', then only the condition number for the invariant subspace is
computed.
If job = 'B', then condition numbers for both the cluster and the invariant
subspace are computed.
select LOGICAL.
Array, size at least max (1, n).
Specifies the eigenvalues in the selected cluster. To select an eigenvalue j,
select(j) must be .TRUE.
1007
3 Intel Math Kernel Library Developer Reference
If job = 'N', then lwork 1 for complex flavors and lwork max(1,n)
for real flavors.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla. See Application Notes for details.
The actual amount of workspace required cannot exceed n2/2 if job = 'V'
or 'B'.
liwork INTEGER.
The dimension of the array iwork.
If job = 'V' or 'B', liwork max(1,2m(n-m)).
Output Parameters
q If compq = 'V', q contains the updated matrix of Schur vectors; the first
m columns of the Q form an orthogonal basis for the specified invariant
subspace.
1008
LAPACK Routines 3
m INTEGER.
For complex flavors: the dimension of the specified invariant subspaces,
which is the same as the number of selected eigenvalues (see select).
For real flavors: the dimension of the specified invariant subspace. The
value of m is obtained by counting 1 for each selected real eigenvalue and 2
for each selected complex conjugate pair of eigenvalues (see select).
Constraint: 0 mn.
For real flavors: if info = 1, then s is set to zero.s is not referenced if job
= 'N' or 'V'.
work(1) On exit, if info = 0, then work(1) returns the optimal size of lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the optimal size of liwork.
info INTEGER.
If info = 0, the execution is successful.
1009
3 Intel Math Kernel Library Developer Reference
compq Restored based on the presence of the argument q as follows: compq = 'V', if q
is present, compq = 'N', if q is omitted.
Application Notes
The computed matrix R is exactly similar to a matrix T+E, where ||E||2 = O()*||T||2, and is the
machine precision. The computed s cannot underestimate the true reciprocal condition number by more than
a factor of (min(m, n-m))1/2; sep may differ from the true value by (m*n-m2)1/2. The angle between the
computed invariant subspace and the true subspace is O()*||A||2/sep. Note that if a 2-by-2 diagonal
block is involved in the re-ordering, its off-diagonal elements are in general changed; the diagonal elements
and the eigenvalues of the block are unchanged unless the block is sufficiently ill-conditioned, in which case
they may be noticeably altered. It is possible for a 2-by-2 block to break into two 1-by-1 blocks, that is, for a
pair of complex eigenvalues to become purely real.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?trsyl
Solves Sylvester equation for real quasi-triangular or
complex triangular matrices.
Syntax
call strsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call dtrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call ctrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call ztrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call trsyl(a, b, c, scale [, trana] [,tranb] [,isgn] [,info])
1010
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine solves the Sylvester matrix equation op(A)*XX*op(B) = *C, where op(A) = A or AH, and the
matrices A and B are upper triangular (or, for real flavors, upper quasi-triangular in canonical Schur form);
1 is a scale factor determined by the routine to avoid overflow in X; A is m-by-m, B is n-by-n, and C and X
are both m-by-n. The matrix X is obtained by a straightforward process of back substitution.
The equation has a unique solution if and only if ii 0, where {i} and {i} are the eigenvalues of A and
B, respectively, and the sign (+ or -) is the same as that used in the equation to be solved.
Input Parameters
1011
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
Let X be the exact, Y the corresponding computed solution, and R the residual matrix: R = C - (AYYB).
Then the residual is always small:
||R||F = O()*(||A||F +||B||F)*||Y||F.
However, Y is not necessarily the exact solution of a slightly perturbed equation; in other words, the solution
is not backwards stable.
For the forward error, the following bound holds:
||Y - X||F||R||F/sep(A,B)
but this may be a considerable overestimate. See [Golub96] for a definition of sep(A, B).
The approximate number of floating-point operations for real flavors is m*n*(m + n). For complex flavors it
is 4 times greater.
1012
LAPACK Routines 3
Generalized Nonsymmetric Eigenvalue Problems: LAPACK Computational Routines
This section describes LAPACK routines for solving generalized nonsymmetric eigenvalue problems,
reordering the generalized Schur factorization of a pair of matrices, as well as performing a number of
related computational tasks.
A generalized nonsymmetric eigenvalue problem is as follows: given a pair of nonsymmetric (or non-
Hermitian) n-by-n matrices A and B, find the generalized eigenvalues and the corresponding generalized
eigenvectorsx and y that satisfy the equations
Ax = Bx (right generalized eigenvectors x)
and
yHA = yHB (left generalized eigenvectors y).
Table "Computational Routines for Solving Generalized Nonsymmetric Eigenvalue Problems" lists LAPACK
routines (FORTRAN 77 interface) used to solve the generalized nonsymmetric eigenvalue problems and the
generalized Sylvester equation. The corresponding routine names in the Fortran 95 interface are without the
first symbol.
Computational Routines for Solving Generalized Nonsymmetric Eigenvalue Problems
Routine Operation performed
name
gghrd Reduces a pair of matrices to generalized upper Hessenberg form using orthogonal/
unitary transformations.
hgeqz Implements the QZ method for finding the generalized eigenvalues of the matrix pair
(H,T).
tgevc Computes some or all of the right and/or left generalized eigenvectors of a pair of upper
triangular matrices
tgexc Reorders the generalized Schur decomposition of a pair of matrices (A,B) so that one
diagonal block of (A,B) moves to another row index.
tgsen Reorders the generalized Schur decomposition of a pair of matrices (A,B) so that a
selected cluster of eigenvalues appears in the leading diagonal blocks of (A,B).
tgsyl Estimates reciprocal condition numbers for specified eigenvalues and/or eigenvectors of a
pair of matrices in generalized real Schur canonical form.
?gghrd
Reduces a pair of matrices to generalized upper
Hessenberg form using orthogonal/unitary
transformations.
Syntax
call sgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call dgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call cgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call zgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call gghrd(a, b [,ilo] [,ihi] [,q] [,z] [,compq] [,compz] [,info])
1013
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a pair of real/complex matrices (A,B) to generalized upper Hessenberg form using
orthogonal/unitary transformations, where A is a general matrix and B is upper triangular. The form of the
generalized eigenvalue problem is A*x = *B*x, and B is typically made upper triangular by computing its
QR factorization and moving the orthogonal matrix Q to the left side of the equation.
This routine simultaneously reduces A to a Hessenberg matrix H:
QH*A*Z = H
and transforms B to another upper triangular matrix T:
QH*B*Z = T
in order to reduce the problem to its standard form H*y = *T*y, where y = ZH*x.
The orthogonal/unitary matrices Q and Z are determined as products of Givens rotations. They may either be
formed explicitly, or they may be postmultiplied into input matrices Q1 and Z1, so that
Q1*A*Z1H = (Q1*Q)*H*(Z1*Z)H
Q1*B*Z1H = (Q1*Q)*T*(Z1*Z)H
If Q1 is the orthogonal/unitary matrix from the QR factorization of B in the original equation A*x = *B*x,
then the routine ?gghrd reduces the original problem to generalized Hessenberg form.
Input Parameters
ilo, ihi INTEGER. ilo and ihi mark the rows and columns of A which are to be
reduced. It is assumed that A is already upper triangular in rows and
columns 1:ilo-1 and ihi+1:n. Values of ilo and ihi are normally set by a
previous call to ggbal; otherwise they should be set to 1 and n respectively.
Constraint:
If n > 0, then 1 iloihin;
1014
LAPACK Routines 3
a, b, q, z REAL for sgghrd
DOUBLE PRECISION for dgghrd
COMPLEX for cgghrd
DOUBLE COMPLEX for zgghrd.
Arrays:
a(lda,*) contains the n-by-n general matrix A.
The second dimension of a must be at least max(1, n).
b(ldb,*) contains the n-by-n upper triangular matrix B.
The second dimension of b must be at least max(1, n).
q(ldq,*)
If compq = 'N', then q is not referenced.
Output Parameters
a On exit, the upper triangle and the first subdiagonal of A are overwritten
with the upper Hessenberg matrix H, and the rest is set to zero.
1015
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
If present, compq must be equal to 'I' or 'V' and the argument q must also be
present. Note that there will be an error condition if compq is present and q
omitted.
If present, compz must be equal to 'I' or 'V' and the argument z must also be
present. Note that there will be an error condition if compz is present and z
omitted.
?ggbal
Balances a pair of general real or complex matrices.
Syntax
call sggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call dggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call cggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call zggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call ggbal(a, b [,ilo] [,ihi] [,lscale] [,rscale] [,job] [,info])
Include Files
mkl.fi, lapack.f90
Description
1016
LAPACK Routines 3
The routine balances a pair of general real/complex matrices (A,B). This involves, first, permuting A and B by
similarity transformations to isolate eigenvalues in the first 1 to ilo-1 and last ihi+1 to n elements on the
diagonal;and second, applying a diagonal similarity transformation to rows and columns ilo to ihi to make the
rows and columns as close in norm as possible. Both steps are optional. Balancing may reduce the 1-norm of
the matrices, and improve the accuracy of the computed eigenvalues and/or eigenvectors in the generalized
eigenvalue problem A*x = *B*x.
Input Parameters
If job = 'N ', then no operations are done; simply set ilo =1, ihi=n,
lscale(i) =1.0 and rscale(i)=1.0 for
i = 1,..., n.
If job = 'P', then permute only.
Output Parameters
ilo, ihi INTEGER. ilo and ihi are set to integers such that on exit Ai, j = 0 and Bi, j =
0 if i>j and j=1,...,ilo-1 or i=ihi+1,..., n.
1017
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
job Must be 'B', 'S', 'P', or 'N'. The default value is 'B'.
?ggbak
Forms the right or left eigenvectors of a generalized
eigenvalue problem.
1018
LAPACK Routines 3
Syntax
call sggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call dggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call cggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call zggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call ggbak(v [, ilo] [,ihi] [,lscale] [,rscale] [,job] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the right or left eigenvectors of a real/complex generalized eigenvalue problem
A*x = *B*x
by backward transformation on the computed eigenvectors of the balanced pair of matrices output by ggbal.
Input Parameters
ilo, ihi INTEGER. The integers ilo and ihi determined by ?gebal. Constraint:
If n > 0, then 1 iloihin;
The array rscale contains details of the permutations and/or scaling factors
applied to the right side of A and B, as returned by ?ggbal.
1019
3 Intel Math Kernel Library Developer Reference
(m 0).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
job Must be 'B', 'S', 'P', or 'N'. The default value is 'B'.
side If omitted, this argument is restored based on the presence of arguments lscale
and rscale as follows:
side = 'L', if lscale is present and rscale omitted,
side = 'R', if lscale is omitted and rscale present.
Note that there will be an error condition if both lscale and rscale are present or
if they both are omitted.
?gghd3
Reduces a pair of matrices to generalized upper
Hessenberg form.
1020
LAPACK Routines 3
Syntax
call sgghd3 (compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, work, lwork,
info )
call dgghd3 (compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, work, lwork,
info )
call cgghd3 (compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, work, lwork,
info )
call zgghd3 (compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, work, lwork,
info )
Include Files
mkl.fi
Description
?gghd3 reduces a pair of real or complex matrices (A, B) to generalized upper Hessenberg form using
orthogonal/unitary transformations, where A is a general matrix and B is upper triangular. The form of the
generalized eigenvalue problem is
A*x = *B*x,
and B is typically made upper triangular by computing its QR factorization and moving the orthogonal/unitary
matrix Q to the left side of the equation.
This subroutine simultaneously reduces A to a Hessenberg matrix H:
QT*A*Z = H for real flavors
or
QT*A*Z = H for complex flavors
and transforms B to another upper triangular matrix T:
QT*B*Z = T for real flavors
or
QT*B*Z = T for complex flavors
in order to reduce the problem to its standard form
H*y = *T*y
where y = ZT*x for real flavors
or
y = ZT*x for complex flavors.
The orthogonal/unitary matrices Q and Z are determined as products of Givens rotations. They may either be
formed explicitly, or they may be postmultiplied into input matrices Q1 and Z1, so that
for real flavors:
Q1 * A * Z1T = (Q1*Q) * H * (Z1*Z)T
Q1 * B * Z1T = (Q1*Q) * T * (Z1*Z)T
for complex flavors:
Q1 * A * Z1H = (Q1*Q) * H * (Z1*Z)T
Q1 * B * Z1T = (Q1*Q) * T * (Z1*Z)T
If Q1 is the orthogonal/unitary matrix from the QR factorization of B in the original equation A*x = *B*x,
then ?gghd3 reduces the original problem to generalized Hessenberg form.
1021
3 Intel Math Kernel Library Developer Reference
This is a blocked variant of ?gghrd, using matrix-matrix multiplications for parts of the computation to
enhance performance.
Input Parameters
ilo, ihi INTEGER. ilo and ihi mark the rows and columns of a which are to be
reduced. It is assumed that a is already upper triangular in rows and
columns 1:ilo - 1 and ihi + 1:n. ilo and ihi are normally set by a
previous call to ?ggbal; otherwise they should be set to 1 and n,
respectively.
1 iloihin, if n > 0; ilo=1 and ihi=0, if n=0.
lda max(1,n).
ldb max(1,n).
1022
LAPACK Routines 3
q REAL for sgghd3
DOUBLE PRECISION for dgghd3
COMPLEX for cgghd3
DOUBLE COMPLEX for zgghd3
Array, size (ldq, n).
ldz INTEGER. The leading dimension of the array z. ldzn if compz='V' or 'I';
ldz 1 otherwise.
Output Parameters
a On exit, the upper triangle and the first subdiagonal of a are overwritten
with the upper Hessenberg matrix H, and the rest is set to zero.
b On exit, the upper triangular matrix T = QTBZ for real flavors or T = QHBZ
for complex flavors. The elements below the diagonal are set to zero.
1023
3 Intel Math Kernel Library Developer Reference
Application Notes
This routine reduces A to Hessenberg form and maintains B in using a blocked variant of Moler and Stewart's
original algorithm, as described by Kagstrom, Kressner, Quintana-Orti, and Quintana-Orti (BIT 2008).
?hgeqz
Implements the QZ method for finding the generalized
eigenvalues of the matrix pair (H,T).
Syntax
call shgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alphar, alphai, beta, q,
ldq, z, ldz, work, lwork, info)
call dhgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alphar, alphai, beta, q,
ldq, z, ldz, work, lwork, info)
call chgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alpha, beta, q, ldq, z,
ldz, work, lwork, rwork, info)
call zhgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alpha, beta, q, ldq, z,
ldz, work, lwork, rwork, info)
call hgeqz(h, t [,ilo] [,ihi] [,alphar] [,alphai] [,beta] [,q] [,z] [,job] [,compq]
[,compz] [,info])
call hgeqz(h, t [,ilo] [,ihi] [,alpha] [,beta] [,q] [,z] [,job] [,compq] [, compz]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the eigenvalues of a real/complex matrix pair (H,T), where H is an upper Hessenberg
matrix and T is upper triangular, using the double-shift version (for real flavors) or single-shift version (for
complex flavors) of the QZ method. Matrix pairs of this type are produced by the reduction to generalized
upper Hessenberg form of a real/complex matrix pair (A,B):
A = Q1*H*Z1H, B = Q1*T*Z1H,
as computed by ?gghrd.
1024
LAPACK Routines 3
H = Q*S*ZT, T = Q*P*ZT,
where Q and Z are orthogonal matrices, P is an upper triangular matrix, and S is a quasi-triangular matrix
with 1-by-1 and 2-by-2 diagonal blocks. The 1-by-1 blocks correspond to real eigenvalues of the matrix pair
(H,T) and the 2-by-2 blocks correspond to complex conjugate pairs of eigenvalues.
Additionally, the 2-by-2 upper triangular diagonal blocks of P corresponding to 2-by-2 blocks of S are reduced
to positive diagonal form, that is, if Sj + 1, j is non-zero, then Pj + 1, j = Pj, j + 1 = 0, Pj, j > 0, and Pj
+ 1, j + 1 > 0.
H = Q* S*ZH, T = Q*P*ZH,
where Q and Z are unitary matrices, and S and P are upper triangular.
For all function flavors:
Optionally, the orthogonal/unitary matrix Q from the generalized Schur factorization may be post-multiplied
by an input matrix Q1, and the orthogonal/unitary matrix Z may be post-multiplied by an input matrix Z1.
If Q1 and Z1 are the orthogonal/unitary matrices from ?gghrd that reduced the matrix pair (A,B) to
generalized upper Hessenberg form, then the output matrices Q1Q and Z1Z are the orthogonal/unitary
factors from the generalized Schur factorization of (A,B):
A = (Q1Q)*S *(Z1Z)H, B = (Q1Q)*P*(Z1Z)H.
To avoid overflow, eigenvalues of the matrix pair (H,T) (equivalently, of (A,B)) are computed as a pair of
values (alpha,beta). For chgeqz/zhgeqz, alpha and beta are complex, and for shgeqz/dhgeqz, alpha is
complex and beta real. If beta is nonzero, = alpha/beta is an eigenvalue of the generalized
nonsymmetric eigenvalue problem (GNEP)
A*x = *B*x
and if alpha is nonzero, = beta/alpha is an eigenvalue of the alternate form of the GNEP
*A*y = B*y .
Real eigenvalues (for real flavors) or the values of alpha and beta for the i-th eigenvalue (for complex
flavors) can be read directly from the generalized Schur form:
alpha = Si, i, beta = Pi, i.
Input Parameters
If compq = 'I', q is initialized to the unit matrix and the matrix of left
Schur vectors of (H,T) is returned;
If compq = 'V', q must contain an orthogonal/unitary matrix Q1 on entry
and the product Q1*Q is returned.
1025
3 Intel Math Kernel Library Developer Reference
If compz = 'I', z is initialized to the unit matrix and the matrix of right
Schur vectors of (H,T) is returned;
If compz = 'V', z must contain an orthogonal/unitary matrix Z1 on entry
and the product Z1*Z is returned.
ilo, ihi INTEGER. ilo and ihi mark the rows and columns of H which are in
Hessenberg form. It is assumed that H is already upper triangular in rows
and columns 1:ilo-1 and ihi+1:n.
Constraint:
If n > 0, then 1 iloihin;
1026
LAPACK Routines 3
ldz INTEGER. The leading dimension of z;
If compq = 'N', then ldz 1.
Output Parameters
t If job = 'S', then, on exit, t contains the upper triangular matrix P from
the generalized Schur factorization.
For real flavors:
2-by-2 diagonal blocks of P corresponding to 2-by-2 blocks of S are reduced
to positive diagonal form, that is, if h(j+1,j) is non-zero, then t(j
+1,j)=t(j,j+1)=0 and t(j,j) and t(j+1,j+1) will be positive.
If job = 'E', then on exit the diagonal blocks of t match those of P, but
the rest of t is unspecified.
For complex flavors:
if job = 'E', then on exit the diagonal of t matches that of P, but the rest
of t is unspecified.
1027
3 Intel Math Kernel Library Developer Reference
alphai(j+1) = -alphai(j).
work(1) If info 0, on exit, work(1) contains the minimum value of lwork required
for optimum performance. Use this lwork for subsequent runs.
info INTEGER.
If info = 0, the execution is successful.
(H,T) is not in Schur form, but alphar(i), alphai(i) (for real flavors), alpha(i)
(for complex flavors), and beta(i), i=info+1,..., n should be correct.
1028
LAPACK Routines 3
(H,T) is not in Schur form, but alphar(i), alphai(i) (for real flavors), alpha(i)
(for complex flavors), and beta(i), i =info-n+1,..., n should be correct.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
1029
3 Intel Math Kernel Library Developer Reference
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?tgevc
Computes some or all of the right and/or left
generalized eigenvectors of a pair of upper triangular
matrices.
Syntax
call stgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm, m, work,
info)
call dtgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm, m, work,
info)
call ctgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm, m, work,
rwork, info)
call ztgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm, m, work,
rwork, info)
call tgevc(s, p [,howmny] [,select] [,vl] [,vr] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes some or all of the right and/or left eigenvectors of a pair of real/complex matrices
(S,P), where S is quasi-triangular (for real flavors) or upper triangular (for complex flavors) and P is upper
triangular.
Matrix pairs of this type are produced by the generalized Schur factorization of a real/complex matrix pair
(A,B):
A = Q*S*ZH, B = Q*P*ZH
as computed by ?gghrd plus ?hgeqz.
The right eigenvector x and the left eigenvector y of (S,P) corresponding to an eigenvalue w are defined by:
S*x = w*P*x, yH*S = w*yH*P
The eigenvalues are not input to this routine, but are computed directly from the diagonal blocks or diagonal
elements of S and P.
This routine returns the matrices X and/or Y of right and left eigenvectors of (S,P), or the products Z*X
and/or Q*Y, where Z and Q are input matrices.
If Q and Z are the orthogonal/unitary factors from the generalized Schur factorization of a matrix pair (A,B),
then Z*X and Q*Y are the matrices of right and left eigenvectors of (A,B).
1030
LAPACK Routines 3
Input Parameters
select LOGICAL.
Array, size at least max (1, n).
If howmny = 'S', select specifies the eigenvectors to be computed.
If w(j) and omega(j + 1) are the real and imaginary parts of a complex
eigenvalue, the corresponding complex eigenvector is computed if either
select(j) or select(j + 1) is .TRUE., and on exit select(j) is set
to .TRUE.and select(j + 1) is set to .FALSE..
1031
3 Intel Math Kernel Library Developer Reference
For complex flavors, P must have real diagonal elements. The second
dimension of p must be at least max(1, n).
If side = 'L' or 'B' and howmny = 'B', vl(ldvl,*) must contain an n-by-
n matrix Q (usually the orthogonal/unitary matrix Q of left Schur vectors
returned by ?hgeqz). The second dimension of vl must be at least max(1,
mm).
If side = 'R', vl is not referenced.
If side = 'R' or 'B' and howmny = 'B', vr(ldvr,*) must contain an n-by-
n matrix Z (usually the orthogonal/unitary matrix Z of right Schur vectors
returned by ?hgeqz). The second dimension of vr must be at least max(1,
mm).
If side = 'L', vr is not referenced.
rwork REAL for ctgevc DOUBLE PRECISION for ztgevc. Workspace array, size at
least max (1, 2*n). Used in complex flavors only.
Output Parameters
1032
LAPACK Routines 3
if howmny = 'B', the matrix Z*X;
info INTEGER.
If info = 0, the execution is successful.
1033
3 Intel Math Kernel Library Developer Reference
howmny If omitted, this argument is restored based on the presence of argument select
as follows:
howmny = 'S', if select is present,
howmny = 'A', if select is omitted.
If present, howmny must be equal to 'A' or 'B' and the argument select must
be omitted.
Note that there will be an error condition if both howmny and select are present.
?tgexc
Reorders the generalized Schur decomposition of a
pair of matrices (A,B) so that one diagonal block of
(A,B) moves to another row index.
Syntax
call stgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, work, lwork,
info)
call dtgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, work, lwork,
info)
call ctgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, info)
call ztgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, info)
call tgexc(a, b [,ifst] [,ilst] [,z] [,q] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reorders the generalized real-Schur/Schur decomposition of a real/complex matrix pair (A,B)
using an orthogonal/unitary equivalence transformation
(A,B) = Q*(A,B)*ZH,
so that the diagonal block of (A, B) with row index ifst is moved to row ilst. Matrix pair (A, B) must be in a
generalized real-Schur/Schur canonical form (as returned by gges), that is, A is block upper triangular with
1-by-1 and 2-by-2 diagonal blocks and B is upper triangular. Optionally, the matrices Q and Z of generalized
Schur vectors are updated.
Qin*Ain*ZinT = Qout*Aout*ZoutT
Qin*Bin*ZinT = Qout*Bout*ZoutT.
Input Parameters
1034
LAPACK Routines 3
n INTEGER. The order of the matrices A and B (n 0).
ifst, ilst INTEGER. Specify the reordering of the diagonal blocks of (A, B). The block
with row index ifst is moved to row ilst, by a sequence of swapping between
adjacent blocks. Constraint: 1 ifst, ilstn.
1035
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info = 1, the transformed matrix pair (A, B) would be too far from
generalized Schur form; the problem is ill-conditioned. (A, B) may have
been partially reordered, and ilst points to the first row of the current
position of the block being moved.
Application Notes
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
1036
LAPACK Routines 3
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?tgsen
Reorders the generalized Schur decomposition of a
pair of matrices (A,B) so that a selected cluster of
eigenvalues appears in the leading diagonal blocks of
(A,B).
Syntax
call stgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alphar, alphai, beta, q,
ldq, z, ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call dtgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alphar, alphai, beta, q,
ldq, z, ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call ctgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alpha, beta, q, ldq, z,
ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call ztgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alpha, beta, q, ldq, z,
ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call tgsen(a, b, select [,alphar] [,alphai] [,beta] [,ijob] [,q] [,z] [,pl] [,pr] [,dif]
[,m] [,info])
call tgsen(a, b, select [,alpha] [,beta] [,ijob] [,q] [,z] [,pl] [,pr] [, dif] [,m]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reorders the generalized real-Schur/Schur decomposition of a real/complex matrix pair (A, B) (in
terms of an orthogonal/unitary equivalence transformation QT*(A,B)*Z for real flavors or QH*(A,B)*Z for
complex flavors), so that a selected cluster of eigenvalues appears in the leading diagonal blocks of the pair
(A, B). The leading columns of Q and Z form orthonormal/unitary bases of the corresponding left and right
eigenspaces (deflating subspaces).
(A, B) must be in generalized real-Schur/Schur canonical form (as returned by gges), that is, A and B are
both upper triangular.
?tgsen also computes the generalized eigenvalues
j = (alphar(j) + alphai(j)*i)/beta(j) (for real flavors)
j = alpha(j)/beta(j) (for complex flavors)
of the reordered matrix pair (A, B).
Optionally, the routine computes the estimates of reciprocal condition numbers for eigenvalues and
eigenspaces. These are Difu[(A11, B11), (A22, B22)] and Difl[(A11, B11), (A22, B22)], that is, the
separation(s) between the matrix pairs (A11, B11) and (A22, B22) that correspond to the selected cluster and
the eigenvalues outside the cluster, respectively, and norms of "projections" onto left and right eigenspaces
with respect to the selected cluster in the (1,1)-block.
1037
3 Intel Math Kernel Library Developer Reference
Input Parameters
ijob INTEGER. Specifies whether condition numbers are required for the cluster
of eigenvalues (pl and pr) or the deflating subspaces Difu and Difl.
If ijob =4,>compute pl, pr and dif (i.e., options 0, 1 and 2 above). This is
an economic version to get it all;
If ijob =5, compute pl, pr and dif (i.e., options 0, 1 and 3 above).
select LOGICAL.
Array, size at least max (1, n). Specifies the eigenvalues in the selected
cluster.
To select an eigenvalue j, select(j) must be .TRUE.For real flavors: to
select a complex conjugate pair of eigenvalues j and j + 1 (corresponding
2 by 2 diagonal block), select(j) and/or select(j + 1) must be set
to .TRUE.; the complex conjugate j and j + 1 must be either both
included in the cluster or both excluded.
1038
LAPACK Routines 3
For real flavors: B is upper triangular, with (A, B) in generalized real Schur
canonical form.
For complex flavors: B is upper triangular, in generalized Schur canonical
form. The second dimension of b must be at least max(1, n).
q(ldq,*)
If wantq = .TRUE., then q is an n-by-n matrix;
1039
3 Intel Math Kernel Library Developer Reference
Output Parameters
1040
LAPACK Routines 3
z If wantz = .TRUE., then, on exit, Z has been postmultiplied by the left
orthogonal transformation matrix which reorder (A, B). The leading m
columns of Z form orthonormal bases for the specified pair of left
eigenspaces (deflating subspaces).
m INTEGER.
The dimension of the specified pair of left and right eigen-spaces (deflating
subspaces); 0 mn.
dif REAL for single precision flavors;DOUBLE PRECISION for double precision
flavors.
Array, size (2).
If ijob 2, dif(1:2) store the estimates of Difu and Difl.
work(1) If ijob is not 0 and info = 0, on exit, work(1) contains the minimum
value of lwork required for optimum performance. Use this lwork for
subsequent runs.
iwork(1) If ijob is not 0 and info = 0, on exit, iwork(1) contains the minimum
value of liwork required for optimum performance. Use this liwork for
subsequent runs.
info INTEGER.
If info = 0, the execution is successful.
1041
3 Intel Math Kernel Library Developer Reference
Specific details for the routine tgsen interface are the following:
Application Notes
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?tgsyl
Solves the generalized Sylvester equation.
Syntax
call stgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, scale,
dif, work, lwork, iwork, info)
call dtgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, scale,
dif, work, lwork, iwork, info)
1042
LAPACK Routines 3
call ctgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, scale,
dif, work, lwork, iwork, info)
call ztgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, scale,
dif, work, lwork, iwork, info)
call tgsyl(a, b, c, d, e, f [,ijob] [,trans] [,scale] [,dif] [,info])
Include Files
mkl.fi, lapack.f90
Description
Here Ik is the identity matrix of size k and XT is the transpose/conjugate-transpose of X. kron(X, Y) is the
Kronecker product between the matrices X and Y.
If trans = 'T' (for real flavors), or trans = 'C' (for complex flavors), the routine ?tgsyl solves the
transposed/conjugate-transposed system ZT*y = scale*b, which is equivalent to solve for R and L in
AT*R+DT*L = scale*C
R*BT+L*ET = scale*(-F)
This case (trans = 'T' for stgsyl/dtgsyl or trans = 'C' for ctgsyl/ztgsyl) is used to compute an
one-norm-based estimate of Dif[(A, D), (B, E)], the separation between the matrix pairs (A,D) and
(B,E), using lacon/lacon.
If ijob 1, ?tgsyl computes a Frobenius norm-based estimate of Dif[(A, D), (B,E)]. That is, the
reciprocal of a lower bound on the reciprocal of the smallest singular value of Z. This is a level 3 BLAS
algorithm.
Input Parameters
If trans = 'T', solve the 'transposed' system (for real flavors only).
1043
3 Intel Math Kernel Library Developer Reference
If trans = 'C', solve the ' conjugate transposed' system (for complex
flavors only).
If ijob =3, only an estimate of Dif[(A, D), (B, E)] is computed (look ahead
strategy is used);
If ijob =4, only an estimate of Dif[(A, D), (B,E)] is computed (?gecon on
sub-systems is used). If trans = 'T' or 'C', ijob is not referenced.
m INTEGER. The order of the matrices A and D, and the row dimension of the
matrices C, F, R and L.
n INTEGER. The order of the matrices B and E, and the column dimension of
the matrices C, F, R and L.
1044
LAPACK Routines 3
ldc INTEGER. The leading dimension of c; at least max(1, m) .
lwork INTEGER.
The dimension of the array work. lwork 1.
iwork INTEGER. Workspace array, size at least (m+n+6) for real flavors, and at
least (m+n+2) for complex flavors.
Output Parameters
If ijob=3 or 4 and trans = 'N', c holds R, the solution achieved during the
computation of the Dif-estimate.
work(1) If info = 0, work(1) contains the minimum value of lwork required for
optimum performance. Use this lwork for subsequent runs.
1045
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
Application Notes
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?tgsna
Estimates reciprocal condition numbers for specified
eigenvalues and/or eigenvectors of a pair of matrices
in generalized real Schur canonical form.
Syntax
call stgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s, dif, mm,
m, work, lwork, iwork, info)
call dtgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s, dif, mm,
m, work, lwork, iwork, info)
call ctgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s, dif, mm,
m, work, lwork, iwork, info)
1046
LAPACK Routines 3
call ztgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s, dif, mm,
m, work, lwork, iwork, info)
call tgsna(a, b [,s] [,dif] [,vl] [,vr] [,select] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
The real flavors stgsna/dtgsna of this routine estimate reciprocal condition numbers for specified
eigenvalues and/or eigenvectors of a matrix pair (A, B) in generalized real Schur canonical form (or of any
matrix pair (Q*A*ZT, Q*B*ZT) with orthogonal matrices Q and Z.
(A, B) must be in generalized real Schur form (as returned by gges/gges), that is, A is block upper triangular
with 1-by-1 and 2-by-2 diagonal blocks. B is upper triangular.
The complex flavors ctgsna/ztgsna estimate reciprocal condition numbers for specified eigenvalues and/or
eigenvectors of a matrix pair (A, B). (A, B) must be in generalized Schur canonical form, that is, A and B are
both upper triangular.
Input Parameters
If job = 'B', for both eigenvalues and eigenvectors (compute both s and
dif).
select LOGICAL.
Array, size at least max (1, n).
If howmny = 'S', select specifies the eigenpairs for which condition
numbers are required.
If howmny = 'A', select is not referenced.
1047
3 Intel Math Kernel Library Developer Reference
(n 0).
1048
LAPACK Routines 3
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla. See Application Notes for details.
iwork INTEGER. Workspace array, size at least (n+6) for real flavors, and at least
(n+2) for complex flavors.
If job = 'E', iwork is not referenced.
Output Parameters
m INTEGER. The number of elements in the arrays s and dif used to store the
specified condition numbers; for each selected eigenvalue one element is
used.
If howmny = 'A', m is set to n.
work(1) work(1)
1049
3 Intel Math Kernel Library Developer Reference
If job is not 'E' and info = 0, on exit, work(1) contains the minimum
value of lwork required for optimum performance. Use this lwork for
subsequent runs.
info INTEGER.
If info = 0, the execution is successful.
howmny Restored based on the presence of the argument select as follows: howmny =
'S', if select is present, howmny = 'A', if select is omitted.
job Restored based on the presence of arguments s and dif as follows: job = 'B',
if both s and dif are present, job = 'E', if s is present and dif omitted, job =
'V', if s is omitted and dif present, Note that there will be an error condition if
both s and dif are omitted.
Application Notes
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
1050
LAPACK Routines 3
VHBQ = D2*(0 R),
where U, V, and Q are orthogonal/unitary matrices, R is a nonsingular upper triangular matrix, and D1, D2
are diagonal matrices of the structure detailed in the routines description section.
Table Computational Routines for Generalized Singular Value Decomposition lists LAPACK routines
(FORTRAN 77 interface) that perform generalized singular value decomposition of matrices. The
corresponding routine names in the Fortran 95 interface are without the first symbol.
Computational Routines for Generalized Singular Value Decomposition
Routine name Operation performed
You can use routines listed in the above table as well as the driver routine ggsvd to find the GSVD of a pair of
general rectangular matrices.
?ggsvp
Computes the preprocessing decomposition for the
generalized SVD (deprecated).
Syntax
call sggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, tau, work, info)
call dggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, tau, work, info)
call cggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, rwork, tau, work, info)
call zggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, rwork, tau, work, info)
call ggsvp(a, b, tola, tolb [, k] [,l] [,u] [,v] [,q] [,info])
Include Files
mkl.fi, lapack.f90
Description
This routine is deprecated; use ggsvp3.
1051
3 Intel Math Kernel Library Developer Reference
where the k-by-k matrix A12 and l-by-l matrix B13 are nonsingular upper triangular; A23 is l-by-l upper
triangular if m-k-l0, otherwise A23 is (m-k)-by-l upper trapezoidal. The sum k+l is equal to the effective
numerical rank of the (m+p)-by-n matrix (AH,BH)H.
This decomposition is the preprocessing step for computing the Generalized Singular Value Decomposition
(GSVD), see subroutine ?tgsja.
Input Parameters
1052
LAPACK Routines 3
b(ldb,*) contains the p-by-n matrix B.
The second dimension of b must be at least max(1, n).
tau(*) is a workspace array.
The dimension of tau must be at least max(1, n).
work(*) is a workspace array.
The dimension of work must be at least max(1, 3n, m, p).
ldu INTEGER. The leading dimension of the output array u . ldu max(1, m) if
jobu = 'U'; ldu 1 otherwise.
ldv INTEGER. The leading dimension of the output array v . ldv max(1, p) if
jobv = 'V'; ldv 1 otherwise.
ldq INTEGER. The leading dimension of the output array q . ldq max(1, n) if
jobq = 'Q'; ldq 1 otherwise.
Output Parameters
1053
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
1054
LAPACK Routines 3
?ggsvp3
Performs preprocessing for a generalized SVD.
Syntax
call sggsvp3 (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, tau, work, lwork, info )
call dggsvp3 (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, tau, work, lwork, info )
call cggsvp3 (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, rwork, tau, work, lwork, info )
call zggsvp3 (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, rwork, tau, work, lwork, info )
Include Files
mkl_lapack.fi
Include Files
mkl.fi
Description
?ggsvp3 computes orthogonal or unitary matrices U, V, and Q such that
for real flavors:
nkl k l
k 0 A12 A13
U T AQ = if m - k - l 0;
l0 0 A23
mkl 0 0 0
nkl k l
U T AQ = k 0 A12 A13 if m - k - l< 0;
mk 0 0 A23
nkl k l
T
V BQ = l 0 0 B13
pl 00 0
nkl k l
k 0 A12 A13
U H AQ = if m - k - l 0;
l0 0 A23
mkl 0 0 0
nkl k l
H
U AQ = k 0 A12 A13 if m - k-l< 0;
mk 0 0 A23
nkl k l
H
V BQ = l 0 0 B13
pl 0 0 0
where the k-by-k matrix A12 and l-by-l matrix B13 are nonsingular upper triangular; A23 is l-by-l upper
triangular if m-k-l 0, otherwise A23 is (m-k-by-l upper trapezoidal. k + l = the effective numerical rank of
the (m + p)-by-n matrix (AT,BT)T for real flavors or (AH,BH)H for complex flavors.
1055
3 Intel Math Kernel Library Developer Reference
This decomposition is the preprocessing step for computing the Generalized Singular Value Decomposition
(GSVD), see ?ggsvd3.
Input Parameters
lda max(1,m).
ldb max(1,p).
1056
LAPACK Routines 3
DOUBLE PRECISION for zggsvp3
tola and tolb are the thresholds to determine the effective numerical rank
of matrix B and a subblock of A. Generally, they are set to
tola = max(m,n)*norm(a)*MACHEPS,
tolb = max(p,n)*norm(b)*MACHEPS.
The size of tola and tolb may affect the size of backward errors of the
decomposition.
for dggsvp3
1057
3 Intel Math Kernel Library Developer Reference
Output Parameters
Application Notes
The subroutine uses LAPACK subroutine ?geqp3 for the QR factorization with column pivoting to detect the
effective numerical rank of the A matrix. It may be replaced by a better rank determination strategy.
1058
LAPACK Routines 3
?ggsvp3 replaces the deprecated subroutine ?ggsvp.
?ggsvd3
Computes generalized SVD.
Syntax
call sggsvd3(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, lwork, iwork, info)
call dggsvd3(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, lwork, iwork, info)
call cggsvd3(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, lwork, rwork, iwork, info)
call zggsvd3(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, lwork, rwork, iwork, info)
Include Files
mkl.fi
Description
?ggsvd3 computes the generalized singular value decomposition (GSVD) of an m-by-n real or complex matrix
A and p-by-n real or complex matrix B:
k l
k I 0
D1 =
l0 C
mkl 0 0
k l
D2 = l 0 S
pl 0 0
nkl k l
0R =k 0 R11 R12
l 0 0 R22
where
C = diag( alpha(k+1), ... , alpha(k+l) ),
C2 + S2 = I.
If m - k - l < 0,
1059
3 Intel Math Kernel Library Developer Reference
k mk k+l m
D1 = k I 0 0
mk 0 C 0
k mk k+l m
mk 0 S 0
D2 =
k+l m 0 0 I
pl 0 0 0
nkl k mk k+l m
k 0 R11 R12 R13
0R =
mk 0 0 R22 R23
k+l m 0 0 0 R33
where
C = diag(alpha(k + 1), ... , alpha(m)),
C2 + S2 = I.
The routine computes C, S, R, and optionally the orthogonal/unitary transformation matrices U, V and Q.
In particular, if B is an n-by-n nonsingular matrix, then the GSVD of A and B implicitly gives the SVD of
A*inv(B):
A*inv(B) = U*(D1*inv(D2))*VT for real flavors
or
A*inv(B) = U*(D1*inv(D2))*VH for complex flavors.
If (AT,BT)T for real flavors or (AH,BH)H for complex flavors has orthonormal columns, then the GSVD of A and
B is also equal to the CS decomposition of A and B. Furthermore, the GSVD can be used to derive the
solution of the eigenvalue problem:
AT*AX = * BT*BX for real flavors
or
AH*AX = * BH*BX for complex flavors
In some literature, the GSVD of A and B is presented in the form
UT*A*X = ( 0 D1 ), VT*B*X = ( 0 D2 ) for real (A, B)
or
UH*A*X = ( 0 D1 ), VH*B*X = ( 0 D2 ) for complex (A, B)
where U and V are orthogonal and X is nonsingular, D1 and D2 are "diagonal''. The former GSVD form can be
converted to the latter form by taking the nonsingular matrix X as
I 0
X = Q*
0 inv R
Input Parameters
1060
LAPACK Routines 3
jobq CHARACTER*1. = 'Q': Orthogonal/unitary matrix Q is computed;
= 'N': Q is not computed.
lda max(1,m).
ldb max(1,p).
1061
3 Intel Math Kernel Library Developer Reference
for dggsvd3
Output Parameters
On exit, alpha and beta contain the generalized singular value pairs of a
and b;
1062
LAPACK Routines 3
alpha(1: k) = 1,
beta(1: k) = 0,
and if m - k - l 0,
alpha(k + 1:k + l) = C,
beta(k + 1:k + l) = S,
or if m - k - l < 0,
1063
3 Intel Math Kernel Library Developer Reference
iwork On exit, iwork stores the sorting information. More precisely, the following
loop uses iwork to sort alpha:
Application Notes
?ggsvd3 replaces the deprecated subroutine ?ggsvd.
?tgsja
Computes the generalized SVD of two upper triangular
or trapezoidal matrices.
Syntax
call stgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha, beta,
u, ldu, v, ldv, q, ldq, work, ncycle, info)
call dtgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha, beta,
u, ldu, v, ldv, q, ldq, work, ncycle, info)
call ctgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha, beta,
u, ldu, v, ldv, q, ldq, work, ncycle, info)
call ztgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha, beta,
u, ldu, v, ldv, q, ldq, work, ncycle, info)
call tgsja(a, b, tola, tolb, k, l [,u] [,v] [,q] [,jobu] [,jobv] [,jobq] [,alpha]
[,beta] [,ncycle] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the generalized singular value decomposition (GSVD) of two real/complex upper
triangular (or trapezoidal) matrices A and B. On entry, it is assumed that matrices A and B have the following
forms, which may be obtained by the preprocessing subroutine ggsvp from a general m-by-n matrix A and p-
by-n matrix B:
1064
LAPACK Routines 3
where the k-by-k matrix A12 and l-by-l matrix B13 are nonsingular upper triangular; A23 is l-by-l upper
triangular if m-k-l0, otherwise A23 is (m-k)-by-l upper trapezoidal.
On exit,
UH*A*Q = D1*(0 R), VH*B*Q = D2*(0 R),
where U, V and Q are orthogonal/unitary matrices, R is a nonsingular upper triangular matrix, and D1 and D2
are "diagonal" matrices, which are of the following structures:
If m-k-l0,
where
C = diag(alpha(k+1),...,alpha(k+l))
1065
3 Intel Math Kernel Library Developer Reference
S = diag(beta(k+1),...,beta(k+l))
C2 + S2 = I
R is stored in a(1:k+l, n-k-l+1:n ) on exit.
If m-k-l < 0,
where
C = diag(alpha(k+1),...,alpha(m)),
S = diag(beta(k+1),...,beta(m)),
C2 + S2 = I
Input Parameters
1066
LAPACK Routines 3
jobv CHARACTER*1. Must be 'V', 'I', or 'N'.
If jobv = 'V', v must contain an orthogonal/unitary matrix V1 on entry.
k, l INTEGER. Specify the subblocks in the input matrices A and B, whose GSVD
is computed.
1067
3 Intel Math Kernel Library Developer Reference
Output Parameters
alpha(k+1:k+l) = diag(C),
beta(k+1:k+l) = diag(S),
or if m-k-l < 0,
alpha(k+l+1:n)= 0 and
beta(k+l+1:n) = 0.
1068
LAPACK Routines 3
If jobv = 'V', v contains the product V1*V.
info INTEGER.
If info = 0, the execution is successful.
1069
3 Intel Math Kernel Library Developer Reference
Note that there will be an error condition if jobv is present and v omitted.
See Also
Cosine-Sine Decomposition: LAPACK Driver Routines
?bbcsd
Computes the CS decomposition of an orthogonal/
unitary matrix in bidiagonal-block form.
Syntax
call sbbcsd( jobu1, jobu2, jobv1t, jobv2t, trans, m, p, q, theta, phi, u1, ldu1, u2,
ldu2, v1t, ldv1t, v2t, ldv2t, b11d, b11e, b12d, b12e, b21d, b21e, b21e, b22e, work,
lwork, info )
call dbbcsd( jobu1, jobu2, jobv1t, jobv2t, trans, m, p, q, theta, phi, u1, ldu1, u2,
ldu2, v1t, ldv1t, v2t, ldv2t, b11d, b11e, b12d, b12e, b21d, b21e, b21e, b22e, work,
lwork, info )
call cbbcsd( jobu1, jobu2, jobv1t, jobv2t, trans, m, p, q, theta, phi, u1, ldu1, u2,
ldu2, v1t, ldv1t, v2t, ldv2t, b11d, b11e, b12d, b12e, b21d, b21e, b21e, b22e, rwork,
rlwork, info )
1070
LAPACK Routines 3
call zbbcsd( jobu1, jobu2, jobv1t, jobv2t, trans, m, p, q, theta, phi, u1, ldu1, u2,
ldu2, v1t, ldv1t, v2t, ldv2t, b11d, b11e, b12d, b12e, b21d, b21e, b21e, b22e, rwork,
rlwork, info )
call bbcsd( theta,phi,u1,u2,v1t,v2t[,b11d][,b11e][,b12d][,b12e][,b21d][,b21e][,b22d]
[,b22e][,jobu1][,jobu2][,jobv1t][,jobv2t][,trans][,info] )
Include Files
mkl.fi, lapack.f90
Description
mkl_lapack.fiThe routine ?bbcsd computes the CS decomposition of an orthogonal or unitary matrix in
bidiagonal-block form:
or
respectively.
x is m-by-m with the top-left block p-by-q. Note that q must not be larger than p, m-p, or m-q. If q is not
the smallest index, x must be transposed and/or permuted in constant time using the trans option. See ?
orcsd/?uncsd for details.
The bidiagonal matrices b11, b12, b21, and b22 are represented implicitly by angles theta(1:q) and
phi(1:q-1).
The orthogonal/unitary matrices u1, u2, v1t, and v2t are input/output. The input matrices are pre- or post-
multiplied by the appropriate singular vector matrices.
Input Parameters
jobv1t CHARACTER. If equals Y, then v1t is updated. Otherwise, v1t is not updated.
jobv2t CHARACTER. If equals Y, then v2t is updated. Otherwise, v2t is not updated.
trans CHARACTER
1071
3 Intel Math Kernel Library Developer Reference
ldu1 INTEGER. The leading dimension of the array u1, ldu1 max(1, p).
1072
LAPACK Routines 3
ldu2 INTEGER. The leading dimension of the array u2, ldu2 max(1, m-p).
ldv1t INTEGER. The leading dimension of the array v1t, ldv1t max(1, q).
ldv2t INTEGER. The leading dimension of the array v2t, ldv2t max(1, m-q).
Output Parameters
1073
3 Intel Math Kernel Library Developer Reference
1074
LAPACK Routines 3
b12d REAL for sbbcsd
DOUBLE PRECISION for dbbcsd
COMPLEX for cbbcsd
DOUBLE COMPLEX for zbbcsd
Array, size (q).
When ?bbcsd converges, b12d contains the negative sines of
theta(1), ..., theta(q). If ?bbcsd fails to converge, b12d contains the
diagonal of the partially reduced top right block.
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value
> 0: if ?bbcsd did not converge, info specifies the number of nonzero
entries in phi, and b11d, b11e, etc. contain the partially reduced matrix.
1075
3 Intel Math Kernel Library Developer Reference
See Also
?orcsd/?uncsd
xerbla
?orbdb/?unbdb
Simultaneously bidiagonalizes the blocks of a
partitioned orthogonal/unitary matrix.
Syntax
call sorbdb( trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22,
theta, phi, taup1, taup2, tauq1, tauq2, work, lwork, info )
call dorbdb( trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22,
theta, phi, taup1, taup2, tauq1, tauq2, work, lwork, info )
call cunbdb( trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22,
theta, phi, taup1, taup2, tauq1, tauq2, work, lwork, info )
call zunbdb( trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22,
theta, phi, taup1, taup2, tauq1, tauq2, work, lwork, info )
call orbdb( x11,x12,x21,x22,theta,phi,taup1,taup2,tauq1,tauq2[,trans][,signs][,info] )
call unbdb( x11,x12,x21,x22,theta,phi,taup1,taup2,tauq1,tauq2[,trans][,signs][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orbdb/?unbdb simultaneously bidiagonalizes the blocks of an m-by-m partitioned orthogonal
matrix X:
1076
LAPACK Routines 3
or unitary matrix:
x11 is p-by-q. q must not be larger than p, m-p, or m-q. Otherwise, x must be transposed and/or permuted
in constant time using the trans and signs options.
The orthogonal/unitary matrices p1, p2, q1, and q2 are p-by-p, (m-p)-by-(m-p), q-by-q, (m-q)-by-(m-q),
respectively. They are represented implicitly by Housholder vectors.
The bidiagonal matrices b11, b12, b21, and b22 are q-by-q bidiagonal matrices represented implicitly by angles
theta(1), ..., theta(q) and phi(1), ..., phi(q-1). b11 and b12 are upper bidiagonal, while b21 and b22 are
lower bidiagonal. Every entry in each bidiagonal band is a product of a sine or cosine of theta with a sine or
cosine of phi. See [Sutton09] for details.
Input Parameters
trans CHARACTER
signs CHARACTER
ldx11 INTEGER. The leading dimension of the array X11. If trans = 'T', ldx11p.
Otherwise, ldx11q.
1077
3 Intel Math Kernel Library Developer Reference
ldx12 INTEGER. The leading dimension of the array X12. If trans = 'N', ldx12p.
Otherwise, ldx12m-q.
ldx21 INTEGER. The leading dimension of the array X21. If trans = 'N', ldx21m-
p. Otherwise, ldx21q.
ldx22 INTEGER. The leading dimension of the array X21. If trans = 'N', ldx22m-
p. Otherwise, ldx22m-q.
1078
LAPACK Routines 3
Output Parameters
If trans='N', the columns of the upper triangle of x12 specify the first
p reflectors for q2
otherwise the columns of the lower triangle of x12 specify the first
trans='T', p reflectors for q2
1079
3 Intel Math Kernel Library Developer Reference
1080
LAPACK Routines 3
phi Holds the vector of length q-1.
See Also
?orcsd/?uncsd
?orgqr
?ungqr
?orglq
?unglq
xerbla
gelsy Computes the minimum-norm solution to a linear least squares problem using a
complete orthogonal factorization of A.
gelss Computes the minimum-norm solution to a linear least squares problem using the
singular value decomposition of A.
1081
3 Intel Math Kernel Library Developer Reference
gelsd Computes the minimum-norm solution to a linear least squares problem using the
singular value decomposition of A and a divide and conquer method.
?gels
Uses QR or LQ factorization to solve a overdetermined
or underdetermined linear system with full rank
matrix.
Syntax
call sgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call dgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call cgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call zgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call gels(a, b [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine solves overdetermined or underdetermined real/ complex linear systems involving an m-by-n
matrix A, or its transpose/ conjugate-transpose, using a QR or LQ factorization of A. It is assumed that A has
full rank.
The following options are provided:
1. If trans = 'N' and mn: find the least squares solution of an overdetermined system, that is, solve the
least squares problem
minimize ||b - A*x||2
2. If trans = 'N' and m < n: find the minimum norm solution of an underdetermined system A*X = B.
3. If trans = 'T' or 'C' and mn: find the minimum norm solution of an undetermined system AH*X = B.
4. If trans = 'T' or 'C' and m < n: find the least squares solution of an overdetermined system, that is,
solve the least squares problem
minimize ||b - AH*x||2
Several right hand side vectors b and solution vectors x can be handled in a single call; they are formed by
the columns of the right hand side matrix B and the solution matrix X (when coefficient matrix is A, B is m-
by-nrhs and X is n-by-nrhs; if the coefficient matrix is AT or AH, B isn-by-nrhs and X is m-by-nrhs.
Input Parameters
If trans = 'T', the linear system involves the transposed matrix AT (for
real flavors only);
1082
LAPACK Routines 3
If trans = 'C', the linear system involves the conjugate-transposed
matrix AH (for complex flavors only).
lwork INTEGER. The size of the work array; must be at least min (m, n)+max(1,
m, n, nrhs).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
if trans = 'N' and mn, rows 1 to n of b contain the least squares solution
vectors; the residual sum of squares for the solution in each column is
given by the sum of squares of modulus of elements n+1 to m in that
column;
1083
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For better performance, try using lwork = min (m, n)+max(1, m, n, nrhs)*blocksize, where
blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance of the
blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
1084
LAPACK Routines 3
?gelsy
Computes the minimum-norm solution to a linear least
squares problem using a complete orthogonal
factorization of A.
Syntax
call sgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, info)
call dgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, info)
call cgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, rwork, info)
call zgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, rwork, info)
call gelsy(a, b [,rank] [,jpvt] [,rcond] [,info])
Include Files
mkl.fi, lapack.f90
Description
The ?gelsy routine computes the minimum-norm solution to a real/complex linear least squares problem:
with R11 defined as the largest leading submatrix whose estimated condition number is less than 1/rcond.
The order of R11, rank, is the effective rank of A. Then, R22 is considered to be negligible, and R12 is
annihilated by orthogonal/unitary transformations from the right, arriving at the complete orthogonal
factorization:
1085
3 Intel Math Kernel Library Developer Reference
The ?gelsy routine is identical to the original deprecated ?gelsx routine except for the following
differences:
The call to the subroutine ?geqpf has been substituted by the call to the subroutine ?geqp3, which is a
BLAS-3 version of the QR factorization with column pivoting.
The matrix B (the right hand side) is updated with BLAS-3.
The permutation of the matrix B (the right hand side) is faster and more simple.
Input Parameters
jpvt INTEGER.
Array, size at least max(1, n).
On entry, if jpvt(i) 0, the i-th column of A is permuted to the front of AP,
otherwise the i-th column of A is a free column.
1086
LAPACK Routines 3
rwork REAL for cgelsy DOUBLE PRECISION for zgelsy. Workspace array, size at
least max(1, 2n). Used in complex flavors only.
Output Parameters
jpvt On exit, if jpvt(i)= k, then the i-th column of AP was the k-th column of A.
rank INTEGER. The effective rank of A, that is, the order of the submatrix R11.
This is the same as the order of the submatrix T11 in the complete
orthogonal factorization of A.
info INTEGER.
If info = 0, the execution is successful.
jpvt Holds the vector of length n. Default value for this element is jpvt(i) = 0.
Application Notes
For real flavors:
The unblocked strategy requires that:
lwork max( mn+3n+1, 2*mn + nrhs ),
where mn = min( m, n ).
1087
3 Intel Math Kernel Library Developer Reference
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?gelss
Computes the minimum-norm solution to a linear least
squares problem using the singular value
decomposition of A.
Syntax
call sgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, info)
call dgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, info)
call cgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork, info)
call zgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork, info)
call gelss(a, b [,rank] [,s] [,rcond] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the minimum norm solution to a real linear least squares problem:
minimize ||b - A*x||2
using the singular value decomposition (SVD) of A. A is an m-by-n matrix which may be rank-deficient.
Several right hand side vectors b and solution vectors x can be handled in a single call; they are stored as
the columns of the m-by-nrhs right hand side matrix B and the n-by-nrhs solution matrix X. The effective
rank of A is determined by treating as zero those singular values which are less than rcond times the largest
singular value.
Input Parameters
1088
LAPACK Routines 3
(nrhs 0).
Output Parameters
a On exit, the first min(m, n) rows of a are overwritten with the matrix of
right singular vectors of A, stored row-wise.
1089
3 Intel Math Kernel Library Developer Reference
rank INTEGER. The effective rank of A, that is, the number of singular values
which are greater than rcond *s(1).
info INTEGER.
If info = 0, the execution is successful.
If info = i, then the algorithm for computing the SVD failed to converge; i
indicates the number of off-diagonal elements of an intermediate bidiagonal
form which did not converge to zero.
Application Notes
For real flavors:
lwork 3*min(m, n)+ max( 2*min(m, n), max(m, n), nrhs)
For complex flavors:
lwork 2*min(m, n)+ max(m, n, nrhs)
For good performance, lwork should generally be larger.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
1090
LAPACK Routines 3
?gelsd
Computes the minimum-norm solution to a linear least
squares problem using the singular value
decomposition of A and a divide and conquer method.
Syntax
call sgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, iwork, info)
call dgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, iwork, info)
call cgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork, iwork,
info)
call zgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork, iwork,
info)
call gelsd(a, b [,rank] [,s] [,rcond] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the minimum-norm solution to a real linear least squares problem:
minimize ||b - A*x||2
using the singular value decomposition (SVD) of A. A is an m-by-n matrix which may be rank-deficient.
Several right hand side vectors b and solution vectors x can be handled in a single call; they are stored as
the columns of the m-by-nrhs right hand side matrix B and the n-by-nrhs solution matrix X.
The problem is solved in three steps:
1. Reduce the coefficient matrix A to bidiagonal form with Householder transformations, reducing the
original problem into a "bidiagonal least squares problem" (BLS).
2. Solve the BLS using a divide and conquer approach.
3. Apply back all the Householder transformations to solve the original least squares problem.
The effective rank of A is determined by treating as zero those singular values which are less than rcond
times the largest singular value.
The routine uses auxiliary routines lals0 and lalsa.
Input Parameters
1091
3 Intel Math Kernel Library Developer Reference
Arrays:
a(lda,*) contains the m-by-n matrix A.
The second dimension of a must be at least max(1, n).
b(ldb,*) contains the m-by-nrhs right hand side matrix B.
The second dimension of b must be at least max(1, nrhs).
work is a workspace array, its dimension max(1, lwork).
iwork INTEGER. Workspace array. See Application Notes for the suggested
dimension of iwork.
Output Parameters
1092
LAPACK Routines 3
rank INTEGER. The effective rank of A, that is, the number of singular values
which are greater than rcond *s(1).
rwork(1) If info = 0, on exit, rwork(1) returns the minimum size of the workspace
array iwork required for optimum performance.
iwork(1) If info = 0, on exit, iwork(1) returns the minimum size of the workspace
array iwork required for optimum performance.
info INTEGER.
If info = 0, the execution is successful.
If info = i, then the algorithm for computing the SVD failed to converge; i
indicates the number of off-diagonal elements of an intermediate bidiagonal
form that did not converge to zero.
Application Notes
The divide and conquer algorithm makes very mild assumptions about floating point arithmetic. It will work
on machines with a guard digit in add/subtract. It could conceivably fail on hexadecimal or decimal machines
without guard digits, but we know of none.
The exact minimum amount of workspace needed depends on m, n and nrhs. The size lwork of the
workspace array work must be as given below.
For real flavors:
If mn,
lwork< 2n + n*nrhs;
1093
3 Intel Math Kernel Library Developer Reference
If m < n,
lwork 2m + m*nrhs;
where smlsiz is returned by ilaenv and is equal to the maximum size of the subproblems at the bottom of
the computation tree (usually about 25), and
nlvl = INT( log2( min( m, n )/(smlsiz+1)) ) + 1.
For good performance, lwork should generally be larger.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The dimension of the workspace array iwork must be at least
3*min( m, n )*nlvl + 11*min( m, n ).
The dimension of the workspace array iwork (for complex flavors) must be at least max(1, lrwork).
?getsls
Uses QR or LQ factorization to solve an
overdetermined or underdetermined linear system
with full rank matrix, with best performance for tall
and skinny matrices.
call sgetsls(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call dgetsls(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call cgetsls(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call zgetsls(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
Description
The routine solves overdetermined or underdetermined real/ complex linear systems involving an m-by-n
matrix A, or its transpose/conjugate-transpose, using a ?geqr or ?gelq factorization of A. It is assumed that
A has full rank.
The following options are provided:
1. If trans = 'N' and mn: find the least squares solution of an overdetermined system, that is, solve the
least squares problem
minimize ||b - A*x||2
2. If trans = 'N' and m < n: find the minimum norm solution of an underdetermined system A*X = B.
3. If trans = 'T' or 'C' and mn: find the minimum norm solution of an undetermined system AH*X = B.
4. If trans = 'T' or 'C' and m < n: find the least squares solution of an overdetermined system, that is,
solve the least squares problem
1094
LAPACK Routines 3
minimize ||b - AH*x||2
Several right hand side vectors b and solution vectors x can be handled in a single call; they are formed by
the columns of the right hand side matrix B and the solution matrix X (when coefficient matrix is A, B is m-
by-nrhs and X is n-by-nrhs; if the coefficient matrix is AT or AH, B isn-by-nrhs and X is m-by-nrhs.
Input Parameters
If trans = 'T', the linear system involves the transposed matrix AT (for
real flavors only);
If trans = 'C', the linear system involves the conjugate-transposed
matrix AH (for complex flavors only).
lwork INTEGER. The size of the work array; must be at least min (m, n)+max(1,
m, n, nrhs).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
1095
3 Intel Math Kernel Library Developer Reference
Output Parameters
if trans = 'N' and mn, rows 1 to n of b contain the least squares solution
vectors; the residual sum of squares for the solution in each column is
given by the sum of squares of modulus of elements n+1 to m in that
column;
if trans = 'N' and m < n, rows 1 to n of b contain the minimum norm
solution vectors;
if trans = 'T' or 'C' and mn, rows 1 to m of b contain the minimum
norm solution vectors;
if trans = 'T' or 'C' and m < n, rows 1 to m of b contain the least
squares solution vectors; the residual sum of squares for the solution in
each column is given by the sum of squares of modulus of elements m+1 to
n in that column.
info INTEGER.
If info = 0, the execution is successful.
gglse Solves the linear equality-constrained least squares problem using a generalized RQ
factorization.
?gglse
Solves the linear equality-constrained least squares
problem using a generalized RQ factorization.
1096
LAPACK Routines 3
Syntax
call sgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call dgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call cgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call zgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call gglse(a, b, c, d, x [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine solves the linear equality-constrained least squares (LSE) problem:
minimize ||c - A*x||2 subject to B*x = d
where A is an m-by-n matrix, B is a p-by-n matrix, c is a given m-vector, andd is a given p-vector. It is
assumed that pnm+p, and
These conditions ensure that the LSE problem has a unique solution, which is obtained using a generalized
RQ factorization of the matrices (B, A) given by
Input Parameters
1097
3 Intel Math Kernel Library Developer Reference
c(*), size at least max(1, m), contains the right hand side vector for the
least squares part of the LSE problem.
d(*),, size at least max(1, p), contains the right hand side vector for the
constrained equation.
work is a workspace array, its dimension max(1, lwork).
Output Parameters
a The elements on and above the diagonal contain the min(m, n)-by-n upper
trapezoidal matrix T as returned by ?ggrqf.
b On exit, the upper right triangle of the subarray b(1:p, n-p+1:n) contains
the p-by-p upper triangular matrix R as returned by ?ggrqf.
d On exit, d is destroyed.
c On exit, the residual sum-of-squares for the solution is given by the sum of
squares of elements n-p+1 to m of vector c.
info INTEGER.
If info = 0, the execution is successful.
1098
LAPACK Routines 3
Application Notes
For optimum performance, use
lworkp + min(m, n) + max(m, n)*nb,
where nb is an upper bound for the optimal blocksizes for ?geqrf, ?gerqf, ?ormqr/?unmqr and ?ormrq/?
unmrq.
You may set lwork to -1. The routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?ggglm
Solves a general Gauss-Markov linear model problem
using a generalized QR factorization.
Syntax
call sggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call dggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call cggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call zggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call ggglm(a, b, d, x, y [,info])
Include Files
mkl.fi, lapack.f90
Description
1099
3 Intel Math Kernel Library Developer Reference
Under these assumptions, the constrained equation is always consistent, and there is a unique solution x and
a minimal 2-norm solution y, which is obtained using a generalized QR factorization of the matrices (A, B )
given by
In particular, if matrix B is square nonsingular, then the problem GLM is equivalent to the following weighted
linear least squares problem
minimizex ||B-1(d-A*x)||2.
Input Parameters
lwork INTEGER. The size of the work array; lwork max(1, n+m+p).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
1100
LAPACK Routines 3
Output Parameters
a On exit, the upper triangular part of the array a contains the m-by-m upper
triangular matrix R.
d On exit, d is destroyed
info INTEGER.
If info = 0, the execution is successful.
1101
3 Intel Math Kernel Library Developer Reference
Application Notes
For optimum performance, use
lworkm + min(n, p) + max(n, p)*nb,
where nb is an upper bound for the optimal blocksizes for ?geqrf, ?gerqf, ?ormqr/?unmqr and ?ormrq/?
unmrq.
You may set lwork to -1. The routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
syevd/heevd Computes all eigenvalues and (optionally) all eigenvectors of a real symmetric /
Hermitian matrix using divide and conquer algorithm.
spevd/hpevd Uses divide and conquer algorithm to compute all eigenvalues and (optionally) all
eigenvectors of a real symmetric / Hermitian matrix held in packed storage.
sbev /hbev Computes all eigenvalues and, optionally, eigenvectors of a real symmetric /
Hermitian band matrix.
sbevd/hbevd Computes all eigenvalues and (optionally) all eigenvectors of a real symmetric /
Hermitian band matrix using divide and conquer algorithm.
stevd Computes all eigenvalues and (optionally) all eigenvectors of a real symmetric
tridiagonal matrix using divide and conquer algorithm.
1102
LAPACK Routines 3
Routine Name Operation performed
?syev
Computes all eigenvalues and, optionally,
eigenvectors of a real symmetric matrix.
Syntax
call ssyev(jobz, uplo, n, a, lda, w, work, lwork, info)
call dsyev(jobz, uplo, n, a, lda, w, work, lwork, info)
call syev(a, w [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric matrix A.
Note that for most cases of real symmetric eigenvalue problems the default choice should be syevr function
as its underlying algorithm is faster and uses less workspace.
Input Parameters
lwork INTEGER.
The dimension of the array work.
Constraint: lwork max(1, 3n-1).
1103
3 Intel Math Kernel Library Developer Reference
Output Parameters
(if uplo = 'L') or the upper triangle (if uplo = 'U') of A, including the
diagonal, is overwritten.
work(1) On exit, if lwork > 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For optimum performance set lwork (nb+2)*n, where nb is the blocksize for ?sytrd returned by ilaenv.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
1104
LAPACK Routines 3
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
If lwork has any of admissible sizes, which is no less than the minimal value described, then the routine
completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array on exit. Use this value (work(1))
for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array work. This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?heev
Computes all eigenvalues and, optionally,
eigenvectors of a Hermitian matrix.
Syntax
call cheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)
call zheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)
call heev(a, w [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A.
Note that for most cases of complex Hermitian eigenvalue problems the default choice should be heevr
function as its underlying algorithm is faster and uses less workspace.
Input Parameters
1105
3 Intel Math Kernel Library Developer Reference
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
lwork INTEGER.
The dimension of the array work. C
onstraint: lwork max(1, 2n-1).
Output Parameters
(if uplo = 'L') or the upper triangle (if uplo = 'U') of A, including the
diagonal, is overwritten.
work(1) On exit, if lwork > 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
1106
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Interface Conventions.
Specific details for the routine heev interface are the following:
Application Notes
For optimum performance use
lwork (nb+1)*n,
where nb is the blocksize for ?hetrd returned by ilaenv.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?syevd
Computes all eigenvalues and, optionally, all
eigenvectors of a real symmetric matrix using divide
and conquer algorithm.
Syntax
call ssyevd(jobz, uplo, n, a, lda, w, work, lwork, iwork, liwork, info)
call dsyevd(jobz, uplo, n, a, lda, w, work, lwork, iwork, liwork, info)
call syevd(a, w [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric matrix A.
In other words, it can compute the spectral factorization of A as: A = Z**ZT.
Here is a diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the orthogonal matrix
whose columns are the eigenvectors zi. Thus,
A*zi = i*zi for i = 1, 2, ..., n.
1107
3 Intel Math Kernel Library Developer Reference
If the eigenvectors are requested, then this routine uses a divide and conquer algorithm to compute
eigenvalues and eigenvectors. However, if only eigenvalues are required, then it uses the Pal-Walker-Kahan
variant of the QL or QR algorithm.
Note that for most cases of real symmetric eigenvalue problems the default choice should be syevr function
as its underlying algorithm is faster and uses less workspace. ?syevd requires more workspace but is faster
in some cases, especially for large matrices.
Input Parameters
lwork INTEGER.
The dimension of the array work.
Constraints:
if n 1, then lwork 1;
iwork INTEGER.
1108
LAPACK Routines 3
Workspace array, its dimension max(1, liwork).
liwork INTEGER.
The dimension of the array iwork.
Constraints:
if n 1, then liwork 1;
Output Parameters
work(1) On exit, if lwork > 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if liwork > 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1109
3 Intel Math Kernel Library Developer Reference
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A+E such that ||E||2 = O()*||A||2,
where is the machine precision.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run, or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, then the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), then the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work, iwork). This operation is called a workspace
query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, then the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
The complex analogue of this routine is heevd
?heevd
Computes all eigenvalues and, optionally, all
eigenvectors of a complex Hermitian matrix using
divide and conquer algorithm.
Syntax
call cheevd(jobz, uplo, n, a, lda, w, work, lwork, rwork, lrwork, iwork, liwork, info)
call zheevd(jobz, uplo, n, a, lda, w, work, lwork, rwork, lrwork, iwork, liwork, info)
call heevd(a, w [,job] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a complex Hermitian matrix
A. In other words, it can compute the spectral factorization of A as: A = Z**ZH.
Here is a real diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the (complex)
unitary matrix whose columns are the eigenvectors zi. Thus,
A*zi = i*zi for i = 1, 2, ..., n.
If the eigenvectors are requested, then this routine uses a divide and conquer algorithm to compute
eigenvalues and eigenvectors. However, if only eigenvalues are required, then it uses the Pal-Walker-Kahan
variant of the QL or QR algorithm.
1110
LAPACK Routines 3
Note that for most cases of complex Hermetian eigenvalue problems the default choice should be heevr
function as its underlying algorithm is faster and uses less workspace. ?heevd requires more workspace but
is faster in some cases, especially for large matrices.
Input Parameters
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
lwork INTEGER.
The dimension of the array work. Constraints:
if n 1, then lwork 1;
lrwork INTEGER.
The dimension of the array rwork. Constraints:
if n 1, then lrwork 1;
1111
3 Intel Math Kernel Library Developer Reference
liwork INTEGER.
The dimension of the array iwork. Constraints: if n 1, then liwork 1;
Output Parameters
a If jobz = 'V', then on exit this array is overwritten by the unitary matrix
Z which contains the eigenvectors of A.
work(1) On exit, if lwork > 0, then the real part of work(1) returns the required
minimal size of lwork.
rwork(1) On exit, if lrwork > 0, then rwork(1) returns the required minimal size of
lrwork.
iwork(1) On exit, if liwork > 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
If info = i, and jobz = 'N', then the algorithm failed to converge; i off-
diagonal elements of an intermediate tridiagonal form did not converge to
zero;
if info = i, and jobz = 'V', then the algorithm failed to compute an
eigenvalue while working on the submatrix lying in rows and columns
info/(n+1) through mod(info, n+1).
1112
LAPACK Routines 3
If info = -i, the i-th parameter had an illegal value.
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A + E such that ||E||2 = O()*||A||2,
where is the machine precision.
If you are in doubt how much workspace to supply, use a generous value of lwork (liwork or lrwork) for the
first run or set lwork = -1 (liwork = -1, lrwork = -1).
If you choose the first option and set any of admissible lwork (liwork or lrwork) sizes, which is no less than
the minimal value described, the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of the
corresponding array (work, iwork, rwork) on exit. Use this value (work(1), iwork(1), rwork(1)) for
subsequent runs.
If you set lwork = -1 (liwork = -1, lrwork = -1), the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork, rwork). This operation
is called a workspace query.
Note that if you set lwork (liwork, lrwork) to less than the minimal required value and not -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
The real analogue of this routine is syevd. See also hpevd for matrices held in packed storage, and hbevd for
banded matrices.
?syevx
Computes selected eigenvalues and, optionally,
eigenvectors of a symmetric matrix.
Syntax
call ssyevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, work,
lwork, iwork, ifail, info)
call dsyevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, work,
lwork, iwork, ifail, info)
call syevx(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
1113
3 Intel Math Kernel Library Developer Reference
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix A.
Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
Note that for most cases of real symmetric eigenvalue problems the default choice should be syevr function
as its underlying algorithm is faster and uses less workspace. ?syevx is faster for a few selected
eigenvalues.
Input Parameters
If range = 'V', all eigenvalues in the half-open interval (vl, vu] will be
found.
If range = 'I', the eigenvalues with indices il through iu will be found.
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n) .
il, iu INTEGER.
If range = 'I', the indices of the smallest and largest eigenvalues to be
returned.
Constraints: 1 iliun, if n > 0;
il = 1 and iu = 0, if n = 0.
1114
LAPACK Routines 3
Not referenced if range = 'A'or 'V'.
lwork INTEGER.
The dimension of the array work.
If n 1 then lwork 1, otherwise lwork=8*n.
Output Parameters
a On exit, the lower triangle (if uplo = 'L') or the upper triangle (if uplo =
'U') of A, including the diagonal, is overwritten.
1115
3 Intel Math Kernel Library Developer Reference
Note: you must ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
work(1) On exit, if lwork > 0, then work(1) returns the required minimal size of
lwork.
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, then ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'V', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
jobz Restored based on the presence of the argument z as follows: jobz = 'V', if z
is present, jobz = 'N', if z is omitted Note that there will be an error condition
if ifail is present and z is omitted.
1116
LAPACK Routines 3
range Restored based on the presence of arguments vl, vu, il, iu as follows: range =
'V', if one of or both vl and vu are present, range = 'I', if one of or both il
and iu are present, range = 'A', if none of vl, vu, il, iu is present, Note that
there will be an error condition if one of or both vl and vu are present and at the
same time one of or both il and iu are present.
Application Notes
For optimum performance use lwork (nb+3)*n, where nb is the maximum of the blocksize for ?sytrd
and ?ormtr returned by ilaenv.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run or set lwork
= -1.
If lwork has any of admissible sizes, which is no less than the minimal value described, then the routine
completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array work on exit. Use this value
(work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array work. This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T|} is used as tolerance, where ||T|| is the 1-norm of the
tridiagonal matrix obtained by reducing A to tridiagonal form. Eigenvalues are computed most accurately
when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
?heevx
Computes selected eigenvalues and, optionally,
eigenvectors of a Hermitian matrix.
Syntax
call cheevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, work,
lwork, rwork, iwork, ifail, info)
call zheevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, work,
lwork, rwork, iwork, ifail, info)
call heevx(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A.
Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
Note that for most cases of complex Hermetian eigenvalue problems the default choice should be heevr
function as its underlying algorithm is faster and uses less workspace. ?heevx is faster for a few selected
eigenvalues.
1117
3 Intel Math Kernel Library Developer Reference
Input Parameters
If range = 'V', all eigenvalues in the half-open interval (vl, vu] will be
found.
If range = 'I', the eigenvalues with indices il through iu will be found.
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
il, iu INTEGER.
If range = 'I', the indices of the smallest and largest eigenvalues to be
returned. Constraints:
1 iliun, if n > 0;il = 1 and iu = 0, if n = 0. Not referenced if range =
'A'or 'V'.
lwork INTEGER.
1118
LAPACK Routines 3
The dimension of the array work.
lwork 1 if n1; otherwise at least 2*n.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
a On exit, the lower triangle (if uplo = 'L') or the upper triangle (if uplo =
'U') of A, including the diagonal, is overwritten.
work(1) On exit, if lwork > 0, then work(1) returns the required minimal size of
lwork.
ifail INTEGER.
Array, size at least max(1, n).
1119
3 Intel Math Kernel Library Developer Reference
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, then ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'V', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
jobz Restored based on the presence of the argument z as follows: jobz = 'V', if z
is present, jobz = 'N', if z is omitted Note that there will be an error condition
if ifail is present and z is omitted.
range Restored based on the presence of arguments vl, vu, il, iu as follows: range =
'V', if one of or both vl and vu are present, range = 'I', if one of or both il
and iu are present, range = 'A', if none of vl, vu, il, iu is present, Note that
there will be an error condition if one of or both vl and vu are present and at the
same time one of or both il and iu are present.
Application Notes
For optimum performance use lwork (nb+1)*n, where nb is the maximum of the blocksize for ?hetrd
and ?unmtr returned by ilaenv.
1120
LAPACK Routines 3
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T|| will be used in its place, where ||T|| is the 1-norm of
the tridiagonal matrix obtained by reducing A to tridiagonal form. Eigenvalues will be computed most
accurately when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
?syevr
Computes selected eigenvalues and, optionally,
eigenvectors of a real symmetric matrix using the
Relatively Robust Representations.
Syntax
call ssyevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz,
isuppz, work, lwork, iwork, liwork, info)
call dsyevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz,
isuppz, work, lwork, iwork, liwork, info)
call syevr(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix A.
Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
The routine first reduces the matrix A to tridiagonal form T. Then, whenever possible, ?syevr calls stemr to
compute the eigenspectrum using Relatively Robust Representations. stemr computes eigenvalues by the
dqds algorithm, while orthogonal eigenvectors are computed from various "good" L*D*LT representations
(also known as Relatively Robust Representations). Gram-Schmidt orthogonalization is avoided as far as
possible. More specifically, the various steps of the algorithm are as follows. For the each unreduced block of
T:
a. Compute T - *I = L*D*LT, so that L and D define all the wanted eigenvalues to high relative
accuracy. This means that small relative changes in the entries of D and L cause only small relative
changes in the eigenvalues and eigenvectors. The standard (unfactored) representation of the
tridiagonal matrix T does not have this property in general.
1121
3 Intel Math Kernel Library Developer Reference
b. Compute the eigenvalues to suitable accuracy. If the eigenvectors are desired, the algorithm attains full
accuracy of the computed eigenvalues only right before the corresponding vectors have to be
computed, see Steps c) and d).
c. For each cluster of close eigenvalues, select a new shift close to the cluster, find a new factorization,
and refine the shifted eigenvalues to suitable accuracy.
d. For each eigenvalue with a large enough relative separation, compute the corresponding eigenvector by
forming a rank revealing twisted factorization. Go back to Step c) for any clusters that remain.
The desired accuracy of the output can be specified by the input parameter abstol.
The routine ?syevr calls stemr when the full spectrum is requested on machines that conform to the
IEEE-754 floating point standard. ?syevr calls stebz and stein on non-IEEE machines and when partial
spectrum requests are made.
Note that ?syevr is preferable for most cases of real symmetric eigenvalue problems as its underlying
algorithm is fast and uses less workspace.
Input Parameters
For range = 'V'or 'I' and iu-il < n-1, sstebz/dstebz and sstein/
dstein are called.
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
1122
LAPACK Routines 3
If range = 'V', the lower and upper bounds of the interval to be searched
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint:
1 iliun, if n > 0;
il=1 and iu=0, if n = 0.
If range = 'A' or 'V', il and iu are not referenced.
lwork INTEGER.
The dimension of the array work.
Constraint: lwork max(1, 26n).
liwork INTEGER.
The dimension of the array iwork, lwork max(1, 10n).
1123
3 Intel Math Kernel Library Developer Reference
Output Parameters
a On exit, the lower triangle (if uplo = 'L') or the upper triangle (if uplo =
'U') of A, including the diagonal, is overwritten.
If jobz = 'N', then z is not referenced. Note that you must ensure that at
least max(1, m) columns are supplied in the array z ; if range = 'V', the
exact value of m is not known in advance and an upper bound must be
used.
isuppz INTEGER.
Array, size at least 2 *max(1, m).
The support of the eigenvectors in z, i.e., the indices indicating the nonzero
elements in z. The i-th eigenvector is nonzero only in elements
isuppz( 2i-1) through isuppz( 2i ). Referenced only if eigenvectors
are needed (jobz = 'V') and all eigenvalues are needed, that is, range =
'A' or range = 'I' and il = 1 and iu = n.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1124
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Interface Conventions.
Specific details for the routine syevr interface are the following:
z Holds the matrix Z of size (n, n), where the values n and m are significant.
isuppz Holds the vector of length (2*m), where the values (2*m) are significant.
jobz Restored based on the presence of the argument z as follows: jobz = 'V', if z
is present, jobz = 'N', if z is omitted Note that there will be an error condition
if isuppz is present and z is omitted.
range Restored based on the presence of arguments vl, vu, il, iu as follows: range =
'V', if one of or both vl and vu are present, range = 'I', if one of or both il
and iu are present, range = 'A', if none of vl, vu, il, iu is present, Note that
there will be an error condition if one of or both vl and vu are present and at the
same time one of or both il and iu are present.
Application Notes
For optimum performance use lwork (nb+6)*n, where nb is the maximum of the blocksize for ?sytrd
and ?ormtr returned by ilaenv.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, then the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), then the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work, iwork). This operation is called a workspace
query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, then the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
1125
3 Intel Math Kernel Library Developer Reference
?heevr
Computes selected eigenvalues and, optionally,
eigenvectors of a Hermitian matrix using the
Relatively Robust Representations.
Syntax
call cheevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz,
isuppz, work, lwork, rwork, lrwork, iwork, liwork, info)
call zheevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz,
isuppz, work, lwork, rwork, lrwork, iwork, liwork, info)
call heevr(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A.
Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
The routine first reduces the matrix A to tridiagonal form T with a call to hetrd. Then, whenever possible, ?
heevr calls stegr to compute the eigenspectrum using Relatively Robust Representations. ?stegr computes
eigenvalues by the dqds algorithm, while orthogonal eigenvectors are computed from various "good" L*D*LT
representations (also known as Relatively Robust Representations). Gram-Schmidt orthogonalization is
avoided as far as possible. More specifically, the various steps of the algorithm are as follows. For each
unreduced block (submatrix) of T:
a. Compute T - *I = L*D*LT, so that L and D define all the wanted eigenvalues to high relative
accuracy. This means that small relative changes in the entries of D and L cause only small relative
changes in the eigenvalues and eigenvectors. The standard (unfactored) representation of the
tridiagonal matrix T does not have this property in general.
b. Compute the eigenvalues to suitable accuracy. If the eigenvectors are desired, the algorithm attains full
accuracy of the computed eigenvalues only right before the corresponding vectors have to be
computed, see Steps c) and d).
c. For each cluster of close eigenvalues, select a new shift close to the cluster, find a new factorization,
and refine the shifted eigenvalues to suitable accuracy.
d. For each eigenvalue with a large enough relative separation, compute the corresponding eigenvector by
forming a rank revealing twisted factorization. Go back to Step c) for any clusters that remain.
The desired accuracy of the output can be specified by the input parameter abstol.
The routine ?heevr calls stemr when the full spectrum is requested on machines which conform to the
IEEE-754 floating point standard, or stebz and stein on non-IEEE machines and when partial spectrum
requests are made.
Note that the routine ?heevr is preferable for most cases of complex Hermitian eigenvalue problems as its
underlying algorithm is fast and uses less workspace.
Input Parameters
1126
LAPACK Routines 3
range CHARACTER*1. Must be 'A' or 'V' or 'I'.
If range = 'A', the routine computes all eigenvalues.
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0 if n = 0.
1127
3 Intel Math Kernel Library Developer Reference
lwork INTEGER.
The dimension of the array work.
Constraint: lwork max(1, 2n).
lrwork INTEGER.
The dimension of the array rwork;
lwork max(1, 24n).
If lrwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work, rwork and iwork arrays, returns
these values as the first entries of the work, rwork and iwork arrays, and no
error message related to lwork or lrwork or liwork is issued by xerbla.
liwork INTEGER.
The dimension of the array iwork,
lwork max(1, 10n).
If liwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work, rwork and iwork arrays, returns
these values as the first entries of the work, rwork and iwork arrays, and no
error message related to lwork or lrwork or liwork is issued by xerbla.
Output Parameters
a On exit, the lower triangle (if uplo = 'L') or the upper triangle (if uplo =
'U') of A, including the diagonal, is overwritten.
1128
LAPACK Routines 3
0 mn.
If range = 'A', m = n, if range = 'I', m = iu-il+1, and if range =
'V' the exact value of m is not known in advance.
Note: you must ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
isuppz INTEGER.
Array, size at least 2 *max(1, m).
The support of the eigenvectors in z, i.e., the indices indicating the nonzero
elements in z. The i-th eigenvector is nonzero only in elements
isuppz( 2i-1) through isuppz( 2i ). Referenced only if eigenvectors
are needed (jobz = 'V') and all eigenvalues are needed, that is, range =
'A' or range = 'I' and il = 1 and iu = n.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
rwork(1) On exit, if info = 0, then rwork(1) returns the required minimal size of
lrwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1129
3 Intel Math Kernel Library Developer Reference
z Holds the matrix Z of size (n, n), where the values n and m are significant.
isuppz Holds the vector of length (2*n), where the values (2*m) are significant.
jobz Restored based on the presence of the argument z as follows: jobz = 'V', if z
is present, jobz = 'N', if z is omitted Note that there will be an error condition
if isuppz is present and z is omitted.
range Restored based on the presence of arguments vl, vu, il, iu as follows: range =
'V', if one of or both vl and vu are present, range = 'I', if one of or both il
and iu are present, range = 'A', if none of vl, vu, il, iu is present, Note that
there will be an error condition if one of or both vl and vu are present and at the
same time one of or both il and iu are present.
Application Notes
For optimum performance use lwork (nb+1)*n, where nb is the maximum of the blocksize for ?hetrd
and ?unmtr returned by ilaenv.
If you are in doubt how much workspace to supply, use a generous value of lwork (or lrwork, or liwork) for
the first run or set lwork = -1 (lrwork = -1, liwork = -1).
If you choose the first option and set any of admissible lwork (or lrwork, liwork) sizes, which is no less than
the minimal value described, the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of the
corresponding array (work, rwork, iwork) on exit. Use this value (work(1), rwork(1), iwork(1)) for
subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work, rwork, iwork). This operation is called a workspace query.
Note that if you set lwork (lrwork, liwork) to less than the minimal required value and not -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
Normal execution of ?stemr may create NaNs and infinities and hence may abort due to a floating point
exception in environments which do not handle NaNs and infinities in the IEEE standard default manner.
1130
LAPACK Routines 3
Inderjit Dhillon: "A new O(n^2) algorithm for the symmetric tridiagonal eigenvalue/eigenvector problem",
Computer Science Division Technical Report No. UCB/CSD-97-971, UC Berkeley, May 1997.
?spev
Computes all eigenvalues and, optionally,
eigenvectors of a real symmetric matrix in packed
storage.
Syntax
call sspev(jobz, uplo, n, ap, w, z, ldz, work, info)
call dspev(jobz, uplo, n, ap, w, z, ldz, work, info)
call spev(ap, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and, optionally, eigenvectors of a real symmetric matrix A in
packed storage.
Input Parameters
Output Parameters
1131
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
jobz Restored based on the presence of the argument z as follows: jobz = 'V', if z
is present, jobz = 'N', if z is omitted.
?hpev
Computes all eigenvalues and, optionally,
eigenvectors of a Hermitian matrix in packed storage.
Syntax
call chpev(jobz, uplo, n, ap, w, z, ldz, work, rwork, info)
call zhpev(jobz, uplo, n, ap, w, z, ldz, work, rwork, info)
call hpev(ap, w [,uplo] [,z] [,info])
1132
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A in
packed storage.
Input Parameters
Output Parameters
1133
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
?spevd
Uses divide and conquer algorithm to compute all
eigenvalues and (optionally) all eigenvectors of a real
symmetric matrix held in packed storage.
Syntax
call sspevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, iwork, liwork, info)
call dspevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, iwork, liwork, info)
call spevd(ap, w [,uplo] [,z] [,info])
1134
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric matrix A
(held in packed storage). In other words, it can compute the spectral factorization of A as:
A = Z**ZT.
Here is a diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the orthogonal matrix
whose columns are the eigenvectors zi. Thus,
A*zi = i*zi for i = 1, 2, ..., n.
If the eigenvectors are requested, then this routine uses a divide and conquer algorithm to compute
eigenvalues and eigenvectors. However, if only eigenvalues are required, then it uses the Pal-Walker-Kahan
variant of the QL or QR algorithm.
Input Parameters
lwork INTEGER.
The dimension of the array work.
Constraints:
if n 1, then lwork 1;
1135
3 Intel Math Kernel Library Developer Reference
lworkn2+ 6*n + 1.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the required sizes of the work and iwork arrays, returns these
values as the first entries of the work and iwork arrays, and no error
message related to lwork or liwork is issued by xerbla. See Application
Notes for details.
liwork INTEGER.
The dimension of the array iwork.
Constraints:
if n 1, then liwork 1;
Output Parameters
info INTEGER.
1136
LAPACK Routines 3
If info = 0, the execution is successful.
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A+E such that ||E||2 = O()*||A||2,
where is the machine precision.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The complex analogue of this routine is hpevd.
See also syevd for matrices held in full storage, and sbevd for banded matrices.
?hpevd
Uses divide and conquer algorithm to compute all
eigenvalues and, optionally, all eigenvectors of a
complex Hermitian matrix held in packed storage.
Syntax
call chpevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, rwork, lrwork, iwork, liwork,
info)
1137
3 Intel Math Kernel Library Developer Reference
call zhpevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, rwork, lrwork, iwork, liwork,
info)
call hpevd(ap, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a complex Hermitian matrix
A (held in packed storage). In other words, it can compute the spectral factorization of A as: A = Z**ZH.
Here is a real diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the (complex)
unitary matrix whose columns are the eigenvectors zi. Thus,
A*zi = i*zi for i = 1, 2, ..., n.
If the eigenvectors are requested, then this routine uses a divide and conquer algorithm to compute
eigenvalues and eigenvectors. However, if only eigenvalues are required, then it uses the Pal-Walker-Kahan
variant of the QL or QR algorithm.
Input Parameters
lwork INTEGER.
The dimension of the array work.
Constraints:
1138
LAPACK Routines 3
if n 1, then lwork 1;
lrwork INTEGER.
The dimension of the array rwork. Constraints:
if n 1, then lrwork 1;
liwork INTEGER.
The dimension of the array iwork.
Constraints:
if n 1, then liwork 1;
Output Parameters
1139
3 Intel Math Kernel Library Developer Reference
If jobz = 'V', then this array is overwritten by the unitary matrix Z which
contains the eigenvectors of A.
If jobz = 'N', then z is not referenced.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
rwork(1) On exit, if info = 0, then rwork(1) returns the required minimal size of
lrwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1140
LAPACK Routines 3
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A + E such that ||E||2 = O()*||A||2,
where is the machine precision.
If you are in doubt how much workspace to supply, use a generous value of lwork (liwork or lrwork) for the
first run or set lwork = -1 (liwork = -1, lrwork = -1).
If you choose the first option and set any of admissible lwork (liwork or lrwork) sizes, which is no less than
the minimal value described, the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of the
corresponding array (work, iwork, rwork) on exit. Use this value (work(1), iwork(1), rwork(1)) for
subsequent runs.
If you set lwork = -1 (liwork = -1, lrwork = -1), the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork, rwork). This operation
is called a workspace query.
Note that if you set lwork (liwork, lrwork) to less than the minimal required value and not -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
The real analogue of this routine is spevd.
See also heevd for matrices held in full storage, and hbevd for banded matrices.
?spevx
Computes selected eigenvalues and, optionally,
eigenvectors of a real symmetric matrix in packed
storage.
Syntax
call sspevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz, work,
iwork, ifail, info)
call dspevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz, work,
iwork, ifail, info)
call spevx(ap, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix A in
packed storage. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a
range of indices for the desired eigenvalues.
Input Parameters
1141
3 Intel Math Kernel Library Developer Reference
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
1142
LAPACK Routines 3
Output Parameters
Note: you must ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
1143
3 Intel Math Kernel Library Developer Reference
z Holds the matrix Z of size (n, n), where the values n and m are significant.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 will be used in its place, where T is the tridiagonal
matrix obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol
is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
?hpevx
Computes selected eigenvalues and, optionally,
eigenvectors of a Hermitian matrix in packed storage.
1144
LAPACK Routines 3
Syntax
call chpevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz, work,
rwork, iwork, ifail, info)
call zhpevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz, work,
rwork, iwork, ifail, info)
call hpevx(ap, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A in
packed storage. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a
range of indices for the desired eigenvalues.
Input Parameters
1145
3 Intel Math Kernel Library Developer Reference
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0 if n = 0.
Output Parameters
1146
LAPACK Routines 3
If jobz = 'V', then if info = 0, the first m columns of z contain the
orthonormal eigenvectors of the matrix A corresponding to the selected
eigenvalues, with the i-th column of z holding the eigenvector associated
with w(i).
If an eigenvector fails to converge, then that column of z contains the latest
approximation to the eigenvector, and the index of the eigenvector is
returned in ifail.
If jobz = 'N', then z is not referenced.
Note: you must ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
z Holds the matrix Z of size (n, n), where the values n and m are significant.
1147
3 Intel Math Kernel Library Developer Reference
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 will be used in its place, where T is the tridiagonal
matrix obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol
is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
?sbev
Computes all eigenvalues and, optionally,
eigenvectors of a real symmetric band matrix.
Syntax
call ssbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, info)
call dsbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, info)
call sbev(ab, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric band matrix A.
Input Parameters
1148
LAPACK Routines 3
If uplo = 'L', ab stores the lower triangular part of A.
Output Parameters
z(ldz,*).
The second dimension of z must be at least max(1, n).
If jobz = 'V', then if info = 0, z contains the orthonormal eigenvectors
of the matrix A, with the i-th column of z holding the eigenvector associated
with w(i).
info INTEGER.
If info = 0, the execution is successful.
1149
3 Intel Math Kernel Library Developer Reference
?hbev
Computes all eigenvalues and, optionally,
eigenvectors of a Hermitian band matrix.
Syntax
call chbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, rwork, info)
call zhbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, rwork, info)
call hbev(ab, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a complex Hermitian band matrix A.
Input Parameters
1150
LAPACK Routines 3
kd INTEGER. The number of super- or sub-diagonals in A
(kd 0).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
1151
3 Intel Math Kernel Library Developer Reference
?sbevd
Computes all eigenvalues and, optionally, all
eigenvectors of a real symmetric band matrix using
divide and conquer algorithm.
Syntax
call ssbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, iwork, liwork, info)
call dsbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, iwork, liwork, info)
call sbevd(ab, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric band
matrix A. In other words, it can compute the spectral factorization of A as:
A = Z**ZT
Here is a diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the orthogonal matrix
whose columns are the eigenvectors zi. Thus,
A*zi = i*zi for i = 1, 2, ..., n.
If the eigenvectors are requested, then this routine uses a divide and conquer algorithm to compute
eigenvalues and eigenvectors. However, if only eigenvalues are required, then it uses the Pal-Walker-Kahan
variant of the QL or QR algorithm.
1152
LAPACK Routines 3
Input Parameters
lwork INTEGER.
The dimension of the array work.
Constraints:
if n 1, then lwork 1;
liwork INTEGER.
1153
3 Intel Math Kernel Library Developer Reference
Output Parameters
work(1) On exit, if lwork > 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if liwork > 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1154
LAPACK Routines 3
ab Holds the array A of size (kd+1,n).
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A+E such that ||E||2=O()*||A||2,
where is the machine precision.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
set lwork = -1 (liwork = -1).
If any of admissible lwork (or liwork) has any of admissible sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array (work, iwork) on
exit. Use this value (work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if work (liwork) is less than the minimal required value and is not equal to -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The complex analogue of this routine is hbevd.
See also syevd for matrices held in full storage, and spevd for matrices held in packed storage.
?hbevd
Computes all eigenvalues and, optionally, all
eigenvectors of a complex Hermitian band matrix
using divide and conquer algorithm.
Syntax
call chbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, rwork, lrwork, iwork,
liwork, info)
call zhbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, rwork, lrwork, iwork,
liwork, info)
call hbevd(ab, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a complex Hermitian band
matrix A. In other words, it can compute the spectral factorization of A as: A = Z**ZH.
Here is a real diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the (complex)
unitary matrix whose columns are the eigenvectors zi. Thus,
1155
3 Intel Math Kernel Library Developer Reference
Input Parameters
lwork INTEGER.
The dimension of the array work.
Constraints:
if n 1, then lwork 1;
1156
LAPACK Routines 3
rwork REAL for chbevd
DOUBLE PRECISION for zhbevd
Workspace array, size at least lrwork.
lrwork INTEGER.
The dimension of the array rwork.
Constraints:
if n 1, then lrwork 1;
liwork INTEGER.
The dimension of the array iwork.
Constraints:
if jobz = 'N' or n 1, then liwork 1;
Output Parameters
1157
3 Intel Math Kernel Library Developer Reference
If jobz = 'V', then this array is overwritten by the unitary matrix Z which
contains the eigenvectors of A. The i-th column of Z contains the
eigenvector which corresponds to the eigenvalue w(i).
work(1) On exit, if lwork > 0, then the real part of work(1) returns the required
minimal size of lwork.
rwork(1) On exit, if lrwork > 0, then rwork(1) returns the required minimal size of
lrwork.
iwork(1) On exit, if liwork > 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A + E such that ||E||2 = O()||A||2,
where is the machine precision.
If you are in doubt how much workspace to supply, use a generous value of lwork (liwork or lrwork) for the
first run or set lwork = -1 (liwork = -1, lrwork = -1).
1158
LAPACK Routines 3
If you choose the first option and set any of admissible lwork (liwork or lrwork) sizes, which is no less than
the minimal value described, the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of the
corresponding array (work, iwork, rwork) on exit. Use this value (work(1), iwork(1), rwork(1)) for
subsequent runs.
If you set lwork = -1 (liwork = -1, lrwork = -1), the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork, rwork). This operation
is called a workspace query.
Note that if you set lwork (liwork, lrwork) to less than the minimal required value and not -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
The real analogue of this routine is sbevd.
See also heevd for matrices held in full storage, and hpevd for matrices held in packed storage.
?sbevx
Computes selected eigenvalues and, optionally,
eigenvectors of a real symmetric band matrix.
Syntax
call ssbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol, m, w,
z, ldz, work, iwork, ifail, info)
call dsbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol, m, w,
z, ldz, work, iwork, ifail, info)
call sbevx(ab, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric band matrix A.
Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
Input Parameters
1159
3 Intel Math Kernel Library Developer Reference
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
ldq, ldz INTEGER. The leading dimensions of the output arrays q and z, respectively.
Constraints:
ldq 1, ldz 1;
If jobz = 'V', then ldq max(1, n) and ldz max(1, n) .
1160
LAPACK Routines 3
iwork INTEGER. Workspace array, size at least max(1, 5n).
Output Parameters
Note: you must ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
1161
3 Intel Math Kernel Library Developer Reference
z Holds the matrix Z of size (n, n), where the values n and m are significant.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
1162
LAPACK Routines 3
If abstol is less than or equal to zero, then *||T||1 is used as tolerance, where T is the tridiagonal matrix
obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol is set
to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
?hbevx
Computes selected eigenvalues and, optionally,
eigenvectors of a Hermitian band matrix.
Syntax
call chbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol, m, w,
z, ldz, work, rwork, iwork, ifail, info)
call zhbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol, m, w,
z, ldz, work, rwork, iwork, ifail, info)
call hbevx(ab, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian band matrix
A. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices
for the desired eigenvalues.
Input Parameters
1163
3 Intel Math Kernel Library Developer Reference
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0 if n = 0.
ldq, ldz INTEGER. The leading dimensions of the output arrays q and z, respectively.
Constraints:
ldq 1, ldz 1;
If jobz = 'V', then ldq max(1, n) and ldz max(1, n).
Output Parameters
1164
LAPACK Routines 3
If jobz = 'N', the array q is not referenced.
Note: you must ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
1165
3 Intel Math Kernel Library Developer Reference
z Holds the matrix Z of size (n, n), where the values n and m are significant.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol + * max( |a|,|b| ), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 will be used in its place, where T is the tridiagonal
matrix obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol
is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
1166
LAPACK Routines 3
?stev
Computes all eigenvalues and, optionally,
eigenvectors of a real symmetric tridiagonal matrix.
Syntax
call sstev(jobz, n, d, e, z, ldz, work, info)
call dstev(jobz, n, d, e, z, ldz, work, info)
call stev(d, e [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric tridiagonal matrix A.
Input Parameters
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V' then ldz max(1, n).
Output Parameters
1167
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
?stevd
Computes all eigenvalues and, optionally, all
eigenvectors of a real symmetric tridiagonal matrix
using divide and conquer algorithm.
Syntax
call sstevd(jobz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call dstevd(jobz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call stevd(d, e [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric tridiagonal
matrix T. In other words, the routine can compute the spectral factorization of T as: T = Z**ZT.
Here is a diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the orthogonal matrix
whose columns are the eigenvectors zi. Thus,
1168
LAPACK Routines 3
T*zi = i*zi for i = 1, 2, ..., n.
If the eigenvectors are requested, then this routine uses a divide and conquer algorithm to compute
eigenvalues and eigenvectors. However, if only eigenvalues are required, then it uses the Pal-Walker-Kahan
variant of the QL or QR algorithm.
There is no complex analogue of this routine.
Input Parameters
lwork INTEGER.
The dimension of the array work.
Constraints:
if jobz = 'N' or n 1, then lwork 1;
liwork INTEGER.
The dimension of the array iwork.
Constraints:
1169
3 Intel Math Kernel Library Developer Reference
Output Parameters
work(1) On exit, if lwork > 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if liwork > 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1170
LAPACK Routines 3
z Holds the matrix Z of size (n, n).
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix T+E such that ||E||2 = O()*||T||2,
where is the machine precision.
If i is an exact eigenvalue, and i is the corresponding computed value, then
|i - i| c(n)**||T||2
where c(n) is a modestly increasing function of n.
If zi is the corresponding exact eigenvector, and wi is the corresponding computed vector, then the angle
(zi, wi) between them is bounded as follows:
(zi, wi) c(n)**||T||2 / min ij|i - j|.
Thus the accuracy of a computed eigenvector depends on the gap between its eigenvalue and all the other
eigenvalues.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run, or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, then the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), then the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work, iwork). This operation is called a workspace
query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, then the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
?stevx
Computes selected eigenvalues and eigenvectors of a
real symmetric tridiagonal matrix.
Syntax
call sstevx(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, work, iwork,
ifail, info)
call dstevx(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, work, iwork,
ifail, info)
call stevx(d, e, w [, z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
1171
3 Intel Math Kernel Library Developer Reference
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric tridiagonal
matrix A. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of
indices for the desired eigenvalues.
Input Parameters
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0 if n = 0.
1172
LAPACK Routines 3
ldz INTEGER. The leading dimensions of the output array z; ldz 1. If jobz =
'V', then ldz max(1, n).
Output Parameters
z(ldz,*) .
The second dimension of z must be at least max(1, m).
If jobz = 'V', then if info = 0, the first m columns of z contain the
orthonormal eigenvectors of the matrix A corresponding to the selected
eigenvalues, with the i-th column of z holding the eigenvector associated
with w(i).
Note: you must ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
1173
3 Intel Math Kernel Library Developer Reference
z Holds the matrix Z of size (n, n), where the values n and m are significant.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *|A|1 is used instead. Eigenvalues are computed most accurately
when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, set abstol to 2*?
lamch('S').
1174
LAPACK Routines 3
?stevr
Computes selected eigenvalues and, optionally,
eigenvectors of a real symmetric tridiagonal matrix
using the Relatively Robust Representations.
Syntax
call sstevr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
lwork, iwork, liwork, info)
call dstevr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
lwork, iwork, liwork, info)
call stevr(d, e, w [, z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric tridiagonal
matrix T. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of
indices for the desired eigenvalues.
Whenever possible, the routine calls stemr to compute the eigenspectrum using Relatively Robust
Representations. stegr computes eigenvalues by the dqds algorithm, while orthogonal eigenvectors are
computed from various "good" L*D*LT representations (also known as Relatively Robust Representations).
Gram-Schmidt orthogonalization is avoided as far as possible. More specifically, the various steps of the
algorithm are as follows. For the i-th unreduced block of T:
The desired accuracy of the output can be specified by the input parameter abstol.
The routine ?stevr calls stemr when the full spectrum is requested on machines which conform to the
IEEE-754 floating point standard. ?stevr calls stebz and stein on non-IEEE machines and when partial
spectrum requests are made.
Input Parameters
1175
3 Intel Math Kernel Library Developer Reference
For range = 'V'or 'I' and iu-il < n-1, sstebz/dstebz and sstein/
dstein are called.
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0 if n = 0.
1176
LAPACK Routines 3
lwork INTEGER.
The dimension of the array work. Constraint:
lwork max(1, 20*n).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the required sizes of the work and iwork arrays, returns these
values as the first entries of the work and iwork arrays, and no error
message related to lwork or liwork is issued by xerbla. See Application
Notes for details.
iwork INTEGER.
Workspace array, its dimension max(1, liwork).
liwork INTEGER.
The dimension of the array iwork,
lwork max(1, 10*n).
If liwork = -1, then a workspace query is assumed; the routine only
calculates the required sizes of the work and iwork arrays, returns these
values as the first entries of the work and iwork arrays, and no error
message related to lwork or liwork is issued by xerbla. See Application
Notes for details.
Output Parameters
Note: you must ensure that at least max(1,m) columns are supplied in the
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
1177
3 Intel Math Kernel Library Developer Reference
isuppz INTEGER.
Array, size at least 2 *max(1, m).
The support of the eigenvectors in z, i.e., the indices indicating the nonzero
elements in z. The i-th eigenvector is nonzero only in elements
isuppz( 2i-1) through isuppz(2i).
Implemented only for range = 'A' or 'I' and iu-il = n-1.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
z Holds the matrix Z of size (n, n), where the values n and m are significant.
isuppz Holds the vector of length (2*n), where the values (2*m) are significant.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
1178
LAPACK Routines 3
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
Normal execution of the routine ?stegr may create NaNs and infinities and hence may abort due to a floating
point exception in environments which do not handle NaNs and infinities in the IEEE standard default manner.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run, or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, then the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), then the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work, iwork). This operation is called a workspace
query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, then the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
gees Computes the eigenvalues and Schur factorization of a general matrix, and orders
the factorization so that selected eigenvalues are at the top left of the Schur form.
geesx Computes the eigenvalues and Schur factorization of a general matrix, orders the
factorization and computes reciprocal condition numbers.
geev Computes the eigenvalues and left and right eigenvectors of a general matrix.
geevx Computes the eigenvalues and left and right eigenvectors of a general matrix, with
preliminary matrix balancing, and computes reciprocal condition numbers for the
eigenvalues and right eigenvectors.
?gees
Computes the eigenvalues and Schur factorization of a
general matrix, and orders the factorization so that
selected eigenvalues are at the top left of the Schur
form.
1179
3 Intel Math Kernel Library Developer Reference
Syntax
call sgees(jobvs, sort, select, n, a, lda, sdim, wr, wi, vs, ldvs, work, lwork, bwork,
info)
call dgees(jobvs, sort, select, n, a, lda, sdim, wr, wi, vs, ldvs, work, lwork, bwork,
info)
call cgees(jobvs, sort, select, n, a, lda, sdim, w, vs, ldvs, work, lwork, rwork,
bwork, info)
call zgees(jobvs, sort, select, n, a, lda, sdim, w, vs, ldvs, work, lwork, rwork,
bwork, info)
call gees(a, wr, wi [,vs] [,select] [,sdim] [,info])
call gees(a, w [,vs] [,select] [,sdim] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues, the real Schur
form T, and, optionally, the matrix of Schur vectors Z. This gives the Schur factorization A = Z*T*ZH.
Optionally, it also orders the eigenvalues on the diagonal of the real-Schur/Schur form so that selected
eigenvalues are at the top left. The leading columns of Z then form an orthonormal basis for the invariant
subspace corresponding to the selected eigenvalues.
A real matrix is in real-Schur form if it is upper quasi-triangular with 1-by-1 and 2-by-2 blocks. 2-by-2 blocks
will be standardized in the form
Input Parameters
sort CHARACTER*1. Must be 'N' or 'S'. Specifies whether or not to order the
eigenvalues on the diagonal of the Schur form.
If sort = 'N', then eigenvalues are not ordered.
1180
LAPACK Routines 3
select must be declared EXTERNAL in the calling subroutine.
If sort = 'S', select is used to select eigenvalues to sort to the top left of
the Schur form.
If sort = 'N', select is not referenced.
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
ldvs INTEGER. The leading dimension of the output array vs. Constraints:
ldvs 1;
ldvs max(1, n) if jobvs = 'V'.
lwork INTEGER.
The dimension of the array work.
Constraint:
lwork max(1, 3n) for real flavors;
lwork max(1, 2n) for complex flavors.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
1181
3 Intel Math Kernel Library Developer Reference
Workspace array, size at least max(1, n). Used in complex flavors only.
bwork LOGICAL. Workspace array, size at least max(1, n). Not referenced if sort
= 'N'.
Output Parameters
sdim INTEGER.
If sort = 'N', sdim= 0.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
If info = i, and
in:
1182
LAPACK Routines 3
the QR algorithm failed to compute all the eigenvalues; elements 1:ilo-1
and i+1:n of wr and wi (for real flavors) or w (for complex flavors) contain
those eigenvalues which have converged; if jobvs = 'V', vs contains the
matrix which reduces A to its partially converged Schur form;
i = n+1:
the eigenvalues could not be reordered because some eigenvalues were too
close to separate (the problem is very ill-conditioned);
i = n+2:
after reordering, round-off changed values of some complex eigenvalues so
that leading eigenvalues in the Schur form no longer satisfy select
= .TRUE.. This could also be caused by underflow due to scaling.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
1183
3 Intel Math Kernel Library Developer Reference
?geesx
Computes the eigenvalues and Schur factorization of a
general matrix, orders the factorization and computes
reciprocal condition numbers.
Syntax
call sgeesx(jobvs, sort, select, sense, n, a, lda, sdim, wr, wi, vs, ldvs, rconde,
rcondv, work, lwork, iwork, liwork, bwork, info)
call dgeesx(jobvs, sort, select, sense, n, a, lda, sdim, wr, wi, vs, ldvs, rconde,
rcondv, work, lwork, iwork, liwork, bwork, info)
call cgeesx(jobvs, sort, select, sense, n, a, lda, sdim, w, vs, ldvs, rconde, rcondv,
work, lwork, rwork, bwork, info)
call zgeesx(jobvs, sort, select, sense, n, a, lda, sdim, w, vs, ldvs, rconde, rcondv,
work, lwork, rwork, bwork, info)
call geesx(a, wr, wi [,vs] [,select] [,sdim] [,rconde] [,rcondev] [,info])
call geesx(a, w [,vs] [,select] [,sdim] [,rconde] [,rcondev] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues, the real-Schur/
Schur form T, and, optionally, the matrix of Schur vectors Z. This gives the Schur factorization A = Z*T*ZH.
Optionally, it also orders the eigenvalues on the diagonal of the real-Schur/Schur form so that selected
eigenvalues are at the top left; computes a reciprocal condition number for the average of the selected
eigenvalues (rconde); and computes a reciprocal condition number for the right invariant subspace
corresponding to the selected eigenvalues (rcondv). The leading columns of Z form an orthonormal basis for
this invariant subspace.
For further explanation of the reciprocal condition numbers rconde and rcondv, see [LUG], Section 4.10
(where these quantities are called s and sep respectively).
A real matrix is in real-Schur form if it is upper quasi-triangular with 1-by-1 and 2-by-2 blocks. 2-by-2 blocks
will be standardized in the form
Input Parameters
1184
LAPACK Routines 3
sort CHARACTER*1. Must be 'N' or 'S'. Specifies whether or not to order the
eigenvalues on the diagonal of the Schur form.
If sort = 'N', then eigenvalues are not ordered.
If sort = 'S', select is used to select eigenvalues to sort to the top left of
the Schur form.
If sort = 'N', select is not referenced.
sense CHARACTER*1. Must be 'N', 'E', 'V', or 'B'. Determines which reciprocal
condition number are computed.
If sense = 'N', none are computed;
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
ldvs INTEGER. The leading dimension of the output array vs. Constraints:
1185
3 Intel Math Kernel Library Developer Reference
ldvs 1;
ldvs max(1, n)if jobvs = 'V'.
lwork INTEGER.
The dimension of the array work. Constraint:
lwork max(1, 3n) for real flavors;
lwork max(1, 2n) for complex flavors.
Also, if sense = 'E', 'V', or 'B', then
iwork INTEGER.
Workspace array, size (liwork). Used in real flavors only. Not referenced if
sense = 'N' or 'E'.
liwork INTEGER.
The dimension of the array iwork. Used in real flavors only.
Constraint:
liwork 1;
if sense = 'V' or 'B', liworksdim*(n-sdim).
bwork LOGICAL. Workspace array, size at least max(1, n). Not referenced if sort
= 'N'.
Output Parameters
sdim INTEGER.
If sort = 'N', sdim= 0.
1186
LAPACK Routines 3
Note that for real flavors complex conjugate pairs for which select is true for
either eigenvalue count as 2.
rconde, rcondv REAL for single precision flavors DOUBLE PRECISION for double precision
flavors.
If sense = 'E' or 'B', rconde contains the reciprocal condition number for
the average of the selected eigenvalues.
If sense = 'N' or 'V', rconde is not referenced.
If sense = 'V' or 'B', rcondv contains the reciprocal condition number for
the selected right invariant subspace.
If sense = 'N' or 'E', rcondv is not referenced.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
If info = i, and
in:
1187
3 Intel Math Kernel Library Developer Reference
sense Restored based on the presence of arguments rconde and rcondv as follows:
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (or liwork) for the first run
or set lwork = -1 (liwork = -1).
1188
LAPACK Routines 3
If you choose the first option and set any of admissible lwork (or liwork) sizes, which is no less than the
minimal value described, the routine completes the task, though probably not so fast as with a recommended
workspace, and provides the recommended workspace in the first element of the corresponding array (work,
iwork) on exit. Use this value (work(1), iwork(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if you set lwork (liwork) to less than the minimal required value and not -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?geev
Computes the eigenvalues and left and right
eigenvectors of a general matrix.
Syntax
call sgeev(jobvl, jobvr, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, work, lwork, info)
call dgeev(jobvl, jobvr, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, work, lwork, info)
call cgeev(jobvl, jobvr, n, a, lda, w, vl, ldvl, vr, ldvr, work, lwork, rwork, info)
call zgeev(jobvl, jobvr, n, a, lda, w, vl, ldvl, vr, ldvr, work, lwork, rwork, info)
call geev(a, wr, wi [,vl] [,vr] [,info])
call geev(a, w [,vl] [,vr] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues and, optionally,
the left and/or right eigenvectors. The right eigenvector v of A satisfies
A*v = *v
where is its eigenvalue.
The left eigenvector u of A satisfies
uH*A = *uH
where uH denotes the conjugate transpose of u. The computed eigenvectors are normalized to have
Euclidean norm equal to 1 and largest component real.
Input Parameters
1189
3 Intel Math Kernel Library Developer Reference
lda INTEGER. The leading dimension of the array a. Must be at least max(1,
n).
ldvl, ldvr INTEGER. The leading dimensions of the output arrays vl and vr,
respectively.
Constraints:
ldvl 1; ldvr 1.
If jobvl = 'V', ldvl max(1, n);
lwork INTEGER.
The dimension of the array work.
Output Parameters
1190
LAPACK Routines 3
Contain the real and imaginary parts, respectively, of the computed
eigenvalues. Complex conjugate pairs of eigenvalues appear consecutively
with the eigenvalue having positive imaginary part first.
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then for
i = sqrt(-1), uj = vl(:,j) + i*vl(:,j+1) and uj + 1 = vl(:,j)-
i*vl(:,j+1).
For complex flavors:
uj = vl(:,j), the j-th column of vl.
vr(ldvr,*); the second dimension of vr must be at least max(1, n).
If jobvr = 'N', vr is not referenced.
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then for
i = sqrt(-1), vj = vr(:,j) + i*vr(:,j+1) and vj + 1 = vr(:,j) -
i*vr(:,j+1).
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
1191
3 Intel Math Kernel Library Developer Reference
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine exits immediately
with an error and does not provide any information on the recommended workspace.
?geevx
Computes the eigenvalues and left and right
eigenvectors of a general matrix, with preliminary
matrix balancing, and computes reciprocal condition
numbers for the eigenvalues and right eigenvectors.
1192
LAPACK Routines 3
Syntax
call sgeevx(balanc, jobvl, jobvr, sense, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, ilo,
ihi, scale, abnrm, rconde, rcondv, work, lwork, iwork, info)
call dgeevx(balanc, jobvl, jobvr, sense, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, ilo,
ihi, scale, abnrm, rconde, rcondv, work, lwork, iwork, info)
call cgeevx(balanc, jobvl, jobvr, sense, n, a, lda, w, vl, ldvl, vr, ldvr, ilo, ihi,
scale, abnrm, rconde, rcondv, work, lwork, rwork, info)
call zgeevx(balanc, jobvl, jobvr, sense, n, a, lda, w, vl, ldvl, vr, ldvr, ilo, ihi,
scale, abnrm, rconde, rcondv, work, lwork, rwork, info)
call geevx(a, wr, wi [,vl] [,vr] [,balanc] [,ilo] [,ihi] [,scale] [,abnrm] [, rconde]
[,rcondv] [,info])
call geevx(a, w [,vl] [,vr] [,balanc] [,ilo] [,ihi] [,scale] [,abnrm] [,rconde] [,
rcondv] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues and, optionally,
the left and/or right eigenvectors.
Optionally also, it computes a balancing transformation to improve the conditioning of the eigenvalues and
eigenvectors (ilo, ihi, scale, and abnrm), reciprocal condition numbers for the eigenvalues (rconde), and
reciprocal condition numbers for the right eigenvectors (rcondv).
The right eigenvector v of A satisfies
Av = v
where is its eigenvalue.
The left eigenvector u of A satisfies
uHA = uH
where uH denotes the conjugate transpose of u. The computed eigenvectors are normalized to have Euclidean
norm equal to 1 and largest component real.
Balancing a matrix means permuting the rows and columns to make it more nearly upper triangular, and
applying a diagonal similarity transformation D*A*inv(D), where D is a diagonal matrix, to make its rows and
columns closer in norm and the condition numbers of its eigenvalues and eigenvectors smaller. The computed
reciprocal condition numbers correspond to the balanced matrix. Permuting rows and columns will not
change the condition numbers in exact arithmetic) but diagonal scaling will. For further explanation of
balancing, see [LUG], Section 4.10.
Input Parameters
balanc CHARACTER*1. Must be 'N', 'P', 'S', or 'B'. Indicates how the input
matrix should be diagonally scaled and/or permuted to improve the
conditioning of its eigenvalues.
If balanc = 'N', do not diagonally scale or permute;
1193
3 Intel Math Kernel Library Developer Reference
sense CHARACTER*1. Must be 'N', 'E', 'V', or 'B'. Determines which reciprocal
condition number are computed.
If sense = 'N', none are computed;
If sense is 'E' or 'B', both left and right eigenvectors must also be
computed (jobvl = 'V' and jobvr = 'V').
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
ldvl, ldvr INTEGER. The leading dimensions of the output arrays vl and vr,
respectively.
Constraints:
ldvl 1; ldvr 1.
1194
LAPACK Routines 3
If jobvl = 'V', ldvl max(1, n);
lwork INTEGER.
The dimension of the array work.
For real flavors:
If sense = 'N' or 'E', lwork max(1, 2n), and if jobvl = 'V' or
jobvr = 'V', lwork 3n;
If sense = 'V' or 'B', lworkn*(n+6).
iwork INTEGER.
Workspace array, size at least max(1, 2n-2). Used in real flavors only. Not
referenced if sense = 'N' or 'E'.
Output Parameters
1195
3 Intel Math Kernel Library Developer Reference
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then for
i = sqrt(-1), uj = vl(:,j) + i*vl(:,j+1) and uj + 1 = vl(:,j)-
i*vl(:,j+1).
For complex flavors:
uj = vl(:,j), the j-th column of vl.
vr(ldvr,*); the second dimension of vr must be at least max(1, n).
If jobvr = 'N', vr is not referenced.
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then for
i = sqrt(-1), vj = vr(:,j) + i*vr(:,j+1) and vj + 1 = vr(:,j) -
i*vr(:,j+1).
ilo, ihi INTEGER. ilo and ihi are integer values determined when A was balanced.
The balanced A(i,j) = 0 if i > j and j = 1,..., ilo-1 or i = ihi
+1,..., n.
If balanc = 'N' or 'S', ilo = 1 and ihi = n.
1196
LAPACK Routines 3
The one-norm of the balanced matrix (the maximum of the sum of absolute
values of elements of any column).
rconde, rcondv REAL for single precision flavors DOUBLE PRECISION for double precision
flavors.
Arrays, size at least max(1, n) each.
rconde(j) is the reciprocal condition number of the j-th eigenvalue.
rcondv(j) is the reciprocal condition number of the j-th right eigenvector.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
balanc Must be 'N', 'B', 'P' or 'S'. The default value is 'N'.
1197
3 Intel Math Kernel Library Developer Reference
sense Restored based on the presence of arguments rconde and rcondv as follows:
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?gesdd Computes the singular value decomposition of a general rectangular matrix using a
divide and conquer method.
?gejsv Computes the singular value decomposition of a real matrix using a preconditioned
Jacobi SVD method.
?gesvj Computes the singular value decomposition of a real matrix using Jacobi plane
rotations.
?gesvdx Computes the SVD and left and right singular vectors for a matrix.
?gesvd
Computes the singular value decomposition of a
general rectangular matrix.
1198
LAPACK Routines 3
Syntax
call sgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, info)
call dgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, info)
call cgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, info)
call zgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, info)
call gesvd(a, s [,u] [,vt] [,ww] [,job] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the singular value decomposition (SVD) of a real/complex m-by-n matrix A, optionally
computing the left and/or right singular vectors. The SVD is written as
A = U**VT for real routines
A = U**VH for complex routines
where is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m
orthogonal/unitary matrix, and V is an n-by-n orthogonal/unitary matrix. The diagonal elements of are the
singular values of A; they are real and non-negative, and are returned in descending order. The first min(m,
n) columns of U and V are the left and right singular vectors of A.
Note that the routine returns VT (for real flavors) or VH (for complex flavors), not V.
Input Parameters
jobu CHARACTER*1. Must be 'A', 'S', 'O', or 'N'. Specifies options for
computing all or part of the matrix U.
If jobu = 'A', all m columns of U are returned in the array u;
if jobu = 'S', the first min(m, n) columns of U (the left singular vectors)
are returned in the array u;
if jobu = 'O', the first min(m, n) columns of U (the left singular vectors)
are overwritten on the array a;
if jobu = 'N', no columns of U (no left singular vectors) are computed.
jobvt CHARACTER*1. Must be 'A', 'S', 'O', or 'N'. Specifies options for
computing all or part of the matrix VT/VH.
If jobvt = 'A', all n rows of VT/VH are returned in the array vt;
if jobvt = 'S', the first min(m,n) rows of VT/VH (the right singular
vectors) are returned in the array vt;
if jobvt = 'O', the first min(m,n) rows of VT/VH) (the right singular
vectors) are overwritten on the array a;
if jobvt = 'N', no rows of VT/VH (no right singular vectors) are computed.
1199
3 Intel Math Kernel Library Developer Reference
ldu, ldvt INTEGER. The leading dimensions of the output arrays u and vt,
respectively.
Constraints:
ldu 1; ldvt 1.
If jobu = 'A'or 'S', ldum;
lwork INTEGER.
The dimension of the array work.
Constraints:
lwork 1
lwork max(3*min(m, n)+max(m, n), 5*min(m,n)) (for real flavors);
lwork 2*min(m, n)+max(m, n) (for complex flavors).
For good performance, lwork must generally be larger.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla. See Application Notes for details.
Output Parameters
a On exit,
If jobu = 'O', a is overwritten with the first min(m,n) columns of U (the
left singular vectors stored columnwise);
1200
LAPACK Routines 3
If jobvt = 'O', a is overwritten with the first min(m, n) rows of VT/VH (the
right singular vectors stored rowwise);
If jobu'O' and jobvt'O', the contents of a are destroyed.
s REAL for single precision flavors DOUBLE PRECISION for double precision
flavors.
Array, size at least max(1, min(m,n)). Contains the singular values of A
sorted so that s(i) s(i+1).
If jobvt = 'S', vt contains the first min(m, n) rows of VT/VH (the right
singular vectors stored row-wise).
If jobvt = 'N'or 'O', vt is not referenced.
work On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
For real flavors:
If info > 0, work(2:min(m,n)) contains the unconverged superdiagonal
elements of an upper bidiagonal matrix B whose diagonal is in s (not
necessarily sorted). B satisfies A=u*B*vt, so it has the same singular
values as A, and singular vectors related by u and vt.
info INTEGER.
If info = 0, the execution is successful.
1201
3 Intel Math Kernel Library Developer Reference
vt If present and is a square n-by-n matrix, on exit contains the n-by-n orthogonal/
unitary matrix V'T/V'H.
Otherwise, if present, on exit contains the first min(m,n) rows of the matrix
V'T/V'H (right singular vectors stored row-wise).
job Must be either 'N', or 'U', or 'V'. The default value is 'N'.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?gesdd
Computes the singular value decomposition of a
general rectangular matrix using a divide and conquer
method.
1202
LAPACK Routines 3
Syntax
call sgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, iwork, info)
call dgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, iwork, info)
call cgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, iwork, info)
call zgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, iwork, info)
call gesdd(a, s [,u] [,vt] [,jobz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the singular value decomposition (SVD) of a real/complex m-by-n matrix A, optionally
computing the left and/or right singular vectors.
If singular vectors are desired, it uses a divide-and-conquer algorithm. The SVD is written
A = U**VT for real routines,
A = U**VH for complex routines,
where is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m
orthogonal/unitary matrix, and V is an n-by-n orthogonal/unitary matrix. The diagonal elements of are the
singular values of A; they are real and non-negative, and are returned in descending order. The first min(m,
n) columns of U and V are the left and right singular vectors of A.
Note that the routine returns vt = VT (for real flavors) or vt =VH (for complex flavors), not V.
Input Parameters
if m n, the first n columns of U are overwritten in the array a and all rows
of VT or VH are returned in the array vt;
if m<n, all columns of U are returned in the array u and the first m rows of
VT or VH are overwritten in the array a;
if jobz = 'N', no columns of U or rows of VT or VH are computed.
1203
3 Intel Math Kernel Library Developer Reference
Arrays:
a(lda,*) is an array containing the m-by-n matrix A.
The second dimension of a must be at least max(1, n).
work is a workspace array, its dimension max(1, lwork).
lda INTEGER. The leading dimension of the array a. Must be at least max(1, m).
ldu, ldvt INTEGER. The leading dimensions of the output arrays u and vt,
respectively.
Constraints:
ldu 1; ldvt 1.
If jobz = 'S' or 'A', or jobz = 'O' and m < n,
then ldum;
then ldvtn;
lwork INTEGER.
The dimension of the array work; lwork 1.
Output Parameters
a On exit:
If jobz = 'O', then if m n, a is overwritten with the first n columns of U
(the left singular vectors, stored columnwise). If m < n, a is overwritten
with the first m rows of VT (the right singular vectors, stored rowwise);
If jobz'O', the contents of a are destroyed.
s REAL for single precision flavors DOUBLE PRECISION for double precision
flavors.
1204
LAPACK Routines 3
Array, size at least max(1, min(m,n)). Contains the singular values of A
sorted so that s(i) s(i+1).
work(1) On exit, if info = 0, then work(1) returns the optimal size of lwork.
info INTEGER.
If info = 0, the execution is successful.
1205
3 Intel Math Kernel Library Developer Reference
job Must be 'N', 'A', 'S', or 'O'. The default value is 'N'.
Application Notes
The theoretical minimum value for lwork depends on the flavor of the routine.
If jobz = 'O', lwork= 3*(min(m, n))2 + max (max(m, n), 5*(min(m, n))2 + 4*min(m, n));
If jobz = 'S' or 'A', lwork= min(m, n)*(6 + 4*min(m, n)) + max(m, n);
The optimal value of lwork returned by a workspace query generally provides better performance than the
theoretical minimum value.The value of lwork returned by a workspace query is generally larger than the
theoretical minimum value, but for very small matrices it can be smaller. The absolute minimum value of
lwork is the minimum of the workspace query result and the theoretical minimum.
If you set lwork to a value less than the absolute minimum value and not equal to -1, the routine returns
immediately with an error exit and does not provide information on the recommended workspace size.
?gejsv
Computes the singular value decomposition using a
preconditioned Jacobi SVD method.
Syntax
call sgejsv(joba, jobu, jobv, jobr, jobt, jobp, m, n, a, lda, sva, u, ldu, v, ldv,
work, lwork, iwork, info)
call dgejsv(joba, jobu, jobv, jobr, jobt, jobp, m, n, a, lda, sva, u, ldu, v, ldv,
work, lwork, iwork, info)
call cgejsv (joba, jobu, jobv, jobr, jobt, jobp, m, n, a, lda, sva, u, ldu, v, ldv,
cwork, lwork, rwork, lrwork, iwork, info )
call zgejsv (joba, jobu, jobv, jobr, jobt, jobp, m, n, a, lda, sva, u, ldu, v, ldv,
cwork, lwork, rwork, lrwork, iwork, info )
Include Files
mkl.fi
Description
The routine computes the singular value decomposition (SVD) of a real/complex m-by-n matrix A, where mn.
1206
LAPACK Routines 3
A = U**VT, for real routines
A = U**VH, for complex routines
where is an m-by-n matrix which is zero except for its n diagonal elements, U is an m-by-n (or m-by-m)
orthonormal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of are the singular values
of A; the columns of U and V are the left and right singular vectors of A, respectively. The matrices U and V
are computed and stored in the arrays u and v, respectively. The diagonal of is computed and stored in the
array sva.
The ?gejsv routine can sometimes compute tiny singular values and their singular vectors much more
accurately than other SVD routines.
The routine implements a preconditioned Jacobi SVD algorithm. It uses ?geqp3, ?geqrf, and ?gelqf as
preprocessors and preconditioners. Optionally, an additional row pivoting can be used as a preprocessor,
which in some cases results in much higher accuracy. An example is matrix A with the structure A = D1 * C
* D2, where D1, D2 are arbitrarily ill-conditioned diagonal matrices and C is a well-conditioned matrix. In that
case, complete pivoting in the first QR factorizations provides accuracy dependent on the condition number
of C, and independent of D1, D2. Such higher accuracy is not completely understood theoretically, but it
works well in practice.
If A can be written as A = B*D, with well-conditioned B and some diagonal D, then the high accuracy is
guaranteed, both theoretically and in software, independent of D. For more details see [Drmac08-1],
[Drmac08-2].
The computational range for the singular values can be the full range ( UNDERFLOW,OVERFLOW ), provided
that the machine arithmetic and the BLAS and LAPACK routines called by ?gejsv are implemented to work in
that range. If that is not the case, the restriction for safe computation with the singular values in the range
of normalized IEEE numbers is that the spectral condition number kappa(A)=sigma_max(A)/sigma_min(A)
does not overflow. This code (?gejsv) is best used in this restricted range, meaning that singular values of
magnitude below ||A||_2 / slamch('O') (for single precision) or ||A||_2 / dlamch('O') (for double
precision) are returned as zeros. See jobr for details on this.
This implementation is slower than the one described in [Drmac08-1], [Drmac08-2] due to replacement of
some non-LAPACK components, and because the choice of some tuning parameters in the iterative part (?
gesvj) is left to the implementer on a particular machine.
The rank revealing QR factorization (in this code: ?geqp3) should be implemented as in [Drmac08-3].
If m is much larger than n, it is obvious that the inital QRF with column pivoting can be preprocessed by the
QRF without pivoting. That well known trick is not used in ?gejsv because in some cases heavy row
weighting can be treated with complete pivoting. The overhead in cases m much larger than n is then only
due to pivoting, but the benefits in accuracy have prevailed. You can incorporate this extra QRF step easily
and also improve data movement (matrix transpose, matrix copy, matrix transposed copy) - this
implementation of ?gejsv uses only the simplest, naive data movement.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
1207
3 Intel Math Kernel Library Developer Reference
Input Parameters
If joba = 'R', the procedure is similar to the 'A' option. Rank revealing
property of the initial QR factorization is used to reveal (using triangular
factor) a gap sigma_{r+1} < epsilon * sigma_r, in which case the
numerical rank is declared to be r. The SVD is computed with absolute error
bounds, but more accurately than with 'A'.
If jobu = 'F', a full set of m left singular vectors is returned in the array u.
1208
LAPACK Routines 3
If jobv = 'J', n columns of V are returned in the array v but they are
computed as the product of Jacobi rotations. This option is allowed only if
jobu'N'
If jobv = 'W', v may be used as workspace of length n*n. See the
description of v.
If jobr = 'N', the function does not remove small columns of the scaled
matrix. This option assumes that BLAS and QR factorizations and triangular
solvers are implemented to work in that range. If the condition of A if
greater that big, use ?gesvj.
If jobr = 'R', restricted range for singular values of the scaled matrix A is
[sqrt(?lamch('S'), sqrt(big)], roughly as described above. This
option is recommended.
For computing the singular values in the full range [?lamch('S'),big],
use ?gesvj.
1209
3 Intel Math Kernel Library Developer Reference
lda INTEGER. The leading dimension of the array a. Must be at least max(1,
m) .
lwork INTEGER.
For real flavors:
Length of work to confirm proper allocation of work space. lwork depends
on the task performed:
If only sigma is needed (jobu = 'N', jobv = 'N') and
1210
LAPACK Routines 3
... no scaled condition estimate is required, then lwork max(2*m+n,
4*n+1,7). This is the minimal requirement. For optimal performance
(blocked code) the optimal value is lwork max(2*m+n,3*n+(n+1)*nb,
7). Here nb is the optimal block size for ?geqp3/?geqrf.
In general, the optimal length lwork is computed as
lwork max(2*m+n,n+lwork(sgeqp3),n+lwork(sgeqrf),n+n*n
+lwork(spocon, 7) for sgejsv
lwork max(2*m+n,n+lwork(dgeqp3),n+lwork(dgeqrf),n+n*n
+lwork(dpocon, 7) for dgejsv
If sigma and the right singular vectors are needed (jobv = 'V'),
1211
3 Intel Math Kernel Library Developer Reference
if jobv = 'V',
the minimal requirement is lwork max(2*m+n, 6*n+2*n*n)
if jobv = 'J',
the minimal requirement is lwork max(2*m+n, 4*n+n*n, 2*n+n*n
+6)
For optimal performance, lwork should be additionally larger than n
+m*nb, where nb is the optimal block size for ?ormlq.
1212
LAPACK Routines 3
rwork REAL for cgejsv
DOUBLE PRECISION for zgejsv
rwork is an array, dimension is at least max(7, lrwork).
Output Parameters
sva On exit:
1213
3 Intel Math Kernel Library Developer Reference
u On exit:
If jobu = 'U', contains the m-by-n matrix of the left singular vectors.
If jobu = 'F', contains the m-by-m matrix of the left singular vectors,
including an orthonormal basis of the orthogonal complement of the range
of A.
If jobu = 'W' and jobv = 'V', jobt = 'T', and m = n, then u is used
as workspace if the procedure replaces A with AT (for real flavors) or AH (for
complex flavors). In that case, v is computed in u as left singular vectors of
AT or AH and copied back to the v array. This 'W' option is just a reminder
to the caller that in this case u is reserved as workspace of length n*n.
v On exit:
If jobv = 'V' or 'J', contains the n-by-n matrix of the right singular
vectors.
If jobv = 'W' and jobu = 'U', jobt = 'T', and m = n, then v is used
as workspace if the procedure replaces A with AT (for real flavors) or AH (for
complex flavors). In that case, u is computed in v as right singular vectors
of AT or AH and copied back to the u array. This 'W' option is just a
reminder to the caller that in this case v is reserved as workspace of length
n*n.
If jobv = 'N', v is not referenced.
work On exit,
work(1) = scale = work(2)/work(1) is the scaling factor such that
scale*sva(1:n) are the computed singular values of A. See the
description of sva().
1214
LAPACK Routines 3
If full SVD is needed, the following two condition numbers are useful for the
analysis of the algorithm. They are provided for a user who is familiar with
the details of the method.
work(4) = an estimate of the scaled condition number of the triangular
factor in the first QR factorization.
work(5) = an estimate of the scaled condition number of the triangular
factor in the second QR factorization.
The following two parameters are computed if jobt = 'T'. They are
provided for a user who is familiar with the details of the method.
work(6) = the entropy of A**t*A :: this is the Shannon entropy of
diag(A**t*A) / Trace(A**t*A) taken as point in the probability
simplex.
work(7) = the entropy of A*A**t.
rwork On exit,
rwork(1) determines the scaling factor scale = rwork(2) / rwork(1)
such that scale*sva(1:n) are the computed singular values of a. (See the
description of sva().)
1215
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
If info > 0, the function did not converge in the maximal number of
sweeps. The computed values may be inaccurate.
See Also
?geqp3
?geqrf
?gelqf
?gesvj
?lamch
?pocon
?ormlq
?gesvj
Computes the singular value decomposition of a real
matrix using Jacobi plane rotations.
Syntax
call sgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, work, lwork, info)
call dgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, work, lwork, info)
call cgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, cwork, lwork, rwork,
lrwork, info )
call zgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, cwork, lwork, rwork,
lrwork, info )
Include Files
mkl.fi
Description
The routine computes the singular value decomposition (SVD) of a real or complex m-by-n matrix A, where
mn.
The SVD of A is written as
A = U**VT for real flavors, or
A = U**VH for complex flavors,
where is an m-by-n diagonal matrix, U is an m-by-n orthonormal matrix, and V is an n-by-n orthogonal/
unitary matrix. The diagonal elements of are the singular values of A; the columns of U and V are the left
and right singular vectors of A, respectively. The matrices U and V are computed and stored in the arrays u
and v, respectively. The diagonal of is computed and stored in the array sva.
The ?gesvj routine can sometimes compute tiny singular values and their singular vectors much more
accurately than other SVD routines.
1216
LAPACK Routines 3
The n-by-n orthogonal matrix V is obtained as a product of Jacobi plane rotations. The rotations are
implemented as fast scaled rotations of Anda and Park [AndaPark94]. In the case of underflow of the Jacobi
angle, a modified Jacobi transformation of Drmac ([Drmac08-4]) is used. Pivot strategy uses column
interchanges of de Rijk ([deRijk98]). The relative accuracy of the computed singular values and the accuracy
of the computed singular vectors (in angle metric) is as guaranteed by the theory of Demmel and Veselic
[Demmel92]. The condition number that determines the accuracy in the full rank case is essentially
where (.) is the spectral condition number. The best performance of this Jacobi SVD procedure is achieved if
used in an accelerated version of Drmac and Veselic [Drmac08-1], [Drmac08-2].
The computational range for the nonzero singular values is the machine number interval
( UNDERFLOW,OVERFLOW ). In extreme cases, even denormalized singular values can be computed with the
corresponding gradual loss of accurate digit.
Input Parameters
If jobv = 'A', the Jacobi rotations are applied to the mv-byn array v. In
other words, the right singular vector matrix V is not computed explicitly,
instead it is applied to an mv-byn matrix initially stored in the first mv rows
of V.
1217
3 Intel Math Kernel Library Developer Reference
If jobv = 'N', the matrix V is not computed and the array v is not
referenced.
lda INTEGER. The leading dimension of the array a. Must be at least max(1,
m) .
mv INTEGER.
Ifjobv = 'A', the product of Jacobi rotations in ?gesvj is applied to the
first mv rows of v. See the description of jobv. 0 mvldv.
lwork INTEGER.
Length of work for real flavors or cwork for complex flavors, lwork
max(6,m+n).
1218
LAPACK Routines 3
rwork is a workspace array, its dimension max(6,m+n).
If jobu = 'C', rwork(1) = CTOL, where CTOL defines the threshold for
convergence. The process stops if all columns of a are mutually orthogonal
up to CTOL*EPS, EPS=?lamch('E'). It is required that CTOL 1, that is, it
is not allowed to force the routine to obtain orthogonality below .
Output Parameters
a On exit:
If jobu = 'U' or jobu = 'C':
If jobu = 'N':
if info = 0, note that the left singular vectors are 'for free' in the one-
sided Jacobi SVD algorithm. However, if only the singular values are
needed, the level of numerical orthogonality of u is not an issue and
iterations are stopped when the columns of the iterated matrix are
numerically orthogonal up to approximately m*EPS. Thus, on exit, a
contains the columns of u scaled with the corresponding singular values.
if info > 0, the procedure ?gesvj did not converge in the given
number of iterations (sweeps).
1219
3 Intel Math Kernel Library Developer Reference
If info > 0, the procedure ?gesvj did not converge in the given number
of iterations (sweeps) and scale*sva(1:n) may not be accurate.
v On exit:
If jobv = 'V', contains the n-by-n matrix of the right singular vectors.
If jobv = 'A', then v contains the product of the computed right singular
vector matrix and the initial matrix in the array v.
work On exit,
work(1) = scale is the scaling factor such that scale*sva(1:n) are the
computed singular values of A. See the description of sva().
rwork On exit,
rwork(1) = scale is the scaling factor such that scale*sva(1:n) are the
computed singular values of A. See description of sva().
1220
LAPACK Routines 3
rwork(6) is the largest absolute value over all sines of the Jacobi rotation
angles in the last sweep. It can be useful for a post festum analysis.
info INTEGER.
If info = 0, the execution is successful.
If info > 0, the function did not converge in the maximal number (30) of
sweeps. The output may still be useful. See the description of work or
rwork.
See Also
?lamch
?ggsvd
Computes the generalized singular value
decomposition of a pair of general rectangular
matrices (deprecated).
Syntax
call sggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, iwork, info)
call dggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, iwork, info)
call cggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, rwork, iwork, info)
call zggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, rwork, iwork, info)
call ggsvd(a, b, alpha, beta [, k] [,l] [,u] [,v] [,q] [,iwork] [,info])
Include Files
mkl.fi, lapack.f90
Description
This routine is deprecated; use ggsvd3.
The routine computes the generalized singular value decomposition (GSVD) of an m-by-n real/complex
matrix A and p-by-n real/complex matrix B:
U'*A*Q = D1*(0 R), V'*B*Q = D2*(0 R),
where U, V and Q are orthogonal/unitary matrices and U', V' mean transpose/conjugate transpose of U and V
respectively.
Let k+l = the effective numerical rank of the matrix (A', B')', then R is a (k+l)-by-(k+l) nonsingular upper
triangular matrix, D1 and D2 are m-by-(k+l) and p-by-(k+l) "diagonal" matrices and of the following
structures, respectively:
If m-k-l0,
1221
3 Intel Math Kernel Library Developer Reference
where
C = diag(alpha(K+1),..., alpha(K+l))
S = diag(beta(K+1),...,beta(K+l))
C2 + S2 = I
R is stored in a(1:k+l, n-k-l+1:n ) on exit.
If m-k-l < 0,
1222
LAPACK Routines 3
where
C = diag(alpha(K+1),..., alpha(m)),
S = diag(beta(K+1),...,beta(m)),
C2 + S2 = I
Input Parameters
1223
3 Intel Math Kernel Library Developer Reference
iwork INTEGER.
Workspace array, size at least max(1, n).
Output Parameters
k, l INTEGER. On exit, k and l specify the dimension of the subblocks. The sum
k+l is equal to the effective numerical rank of (A', B')'.
alpha(k+1:k+l) = C,
beta(k+1:k+l) = S,
1224
LAPACK Routines 3
or if m-k-l < 0,
alpha(k+1:m)= C, alpha(m+1:k+l)=0
beta(k+1:m) = S, beta(m+1:k+l) = 1
and
alpha(k+l+1:n) = 0
beta(k+l+1:n) = 0.
info INTEGER.
If info = 0, the execution is successful.
1225
3 Intel Math Kernel Library Developer Reference
?gesvdx
Computes the SVD and left and right singular vectors
for a matrix.
Syntax
call sgesvdx(jobu, jobvt, range, m, n, a, lda, vl, vu, il, iu, ns, s, u, ldu, vt,
ldvt, work, lwork, iwork, info)
call dgesvdx(jobu, jobvt, range, m, n, a, lda, vl, vu, il, iu, ns, s, u, ldu, vt,
ldvt, work, lwork, iwork, info)
call cgesvdx(jobu, jobvt, range, m, n, a, lda, vl, vu, il, iu, ns, s, u, ldu, vt,
ldvt, work, lwork, rwork, iwork, info)
call zgesvdx(jobu, jobvt, range, m, n, a, lda, vl, vu, il, iu, ns, s, u, ldu, vt,
ldvt, work, lwork, rwork, iwork, info)
Include Files
mkl.fi
Description
?gesvdx computes the singular value decomposition (SVD) of a real or complex m-by-n matrix A, optionally
computing the left and right singular vectors. The SVD is written
A = U * * transpose(V)
where is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m matrix,
and V is an n-by-n matrix. The matrices U and V are orthogonal for real A, and unitary for complex A. The
diagonal elements of are the singular values of A; they are real and non-negative, and are returned in
descending order. The first min(m,n) columns of U and V are the left and right singular vectors of A.
?gesvdx uses an eigenvalue problem for obtaining the SVD, which allows for the computation of a subset of
singular values and vectors. See ?bdsvdx for details.
1226
LAPACK Routines 3
Input Parameters
jobu CHARACTER*1. Specifies options for computing all or part of the matrix U:
= 'V': the first min(m,n) columns of U (the left singular vectors) or as
specified by range are returned in the array u;
jobvt CHARACTER*1. Specifies options for computing all or part of the matrix VT:
= 'V': the first min(m,n) rows of VT (the right singular vectors) or as
specified by range are returned in the array vt;
lda max(1,m).
il INTEGER.
1227
3 Intel Math Kernel Library Developer Reference
iu INTEGER. If range='I', the indices (in ascending order) of the smallest and
largest singular values to be returned. 1 iliu min(m,n), if min(m,n) > 0.
Not referenced if range = 'A' or 'V'.
ldu INTEGER. The leading dimension of the array u. ldu 1; if jobu = 'V',
ldum.
ldvt INTEGER. The leading dimension of the array vt. ldvt 1; if jobvt = 'V',
ldvtns (see above).
lwork max(1, min(m, n)*(min(m, n) + 4)) for the paths (see comments
inside the code):
Output Parameters
1228
LAPACK Routines 3
Array, size (min(m,n))
NOTE
Make sure that ucolns; if range = 'V', the exact value of ns is not
known in advance and an upper bound must be used.
If jobvt = 'V', vt contains the rows of VT (the right singular vectors, stored
rowwise) as specified by range; if jobvt = 'N', vt is not referenced.
NOTE
Make sure that ldvtns; if range = 'V', the exact value of ns is not
known in advance and an upper bound must be used.
If info = 0, the first ns elements of iwork are zero. If info > 0, then
iwork contains the indices of the eigenvectors that failed to converge in ?
bdsvdx/?stevx.
info INTEGER.
= 0: successful exit.
< 0: if info = -i, the i-th argument had an illegal value.
> 0: if info = i, then i eigenvectors failed to converge in ?bdsvdx/?stevx. if
info = n*2 + 1, an internal error occurred in ?bdsvdx.
?bdsvdx
Computes the SVD of a bidiagonal matrix.
Syntax
call sbdsvdx (uplo, jobz, range, n, d, e, vl, vu, il, iu, ns, s, z, ldz, work, iwork,
info )
1229
3 Intel Math Kernel Library Developer Reference
call dbdsvdx (uplo, jobz, range, n, d, e, vl, vu, il, iu, ns, s, z, ldz, work, iwork,
info )
Include Files
mkl.fi
Description
?bdsvdx computes the singular value decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal
matrix B, B = U * S * VT, where S is a diagonal matrix with non-negative diagonal elements (the singular
values of B), and U and VT are orthogonal matrices of left and right singular vectors, respectively.
Given an upper bidiagonal B with diagonal d = [d1d2 ... dn] and superdiagonal e = [e1e2 ... en - 1], ?bdsvdx
computes the singular value decompositon of B through the eigenvalues and eigenvectors of the n*2-by-n*2
tridiagonal matrix
0 d1
d 1 0 e1
TGK = e1 0 d 2
d2
If (s,u,v) is a singular triplet of B with ||u|| = ||v|| = 1, then (s,q), ||q|| = 1, are eigenpairs of TGK, with
v1 u1 v2 u2 vn un
u v
q =P* 2
= 2
, and P = en + 1 e1 en + 2 e2 .
1. compute -s, -v and change signs so that the singular values (and corresponding vectors) are already in
descending order (as in ?gesvd/?gesdd) or
2. compute s, v and reorder the values (and corresponding vectors).
?bdsvdx implements (1) by calling ?stevx (bisection plus inverse iteration, to be replaced with a version of
the Multiple Relative Robust Representation algorithm. (See P. Willems and B. Lang, A framework for the
MR^3 algorithm: theory and implementation, SIAM J. Sci. Comput., 35:740-766, 2013.)
Input Parameters
1230
LAPACK Routines 3
Array, size n.
il, iu INTEGER. If range='I', the indices (in ascending order) of the smallest and
largest singular values to be returned.
1 iliu min(m,n), if min(m,n) > 0.
Output Parameters
If jobz = 'V', then if info = 0 the first ns columns of z contain the singular
vectors of the matrix B corresponding to the selected singular values, with
U in rows 1 to n and V in rows n+1 to n*2, i.e.
1231
3 Intel Math Kernel Library Developer Reference
U
z=
V
NOTE
Make sure that at least k = ns+1 columns are supplied in the array z;
if range = 'V', the exact value of ns is not known in advance and an
upper bound must be used.
If jobz = 'V', then if info = 0, the first ns elements of iwork are zero. If
info > 0, then iwork contains the indices of the eigenvectors that failed to
converge in ?stevx.
> 0:
if info = i, then i eigenvectors failed to converge in ?stevx. The indices of
the eigenvectors (as returned by ?stevx) are stored in the array iwork.
Table "Driver Routines for Cosine-Sine Decomposition (CSD)" lists LAPACK routines (FORTRAN 77 interface)
that perform CS decomposition of matrices. The corresponding routine names in the Fortran 95 interface are
without the first symbol.
Computational Routines for Cosine-Sine Decomposition (CSD)
Operation Real matrices Complex matrices
See Also
Cosine-Sine Decomposition: LAPACK Computational Routines
1232
LAPACK Routines 3
?orcsd/?uncsd
Computes the CS decomposition of a block-partitioned
orthogonal/unitary matrix.
Syntax
call sorcsd( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12,
ldx12, x21, ldx21, x22, ldx22, theta, u1, ldu1, u2, ldu2, v1t, ldv1t, v2t, ldv2t,
work, lwork, iwork, info )
call dorcsd( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12,
ldx12, x21, ldx21, x22, ldx22, theta, u1, ldu1, u2, ldu2, v1t, ldv1t, v2t, ldv2t,
work, lwork, iwork, info )
call cuncsd( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12,
ldx12, x21, ldx21, x22, ldx22, theta, u1, ldu1, u2, ldu2, v1t, ldv1t, v2t, ldv2t,
work, lwork, rwork, lrwork, iwork, info )
call zuncsd( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12,
ldx12, x21, ldx21, x22, ldx22, theta, u1, ldu1, u2, ldu2, v1t, ldv1t, v2t, ldv2t,
work, lwork, rwork, lrwork, iwork, info )
call orcsd( x11,x12,x21,x22,theta,u1,u2,v1t,v2t[,jobu1][,jobu2][,jobv1t][,jobv2t]
[,trans][,signs][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orcsd/?uncsd compute the CS decomposition of an m-by-m partitioned orthogonal matrix X:
or unitary matrix:
1233
3 Intel Math Kernel Library Developer Reference
x11 is p-by-q. The orthogonal/unitary matrices u1, u2, v1, and v2 are p-by-p, (m-p)-by-(m-p), q-by-q, (m-q)-
by-(m-q), respectively. C and S are r-by-r nonnegative diagonal matrices satisfying C2 + S2 = I, in which r
= min(p,m-p,q,m-q).
Input Parameters
trans CHARACTER
signs CHARACTER
ldx11, ldx12, ldx21, ldx22 INTEGER. The leading dimensions of the parts of array X. ldx11 max(1,
p), ldx12 max(1, p), ldx21 max(1, m - p), ldx22 max(1, m - p).
ldu1 INTEGER. The leading dimension of the array u1. If jobu1 = 'Y', ldu1
max(1,p).
1234
LAPACK Routines 3
ldu2 INTEGER. The leading dimension of the array u2. If jobu2 = 'Y', ldu2
max(1,m-p).
ldv1t INTEGER. The leading dimension of the array v1t. If jobv1t = 'Y', ldv1t
max(1,q).
ldv2t INTEGER. The leading dimension of the array v2t. If jobv2t = 'Y', ldv2t
max(1,m-q).
Output Parameters
1235
3 Intel Math Kernel Library Developer Reference
work On exit,
rwork On exit,
1236
LAPACK Routines 3
intermediate bidiagonal-block form remaining
after nonconvergence. info specifies the number
of nonzero phi's.
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value
> 0: ?orcsd/?uncsd did not converge. See the description of work above
for details.
See Also
?bbcsd
xerbla
?orcsd2by1/?uncsd2by1
Computes the CS decomposition of a block-partitioned
orthogonal/unitary matrix.
1237
3 Intel Math Kernel Library Developer Reference
Syntax
call sorcsd2by1( jobu1, jobu2, jobv1t, m, p, q, x11, ldx11, x21, ldx21, theta, u1,
ldu1, u2, ldu2, v1t, ldv1t, work, lwork, iwork, info )
call dorcsd2by1( jobu1, jobu2, jobv1t, m, p, q, x11, ldx11, x21, ldx21, theta, u1,
ldu1, u2, ldu2, v1t, ldv1t, work, lwork, iwork, info )
call cuncsd2by1( jobu1, jobu2, jobv1t, m, p, q, x11, ldx11, x21, ldx21, theta, u1,
ldu1, u2, ldu2, v1t, ldv1t, work, lwork, rwork, lrwork, iwork, info )
call zuncsd2by1( jobu1, jobu2, jobv1t, m, p, q, x11, ldx11, x21, ldx21, theta, u1,
ldu1, u2, ldu2, v1t, ldv1t, work, lwork, rwork, lrwork, iwork, info )
call orcsd2by1( x11,x21,theta,u1,u2,v1t[,jobu1][,jobu2][,jobv1t][,info] )
call uncsd2by1( x11,x21,theta,u1,u2,v1t[,jobu1][,jobu2][,jobv1t][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orcsd2by1/?uncsd2by1 compute the CS decomposition of an m-by-q matrix X with
orthonormal columns that has been partitioned into a 2-by-1 block structure:
x11 is p-by-q. The orthogonal/unitary matrices u1, u2, v1, and v2 are p-by-p, (m-p)-by-(m-p), q-by-q, (m-q)-
by-(m-q), respectively. C and S are r-by-r nonnegative diagonal matrices satisfying C2 + S2 = I, in which r
= min(p,m-p,q,m-q).
Input Parameters
jobv1t CHARACTER. If equal to 'Y', then v1t is computed. Otherwise, v1t is not
computed.
1238
LAPACK Routines 3
DOUBLE PRECISION for dorcsd2by1
COMPLEX for cuncsd2by1
DOUBLE COMPLEX for zuncsd2by1
Array, size (ldx11,q).
On entry, the part of the orthogonal matrix whose CSD is desired.
ldx11 INTEGER. The leading dimension of the array x11. ldx11 max(1,p).
ldx21 INTEGER. The leading dimension of the array X. ldx21 max(1,m - p).
ldu1 INTEGER. The leading dimension of the array u1. If jobu1 = 'Y', ldu1
max(1,p).
ldu2 INTEGER. The leading dimension of the array u2. If jobu2 = 'Y', ldu2
max(1,m-p).
ldv1t INTEGER. The leading dimension of the array v1t. If jobv1t = 'Y', ldv1t
max(1,q).
1239
3 Intel Math Kernel Library Developer Reference
Output Parameters
work On exit,
1240
LAPACK Routines 3
If info > 0, work(2:r) contains the values phi(1), ...,
phi(r-1) that, together with theta(1), ...,
theta(r) define the matrix in intermediate
bidiagonal-block form remaining after
nonconvergence. info specifies the number of
nonzero phi's.
rwork On exit,
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value
See Also
?bbcsd
xerbla
1241
3 Intel Math Kernel Library Developer Reference
?sygv
Computes all eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem.
Syntax
call ssygv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, info)
call dsygv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, info)
call sygv(a, b, w [,itype] [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
1242
LAPACK Routines 3
The routine computes all the eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be symmetric and B is also positive definite.
Input Parameters
lwork INTEGER.
The dimension of the array work;
lwork max(1, 3n-1).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
1243
3 Intel Math Kernel Library Developer Reference
Output Parameters
if itype = 3, ZT*inv(B)*Z = I;
If jobz = 'N', then on exit the upper triangle (if uplo = 'U') or the
lower triangle (if uplo = 'L') of A, including the diagonal, is destroyed.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
1244
LAPACK Routines 3
Application Notes
For optimum performance use lwork (nb+2)*n, where nb is the blocksize for ssytrd/dsytrd returned by
ilaenv.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if work (liwork) is less than the minimal required value and is not equal to -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?hegv
Computes all eigenvalues and, optionally,
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem.
Syntax
call chegv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, info)
call zhegv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, info)
call hegv(a, b, w [,itype] [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian and B is also positive definite.
Input Parameters
1245
3 Intel Math Kernel Library Developer Reference
lwork INTEGER.
The dimension of the array work; lwork max(1, 2n-1).
Output Parameters
if itype = 3, ZH*inv(B)*Z = I;
If jobz = 'N', then on exit the upper triangle (if uplo = 'U') or the
lower triangle (if uplo = 'L') of A, including the diagonal, is destroyed.
1246
LAPACK Routines 3
If info = 0, contains the eigenvalues in ascending order.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
For optimum performance use lwork (nb+1)*n, where nb is the blocksize for chetrd/zhetrd returned by
ilaenv.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
1247
3 Intel Math Kernel Library Developer Reference
?sygvd
Computes all eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem using a divide and conquer method.
Syntax
call ssygvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, iwork, liwork, info)
call dsygvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, iwork, liwork, info)
call sygvd(a, b, w [,itype] [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x .
Here A and B are assumed to be symmetric and B is also positive definite.
It uses a divide and conquer algorithm.
Input Parameters
1248
LAPACK Routines 3
work is a workspace array, its dimension max(1, lwork).
lwork INTEGER.
The dimension of the array work.
Constraints:
If n 1, lwork 1;
iwork INTEGER.
Workspace array, its dimension max(1, lwork).
liwork INTEGER.
The dimension of the array iwork.
Constraints:
If n 1, liwork 1;
Output Parameters
if itype = 3, ZT*inv(B)*Z = I;
If jobz = 'N', then on exit the upper triangle (if uplo = 'U') or the
lower triangle (if uplo = 'L') of A, including the diagonal, is destroyed.
1249
3 Intel Math Kernel Library Developer Reference
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
For infon:
Application Notes
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
set lwork = -1 (liwork = -1).
1250
LAPACK Routines 3
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if work (liwork) is less than the minimal required value and is not equal to -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?hegvd
Computes all eigenvalues and, optionally,
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem using a divide and
conquer method.
Syntax
call chegvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, lrwork,
iwork, liwork, info)
call zhegvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, lrwork,
iwork, liwork, info)
call hegvd(a, b, w [,itype] [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian and B is also positive definite.
It uses a divide and conquer algorithm.
Input Parameters
1251
3 Intel Math Kernel Library Developer Reference
lwork INTEGER.
The dimension of the array work.
Constraints:
If n 1, lwork 1;
lrwork INTEGER.
The dimension of the array rwork.
Constraints:
If n 1, lrwork 1;
1252
LAPACK Routines 3
iwork INTEGER.
Workspace array, size max(1, liwork).
liwork INTEGER.
The dimension of the array iwork.
Constraints:
If n 1, liwork 1;
Output Parameters
if itype = 3, ZH*inv(B)*Z = I;
If jobz = 'N', then on exit the upper triangle (if uplo = 'U') or the
lower triangle (if uplo = 'L') of A, including the diagonal, is destroyed.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
rwork(1) On exit, if info = 0, then rwork(1) returns the required minimal size of
lrwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1253
3 Intel Math Kernel Library Developer Reference
If info = i, and jobz = 'N', then the algorithm failed to converge; i off-
diagonal elements of an intermediate tridiagonal form did not converge to
zero;
if info = i, and jobz = 'V', then the algorithm failed to compute an
eigenvalue while working on the submatrix lying in rows and columns
info/(n+1) through mod(info, n+1).
If info = n + i, for 1 in, then the leading minor of order i of B is not
positive-definite. The factorization of B could not be completed and no
eigenvalues or eigenvectors were computed.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (liwork or lrwork) for the
first run or set lwork = -1 (liwork = -1, lrwork = -1).
If you choose the first option and set any of admissible lwork (liwork or lrwork) sizes, which is no less than
the minimal value described, the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of the
corresponding array (work, iwork, rwork) on exit. Use this value (work(1), iwork(1), rwork(1)) for
subsequent runs.
If you set lwork = -1 (liwork = -1, lrwork = -1), the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork, rwork). This operation
is called a workspace query.
Note that if you set lwork (liwork, lrwork) to less than the minimal required value and not -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
?sygvx
Computes selected eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem.
Syntax
call ssygvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu, abstol, m,
w, z, ldz, work, lwork, iwork, ifail, info)
1254
LAPACK Routines 3
call dsygvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu, abstol, m,
w, z, ldz, work, lwork, iwork, ifail, info)
call sygvx(a, b, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be symmetric and B is also positive definite. Eigenvalues and eigenvectors can
be selected by specifying either a range of values or a range of indices for the desired eigenvalues.
Input Parameters
1255
3 Intel Math Kernel Library Developer Reference
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
lwork INTEGER.
The dimension of the array work;
lwork < max(1, 8n).
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
See Application Notes for the suggested value of lwork.
iwork INTEGER.
Workspace array, size at least max(1, 5n).
Output Parameters
a On exit, the upper triangle (if uplo = 'U') or the lower triangle (if uplo =
'L') of A, including the diagonal, is overwritten.
1256
LAPACK Routines 3
b On exit, if infon, the part of b containing the matrix is overwritten by the
triangular factor U or L from the Cholesky factorization B = UT*U or B =
L*LT.
if itype = 3, ZT*inv(B)*Z = I;
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
1257
3 Intel Math Kernel Library Developer Reference
z Holds the matrix Z of size (n, n), where the values n and m are significant.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
1258
LAPACK Routines 3
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of
width less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 is used as tolerance, where T is the tridiagonal matrix
obtained by reducing C to tridiagonal form, where C is the symmetric matrix of the standard symmetric
problem to which the generalized problem is transformed. Eigenvalues will be computed most accurately
when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, set abstol to 2*?
lamch('S').
For optimum performance use lwork (nb+3)*n, where nb is the blocksize for ssytrd/dsytrd returned by
ilaenv.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?hegvx
Computes selected eigenvalues and, optionally,
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem.
Syntax
call chegvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu, abstol, m,
w, z, ldz, work, lwork, rwork, iwork, ifail, info)
call zhegvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu, abstol, m,
w, z, ldz, work, lwork, rwork, iwork, ifail, info)
call hegvx(a, b, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian and B is also positive definite. Eigenvalues and eigenvectors can
be selected by specifying either a range of values or a range of indices for the desired eigenvalues.
Input Parameters
1259
3 Intel Math Kernel Library Developer Reference
il, iu INTEGER.
1260
LAPACK Routines 3
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
lwork INTEGER.
The dimension of the array work; lwork max(1, 2n).
iwork INTEGER.
Workspace array, size at least max(1, 5n).
Output Parameters
a On exit, the upper triangle (if uplo = 'U') or the lower triangle (if uplo =
'L') of A, including the diagonal, is overwritten.
1261
3 Intel Math Kernel Library Developer Reference
if itype = 3, ZH*inv(B)*Z = I;
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
1262
LAPACK Routines 3
a Holds the matrix A of size (n, n).
z Holds the matrix Z of size (n, n), where the values n and m are significant.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 will be used in its place, where T is the tridiagonal
matrix obtained by reducing C to tridiagonal form, where C is the symmetric matrix of the standard
symmetric problem to which the generalized problem is transformed. Eigenvalues will be computed most
accurately when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
For optimum performance use lwork (nb+1)*n, where nb is the blocksize for chetrd/zhetrd returned by
ilaenv.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
1263
3 Intel Math Kernel Library Developer Reference
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
?spgv
Computes all eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem with matrices in packed storage.
Syntax
call sspgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, info)
call dspgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, info)
call spgv(ap, bp, w [,itype] [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be symmetric, stored in packed format, and B is also positive definite.
Input Parameters
1264
LAPACK Routines 3
Arrays:
ap(*) contains the packed upper or lower triangle of the symmetric matrix
A, as specified by uplo.
The dimension of ap must be at least max(1, n*(n+1)/2).
bp(*) contains the packed upper or lower triangle of the symmetric matrix
B, as specified by uplo.
The dimension of bp must be at least max(1, n*(n+1)/2).
work(*) is a workspace array, size at least max(1, 3n).
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
Output Parameters
z(ldz,*) .
The second dimension of z must be at least max(1, n).
If jobz = 'V', then if info = 0, z contains the matrix Z of eigenvectors.
The eigenvectors are normalized as follows:
if itype = 1 or 2, ZT*B*Z = I;
if itype = 3, ZT*inv(B)*Z = I;
info INTEGER.
If info = 0, the execution is successful.
1265
3 Intel Math Kernel Library Developer Reference
?hpgv
Computes all eigenvalues and, optionally,
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem with matrices in packed
storage.
Syntax
call chpgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, rwork, info)
call zhpgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, rwork, info)
call hpgv(ap, bp, w [,itype] [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian, stored in packed format, and B is also positive definite.
Input Parameters
1266
LAPACK Routines 3
jobz CHARACTER*1. Must be 'N' or 'V'.
If jobz = 'N', then compute eigenvalues only.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
Output Parameters
1267
3 Intel Math Kernel Library Developer Reference
if itype = 1 or 2, ZH*B*Z = I;
if itype = 3, ZH*inv(B)*Z = I;
info INTEGER.
If info = 0, the execution is successful.
?spgvd
Computes all eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem with matrices in packed storage using a
divide and conquer method.
Syntax
call sspgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, iwork, liwork, info)
call dspgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, iwork, liwork, info)
call spgvd(ap, bp, w [,itype] [,uplo] [,z] [,info])
1268
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be symmetric, stored in packed format, and B is also positive definite.
If eigenvectors are desired, it uses a divide and conquer algorithm.
Input Parameters
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
lwork INTEGER.
The dimension of the array work.
Constraints:
If n 1, lwork 1;
1269
3 Intel Math Kernel Library Developer Reference
iwork INTEGER.
Workspace array, dimension max(1, lwork).
liwork INTEGER.
The dimension of the array iwork.
Constraints:
If n 1, liwork 1;
Output Parameters
z(ldz,*).
The second dimension of z must be at least max(1, n).
If jobz = 'V', then if info = 0, z contains the matrix Z of eigenvectors.
The eigenvectors are normalized as follows:
if itype = 1 or 2, ZT*B*Z = I;
if itype = 3, ZT*inv(B)*Z = I;
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
1270
LAPACK Routines 3
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run, or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, then the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), then the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work, iwork). This operation is called a workspace
query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, then the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
1271
3 Intel Math Kernel Library Developer Reference
?hpgvd
Computes all eigenvalues and, optionally,
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem with matrices in packed
storage using a divide and conquer method.
Syntax
call chpgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, rwork, lrwork,
iwork, liwork, info)
call zhpgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, rwork, lrwork,
iwork, liwork, info)
call hpgvd(ap, bp, w [,itype] [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian, stored in packed format, and B is also positive definite.
If eigenvectors are desired, it uses a divide and conquer algorithm.
Input Parameters
1272
LAPACK Routines 3
bp(*) contains the packed upper or lower triangle of the Hermitian matrix
B, as specified by uplo.
The dimension of bp must be at least max(1, n*(n+1)/2).
work is a workspace array, its dimension max(1, lwork).
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
lwork INTEGER.
The dimension of the array work.
Constraints:
If n 1, lwork 1;
lrwork INTEGER.
The dimension of the array rwork.
Constraints:
If n 1, lrwork 1;
iwork INTEGER.
Workspace array, its dimension max(1, liwork).
liwork INTEGER.
The dimension of the array iwork.
Constraints:
If n 1, liwork 1;
1273
3 Intel Math Kernel Library Developer Reference
Output Parameters
if itype = 3, ZH*inv(B)*Z = I;
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
rwork(1) On exit, if info = 0, then rwork(1) returns the required minimal size of
lrwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1274
LAPACK Routines 3
LAPACK 95 Interface Notes
Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Interface Conventions.
Specific details for the routine hpgvd interface are the following:
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (liwork or lrwork) for the
first run or set lwork = -1 (liwork = -1, lrwork = -1).
If you choose the first option and set any of admissible lwork (liwork or lrwork) sizes, which is no less than
the minimal value described, the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of the
corresponding array (work, iwork, rwork) on exit. Use this value (work(1), iwork(1), rwork(1)) for
subsequent runs.
If you set lwork = -1 (liwork = -1, lrwork = -1), the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork, rwork). This operation
is called a workspace query.
Note that if you set lwork (liwork, lrwork) to less than the minimal required value and not -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
?spgvx
Computes selected eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem with matrices in packed storage.
Syntax
call sspgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m, w, z,
ldz, work, iwork, ifail, info)
call dspgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m, w, z,
ldz, work, iwork, ifail, info)
call spgvx(ap, bp, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail]
[,abstol] [,info])
1275
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be symmetric, stored in packed format, and B is also positive definite.
Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
Input Parameters
1276
LAPACK Routines 3
work(*) is a workspace array, size at least max(1, 8n).
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
iwork INTEGER.
Workspace array, size at least max(1, 5n).
Output Parameters
z(ldz,*) .
The second dimension of z must be at least max(1, n).
1277
3 Intel Math Kernel Library Developer Reference
if itype = 3, ZT*inv(B)*Z = I;
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
z Holds the matrix Z of size (n, n), where the values n and m are significant.
1278
LAPACK Routines 3
uplo Must be 'U' or 'L'. The default value is 'U'.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 is used instead, where T is the tridiagonal matrix
obtained by reducing A to tridiagonal form. Eigenvalues are computed most accurately when abstol is set to
twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, set abstol to 2*?
lamch('S').
?hpgvx
Computes selected eigenvalues and, optionally,
eigenvectors of a generalized Hermitian positive-
definite eigenproblem with matrices in packed
storage.
Syntax
call chpgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m, w, z,
ldz, work, rwork, iwork, ifail, info)
call zhpgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m, w, z,
ldz, work, rwork, iwork, ifail, info)
call hpgvx(ap, bp, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail]
[,abstol] [,info])
1279
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian, stored in packed format, and B is also positive definite.
Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
Input Parameters
1280
LAPACK Routines 3
work(*) is a workspace array, size at least max(1, 2n).
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
iwork INTEGER.
Workspace array, size at least max(1, 5n).
Output Parameters
1281
3 Intel Math Kernel Library Developer Reference
if itype = 3, ZH*inv(B)*Z = I;
ifail INTEGER.
Array, size at least max(1, n).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
1282
LAPACK Routines 3
z Holds the matrix Z of size (n, n), where the values n and m are significant.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 is used as tolerance, where T is the tridiagonal matrix
obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol is set
to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
?sbgv
Computes all eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem with banded matrices.
Syntax
call ssbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, info)
call dsbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, info)
call sbgv(ab, bb, w [,uplo] [,z] [,info])
1283
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite banded eigenproblem, of the form A*x = *B*x. Here A and B are assumed to be symmetric and
banded, and B is also positive definite.
Input Parameters
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1 .
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
Output Parameters
1284
LAPACK Routines 3
bb On exit, contains the factor S from the split Cholesky factorization B =
ST*S, as returned by pbstf/pbstf.
z(ldz,*) .
The second dimension of z must be at least max(1, n).
If jobz = 'V', then if info = 0, z contains the matrix Z of eigenvectors,
with the i-th column of z holding the eigenvector associated with w(i). The
eigenvectors are normalized so that ZT*B*Z = I.
info INTEGER.
If info = 0, the execution is successful.
1285
3 Intel Math Kernel Library Developer Reference
?hbgv
Computes all eigenvalues and, optionally,
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem with banded matrices.
Syntax
call chbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, rwork, info)
call zhbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, rwork, info)
call hbgv(ab, bb, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite banded eigenproblem, of the form A*x = *B*x. Here A and B are Hermitian and
banded matrices, and matrix B is also positive definite.
Input Parameters
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1.
1286
LAPACK Routines 3
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
1287
3 Intel Math Kernel Library Developer Reference
?sbgvd
Computes all eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem with banded matrices. If eigenvectors
are desired, it uses a divide and conquer method.
Syntax
call ssbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork, iwork,
liwork, info)
call dsbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork, iwork,
liwork, info)
call sbgvd(ab, bb, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite banded eigenproblem, of the form A*x = *B*x. Here A and B are assumed to be symmetric and
banded, and B is also positive definite.
If eigenvectors are desired, it uses a divide and conquer algorithm.
Input Parameters
1288
LAPACK Routines 3
(ka 0).
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1.
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
lwork INTEGER.
The dimension of the array work.
Constraints:
If n 1, lwork 1;
iwork INTEGER.
Workspace array, its dimension max(1, liwork).
liwork INTEGER.
The dimension of the array iwork.
Constraints:
If n 1, liwork 1;
1289
3 Intel Math Kernel Library Developer Reference
Output Parameters
z(ldz,*).
The second dimension of z must be at least max(1, n).
If jobz = 'V', then if info = 0, z contains the matrix Z of eigenvectors,
with the i-th column of z holding the eigenvector associated with w(i). The
eigenvectors are normalized so that ZT*B*Z = I.
If jobz = 'N', then z is not referenced.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
1290
LAPACK Routines 3
ab Holds the array A of size (ka+1,n).
Application Notes
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if work (liwork) is less than the minimal required value and is not equal to -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
?hbgvd
Computes all eigenvalues and, optionally,
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem with banded matrices.
If eigenvectors are desired, it uses a divide and
conquer method.
Syntax
call chbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork, rwork,
lrwork, iwork, liwork, info)
call zhbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork, rwork,
lrwork, iwork, liwork, info)
call hbgvd(ab, bb, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite banded eigenproblem, of the form A*x = *B*x. Here A and B are assumed to be
Hermitian and banded, and B is also positive definite.
If eigenvectors are desired, it uses a divide and conquer algorithm.
1291
3 Intel Math Kernel Library Developer Reference
Input Parameters
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1.
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
lwork INTEGER.
The dimension of the array work.
Constraints:
If n 1, lwork 1;
1292
LAPACK Routines 3
rwork REAL for chbgvd
DOUBLE PRECISION for zhbgvd.
Workspace array, size max(1, lrwork).
lrwork INTEGER.
The dimension of the array rwork.
Constraints:
If n 1, lrwork 1;
iwork INTEGER.
Workspace array, size max(1, liwork).
liwork INTEGER.
The dimension of the array iwork.
Constraints:
If n 1, lwork 1;
Output Parameters
1293
3 Intel Math Kernel Library Developer Reference
Array z(ldz,*).
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
rwork(1) On exit, if info = 0, then rwork(1) returns the required minimal size of
lrwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (liwork or lrwork) for the
first run or set lwork = -1 (liwork = -1, lrwork = -1).
1294
LAPACK Routines 3
If you choose the first option and set any of admissible lwork (liwork or lrwork) sizes, which is no less than
the minimal value described, the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of the
corresponding array (work, iwork, rwork) on exit. Use this value (work(1), iwork(1), rwork(1)) for
subsequent runs.
If you set lwork = -1 (liwork = -1, lrwork = -1), the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork, rwork). This operation
is called a workspace query.
Note that if you set lwork (liwork, lrwork) to less than the minimal required value and not -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
?sbgvx
Computes selected eigenvalues and, optionally,
eigenvectors of a real generalized symmetric definite
eigenproblem with banded matrices.
Syntax
call ssbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu, il, iu,
abstol, m, w, z, ldz, work, iwork, ifail, info)
call dsbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu, il, iu,
abstol, m, w, z, ldz, work, iwork, ifail, info)
call sbgvx(ab, bb, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite banded eigenproblem, of the form A*x = *B*x. Here A and B are assumed to be symmetric and
banded, and B is also positive definite. Eigenvalues and eigenvectors can be selected by specifying either all
eigenvalues, a range of values or a range of indices for the desired eigenvalues.
Input Parameters
1295
3 Intel Math Kernel Library Developer Reference
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1.
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
1296
LAPACK Routines 3
ldq INTEGER. The leading dimension of the output array q; ldq < 1.
If jobz = 'V', ldq < max(1, n).
iwork INTEGER.
Workspace array, size (5*n).
Output Parameters
z(ldz,*) .
The second dimension of z must be at least max(1, n).
If jobz = 'V', then if info = 0, z contains the matrix Z of eigenvectors,
with the i-th column of z holding the eigenvector associated with w(i). The
eigenvectors are normalized so that ZT*B*Z = I.
q(ldq,*) .
The second dimension of q must be at least max(1, n).
If jobz = 'V', then q contains the n-by-n matrix used in the reduction of
A*x = lambda*B*x to standard form, that is, C*x= lambda*x and
consequently C to tridiagonal form.
If jobz = 'N', then q is not referenced.
ifail INTEGER.
Array, size (m).
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
1297
3 Intel Math Kernel Library Developer Reference
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
1298
LAPACK Routines 3
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 is used as tolerance, where T is the tridiagonal matrix
obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol is set
to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
?hbgvx
Computes selected eigenvalues and, optionally,
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem with banded matrices.
Syntax
call chbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu, il, iu,
abstol, m, w, z, ldz, work, rwork, iwork, ifail, info)
call zhbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu, il, iu,
abstol, m, w, z, ldz, work, rwork, iwork, ifail, info)
call hbgvx(ab, bb, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite banded eigenproblem, of the form A*x = *B*x. Here A and B are assumed to be
Hermitian and banded, and B is also positive definite. Eigenvalues and eigenvectors can be selected by
specifying either all eigenvalues, a range of values or a range of indices for the desired eigenvalues.
Input Parameters
1299
3 Intel Math Kernel Library Developer Reference
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1.
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V', ldz max(1, n).
1300
LAPACK Routines 3
ldq INTEGER. The leading dimension of the output array q; ldq 1. If jobz =
'V', ldq max(1, n).
iwork INTEGER.
Workspace array, size at least max(1, 5n).
Output Parameters
q(ldq,*).
The second dimension of q must be at least max(1, n).
If jobz = 'V', then q contains the n-by-n matrix used in the reduction of
Ax = Bx to standard form, that is, Cx = x and consequently C to
tridiagonal form.
If jobz = 'N', then q is not referenced.
ifail INTEGER.
Array, size at least max(1, n).
1301
3 Intel Math Kernel Library Developer Reference
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.
If info = 0, the execution is successful.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
1302
LAPACK Routines 3
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T||1 will be used in its place, where T is the tridiagonal
matrix obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol
is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
gges Computes the generalized eigenvalues, Schur form, and the left and/or right Schur
vectors for a pair of nonsymmetric matrices.
ggesx Computes the generalized eigenvalues, Schur form, and, optionally, the left and/or
right matrices of Schur vectors.
ggev Computes the generalized eigenvalues, and the left and/or right generalized
eigenvectors for a pair of nonsymmetric matrices.
ggevx Computes the generalized eigenvalues, and, optionally, the left and/or right
generalized eigenvectors.
?gges
Computes the generalized eigenvalues, Schur form,
and the left and/or right Schur vectors for a pair of
nonsymmetric matrices.
Syntax
call sgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar, alphai,
beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info)
call dgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar, alphai,
beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info)
call cgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta, vsl,
ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info)
1303
3 Intel Math Kernel Library Developer Reference
call zgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta, vsl,
ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info)
call gges(a, b, alphar, alphai, beta [,vsl] [,vsr] [,select] [,sdim] [,info])
call gges(a, b, alpha, beta [, vsl] [,vsr] [,select] [,sdim] [,info])
Include Files
mkl.fi, lapack.f90
Description
The ?gges routine computes the generalized eigenvalues, the generalized real/complex Schur form (S,T),
optionally, the left and/or right matrices of Schur vectors (vsl and vsr) for a pair of n-by-n real/complex
nonsymmetric matrices (A,B). This gives the generalized Schur factorization
(A,B) = ( vsl*S *vsrH, vsl*T*vsrH )
Optionally, it also orders the eigenvalues so that a selected cluster of eigenvalues appears in the leading
diagonal blocks of the upper quasi-triangular matrix S and the upper triangular matrix T. The leading
columns of vsl and vsr then form an orthonormal/unitary basis for the corresponding left and right
eigenspaces (deflating subspaces).
If only the generalized eigenvalues are needed, use the driver ggev instead, which is faster.
A generalized eigenvalue for a pair of matrices (A,B) is a scalar w or a ratio alpha / beta = w, such that A -
w*B is singular. It is usually represented as the pair (alpha, beta), as there is a reasonable interpretation
for beta=0 or for both being zero. A pair of matrices (S,T) is in the generalized real Schur form if T is upper
triangular with non-negative diagonal and S is block upper triangular with 1-by-1 and 2-by-2 blocks. 1-by-1
blocks correspond to real generalized eigenvalues, while 2-by-2 blocks of S are "standardized" by making the
corresponding elements of T have the form:
and the pair of corresponding 2-by-2 blocks in S and T will have a complex conjugate pair of generalized
eigenvalues. A pair of matrices (S,T) is in generalized complex Schur form if S and T are upper triangular
and, in addition, the diagonal of T are non-negative real numbers.
The ?gges routine replaces the deprecated ?gegs routine.
Input Parameters
sort CHARACTER*1. Must be 'N' or 'S'. Specifies whether or not to order the
eigenvalues on the diagonal of the generalized Schur form.
1304
LAPACK Routines 3
If sort = 'N', then eigenvalues are not ordered.
If sort = 'S', selctg is used to select eigenvalues to sort to the top left
of the Schur form.
If sort = 'N', selctg is not referenced.
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
ldb INTEGER. The leading dimension of the array b. Must be at least max(1, n).
ldvsl, ldvsr INTEGER. The leading dimensions of the output matrices vsl and vsr,
respectively. Constraints:
1305
3 Intel Math Kernel Library Developer Reference
lwork INTEGER.
The dimension of the array work.
lwork max(1, 8n+16) for real flavors;
lwork max(1, 2n) for complex flavors.
For good performance, lwork must generally be larger.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
bwork LOGICAL.
Workspace array, size at least max(1, n).
Not referenced if sort = 'N'.
Output Parameters
a On exit, this array has been overwritten by its generalized Schur form S.
b On exit, this array has been overwritten by its generalized Schur form T.
sdim INTEGER.
If sort = 'N', sdim= 0.
Note that for real flavors complex conjugate pairs for which selctg is true
for either eigenvalue count as 2.
1306
LAPACK Routines 3
beta REAL for sgges
DOUBLE PRECISION for dgges
COMPLEX for cgges
DOUBLE COMPLEX for zgges.
Array, size at least max(1, n).
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n, will be the generalized
eigenvalues.
alphar(j) + alphai(j)*i and beta(j), j=1,..., n are the diagonals of the
complex Schur form (S,T) that would result if the 2-by-2 diagonal blocks of
the real generalized Schur form of (A,B) were further reduced to triangular
form using complex unitary transformations. If alphai(j) is zero, then the j-
th eigenvalue is real; if positive, then the j-th and (j+1)-st eigenvalues are
a complex conjugate pair, with alphai(j+1) negative.
For complex flavors:
On exit, alpha(j)/beta(j), j=1,..., n, will be the generalized eigenvalues.
alphaalpha(j) and beta(j), j=1,..., n are the diagonals of the complex Schur
form (S,T) output by cgges/zgges. The beta(j) will be non-negative real.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
If info = i, and
in:
the QZ iteration failed. (A, B) is not in Schur form, but alphar(j), alphai(j)
(for real flavors), or alpha(j) (for complex flavors), and beta(j), j=info
+1,..., n should be correct.
1307
3 Intel Math Kernel Library Developer Reference
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
1308
LAPACK Routines 3
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The quotients alphar(j)/beta(j) and alphai(j)/beta(j) may easily over- or underflow, and beta(j) may even be
zero. Thus, you should avoid simply computing the ratio. However, alphar and alphai will be always less than
and usually comparable with norm(A) in magnitude, and beta always less than and usually comparable with
norm(B).
?ggesx
Computes the generalized eigenvalues, Schur form,
and, optionally, the left and/or right matrices of Schur
vectors.
Syntax
call sggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim, alphar,
alphai, beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, iwork, liwork,
bwork, info)
call dggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim, alphar,
alphai, beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, iwork, liwork,
bwork, info)
call cggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim, alpha,
beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, rwork, iwork, liwork,
bwork, info)
call zggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim, alpha,
beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, rwork, iwork, liwork,
bwork, info)
call ggesx(a, b, alphar, alphai, beta [,vsl] [,vsr] [,select] [,sdim] [,rconde] [,
rcondv] [,info])
call ggesx(a, b, alpha, beta [, vsl] [,vsr] [,select] [,sdim] [,rconde] [,rcondv] [,
info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for a pair of n-by-n real/complex nonsymmetric matrices (A,B), the generalized
eigenvalues, the generalized real/complex Schur form (S,T), optionally, the left and/or right matrices of
Schur vectors (vsl and vsr). This gives the generalized Schur factorization
(A,B) = ( vsl*S *vsrH, vsl*T*vsrH )
Optionally, it also orders the eigenvalues so that a selected cluster of eigenvalues appears in the leading
diagonal blocks of the upper quasi-triangular matrix S and the upper triangular matrix T; computes a
reciprocal condition number for the average of the selected eigenvalues (rconde); and computes a reciprocal
condition number for the right and left deflating subspaces corresponding to the selected eigenvalues
(rcondv). The leading columns of vsl and vsr then form an orthonormal/unitary basis for the corresponding
left and right eigenspaces (deflating subspaces).
A generalized eigenvalue for a pair of matrices (A,B) is a scalar w or a ratio alpha / beta = w, such that A
- w*B is singular. It is usually represented as the pair (alpha, beta), as there is a reasonable interpretation
for beta=0 or for both being zero. A pair of matrices (S,T) is in generalized real Schur form if T is upper
1309
3 Intel Math Kernel Library Developer Reference
triangular with non-negative diagonal and S is block upper triangular with 1-by-1 and 2-by-2 blocks. 1-by-1
blocks correspond to real generalized eigenvalues, while 2-by-2 blocks of S will be "standardized" by making
the corresponding elements of T have the form:
and the pair of corresponding 2-by-2 blocks in S and T will have a complex conjugate pair of generalized
eigenvalues. A pair of matrices (S,T) is in generalized complex Schur form if S and T are upper triangular
and, in addition, the diagonal of T are non-negative real numbers.
Input Parameters
sort CHARACTER*1. Must be 'N' or 'S'. Specifies whether or not to order the
eigenvalues on the diagonal of the generalized Schur form.
If sort = 'N', then eigenvalues are not ordered.
If sort = 'S', selctg is used to select eigenvalues to sort to the top left
of the Schur form.
If sort = 'N', selctg is not referenced.
1310
LAPACK Routines 3
Note that a selected complex eigenvalue may no longer satisfy
selctg(alpha(j), beta(j)) = .TRUE. after ordering, since ordering
may change the value of complex eigenvalues (especially if the eigenvalue
is ill-conditioned); in this case info is set to n+2 (see info below).
sense CHARACTER*1. Must be 'N', 'E', 'V', or 'B'. Determines which reciprocal
condition number are computed.
If sense = 'N', none are computed;
ldvsl, ldvsr INTEGER. The leading dimensions of the output matrices vsl and vsr,
respectively. Constraints:
ldvsl 1. If jobvsl = 'V', ldvsl max(1, n).
ldvsr 1. If jobvsr = 'V', ldvsr max(1, n).
lwork INTEGER.
The dimension of the array work.
For real flavors:
If n=0 then lwork1.
1311
3 Intel Math Kernel Library Developer Reference
If n>0 and sense = 'E', 'V', or 'B', then lwork max(8*n, 6*n+16,
2*sdim*(n-sdim));
For complex flavors:
If n=0 then lwork1.
If n>0 and sense = 'E', 'V', or 'B', then lwork max(1, 2*n,
2*sdim*(n-sdim)).
Note that 2*sdim*(n-sdim) n*n/2.
iwork INTEGER.
Workspace array, size max(1, liwork).
liwork INTEGER.
The dimension of the array iwork.
If sense = 'N', or n=0, then liwork1,
otherwise liwork (n+6) for real flavors, and liwork (n+2) for complex
flavors.
If liwork=-1, then a workspace query is assumed; the routine only
calculates the bound on the optimal size of the work array and the
minimum size of the iwork array, returns these values as the first entries of
the work and iwork arrays, and no error message related to lwork or
liwork is issued by xerbla.
bwork LOGICAL.
Workspace array, size at least max(1, n).
Not referenced if sort = 'N'.
Output Parameters
a On exit, this array has been overwritten by its generalized Schur form S.
b On exit, this array has been overwritten by its generalized Schur form T.
1312
LAPACK Routines 3
sdim INTEGER.
If sort = 'N', sdim= 0.
Note that for real flavors complex conjugate pairs for which selctg is true
for either eigenvalue count as 2.
1313
3 Intel Math Kernel Library Developer Reference
If jobvsl = 'V', this array will contain the left Schur vectors.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = 0, the execution is successful.
If info = i, and
in:
the QZ iteration failed. (A, B) is not in Schur form, but alphar(j), alphai(j)
(for real flavors), or alpha(j) (for complex flavors), and beta(j), j=info
+1,..., n should be correct.
i > n: errors that usually indicate LAPACK problems:
i = n+1: other than QZ iteration failed in ?hgeqz;
i = n+2: after reordering, roundoff changed values of some complex
eigenvalues so that leading eigenvalues in the generalized Schur form no
longer satisfy selctg = .TRUE.. This could also be caused due to scaling;
1314
LAPACK Routines 3
b Holds the matrix B of size (n, n).
sense Restored based on the presence of arguments rconde and rcondv as follows:
Note that there will be an error condition if rconde or rcondv are present and select is omitted.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (or liwork) for the first run
or set lwork = -1 (liwork = -1).
If you choose the first option and set any of admissible lwork (or liwork) sizes, which is no less than the
minimal value described, the routine completes the task, though probably not so fast as with a recommended
workspace, and provides the recommended workspace in the first element of the corresponding array (work,
iwork) on exit. Use this value (work(1), iwork(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if you set lwork (liwork) to less than the minimal required value and not -1, the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
1315
3 Intel Math Kernel Library Developer Reference
The quotients alphar(j)/beta(j) and alphai(j)/beta(j) may easily over- or underflow, and beta(j) may even be
zero. Thus, you should avoid simply computing the ratio. However, alphar and alphai will be always less than
and usually comparable with norm(A) in magnitude, and beta always less than and usually comparable with
norm(B).
?gges3
Computes generalized Schur factorization for a pair of
matrices.
Syntax
call sgges3 (jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar, alphai,
beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info )
call dgges3 (jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar, alphai,
beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info )
call cgges3 (jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta, vsl,
ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info )
call zgges3 (jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta, vsl,
ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info )
Include Files
mkl.fi
Description
For a pair of n-by-n real or complex nonsymmetric matrices (A,B), ?gges3 computes the generalized
eigenvalues, the generalized real or complex Schur form (S,T), and optionally the left or right matrices of
Schur vectors (VSL and VSR). This gives the generalized Schur factorization
(A,B) = ( (VSL)*S*(VSR)T, (VSL)*T*(VSR)T ) for real (A,B)
or
(A,B) = ( (VSL)*S*(VSR)H, (VSL)*T*(VSR)H ) for complex (A,B)
where (VSR)H is the conjugate-transpose of VSR.
Optionally, it also orders the eigenvalues so that a selected cluster of eigenvalues appears in the leading
diagonal blocks of the upper quasi-triangular matrix S and the upper triangular matrix T. The leading
columns of VSL and VSR then form an orthonormal basis for the corresponding left and right eigenspaces
(deflating subspaces).
NOTE
If only the generalized eigenvalues are needed, use the driver ?ggev instead, which is faster.
A generalized eigenvalue for a pair of matrices (A,B) is a scalar w or a ratio alpha/beta = w, such that A -
w*B is singular. It is usually represented as the pair (alpha,beta), as there is a reasonable interpretation for
beta=0 or both being zero.
For real flavors:
A pair of matrices (S,T) is in generalized real Schur form if T is upper triangular with non-negative diagonal
and S is block upper triangular with 1-by-1 and 2-by-2 blocks. 1-by-1 blocks correspond to real generalized
eigenvalues, while 2-by-2 blocks of S will be "standardized" by making the corresponding elements of T have
the form:
a 0
0 b
1316
LAPACK Routines 3
and the pair of corresponding 2-by-2 blocks in S and T have a complex conjugate pair of generalized
eigenvalues.
For complex flavors:
A pair of matrices (S,T) is in generalized complex Schur form if S and T are upper triangular and, in addition,
the diagonal elements of T are non-negative real numbers.
Input Parameters
selctg LOGICAL. selctg is a function of three arguments for real flavors or two
arguments for complex flavors. selctg must be declared EXTERNAL in the
calling subroutine. If sort = 'N', selctg is not referenced. If sort = 'S',
selctg is used to select eigenvalues to sort to the top left of the Schur
form.
For real flavors:
An eigenvalue (alphar(j) + alphai(j))/beta(j) is selected if
selctg(alphar(j),alphai(j),beta(j)) is true. In other words, if either
one of a complex conjugate pair of eigenvalues is selected, then both
complex eigenvalues are selected.
Note that in the ill-conditioned case, a selected complex eigenvalue may no
longer satisfy selctg(alphar(j),alphai(j), beta(j)) = .TRUE. after
ordering. info is to be set to n+2 in this case.
1317
3 Intel Math Kernel Library Developer Reference
ldvsl INTEGER. The leading dimension of the matrix VSL. ldvsl 1, and if
jobvsl = 'V', ldvsl n.
ldvsr INTEGER. The leading dimension of the matrix VSR. ldvsr 1, and if
jobvsr = 'V', ldvsr n.
lwork INTEGER. The size of the array work. If lwork = -1, then a workspace
query is assumed; the routine only calculates the optimal size of the work
array, returns this value as the first entry of the work array, and no error
message related to lwork is issued by xerbla.
Output Parameters
1318
LAPACK Routines 3
Array, size (n).
If jobvsl = 'V', vsl contains the left Schur vectors. Not referenced if
jobvsl = 'N'.
1319
3 Intel Math Kernel Library Developer Reference
If jobvsr = 'V', vsr contains the right Schur vectors. Not referenced if
jobvsr = 'N'.
info INTEGER. = 0: successful exit < 0: if info = -i, the i-th argument had an
illegal value.
=1,...,n:
> n:
?ggev
Computes the generalized eigenvalues, and the left
and/or right generalized eigenvectors for a pair of
nonsymmetric matrices.
Syntax
call sggev(jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl, vr, ldvr,
work, lwork, info)
call dggev(jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl, vr, ldvr,
work, lwork, info)
call cggev(jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr, work,
lwork, rwork, info)
call zggev(jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr, work,
lwork, rwork, info)
call ggev(a, b, alphar, alphai, beta [,vl] [,vr] [,info])
call ggev(a, b, alpha, beta [, vl] [,vr] [,info])
Include Files
mkl.fi, lapack.f90
1320
LAPACK Routines 3
Description
The ?ggev routine computes the generalized eigenvalues, and optionally, the left and/or right generalized
eigenvectors for a pair of n-by-n real/complex nonsymmetric matrices (A,B).
A generalized eigenvalue for a pair of matrices (A,B) is a scalar or a ratio alpha / beta = , such that A -
*B is singular. It is usually represented as the pair (alpha, beta), as there is a reasonable interpretation for
beta =0 and even for both being zero.
The right generalized eigenvector v(j) corresponding to the generalized eigenvalue (j) of (A,B) satisfies
A*v(j) = (j)*B*v(j).
The left generalized eigenvector u(j) corresponding to the generalized eigenvalue (j) of (A,B) satisfies
u(j)H*A = (j)*u(j)H*B
where u(j)H denotes the conjugate transpose of u(j).
Input Parameters
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
ldb INTEGER. The leading dimension of the array b. Must be at least max(1, n).
ldvl, ldvr INTEGER. The leading dimensions of the output matrices vl and vr,
respectively.
1321
3 Intel Math Kernel Library Developer Reference
Constraints:
ldvl 1. If jobvl = 'V', ldvl max(1, n).
ldvr 1. If jobvr = 'V', ldvr max(1, n).
lwork INTEGER.
The dimension of the array work.
lwork max(1, 8n+16) for real flavors;
lwork max(1, 2n) for complex flavors.
For good performance, lwork must generally be larger.
If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
Output Parameters
1322
LAPACK Routines 3
For complex flavors:
On exit, alpha(j)/beta(j), j=1,..., n, are the generalized eigenvalues.
See also Application Notes below.
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then for
i = sqrt(-1), uj = VL*,j + i*VL*,j + 1 and uj + 1 = VL*,j - i*VL*,j+
+ 1 .
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, thenvj
= VR*,j + i*VR*,j + 1 and vj + 1 = VR*,j - i*VR*,j + 1.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
If info = i, and
1323
3 Intel Math Kernel Library Developer Reference
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
1324
LAPACK Routines 3
The quotients alphar(j)/beta(j) and alphai(j)/beta(j) may easily over- or underflow, and beta(j) may even be
zero. Thus, you should avoid simply computing the ratio. However, alphar and alphai (for real flavors) or
alpha (for complex flavors) will be always less than and usually comparable with norm(A) in magnitude, and
beta always less than and usually comparable with norm(B).
?ggevx
Computes the generalized eigenvalues, and,
optionally, the left and/or right generalized
eigenvectors.
Syntax
call sggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alphar, alphai, beta, vl,
ldvl, vr, ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work, lwork,
iwork, bwork, info)
call dggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alphar, alphai, beta, vl,
ldvl, vr, ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work, lwork,
iwork, bwork, info)
call cggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr,
ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work, lwork, rwork,
iwork, bwork, info)
call zggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr,
ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work, lwork, rwork,
iwork, bwork, info)
call ggevx(a, b, alphar, alphai, beta [,vl] [,vr] [,balanc] [,ilo] [,ihi] [, lscale]
[,rscale] [,abnrm] [,bbnrm] [,rconde] [,rcondv] [,info])
call ggevx(a, b, alpha, beta [, vl] [,vr] [,balanc] [,ilo] [,ihi] [,lscale] [, rscale]
[,abnrm] [,bbnrm] [,rconde] [,rcondv] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for a pair of n-by-n real/complex nonsymmetric matrices (A,B), the generalized
eigenvalues, and optionally, the left and/or right generalized eigenvectors.
Optionally also, it computes a balancing transformation to improve the conditioning of the eigenvalues and
eigenvectors (ilo, ihi, lscale, rscale, abnrm, and bbnrm), reciprocal condition numbers for the eigenvalues
(rconde), and reciprocal condition numbers for the right eigenvectors (rcondv).
A generalized eigenvalue for a pair of matrices (A,B) is a scalar or a ratio alpha / beta = , such that A -
*B is singular. It is usually represented as the pair (alpha, beta), as there is a reasonable interpretation for
beta=0 and even for both being zero. The right generalized eigenvector v(j) corresponding to the
generalized eigenvalue (j) of (A,B) satisfies
A*v(j) = (j)*B*v(j).
The left generalized eigenvector u(j) corresponding to the generalized eigenvalue (j) of (A,B) satisfies
u(j)H*A = (j)*u(j)H*B
where u(j)H denotes the conjugate transpose of u(j).
1325
3 Intel Math Kernel Library Developer Reference
Input Parameters
balanc CHARACTER*1. Must be 'N', 'P', 'S', or 'B'. Specifies the balance option
to be performed.
If balanc = 'N', do not diagonally scale or permute;
sense CHARACTER*1. Must be 'N', 'E', 'V', or 'B'. Determines which reciprocal
condition number are computed.
If sense = 'N', none are computed;
1326
LAPACK Routines 3
ldb INTEGER. The leading dimension of the array b.
Must be at least max(1, n).
ldvl, ldvr INTEGER. The leading dimensions of the output matrices vl and vr,
respectively.
Constraints:
ldvl 1. If jobvl = 'V', ldvl max(1, n).
ldvr 1. If jobvr = 'V', ldvr max(1, n).
lwork INTEGER.
The dimension of the array work. lwork max(1, 2*n);
iwork INTEGER.
Workspace array, size at least (n+6) for real flavors and at least (n+2) for
complex flavors.
Not referenced if sense = 'E'.
Output Parameters
1327
3 Intel Math Kernel Library Developer Reference
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then for
i = sqrt(-1), u(j) = vl(:,j) + i*vl(:,j+1) and u(j+1) =
vl(:,j) - i*vl(:,j+1).
1328
LAPACK Routines 3
u(j) = vl(:,j), the j-th column of vl.
vr(ldvr,*); the second dimension of vr must be at least max(1, n).
If jobvr = 'V', the right generalized eigenvectors v(j) are stored one after
another in the columns of vr, in the same order as their eigenvalues. Each
eigenvector will be scaled so the largest component have abs(Re) +
abs(Im) = 1.
If jobvr = 'N', vr is not referenced.
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then
v(j) = vr(:,j) + i*vr(:,j+1) and v(j+1) = vr(:,j) - i*vr(:,j
+1).
For complex flavors:
v(j) = vr(:,j), the j-th column of vr.
ilo, ihi INTEGER. ilo and ihi are integer values such that on exit Ai j = 0 and Bi j
= 0 if i > j and j = 1,..., ilo-1 or i = ihi+1,..., n.
If balanc = 'N' or 'S', ilo = 1 and ihi = n.
1329
3 Intel Math Kernel Library Developer Reference
rconde, rcondv REAL for single precision flavors DOUBLE PRECISION for double precision
flavors.
Arrays, size at least max(1, n) each.
If sense = 'E', or 'B', rconde contains the reciprocal condition numbers
of the eigenvalues, stored in consecutive elements of the array. For a
complex conjugate pair of eigenvalues two consecutive elements of rconde
are set to the same value. Thus rconde(j), rcondv(j), and the j-th columns
of vl and vr all correspond to the same eigenpair (but not in general the j-th
eigenpair, unless all eigenpairs are selected).
If sense = 'N', or 'V', rconde is not referenced.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = 0, the execution is successful.
If info = i, and
in:
the QZ iteration failed. No eigenvectors have been calculated, but alphar(j),
alphai(j) (for real flavors), or alpha(j) (for complex flavors), and beta(j),
j=info+1,..., n should be correct.
i > n: errors that usually indicate LAPACK problems:
i = n+1: other than QZ iteration failed in hgeqz;
i = n+2: error return from tgevc.
1330
LAPACK Routines 3
alpha Holds the vector of length n. Used in complex flavors only.
sense Restored based on the presence of arguments rconde and rcondv as follows:
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The quotients alphar(j)/beta(j) and alphai(j)/beta(j) may easily over- or underflow, and beta(j) may even be
zero. Thus, you should avoid simply computing the ratio. However, alphar and alphai (for real flavors) or
alpha (for complex flavors) will be always less than and usually comparable with norm(A) in magnitude, and
beta always less than and usually comparable with norm(B).
1331
3 Intel Math Kernel Library Developer Reference
?ggev3
Computes the generalized eigenvalues and the left
and right generalized eigenvectors for a pair of
matrices.
Syntax
call sggev3 (jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl, vr,
ldvr, work, lwork, info )
call dggev3 (jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl, vr,
ldvr, work, lwork, info )
call cggev3 (jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr, work,
lwork, rwork, info )
call zggev3 (jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr, work,
lwork, rwork, info )
Include Files
mkl.fi
Description
For a pair of n-by-n real or complex nonsymmetric matrices (A, B), ?ggev3 computes the generalized
eigenvalues, and optionally, the left and right generalized eigenvectors.
A generalized eigenvalue for a pair of matrices (A, B) is a scalar or a ratio alpha/beta = , such that A - *B
is singular. It is usually represented as the pair (alpha,beta), as there is a reasonable interpretation for
beta=0, and even for both being zero.
For real flavors:
The right eigenvector vj corresponding to the eigenvalue j of (A, B) satisfies
A * vj = j * B * vj.
The left eigenvector uj corresponding to the eigenvalue j of (A, B) satisfies
ujH * A = j * ujH * B
where ujH is the conjugate-transpose of uj.
For complex flavors:
The right generalized eigenvector vj corresponding to the generalized eigenvalue j of (A, B) satisfies
A * vj = j * B * vj.
The left generalized eigenvector uj corresponding to the generalized eigenvalues j of (A, B) satisfies
ujH * A = j * ujH * B
where ujH is the conjugate-transpose of uj.
Input Parameters
1332
LAPACK Routines 3
n 0.
lda max(1,n).
ldb max(1,n).
1333
3 Intel Math Kernel Library Developer Reference
Output Parameters
a On exit, a is overwritten.
b On exit, b is overwritten.
1334
LAPACK Routines 3
DOUBLE COMPLEX for zggev3
Array, size (ldvl, n).
=1,...,n:
1335
3 Intel Math Kernel Library Developer Reference
> n:
?rot c, z Applies a plane rotation with real cosine and complex sine to a pair
of complex vectors.
i?max1 c, z Finds the index of the vector element whose real part has
maximum absolute value.
?sum1 sc, dz Forms the 1-norm of the complex vector using the true absolute
value.
1336
LAPACK Routines 3
Routine Name Data Description
Types
?laisnan s, d, Tests input for NaN by comparing two arguments for inequality.
1337
3 Intel Math Kernel Library Developer Reference
1338
LAPACK Routines 3
Routine Name Data Description
Types
?lalsa s, d, c, z Computes the SVD of the coefficient matrix in compact form. Used
by ?gelsd.
?lansb s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
symmetric band matrix.
?lanhb c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
Hermitian band matrix.
1339
3 Intel Math Kernel Library Developer Reference
?lansp s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
symmetric matrix supplied in packed form.
?lanhp c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
complex Hermitian matrix supplied in packed form.
?lanst/?lanht s, d/c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a real
symmetric or complex Hermitian tridiagonal matrix.
?lansy s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a real/
complex symmetric matrix.
?lanhe c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
complex Hermitian matrix.
?lantb s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
triangular band matrix.
?lantp s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
triangular matrix supplied in packed form.
?lantr s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
trapezoidal or triangular matrix.
?laqgb s, d, c, z Scales a general band matrix, using row and column scaling
factors computed by ?gbequ.
?laqge s, d, c, z Scales a general rectangular matrix, using row and column scaling
factors computed by ?geequ.
1340
LAPACK Routines 3
Routine Name Data Description
Types
?laqr1 s, d, c, z Sets a scalar multiple of the first column of the product of 2-by-2
or 3-by-3 matrix H and specified shifts.
?lar1v s, d, c, z Computes the (scaled) r-th column of the inverse of the submatrix
in rows b1 through bn of the tridiagonal matrix LDLT - I.
?lar2v s, d, c, z Applies a vector of plane rotations with real cosines and real/
complex sines from both sides to a sequence of 2-by-2 symmetric/
Hermitian matrices.
1341
3 Intel Math Kernel Library Developer Reference
?largv s, d, c, z Generates a vector of plane rotations with real cosines and real/
complex sines.
?larrf s, d Finds a new relatively robust representation such that at least one
of the eigenvalues is relatively isolated.
?lartg s, d, c, z Generates a plane rotation with real cosine and real/complex sine.
?lartv s, d, c, z Applies a vector of plane rotations with real cosines and real/
complex sines to the elements of a pair of vectors.
1342
LAPACK Routines 3
Routine Name Data Description
Types
?lasd2 s, d Merges the two sets of singular values together into a single
sorted set. Used by ?bdsdc.
?lasd3 s, d Finds all square roots of the roots of the secular equation, as
defined by the values in D and Z, and then updates the singular
vectors by matrix multiplication. Used by ?bdsdc.
?lasd7 s, d Merges the two sets of singular values together into a single
sorted set. Then it tries to deflate the size of the problem. Used
by ?bdsdc.
?lasd8 s, d Finds the square roots of the roots of the secular equation, and
stores, for each element in D, the distance to its two nearest
poles. Used by ?bdsdc.
?lasdq s, d Computes the SVD of a real bidiagonal matrix with diagonal d and
off-diagonal e. Used by ?bdsdc.
?lasq3 s, d Checks for deflation, computes a shift and calls dqds. Used by ?
bdsqr.
1343
3 Intel Math Kernel Library Developer Reference
?lasy2 s, d Solves the Sylvester matrix equation where the matrices are of
order 1 or 2.
1344
LAPACK Routines 3
Routine Name Data Description
Types
?latrs s, d, c, z Solves a triangular system of equations with the scale factor set to
prevent overflow.
?lauu2 s, d, c, z Computes the product UUH or LHL, where U and L are upper or
lower triangular matrices (unblocked algorithm).
?lauum s, d, c, z Computes the product UUH or LHL, where U and L are upper or
lower triangular matrices (blocked algorithm).
?ptts2 s, d, c, z Solves a tridiagonal system of the form AX=B using the LDLH
factorization computed by ?pttrf.
1345
3 Intel Math Kernel Library Developer Reference
1346
LAPACK Routines 3
Routine Name Data Description
Types
?lansf s, d Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
symmetric matrix in RFP format.
?lanhf c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
Hermitian matrix in RFP format.
?tfttp s, d, c, z Copies a triangular matrix from the rectangular full packed format
(TF) to the standard packed format (TP).
?tfttr s, d, c, z Copies a triangular matrix from the rectangular full packed format
(TF) to the standard full format (TR).
?tpttf s, d, c, z Copies a triangular matrix from the standard packed format (TP)
to the rectangular full packed format (TF).
?tpttr s, d, c, z Copies a triangular matrix from the standard packed format (TP)
to the standard full format (TR).
?trttf s, d, c, z Copies a triangular matrix from the standard full format (TR) to
the rectangular full packed format (TF).
?trttp s, d, c, z Copies a triangular matrix from the standard full format (TR) to
the standard packed format (TP).
?la_gbrcond s, d Estimates the Skeel condition number for a general banded matrix.
1347
3 Intel Math Kernel Library Developer Reference
1348
LAPACK Routines 3
Routine Name Data Description
Types
?lacgv
Conjugates a complex vector.
Syntax
call clacgv( n, x, incx )
call zlacgv( n, x, incx )
Include Files
mkl.fi
Description
The routine conjugates a complex vector x of length n and increment incx (see "Vector Arguments in BLAS"
in Appendix B).
Input Parameters
The data types are given for the Fortran interface.
1349
3 Intel Math Kernel Library Developer Reference
Output Parameters
?lacrm
Multiplies a complex matrix by a square real matrix.
Syntax
call clacrm( m, n, a, lda, b, ldb, c, ldc, rwork )
call zlacrm( m, n, a, lda, b, ldb, c, ldc, rwork )
Include Files
mkl.fi
Description
Input Parameters
m INTEGER. The number of rows of the matrix A and of the matrix C (m 0).
n INTEGER. The number of columns and rows of the matrix B and the number
of columns of the matrix C
(n 0).
ldc INTEGER. The leading dimension of the output array c, ldcmax(1, n).
1350
LAPACK Routines 3
Workspace array, DIMENSION(2*m*n).
Output Parameters
?lacrt
Performs a linear transformation of a pair of complex
vectors.
Syntax
call clacrt( n, cx, incx, cy, incy, c, s )
call zlacrt( n, cx, incx, cy, incy, c, s )
Include Files
mkl.fi
Description
Input Parameters
1351
3 Intel Math Kernel Library Developer Reference
Output Parameters
?laesy
Computes the eigenvalues and eigenvectors of a 2-
by-2 complex symmetric matrix, and checks that the
norm of the matrix of eigenvectors is larger than a
threshold value.
Syntax
call claesy( a, b, c, rt1, rt2, evscal, cs1, sn1 )
call zlaesy( a, b, c, rt1, rt2, evscal, cs1, sn1 )
Include Files
mkl.fi
Description
provided the norm of the matrix of eigenvectors is larger than some threshold value.
rt1 is the eigenvalue of larger absolute value, and rt2 of smaller absolute value. If the eigenvectors are
computed, then on return (cs1, sn1) is the unit eigenvector for rt1, hence
Input Parameters
1352
LAPACK Routines 3
Output Parameters
?rot
Applies a plane rotation with real cosine and complex
sine to a pair of complex vectors.
Syntax
call crot( n, cx, incx, cy, incy, c, s )
call zrot( n, cx, incx, cy, incy, c, s )
Include Files
mkl.fi
Description
The routine applies a plane rotation, where the cosine (c) is real and the sine (s) is complex, and the vectors
cx and cy are complex. This routine has its real equivalents in BLAS (see ?rot in Chapter 2).
Input Parameters
1353
3 Intel Math Kernel Library Developer Reference
Output Parameters
?spmv
Computes a matrix-vector product for complex vectors
using a complex symmetric packed matrix.
Syntax
call cspmv( uplo, n, alpha, ap, x, incx, beta, y, incy )
call zspmv( uplo, n, alpha, ap, x, incx, beta, y, incy )
Include Files
mkl.fi
Description
y := alpha*a*x + beta*y,
where:
alpha and beta are complex scalars,
x and y are n-element complex vectors
a is an n-by-n complex symmetric matrix, supplied in packed form.
These routines have their real equivalents in BLAS (see ?spmv in Chapter 2 ).
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
matrix a is supplied in the packed array ap.
1354
LAPACK Routines 3
If uplo = 'U' or 'u', the upper triangular part of the matrix a is supplied
in the array ap.
If uplo = 'L' or 'l', the lower triangular part of the matrix a is supplied
in the array ap .
n INTEGER.
Specifies the order of the matrix a.
The value of n must be at least zero.
incx INTEGER. Specifies the increment for the elements of x. The value of incx
must not be zero.
incy INTEGER. Specifies the increment for the elements of y. The value of incy
must not be zero.
Output Parameters
?spr
Performs the symmetrical rank-1 update of a complex
symmetric packed matrix.
1355
3 Intel Math Kernel Library Developer Reference
Syntax
call cspr( uplo, n, alpha, x, incx, ap )
call zspr( uplo, n, alpha, x, incx, ap )
Include Files
mkl.fi
Description
a:= alpha*x*xH + a,
where:
alpha is a complex scalar
x is an n-element complex vector
a is an n-by-n complex symmetric matrix, supplied in packed form.
These routines have their real equivalents in BLAS (see ?spr in Chapter 2).
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
matrix a is supplied in the packed array ap, as follows:
If uplo = 'U' or 'u', the upper triangular part of the matrix a is supplied
in the array ap.
If uplo = 'L' or 'l', the lower triangular part of the matrix a is supplied
in the array ap .
n INTEGER.
Specifies the order of the matrix a.
The value of n must be at least zero.
incx INTEGER. Specifies the increment for the elements of x. The value of incx
must not be zero.
1356
LAPACK Routines 3
Array, DIMENSION at least ((n*(n + 1))/2). Before entry, with uplo = 'U'
or 'u', the array ap must contain the upper triangular part of the
symmetric matrix packed sequentially, column-by-column, so that ap(1)
contains A(1,1), ap(2) and ap(3) contain A(1,2) and A(2,2)
respectively, and so on.
Before entry, with uplo = 'L' or 'l', the array ap must contain the lower
triangular part of the symmetric matrix packed sequentially, column-by-
column, so that ap(1) contains a(1,1), ap(2) and ap(3) contain a(2,1)
and a(3,1) respectively, and so on.
Note that the imaginary parts of the diagonal elements need not be set,
they are assumed to be zero, and on exit they are set to zero.
Output Parameters
ap With uplo = 'U' or 'u', overwritten by the upper triangular part of the
updated matrix.
With uplo = 'L' or 'l', overwritten by the lower triangular part of the
updated matrix.
?syconv
Converts a symmetric matrix given by a triangular
matrix factorization into two matrices and vice versa.
Syntax
call ssyconv( uplo, way, n, a, lda, ipiv, e, info )
call dsyconv( uplo, way, n, a, lda, ipiv, e, info )
call csyconv( uplo, way, n, a, lda, ipiv, e, info )
call zsyconv( uplo, way, n, a, lda, ipiv, e, info )
call syconv( a[,uplo][,way][,ipiv][,info][,e] )
Include Files
mkl.fi, lapack.f90
Description
The routine converts matrix A, which results from a triangular matrix factorization, into matrices L and D and
vice versa. The routine returns non-diagonalized elements of D and applies or reverses permutation done
with the triangular matrix factorization.
Input Parameters
1357
3 Intel Math Kernel Library Developer Reference
The block diagonal matrix D and the multipliers used to obtain the factor U
or L as computed by ?sytrf.
Output Parameters
See Also
?sytrf
1358
LAPACK Routines 3
?symv
Computes a matrix-vector product for a complex
symmetric matrix.
Syntax
call csymv( uplo, n, alpha, a, lda, x, incx, beta, y, incy )
call zsymv( uplo, n, alpha, a, lda, x, incx, beta, y, incy )
Include Files
mkl.fi
Description
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array a is used:
If uplo = 'U' or 'u', then the upper triangular part of the array a is used.
If uplo = 'L' or 'l', then the lower triangular part of the array a is used.
1359
3 Intel Math Kernel Library Developer Reference
incx INTEGER. Specifies the increment for the elements of x. The value of incx
must not be zero.
incy INTEGER. Specifies the increment for the elements of y. The value of incy
must not be zero.
Output Parameters
?syr
Performs the symmetric rank-1 update of a complex
symmetric matrix.
Syntax
call csyr( uplo, n, alpha, x, incx, a, lda )
call zsyr( uplo, n, alpha, x, incx, a, lda )
Include Files
mkl.fi
Description
These routines have their real equivalents in BLAS (see ?syr in Chapter 2).
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array a is used:
1360
LAPACK Routines 3
If uplo = 'U' or 'u', then the upper triangular part of the array a is used.
If uplo = 'L' or 'l', then the lower triangular part of the array a is used.
incx INTEGER. Specifies the increment for the elements of x. The value of incx
must not be zero.
Output Parameters
a With uplo = 'U' or 'u', the upper triangular part of the array a is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array a is
overwritten by the lower triangular part of the updated matrix.
i?max1
Finds the index of the vector element whose real part
has maximum absolute value.
Syntax
index = icmax1( n, cx, incx )
1361
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
Given a complex vector cx, the i?max1 functions return the index of the first vector element of maximum
absolute value. These functions are based on the BLAS functions icamax/izamax, but using the absolute
value of components. They are designed for use with clacon/zlacon.
Input Parameters
Output Parameters
?sum1
Forms the 1-norm of the complex vector using the
true absolute value.
Syntax
res = scsum1( n, cx, incx )
res = dzsum1( n, cx, incx )
Include Files
mkl.fi
Description
Given a complex vector cx, scsum1/dzsum1 functions take the sum of the absolute values of vector elements
and return a single/double precision result, respectively. These functions are based on scasum/dzasum from
Level 1 BLAS, but use the true absolute value and were designed for use with clacon/zlacon.
Input Parameters
1362
LAPACK Routines 3
Contains the input vector whose elements will be summed.
incx INTEGER. Specifies the spacing between successive elements of cx (incx >
0).
Output Parameters
?gbtf2
Computes the LU factorization of a general band
matrix using the unblocked version of the algorithm.
Syntax
call sgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call dgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call cgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call zgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
Include Files
mkl.fi
Description
The routine forms the LU factorization of a general real/complex m-by-n band matrix A with kl sub-diagonals
and ku super-diagonals. The routine uses partial pivoting with row interchanges and implements the
unblocked version of the algorithm, calling Level 2 BLAS. See also ?gbtrf.
Input Parameters
The array ab contains the matrix A in band storage (see Matrix Arguments).
The second dimension of ab must be at least max(1, n).
1363
3 Intel Math Kernel Library Developer Reference
Output Parameters
ipiv INTEGER.
Array, DIMENSION at least max(1,min(m,n)).
?gebd2
Reduces a general matrix to bidiagonal form using an
unblocked algorithm.
Syntax
call sgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call dgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call cgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call zgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
Include Files
mkl.fi
Description
The routine reduces a general m-by-n matrix A to upper or lower bidiagonal form B by an orthogonal
(unitary) transformation: QT*A*P = B (for real flavors) or QH*A*P = B (for complex flavors).
The routine does not form the matrices Q and P explicitly, but represents them as products of elementary
reflectors. if mn,
1364
LAPACK Routines 3
where tauq and taup are scalars (real for sgebd2/dgebd2, complex for cgebd2/zgebd2), and v and u are
vectors (real for sgebd2/dgebd2, complex for cgebd2/zgebd2).
Input Parameters
Output Parameters
a if mn, the diagonal and first super-diagonal of a are overwritten with the
upper bidiagonal matrix B. Elements below the diagonal, with the array
tauq, represent the orthogonal/unitary matrix Q as a product of elementary
reflectors, and elements above the first superdiagonal, with the array taup,
represent the orthogonal/unitary matrix p as a product of elementary
reflectors.
if m < n, the diagonal and first sub-diagonal of a are overwritten by the
lower bidiagonal matrix B. Elements below the first subdiagonal, with the
array tauq, represent the orthogonal/unitary matrix Q as a product of
elementary reflectors, and elements above the diagonal, with the array
taup, represent the orthogonal/unitary matrix p as a product of elementary
reflectors.
1365
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
?gehd2
Reduces a general square matrix to upper Hessenberg
form using an unblocked algorithm.
Syntax
call sgehd2( n, ilo, ihi, a, lda, tau, work, info )
call dgehd2( n, ilo, ihi, a, lda, tau, work, info )
call cgehd2( n, ilo, ihi, a, lda, tau, work, info )
call zgehd2( n, ilo, ihi, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine reduces a real/complex general matrix A to upper Hessenberg form H by an orthogonal or
unitary similarity transformation QT*A*Q = H (for real flavors) or QH*A*Q = H (for complex flavors).
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of elementary
reflectors.
Input Parameters
ilo, ihi INTEGER. It is assumed that A is already upper triangular in rows and
columns 1:ilo -1 and ihi+1:n.
ilo and ihi must contain the values returned by that routine. Otherwise they
should be set to ilo = 1 and ihi = n. Constraint: 1 iloihi max(1,
n).
1366
LAPACK Routines 3
COMPLEX for cgehd2
DOUBLE COMPLEX for zgehd2.
Arrays:
a (lda,*) contains the n-by-n matrix A to be reduced. The second
dimension of a must be at least max(1, n).
Output Parameters
a On exit, the upper triangle and the first subdiagonal of A are overwritten
with the upper Hessenberg matrix H and the elements below the first
subdiagonal, with the array tau, represent the orthogonal/unitary matrix Q
as a product of elementary reflectors. See Application Notes below.
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The matrix Q is represented as a product of (ihi - ilo) elementary reflectors
Q = H(ilo)*H(ilo +1)*...*H(ihi -1)
Each H(i) has the form
H(i) = I - tau*v*vT for real flavors, or
H(i) = I - tau*v*vH for complex flavors
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i) = 0, v(i+1) = 1 and v(ihi
+1:n) = 0.
On exit, v(i+2:ihi) is stored in a(i+2:ihi, i) and tau in tau(i).
The contents of a are illustrated by the following example, with n = 7, ilo = 2 and ihi = 6:
1367
3 Intel Math Kernel Library Developer Reference
where a denotes an element of the original matrix A, h denotes a modified element of the upper Hessenberg
matrix H, and vi denotes an element of the vector defining H(i).
?gelq2
Computes the LQ factorization of a general
rectangular matrix using an unblocked algorithm.
Syntax
call sgelq2( m, n, a, lda, tau, work, info )
call dgelq2( m, n, a, lda, tau, work, info )
call cgelq2( m, n, a, lda, tau, work, info )
call zgelq2( m, n, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors :
Q = H(k) ... H(2) H(1) (or Q = H(k)H ... H(2)HH(1)H for complex flavors), where k = min(m, n)
Each H(i) has the form
H(i) = I - tau*v*vT for real flavors, or
H(i) = I - tau*v*vH for complex flavors,
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v1:i-1 = 0 and vi =
1.
On exit, vi+1:n (for real functions) and conjg(vi+1:n) (for complex functions) are stored in a(i, i+1:n).
Input Parameters
The data types are given for the Fortran interface.
1368
LAPACK Routines 3
m INTEGER. The number of rows in the matrix A (m 0).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
?geql2
Computes the QL factorization of a general
rectangular matrix using an unblocked algorithm.
Syntax
call sgeql2( m, n, a, lda, tau, work, info )
call dgeql2( m, n, a, lda, tau, work, info )
call cgeql2( m, n, a, lda, tau, work, info )
call zgeql2( m, n, a, lda, tau, work, info )
1369
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors :
Q = H(k)* ... *H(2)*H(1), where k = min(m, n).
Each H(i) has the form
H(i) = I - tau*v*vT for real flavors, or
H(i) = I - tau*v*vH for complex flavors
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v(m-k+i+1:m) = 0
and v(m-k+i) = 1.
Input Parameters
Output Parameters
1370
LAPACK Routines 3
Array, DIMENSION at least max(1, min(m, n)).
info INTEGER.
If info = 0, the execution is successful.
?geqr2
Computes the QR factorization of a general
rectangular matrix using an unblocked algorithm.
Syntax
call sgeqr2( m, n, a, lda, tau, work, info )
call dgeqr2( m, n, a, lda, tau, work, info )
call cgeqr2( m, n, a, lda, tau, work, info )
call zgeqr2( m, n, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors :
Q = H(1)*H(2)* ... *H(k), where k = min(m, n)
Each H(i) has the form
H(i) = I - tau*v*vT for real flavors, or
H(i) = I - tau*v*vH for complex flavors
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v1:i-1 = 0 and vi =
1.
On exit, vi+1:m is stored in a(i+1:m, i).
Input Parameters
The data types are given for the Fortran interface.
1371
3 Intel Math Kernel Library Developer Reference
Arrays:
a(lda,*) contains the m-by-n matrix A.
The second dimension of a must be at least max(1, n).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
?geqr2p
Computes the QR factorization of a general
rectangular matrix with non-negative diagonal
elements using an unblocked algorithm.
Syntax
call sgeqr2p( m, n, a, lda, tau, work, info )
call dgeqr2p( m, n, a, lda, tau, work, info )
call cgeqr2p( m, n, a, lda, tau, work, info )
call zgeqr2p( m, n, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine computes a QR factorization of a real/complex m-by-n matrix A as A = Q*R. The diagonal entries
of R are real and nonnegative.
1372
LAPACK Routines 3
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors :
Q = H(1)*H(2)* ... *H(k), where k = min(m, n)
Each H(i) has the form
H(i) = I - tau*v*vT for real flavors, or
H(i) = I - tau*v*vH for complex flavors
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v(1:i-1) = 0 and
v(i) = 1.
On exit, v(i+1:m) is stored in a(i+1:m, i).
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
1373
3 Intel Math Kernel Library Developer Reference
?geqrt2
Computes a QR factorization of a general real or
complex matrix using the compact WY representation
of Q.
Syntax
call sgeqrt2(m, n, a, lda, t, ldt, info)
call dgeqrt2(m, n, a, lda, t, ldt, info)
call cgeqrt2(m, n, a, lda, t, ldt, info)
call zgeqrt2(m, n, a, lda, t, ldt, info)
call geqrt2(a, t, [info])
Include Files
mkl.fi, lapack.f90
Description
The strictly lower triangular matrix V contains the elementary reflectors H(i) in the ith column below the
diagonal. For example, if m=5 and n=3, the matrix V is
where vi represents the vector that defines H(i). The vectors are returned in the lower triangular part of array
a.
NOTE
The 1s along the diagonal of V are not stored in a.
Input Parameters
1374
LAPACK Routines 3
n INTEGER. The number of columns in A (n 0).
Output Parameters
The n-by-n upper triangular factor of the block reflector. The elements on
and above the diagonal contain the block reflector T. The elements below
the diagonal are not used.
info INTEGER.
If info = 0, the execution is successful.
If info < 0 and info = -i, the ith argument had an illegal value.
?geqrt3
Recursively computes a QR factorization of a general
real or complex matrix using the compact WY
representation of Q.
Syntax
call sgeqrt3(m, n, a, lda, t, ldt, info)
call dgeqrt3(m, n, a, lda, t, ldt, info)
call cgeqrt3(m, n, a, lda, t, ldt, info)
call zgeqrt3(m, n, a, lda, t, ldt, info)
call geqrt3(a, t [, info])
1375
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi, lapack.f90
Description
The strictly lower triangular matrix V contains the elementary reflectors H(i) in the ith column below the
diagonal. For example, if m=5 and n=3, the matrix V is
where vi represents one of the vectors that define H(i). The vectors are returned in the lower part of
triangular array a.
NOTE
The 1s along the diagonal of V are not stored in a.
Input Parameters
1376
LAPACK Routines 3
Output Parameters
a The elements on and above the diagonal of the array contain the n-by-n
upper triangular matrix R. The elements below the diagonal are the
columns of V.
The n-by-n upper triangular factor of the block reflector. The elements on
and above the diagonal contain the block reflector T. The elements below
the diagonal are not used.
info INTEGER.
If info = 0, the execution is successful.
If info < 0 and info = -i, the ith argument had an illegal value.
?gerq2
Computes the RQ factorization of a general
rectangular matrix using an unblocked algorithm.
Syntax
call sgerq2( m, n, a, lda, tau, work, info )
call dgerq2( m, n, a, lda, tau, work, info )
call cgerq2( m, n, a, lda, tau, work, info )
call zgerq2( m, n, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors :
Q = H(1)*H(2)* ... *H(k) for real flavors, or
Q = H(1)H*H(2)H* ... *H(k)H for complex flavors
where k = min(m, n).
1377
3 Intel Math Kernel Library Developer Reference
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v(n-k+i+1:n) = 0
and v(n-k+i) = 1.
Input Parameters
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
?gesc2
Solves a system of linear equations using the LU
factorization with complete pivoting computed by ?
getc2.
1378
LAPACK Routines 3
Syntax
call sgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call dgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call cgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call zgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
Include Files
mkl.fi
Description
Input Parameters
A = P*L*U*Q.
The second dimension of a must be at least max(1, n);
rhs(n) contains on entry the right hand side vector for the system of
equations.
ipiv INTEGER.
Array, DIMENSION at least max(1,n).
The pivot indices: for 1 in, row i of the matrix has been interchanged
with row ipiv(i).
jpiv INTEGER.
Array, DIMENSION at least max(1,n).
The pivot indices: for 1 jn, column j of the matrix has been
interchanged with column jpiv(j).
Output Parameters
1379
3 Intel Math Kernel Library Developer Reference
?getc2
Computes the LU factorization with complete pivoting
of the general n-by-n matrix.
Syntax
call sgetc2( n, a, lda, ipiv, jpiv, info )
call dgetc2( n, a, lda, ipiv, jpiv, info )
call cgetc2( n, a, lda, ipiv, jpiv, info )
call zgetc2( n, a, lda, ipiv, jpiv, info )
Include Files
mkl.fi
Description
The routine computes an LU factorization with complete pivoting of the n-by-n matrix A. The factorization
has the form A = P*L*U*Q, where P and Q are permutation matrices, L is lower triangular with unit diagonal
elements and U is upper triangular.
The LU factorization computed by this routine is used by ?latdf to compute a contribution to the reciprocal
Dif-estimate.
Input Parameters
Output Parameters
a On exit, the factors L and U from the factorization A = P*L*U*Q; the unit
diagonal elements of L are not stored. If U(k, k) appears to be less than
smin, U(k, k) is given the value of smin, that is giving a nonsingular
perturbed system.
ipiv INTEGER.
1380
LAPACK Routines 3
Array, DIMENSION at least max(1,n).
The pivot indices: for 1 i n, row i of the matrix has been interchanged
with row ipiv(i).
jpiv INTEGER.
Array, DIMENSION at least max(1,n).
info INTEGER.
If info = 0, the execution is successful.
?getf2
Computes the LU factorization of a general m-by-n
matrix using partial pivoting with row interchanges
(unblocked algorithm).
Syntax
call sgetf2( m, n, a, lda, ipiv, info )
call dgetf2( m, n, a, lda, ipiv, info )
call cgetf2( m, n, a, lda, ipiv, info )
call zgetf2( m, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
The routine computes the LU factorization of a general m-by-n matrix A using partial pivoting with row
interchanges. The factorization has the form
A = P*L*U
where p is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m >
n) and U is upper triangular (upper trapezoidal if m < n).
Input Parameters
The data types are given for the Fortran interface.
1381
3 Intel Math Kernel Library Developer Reference
Output Parameters
ipiv INTEGER.
Array, size at least max(1,min(m,n)).
The pivot indices: for 1 i n, row i was interchanged with row ipiv(i).
?gtts2
Solves a system of linear equations with a tridiagonal
matrix using the LU factorization computed by ?
gttrf.
Syntax
call sgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call dgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call cgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call zgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
Include Files
mkl.fi
Description
The routine solves for X one of the following systems of linear equations with multiple right hand sides:
A*X = B, AT*X = B, or AH*X = B (for complex matrices only), with a tridiagonal matrix A using the LU
factorization computed by ?gttrf.
Input Parameters
1382
LAPACK Routines 3
If itrans = 1, then AT*X = B (transpose).
nrhs INTEGER. The number of right-hand sides, i.e., the number of columns in B
(nrhs 0).
The array dl contains the (n - 1) multipliers that define the matrix L from
the LU factorization of A.
The array d contains the n diagonal elements of the upper triangular matrix
U from the LU factorization of A.
The array du contains the (n - 1) elements of the first super-diagonal of U.
The array du2 contains the (n - 2) elements of the second super-diagonal of
U.
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations.
ipiv INTEGER.
Array, DIMENSION (n).
Output Parameters
?isnan
Tests input for NaN.
Syntax
val = sisnan( sin )
val = disnan( din )
Include Files
mkl.fi
Description
This logical routine returns .TRUE. if its argument is NaN, and .FALSE. otherwise.
1383
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
?laisnan
Tests input for NaN.
Syntax
val = slaisnan( sin1, sin2 )
val = dlaisnan( din1, din2 )
Include Files
mkl.fi
Description
This logical routine checks for NaNs (NaN stands for 'Not A Number') by comparing its two arguments for
inequality. NaN is the only floating-point value where NaN NaN returns .TRUE. To check for NaNs, pass the
same variable as both arguments.
This routine is not for general use. It exists solely to avoid over-optimization in ?isnan.
Input Parameters
Output Parameters
?labrd
Reduces the first nb rows and columns of a general
matrix to a bidiagonal form.
Syntax
call slabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
call dlabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
1384
LAPACK Routines 3
call clabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
call zlabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
Include Files
mkl.fi
Description
The routine reduces the first nb rows and columns of a general m-by-n matrix A to upper or lower bidiagonal
form by an orthogonal/unitary transformation Q'*A*P, and returns the matrices X and Y which are needed to
apply the transformation to the unreduced part of A.
if mn, A is reduced to upper bidiagonal form; if m < n, to lower bidiagonal form.
The matrices Q and P are represented as products of elementary reflectors: Q = H(1)*(2)* ...*H(nb),
and P = G(1)*G(2)* ...*G(nb)
Input Parameters
ldx INTEGER. The leading dimension of the output array x; must beat least
max(1, m).
ldy INTEGER. The leading dimension of the output array y; must beat least
max(1, n).
Output Parameters
a On exit, the first nb rows and columns of the matrix are overwritten; the
rest of the array is unchanged.
1385
3 Intel Math Kernel Library Developer Reference
if mn, elements on and below the diagonal in the first nb columns, with the
array tauq, represent the orthogonal/unitary matrix Q as a product of
elementary reflectors; and elements above the diagonal in the first nb rows,
with the array taup, represent the orthogonal/unitary matrix p as a product
of elementary reflectors.
if m < n, elements below the diagonal in the first nb columns, with the
array tauq, represent the orthogonal/unitary matrix Q as a product of
elementary reflectors, and elements on and above the diagonal in the first
nb rows, with the array taup, represent the orthogonal/unitary matrix p as
a product of elementary reflectors.
Application Notes
if mn, then for the elementary reflectors H(i) and G(i),
v(1:i-1) = 0, v(i) = 1, and v(i:m) is stored on exit in a(i:m, i); u(1:i) = 0, u(i+1) = 1, and u(i
+1:n) is stored on exit in a(i, i+1:n);
tauq is stored in tauq(i) and taup in taup(i).
if m < n,
1386
LAPACK Routines 3
The contents of a on exit are illustrated by the following examples with nb = 2:
where a denotes an element of the original matrix which is unchanged, vi denotes an element of the vector
defining H(i), and ui an element of the vector defining G(i).
?lacn2
Estimates the 1-norm of a square matrix, using
reverse communication for evaluating matrix-vector
products.
Syntax
call slacn2( n, v, x, isgn, est, kase, isave )
call dlacn2( n, v, x, isgn, est, kase, isave )
call clacn2( n, v, x, est, kase, isave )
call zlacn2( n, v, x, est, kase, isave )
Include Files
mkl.fi
Description
The routine estimates the 1-norm of a square, real or complex matrix A. Reverse communication is used for
evaluating matrix-vector products.
Input Parameters
1387
3 Intel Math Kernel Library Developer Reference
isgn INTEGER.
Workspace array, size (n), used with real flavors only.
kase INTEGER.
On the initial call to the routine, kase must be set to 0.
Output Parameters
isave This parameter is used to save variables between calls to the routine.
?lacon
Estimates the 1-norm of a square matrix, using
reverse communication for evaluating matrix-vector
products.
Syntax
call slacon( n, v, x, isgn, est, kase )
call dlacon( n, v, x, isgn, est, kase )
call clacon( n, v, x, est, kase )
call zlacon( n, v, x, est, kase )
1388
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine estimates the 1-norm of a square, real/complex matrix A. Reverse communication is used for
evaluating matrix-vector products.
WARNING
The ?lacon routine is not thread-safe. It is deprecated and retained for the backward compatibility
only. Use the thread-safe ?lacn2 routine instead.
Input Parameters
v is a workspace array.
x is used as input after an intermediate return.
isgn INTEGER.
Workspace array, DIMENSION (n), used with real flavors only.
kase INTEGER.
On the initial call to ?lacon, kase should be 0.
Output Parameters
1389
3 Intel Math Kernel Library Developer Reference
?lacpy
Copies all or part of one two-dimensional array to
another.
Syntax
call slacpy( uplo, m, n, a, lda, b, ldb )
call dlacpy( uplo, m, n, a, lda, b, ldb )
call clacpy( uplo, m, n, a, lda, b, ldb )
call zlacpy( uplo, m, n, a, lda, b, ldb )
Include Files
mkl.fi
Description
Input Parameters
The data types are given for the Fortran interface.
uplo CHARACTER*1.
Specifies the part of the matrix A to be copied to B.
If uplo = 'U', the upper triangular part of A;
1390
LAPACK Routines 3
lda INTEGER. The leading dimension of a; ldamax(1,m).
ldb INTEGER. The leading dimension of the output array b; ldb max(1, m).
Output Parameters
?ladiv
Performs complex division in real arithmetic, avoiding
unnecessary overflow.
Syntax
call sladiv( a, b, c, d, p, q )
call dladiv( a, b, c, d, p, q )
res = cladiv( x, y )
res = zladiv( x, y )
Include Files
mkl.fi
Description
res = x/y,
where x and y are complex. The computation of x / y will not overflow on an intermediary step unless the
results overflows.
The algorithm used is due to [Baudin12].
Input Parameters
1391
3 Intel Math Kernel Library Developer Reference
The scalars a, b, c, and d in the above expression (for real flavors only).
Output Parameters
?lae2
Computes the eigenvalues of a 2-by-2 symmetric
matrix.
Syntax
call slae2( a, b, c, rt1, rt2 )
call dlae2( a, b, c, rt1, rt2 )
Include Files
mkl.fi
Description
On return, rt1 is the eigenvalue of larger absolute value, and rt1 is the eigenvalue of smaller absolute value.
Input Parameters
Output Parameters
1392
LAPACK Routines 3
DOUBLE PRECISION for dlae2
The computed eigenvalues of larger and smaller absolute value,
respectively.
Application Notes
rt1 is accurate to a few ulps barring over/underflow. rt2 may be inaccurate if there is massive cancellation in
the determinant a*c-b*b; higher precision or correctly rounded or correctly truncated arithmetic would be
needed to compute rt2 accurately in all cases.
Overflow is possible only if rt1 is within a factor of 5 of overflow. Underflow is harmless if the input data is 0
or exceeds
underflow_threshold / macheps.
?laebz
Computes the number of eigenvalues of a real
symmetric tridiagonal matrix which are less than or
equal to a given value, and performs other tasks
required by the routine ?stebz.
Syntax
call slaebz( ijob, nitmax, n, mmax, minp, nbmin, abstol, reltol, pivmin, d, e, e2,
nval, ab, c, mout, nab, work, iwork, info )
call dlaebz( ijob, nitmax, n, mmax, minp, nbmin, abstol, reltol, pivmin, d, e, e2,
nval, ab, c, mout, nab, work, iwork, info )
Include Files
mkl.fi
Description
The routine ?laebz contains the iteration loops which compute and use the function n(w), which is the count
of eigenvalues of a symmetric tridiagonal matrix T less than or equal to its argument w. It performs a choice
of two types of loops:
ijob =2: It takes as input a list of intervals and returns a list of sufficiently small
intervals whose union contains the same eigenvalues as the union of the original
intervals. The input intervals are (ab(j,1),ab(j,2)], j=1,...,minp. The
output interval (ab(j,1),ab(j,2)] will contain eigenvalues nab(j,
1)+1,...,nab(j,2), where 1 j mout.
ijob =3: It performs a binary search in each input interval (ab(j,1),ab(j,2)] for a
point w(j) such that n(w(j))=nval(j), and uses c(j) as the starting point of the
search. If such a w(j) is found, then on output ab(j,1)=ab(j,2)=w. If no such
w(j) is found, then on output (ab(j,1),ab(j,2)] will be a small interval
containing the point where n(w) jumps through nval(j), unless that point lies
outside the initial interval.
Note that the intervals are in all cases half-open intervals, that is, of the form (a,b], which includes b but
not a .
1393
3 Intel Math Kernel Library Developer Reference
To avoid underflow, the matrix should be scaled so that its largest element is no greater than overflow1/2 *
overflow1/4 in absolute value. To assure the most accurate computation of small eigenvalues, the matrix
should be scaled to be not much smaller than that, either.
NOTE
In general, the arguments are not checked for unreasonable values.
Input Parameters
mmax INTEGER. The maximum number of intervals. If more than mmax intervals
are generated, then ?laebz will quit with info=mmax+1.
minp INTEGER. The initial number of intervals. It may not be greater than mmax.
nbmin INTEGER. The smallest number of intervals that should be processed using
a vector loop. If zero, then only the scalar loop will be used.
1394
LAPACK Routines 3
The minimum absolute value of a "pivot" in the Sturm sequence loop. This
value must be at least (max |e(j)**2|*safe_min) and at least safe_min,
where safe_min is at least the smallest number that can divide one without
overflow.
nval INTEGER.
Array, dimension (minp).
If ijob=1 or 2, not referenced.
If ijob=3, the desired values of n(w).
nab INTEGER.
Array, dimension (mmax,2)
If ijob=2, then on input, nab(i,j) should be set. It must satisfy the
condition:
n(ab(i,1)) nab(i,1) nab(i,2) n(ab(i,2)), which means that in
interval i only eigenvalues nab(i,1)+1,...,nab(i,2) are considered.
Usually, nab(i,j)=n(ab(i,j)), from a previous call to ?laebz with
ijob=1.
If ijob=3, normally, nab should be set to some distinctive value(s) before ?
laebz is called.
1395
3 Intel Math Kernel Library Developer Reference
iwork INTEGER.
Workspace array, dimension (mmax).
Output Parameters
nval The elements of nval will be reordered to correspond with the intervals in
ab. Thus, nval(j) on output will not, in general be the same as nval(j) on
input, but it will correspond with the interval (ab(j,1),ab(j,2)] on
output.
ab The input intervals will, in general, be modified, split, and reordered by the
calculation.
mout INTEGER.
If ijob=1, the number of eigenvalues in the intervals.
If ijob=2 or 3, the number of intervals output.
If ijob=3, mout will equal minp.
info INTEGER.
If info = 0 - all intervals converged
Application Notes
This routine is intended to be called only by other LAPACK routines, thus the interface is less user-friendly. It
is intended for two purposes:
(a) finding eigenvalues. In this case, ?laebz should have one or more initial intervals set up in ab, and ?
laebz should be called with ijob=1. This sets up nab, and also counts the eigenvalues. Intervals with no
eigenvalues would usually be thrown out at this point. Also, if not all the eigenvalues in an interval i are
desired, nab(i,1) can be increased or nab(i,2) decreased. For example, set nab(i,1)=nab(i,2)-1 to get the
largest eigenvalue. ?laebz is then called with ijob=2 and mmax no smaller than the value of mout returned
by the call with ijob=1. After this (ijob=2) call, eigenvalues nab(i,1)+1 through nab(i,2) are approximately
ab(i,1) (or ab(i,2)) to the tolerance specified by abstol and reltol.
(b) finding an interval (a',b'] containing eigenvalues w(f),...,w(l). In this case, start with a Gershgorin
interval (a,b). Set up ab to contain 2 search intervals, both initially (a,b). One nval element should contain
f-1 and the other should contain l, while c should contain a and b, respectively. nab(i,1) should be -1 and
1396
LAPACK Routines 3
nab(i,2) should be n+1, to flag an error if the desired interval does not lie in (a,b). ?laebz is then called with
ijob=3. On exit, if w(f-1) < w(f), then one of the intervals -- j -- will have ab(j,1)=ab(j,2) and nab(j,
1)=nab(j,2)=f-1, while if, to the specified tolerance, w(f-k)=...=w(f+r), k > 0 and r 0, then the
interval will have n(ab(j,1))=nab(j,1)=f-k and n(ab(j,2))=nab(j,2)=f+r. The cases w(l) < w(l+1)
and w(l-r)=...=w(l+k) are handled similarly.
?laed0
Used by ?stedc. Computes all eigenvalues and
corresponding eigenvectors of an unreduced
symmetric tridiagonal matrix using the divide and
conquer method.
Syntax
call slaed0( icompq, qsiz, n, d, e, q, ldq, qstore, ldqs, work, iwork, info )
call dlaed0( icompq, qsiz, n, d, e, q, ldq, qstore, ldqs, work, iwork, info )
call claed0( qsiz, n, d, e, q, ldq, qstore, ldqs, rwork, iwork, info )
call zlaed0( qsiz, n, d, e, q, ldq, qstore, ldqs, rwork, iwork, info )
Include Files
mkl.fi
Description
Real flavors of this routine compute all eigenvalues and (optionally) corresponding eigenvectors of a
symmetric tridiagonal matrix using the divide and conquer method.
Complex flavors claed0/zlaed0 compute all eigenvalues of a symmetric tridiagonal matrix which is one
diagonal block of those from reducing a dense or band Hermitian matrix and corresponding eigenvectors of
the dense or band matrix.
Input Parameters
qsiz INTEGER.
The dimension of the orthogonal/unitary matrix used to reduce the full
matrix to tridiagonal form; qsizn (for real flavors, qsizn if icompq = 1).
1397
3 Intel Math Kernel Library Developer Reference
d(*) contains the main diagonal of the tridiagonal matrix. The dimension of
d must be at least max(1, n).
ldq INTEGER. The leading dimension of the array q; ldq max(1, n).
ldqs INTEGER. The leading dimension of the array qstore; ldqs max(1, n).
iwork INTEGER.
Workspace array.
For real flavors, if icompq = 0 or 1, and for complex flavors, the dimension
of iwork must be at least (6+6n+5nlog2(n)).
1398
LAPACK Routines 3
For real flavors, if icompq = 2, the dimension of iwork must be at least
(3+5n).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
?laed1
Used by sstedc/dstedc. Computes the updated
eigensystem of a diagonal matrix after modification by
a rank-one symmetric matrix. Used when the original
matrix is tridiagonal.
Syntax
call slaed1( n, d, q, ldq, indxq, rho, cutpnt, work, iwork, info )
call dlaed1( n, d, q, ldq, indxq, rho, cutpnt, work, iwork, info )
Include Files
mkl.fi
Description
The routine ?laed1 computes the updated eigensystem of a diagonal matrix after modification by a rank-one
symmetric matrix. This routine is used only for the eigenproblem which requires all eigenvalues and
eigenvectors of a tridiagonal matrix. ?laed7 handles the case in which eigenvalues only or eigenvalues and
eigenvectors of a full symmetric matrix (which was reduced to tridiagonal form) are desired.
T = Q(in)*(D(in)+ rho*Z*ZT)*QT(in) = Q(out)*D(out)*QT(out)
where Z = QTu, u is a vector of length n with ones in the cutpnt and (cutpnt+1) -th elements and zeros
elsewhere. The eigenvectors of the original matrix are stored in Q, and the eigenvalues are in D. The
algorithm consists of three stages:
The first stage consists of deflating the size of the problem when there are multiple eigenvalues or if there is
a zero in the z vector. For each such occurrence the dimension of the secular equation problem is reduced by
one. This stage is performed by the routine ?laed2.
The second stage consists of calculating the updated eigenvalues. This is done by finding the roots of the
secular equation via the routine ?laed4 (as called by ?laed3). This routine also calculates the eigenvectors
of the current problem.
1399
3 Intel Math Kernel Library Developer Reference
The final stage consists of computing the updated eigenvectors directly using the updated eigenvalues. The
eigenvectors for the current problem are multiplied with the eigenvectors from the overall problem.
Input Parameters
ldq INTEGER. The leading dimension of the array q; ldq max(1, n).
cutpnt INTEGER.
The location of the last eigenvalue in the leading sub-matrix. min(1,n)
cutpntn/2.
iwork INTEGER.
Workspace array, dimension (4n).
Output Parameters
indxq On exit, contains the permutation which will reintegrate the subproblems
back into sorted order, that is, d( indxq(i = 1, n )) will be in ascending
order.
info INTEGER.
If info = 0, the execution is successful.
1400
LAPACK Routines 3
?laed2
Used by sstedc/dstedc. Merges eigenvalues and
deflates secular equation. Used when the original
matrix is tridiagonal.
Syntax
call slaed2( k, n, n1, d, q, ldq, indxq, rho, z, dlamda, w, q2, indx, indxc, indxp,
coltyp, info )
call dlaed2( k, n, n1, d, q, ldq, indxq, rho, z, dlamda, w, q2, indx, indxc, indxp,
coltyp, info )
Include Files
mkl.fi
Description
The routine ?laed2 merges the two sets of eigenvalues together into a single sorted set. Then it tries to
deflate the size of the problem. There are two ways in which deflation can occur: when two or more
eigenvalues are close together or if there is a tiny entry in the z vector. For each such occurrence the order of
the related secular equation problem is reduced by one.
Input Parameters
z(*) contains the updating vector (the last row of the first sub-eigenvector
matrix and the first row of the second sub-eigenvector matrix).
ldq INTEGER. The leading dimension of the array q; ldq max(1, n).
1401
3 Intel Math Kernel Library Developer Reference
coltyp INTEGER.
Workspace array, dimension (n).
During execution, a label which will indicate which of the following types a
column in the q2 matrix is:
1 : non-zero in the upper half only;
2 : dense;
3 : non-zero in the lower half only;
4 : deflated.
Output Parameters
d On exit, d contains the trailing (n-k) updated eigenvalues (those which were
deflated) sorted into increasing order.
rho On exit, rho has been modified to the value required by ?laed3.
The array w contains the first k values of the final deflation-altered z-vector
which is passed to ?laed3.
1402
LAPACK Routines 3
The permutation used to arrange the columns of the deflated q matrix into
three groups: the first group contains non-zero elements only at and above
n1, the second contains non-zero elements only below n1, and the third is
dense.
coltyp On exit, coltyp(i) is the number of columns of type i, for i=1 to 4 only (see
the definition of types in the description of coltyp in Input Parameters).
info INTEGER.
If info = 0, the execution is successful.
?laed3
Used by sstedc/dstedc. Finds the roots of the
secular equation and updates the eigenvectors. Used
when the original matrix is tridiagonal.
Syntax
call slaed3( k, n, n1, d, q, ldq, rho, dlamda, q2, indx, ctot, w, s, info )
call dlaed3( k, n, n1, d, q, ldq, rho, dlamda, q2, indx, ctot, w, s, info )
Include Files
mkl.fi
Description
The routine ?laed3 finds the roots of the secular equation, as defined by the values in d, w, and rho,
between 1 and k.
It makes the appropriate calls to ?laed4 and then updates the eigenvectors by multiplying the matrix of
eigenvectors of the pair of eigensystems being combined by the matrix of eigenvectors of the k-by-k system
which is solved here.
This code makes very mild assumptions about floating point arithmetic. It will work on machines with a guard
digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray X-MP,
Cray Y-MP, Cray C-90, or Cray-2. It could conceivably fail on hexadecimal or decimal machines without guard
digits, but none are known.
Input Parameters
1403
3 Intel Math Kernel Library Developer Reference
Array q(ldq, *). The second dimension of q must be at least max(1, n).
ldq INTEGER. The leading dimension of the array q; ldq max(1, n).
The first k elements of the array dlamda contain the old roots of the
deflated updating problem. These are the poles of the secular equation.
The first k columns of the array q2 contain the non-deflated eigenvectors
for the split problem. The second dimension of q2 must be at least max(1,
n).
The first k elements of the array w contain the components of the deflation-
adjusted updating vector.
Output Parameters
1404
LAPACK Routines 3
dlamda May be changed on output by having lowest order bit set to zero on Cray X-
MP, Cray Y-MP, Cray-2, or Cray C-90, as described above.
w Destroyed on exit.
info INTEGER.
If info = 0, the execution is successful.
?laed4
Used by sstedc/dstedc. Finds a single root of the
secular equation.
Syntax
call slaed4( n, i, d, z, delta, rho, dlam, info )
call dlaed4( n, i, d, z, delta, rho, dlam, info )
Include Files
mkl.fi
Description
This routine computes the i-th updated eigenvalue of a symmetric rank-one modification to a diagonal matrix
whose elements are given in the array d, and that
D(i) < D(j) for i < j
and that rho > 0. This is arranged by the calling routine, and is no loss in generality. The rank-one modified
system is thus
diag(D) + rho*Z * transpose(Z).
where we assume the Euclidean norm of Z is 1.
The method consists of approximating the rational functions in the secular equation by simpler interpolating
rational functions.
Input Parameters
1405
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
?laed5
Used by sstedc/dstedc. Solves the 2-by-2 secular
equation.
Syntax
call slaed5( i, d, z, delta, rho, dlam )
call dlaed5( i, d, z, delta, rho, dlam )
Include Files
mkl.fi
Description
The routine computes the i-th eigenvalue of a symmetric rank-one modification of a 2-by-2 diagonal matrix
diag(D) + rho*Z * transpose(Z).
The diagonal elements in the array D are assumed to satisfy
D(i) < D(j) for i < j.
We also assume rho > 0 and that the Euclidean norm of the vector Z is one.
Input Parameters
1406
LAPACK Routines 3
Arrays, dimension (2) each. The array d contains the original eigenvalues. It
is assumed that d(1) < d(2).
Output Parameters
?laed6
Used by sstedc/dstedc. Computes one Newton step
in solution of the secular equation.
Syntax
call slaed6( kniter, orgati, rho, d, z, finit, tau, info )
call dlaed6( kniter, orgati, rho, d, z, finit, tau, info )
Include Files
mkl.fi
Description
The routine computes the positive or negative root (closest to the origin) of
It is assumed that if orgati = .TRUE. the root is between d(2) and d(3);otherwise it is between d(1) and
d(2) This routine is called by ?laed4 when necessary. In most cases, the root sought is the smallest in
magnitude, though it might not be in some extremely rare situations.
Input Parameters
kniter INTEGER.
1407
3 Intel Math Kernel Library Developer Reference
orgati LOGICAL.
If orgati = .TRUE., the needed root is between d(2) and d(3); otherwise
it is between d(1) and d(2). See ?laed4 for further details.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
?laed7
Used by ?stedc. Computes the updated eigensystem
of a diagonal matrix after modification by a rank-one
symmetric matrix. Used when the original matrix is
dense.
Syntax
call slaed7( icompq, n, qsiz, tlvls, curlvl, curpbm, d, q, ldq, indxq, rho, cutpnt,
qstore, qptr, prmptr, perm, givptr, givcol, givnum, work, iwork, info )
call dlaed7( icompq, n, qsiz, tlvls, curlvl, curpbm, d, q, ldq, indxq, rho, cutpnt,
qstore, qptr, prmptr, perm, givptr, givcol, givnum, work, iwork, info )
call claed7( n, cutpnt, qsiz, tlvls, curlvl, curpbm, d, q, ldq, rho, indxq, qstore,
qptr, prmptr, perm, givptr, givcol, givnum, work, rwork, iwork, info )
call zlaed7( n, cutpnt, qsiz, tlvls, curlvl, curpbm, d, q, ldq, rho, indxq, qstore,
qptr, prmptr, perm, givptr, givcol, givnum, work, rwork, iwork, info )
1408
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine ?laed7 computes the updated eigensystem of a diagonal matrix after modification by a rank-one
symmetric matrix. This routine is used only for the eigenproblem which requires all eigenvalues and
optionally eigenvectors of a dense symmetric/Hermitian matrix that has been reduced to tridiagonal form.
For real flavors, slaed1/dlaed1 handles the case in which all eigenvalues and eigenvectors of a symmetric
tridiagonal matrix are desired.
T = Q(in)*(D(in)+rho*Z*ZT)*QT(in) = Q(out)*D(out)*QT(out) for real flavors, or
T = Q(in)*(D(in)+rho*Z*ZH)*QH(in) = Q(out)*D(out)*QH(out) for complex flavors
where Z = QT*u for real flavors and Z = QH*u for complex flavors, u is a vector of length n with ones in the
cutpnt and (cutpnt + 1) -th elements and zeros elsewhere. The eigenvectors of the original matrix are
stored in Q, and the eigenvalues are in D. The algorithm consists of three stages:
The first stage consists of deflating the size of the problem when there are multiple eigenvalues or if there is
a zero in the z vector. For each such occurrence the dimension of the secular equation problem is reduced by
one. This stage is performed by the routine slaed8/dlaed8 (for real flavors) or by the routine slaed2/
dlaed2 (for complex flavors).
The second stage consists of calculating the updated eigenvalues. This is done by finding the roots of the
secular equation via the routine ?laed4 (as called by ?laed9 or ?laed3). This routine also calculates the
eigenvectors of the current problem.
The final stage consists of computing the updated eigenvectors directly using the updated eigenvalues. The
eigenvectors for the current problem are multiplied with the eigenvectors from the overall problem.
Input Parameters
cutpnt INTEGER. The location of the last eigenvalue in the leading sub-matrix.
min(1,n) cutpntn .
qsiz INTEGER.
The dimension of the orthogonal/unitary matrix used to reduce the full
matrix to tridiagonal form; qsizn (for real flavors, qsizn if icompq =
1).
tlvls INTEGER. The total number of merging levels in the overall divide and
conquer tree.
1409
3 Intel Math Kernel Library Developer Reference
curpbm INTEGER. The current problem in the current level in the overall merge
routine (counting from upper left to lower right).
ldq INTEGER. The leading dimension of the array q; ldq max(1, n).
qptr INTEGER. Array, dimension (n+2). Serves also as output parameter. List of
indices pointing to beginning of submatrices stored in qstore. The
submatrices are numbered starting at the bottom left of the divide and
conquer tree, from left to right and bottom to top.
1410
LAPACK Routines 3
The array givptr(*) contains a list of pointers which indicate where in givcol
a level's Givens rotations are stored. givptr(i+1) - givptr(i) indicates
the number of Givens rotations.
iwork INTEGER.
Workspace array, dimension (4n ).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
1411
3 Intel Math Kernel Library Developer Reference
?laed8
Used by ?stedc. Merges eigenvalues and deflates
secular equation. Used when the original matrix is
dense.
Syntax
call slaed8( icompq, k, n, qsiz, d, q, ldq, indxq, rho, cutpnt, z, dlamda, q2, ldq2,
w, perm, givptr, givcol, givnum, indxp, indx, info )
call dlaed8( icompq, k, n, qsiz, d, q, ldq, indxq, rho, cutpnt, z, dlamda, q2, ldq2,
w, perm, givptr, givcol, givnum, indxp, indx, info )
call claed8( k, n, qsiz, q, ldq, d, rho, cutpnt, z, dlamda, q2, ldq2, w, indxp, indx,
indxq, perm, givptr, givcol, givnum, info )
call zlaed8( k, n, qsiz, q, ldq, d, rho, cutpnt, z, dlamda, q2, ldq2, w, indxp, indx,
indxq, perm, givptr, givcol, givnum, info )
Include Files
mkl.fi
Description
The routine merges the two sets of eigenvalues together into a single sorted set. Then it tries to deflate the
size of the problem. There are two ways in which deflation can occur: when two or more eigenvalues are
close together or if there is a tiny element in the z vector. For each such occurrence the order of the related
secular equation problem is reduced by one.
Input Parameters
cutpnt INTEGER. The location of the last eigenvalue in the leading sub-matrix.
min(1,n) cutpntn .
qsiz INTEGER.
The dimension of the orthogonal/unitary matrix used to reduce the full
matrix to tridiagonal form; qsizn (for real flavors, qsizn if icompq =
1).
1412
LAPACK Routines 3
On entry, z(*) contains the updating vector (the last row of the first sub-
eigenvector matrix and the first row of the second sub-eigenvector matrix).
The contents of z are destroyed by the updating process.
ldq INTEGER. The leading dimension of the array q; ldq max(1, n).
ldq2 INTEGER. The leading dimension of the output array q2; ldq2 max(1,
n).
Output Parameters
d On exit, contains the trailing (n-k) updated eigenvalues (those which were
deflated) sorted into increasing order.
rho On exit, rho has been modified to the value required by ?laed3.
1413
3 Intel Math Kernel Library Developer Reference
Arrays, dimension (n) each. The array dlamda(*) contains a copy of the
first k eigenvalues which will be used by ?laed3 to form the secular
equation.
The array w(*) will hold the first k values of the final deflation-altered z-
vector and will be passed to ?laed3.
givptr INTEGER. Contains the number of Givens rotations which took place in this
subproblem.
info INTEGER.
If info = 0, the execution is successful.
1414
LAPACK Routines 3
?laed9
Used by sstedc/dstedc. Finds the roots of the
secular equation and updates the eigenvectors. Used
when the original matrix is dense.
Syntax
call slaed9( k, kstart, kstop, n, d, q, ldq, rho, dlamda, w, s, lds, info )
call dlaed9( k, kstart, kstop, n, d, q, ldq, rho, dlamda, w, s, lds, info )
Include Files
mkl.fi
Description
The routine finds the roots of the secular equation, as defined by the values in d, z, and rho, between kstart
and kstop. It makes the appropriate calls to slaed4/dlaed4 and then stores the new matrix of eigenvectors
for use in calculating the next level of z vectors.
Input Parameters
ldq INTEGER. The leading dimension of the array q; ldq max(1, n).
1415
3 Intel Math Kernel Library Developer Reference
The first k elements of the array w(*) contain the components of the
deflation-adjusted updating vector.
lds INTEGER. The leading dimension of the output array s; lds max(1, k).
Output Parameters
dlamda On exit, the value is modified to make sure all dlamda(i) - dlamda(j)
can be computed with high relative accuracy, barring overflow and
underflow.
w Destroyed on exit.
info INTEGER.
If info = 0, the execution is successful.
If info = -i, the i-th parameter had an illegal value. If info = 1, the
eigenvalue did not converge.
?laeda
Used by ?stedc. Computes the Z vector determining
the rank-one modification of the diagonal matrix. Used
when the original matrix is dense.
Syntax
call slaeda( n, tlvls, curlvl, curpbm, prmptr, perm, givptr, givcol, givnum, q, qptr,
z, ztemp, info )
call dlaeda( n, tlvls, curlvl, curpbm, prmptr, perm, givptr, givcol, givnum, q, qptr,
z, ztemp, info )
Include Files
mkl.fi
Description
The routine ?laeda computes the Z vector corresponding to the merge step in the curlvl-th step of the
merge process with tlvls steps for the curpbm-th problem.
1416
LAPACK Routines 3
Input Parameters
tlvls INTEGER. The total number of merging levels in the overall divide and
conquer tree.
curpbm INTEGER. The current problem in the current level in the overall merge
routine (counting from upper left to lower right).
qptr INTEGER. Array, dimension (n+2). Contains a list of pointers which indicate
where in q an eigenblock is stored. sqrt( qptr(i+1) - qptr(i))
indicates the size of the block.
Output Parameters
1417
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
?laein
Computes a specified right or left eigenvector of an
upper Hessenberg matrix by inverse iteration.
Syntax
call slaein( rightv, noinit, n, h, ldh, wr, wi, vr, vi, b, ldb, work, eps3, smlnum,
bignum, info )
call dlaein( rightv, noinit, n, h, ldh, wr, wi, vr, vi, b, ldb, work, eps3, smlnum,
bignum, info )
call claein( rightv, noinit, n, h, ldh, w, v, b, ldb, rwork, eps3, smlnum, info )
call zlaein( rightv, noinit, n, h, ldh, w, v, b, ldb, rwork, eps3, smlnum, info )
Include Files
mkl.fi
Description
The routine ?laein uses inverse iteration to find a right or left eigenvector corresponding to the eigenvalue
(wr,wi) of a real upper Hessenberg matrix H (for real flavors slaein/dlaein) or to the eigenvalue w of a
complex upper Hessenberg matrix H (for complex flavors claein/zlaein).
Input Parameters
rightv LOGICAL.
If rightv = .TRUE., compute right eigenvector;
noinit LOGICAL.
If noinit = .TRUE., no initial vector is supplied in (vr,vi) or in v (for
complex flavors);
if noinit = .FALSE., initial vector is supplied in (vr,vi) or in v (for
complex flavors).
1418
LAPACK Routines 3
DOUBLE COMPLEX for zlaein.
Array h(ldh, *).
The second dimension of h must be at least max(1, n). Contains the upper
Hessenberg matrix H.
ldh INTEGER. The leading dimension of the array h; ldh max(1, n).
1419
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info = 1, inverse iteration did not converge. For real flavors, vr is set to
the last iterate, and so is vi, if wi 0.0. For complex flavors, v is set to the
last iterate.
?laev2
Computes the eigenvalues and eigenvectors of a 2-
by-2 symmetric/Hermitian matrix.
Syntax
call slaev2( a, b, c, rt1, rt2, cs1, sn1 )
1420
LAPACK Routines 3
call dlaev2( a, b, c, rt1, rt2, cs1, sn1 )
call claev2( a, b, c, rt1, rt2, cs1, sn1 )
call zlaev2( a, b, c, rt1, rt2, cs1, sn1 )
Include Files
mkl.fi
Description
(for claev2/zlaev2).
On return, rt1 is the eigenvalue of larger absolute value, rt2 of smaller absolute value, and (cs1, sn1) is the
unit right eigenvector for rt1, giving the decomposition
(for slaev2/dlaev2),
or
(for claev2/zlaev2).
Input Parameters
Output Parameters
1421
3 Intel Math Kernel Library Developer Reference
Application Notes
rt1 is accurate to a few ulps barring over/underflow. rt2 may be inaccurate if there is massive cancellation in
the determinant a*c-b*b; higher precision or correctly rounded or correctly truncated arithmetic would be
needed to compute rt2 accurately in all cases. cs1 and sn1 are accurate to a few ulps barring over/underflow.
Overflow is possible only if rt1 is within a factor of 5 of overflow. Underflow is harmless if the input data is 0
or exceeds underflow_threshold / macheps.
?laexc
Swaps adjacent diagonal blocks of a real upper quasi-
triangular matrix in Schur canonical form, by an
orthogonal similarity transformation.
Syntax
call slaexc( wantq, n, t, ldt, q, ldq, j1, n1, n2, work, info )
call dlaexc( wantq, n, t, ldt, q, ldq, j1, n1, n2, work, info )
Include Files
mkl.fi
Description
The routine swaps adjacent diagonal blocks T11 and T22 of order 1 or 2 in an upper quasi-triangular matrix T
by an orthogonal similarity transformation.
T must be in Schur canonical form, that is, block upper triangular with 1-by-1 and 2-by-2 diagonal blocks;
each 2-by-2 diagonal block has its diagonal elements equal and its off-diagonal elements of opposite sign.
Input Parameters
wantq LOGICAL.
If wantq = .TRUE., accumulate the transformation in the matrix Q;
1422
LAPACK Routines 3
t(ldt,*) contains on entry the upper quasi-triangular matrix T, in Schur
canonical form.
The second dimension of t must be at least max(1, n).
j1 INTEGER. The index of the first row of the first block T11.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info = 1, the transformed matrix T would be too far from Schur form;
the blocks are not swapped and T and Q are unchanged.
?lag2
Computes the eigenvalues of a 2-by-2 generalized
eigenvalue problem, with scaling as necessary to
avoid over-/underflow.
Syntax
call slag2( a, lda, b, ldb, safmin, scale1, scale2, wr1, wr2, wi )
call dlag2( a, lda, b, ldb, safmin, scale1, scale2, wr1, wr2, wi )
Include Files
mkl.fi
1423
3 Intel Math Kernel Library Developer Reference
Description
The routine computes the eigenvalues of a 2 x 2 generalized eigenvalue problem A - w*B, with scaling as
necessary to avoid over-/underflow. The scaling factor, s, results in a modified eigenvalue equation
s*A - w*B,
where s is a non-negative scaling factor chosen so that w, w*B, and s*A do not overflow and, if possible, do
not underflow, either.
Input Parameters
Output Parameters
1424
LAPACK Routines 3
A scaling factor used to avoid over-/underflow in the eigenvalue equation
which defines the second eigenvalue. If the eigenvalues are complex, then
scale2=scale1. If the eigenvalues are real, then the second (real)
eigenvalue is wr2/scale2, but this may overflow or underflow, and in fact,
scale2 may be zero or less than the underflow threshold if the exact
eigenvalue is sufficiently large.
If the eigenvalue is complex, then wr1=wr2 is scale1 times the real part of
the eigenvalues.
?lags2
Computes 2-by-2 orthogonal matrices U, V, and Q,
and applies them to matrices A and B such that the
rows of the transformed A and B are parallel.
Syntax
call slags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq)
call dlags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq)
call clags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq)
call zlags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq)
Include Files
mkl.fi
Description
For real flavors, the routine computes 2-by-2 orthogonal matrices U, V and Q, such that if upper = .TRUE.,
then
1425
3 Intel Math Kernel Library Developer Reference
and
and
and
1426
LAPACK Routines 3
and
Input Parameters
upper LOGICAL.
If upper = .TRUE., the input matrices A and B are upper triangular; If
upper = .FALSE., the input matrices A and B are lower triangular.
1427
3 Intel Math Kernel Library Developer Reference
Output Parameters
?lagtf
Computes an LU factorization of a matrix T-*I, where
T is a general tridiagonal matrix, and is a scalar,
using partial pivoting with row interchanges.
Syntax
call slagtf( n, a, lambda, b, c, tol, d, in, info )
call dlagtf( n, a, lambda, b, c, tol, d, in, info )
Include Files
mkl.fi
1428
LAPACK Routines 3
Description
The routine factorizes the matrix (T - lambda*I), where T is an n-by-n tridiagonal matrix and lambda is a
scalar, as
T - lambda*I = P*L*U,
where P is a permutation matrix, L is a unit lower tridiagonal matrix with at most one non-zero sub-diagonal
elements per column and U is an upper triangular matrix with at most two non-zero super-diagonal elements
per column. The factorization is obtained by Gaussian elimination with partial pivoting and implicit row
scaling. The parameter lambda is included in the routine so that ?lagtf may be used, in conjunction with ?
lagts, to obtain eigenvectors of T by inverse iteration.
Input Parameters
Output Parameters
in INTEGER.
1429
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, the execution is successful.
?lagtm
Performs a matrix-matrix product of the form C =
alpha*A*B+beta*C, where A is a tridiagonal matrix,
B and C are rectangular matrices, and alpha and
beta are scalars, which may be 0, 1, or -1.
Syntax
call slagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call dlagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call clagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call zlagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
Include Files
mkl.fi
Description
Input Parameters
1430
LAPACK Routines 3
n INTEGER. The order of the matrix A (n 0).
nrhs INTEGER. The number of right-hand sides, i.e., the number of columns in X
and B (nrhs 0).
ldx INTEGER. The leading dimension of the array x; ldx max(1, n).
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
Output Parameters
?lagts
Solves the system of equations (T - lambda*I)*x = y
or (T - lambda*I)T*x = y,where T is a general
tridiagonal matrix and lambda is a scalar, using the LU
factorization computed by ?lagtf.
Syntax
call slagts( job, n, a, b, c, d, in, y, tol, info )
1431
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The routine may be used to solve for x one of the systems of equations:
(T - lambda*I)*x = y or (T - lambda*I)T*x = y,
where T is an n-by-n tridiagonal matrix, following the factorization of (T - lambda*I) as
T - lambda*I = P*L*U,
computed by the routine ?lagtf.
The choice of equation to be solved is controlled by the argument job, and in each case there is an option to
perturb zero or very small diagonal elements of U, this option being intended for use in applications such as
inverse iteration.
Input Parameters
in INTEGER.
Array, dimension (n).
1432
LAPACK Routines 3
On entry, in(*) must contain details of the matrix p as returned from ?
lagtf.
Output Parameters
tol On exit, tol is changed as described in Input Parameters section above, only
if tol is non-positive on entry. Otherwise tol is unchanged.
info INTEGER.
If info = 0, the execution is successful.
If info = -i, the i-th parameter had an illegal value. If info = i >0,
overflow would occur when computing the ith element of the solution vector
x. This can only occur when job is supplied as positive and either means
that a diagonal element of U is very small, or that the elements of the right-
hand side vector y are very large.
?lagv2
Computes the Generalized Schur factorization of a real
2-by-2 matrix pencil (A,B) where B is upper
triangular.
Syntax
call slagv2( a, lda, b, ldb, alphar, alphai, beta, csl, snl, csr, snr )
call dlagv2( a, lda, b, ldb, alphar, alphai, beta, csl, snl, csr, snr )
Include Files
mkl.fi
Description
The routine computes the Generalized Schur factorization of a real 2-by-2 matrix pencil (A,B) where B is
upper triangular. The routine computes orthogonal (rotation) matrices given by csl, snl and csr, snr such
that:
1) if the pencil (A,B) has two real eigenvalues (include 0/0 or 1/0 types), then
1433
3 Intel Math Kernel Library Developer Reference
where b11b22>0.
Input Parameters
Output Parameters
1434
LAPACK Routines 3
Note that beta(k) may be zero.
?lahqr
Computes the eigenvalues and Schur factorization of
an upper Hessenberg matrix, using the double-shift/
single-shift QR algorithm.
Syntax
call slahqr( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, info )
call dlahqr( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, info )
call clahqr( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, info )
call zlahqr( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, info )
Include Files
mkl.fi
Description
The routine is an auxiliary routine called by ?hseqr to update the eigenvalues and Schur decomposition
already computed by ?hseqr, by dealing with the Hessenberg submatrix in rows and columns ilo to ihi.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., the full Schur form T is required;
wantz LOGICAL.
If wantz = .TRUE., the matrix of Schur vectors Z is required;
1435
3 Intel Math Kernel Library Developer Reference
Constraints:
1 ilo max(1,ihi); ihin.
z(ldz,*)
If wantz = .TRUE., then, on entry, z must contain the current matrix z of
transformations accumulated by ?hseqr. The second dimension of z must
be at least max(1, n)
iloz, ihiz INTEGER. Specify the rows of z to which transformations must be applied if
wantz = .TRUE..
1 ilozilo; ihiihizn.
Output Parameters
1436
LAPACK Routines 3
w COMPLEX for clahqr
DOUBLE COMPLEX for zlahqr.
Array, DIMENSION at least max(1, n). Used with complex flavors only. The
computed eigenvalues ilo to ihi are stored in the corresponding elements of
w.
If wantt = .TRUE., the eigenvalues are stored in the same order as on the
diagonal of the Schur form returned in h, with w(i) = h(i,i).
info INTEGER.
If info = 0, the execution is successful.
?lahrd
Reduces the first nb columns of a general rectangular
matrix A so that elements below the k-th subdiagonal
are zero, and returns auxiliary matrices which are
needed to apply the transformation to the unreduced
part of A (deprecated).
Syntax
call slahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call dlahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call clahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call zlahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
Include Files
mkl.fi
Description
This routine is deprecated; use lahr2.
1437
3 Intel Math Kernel Library Developer Reference
The routine reduces the first nb columns of a real/complex general n-by-(n-k+1) matrix A so that elements
below the k-th subdiagonal are zero. The reduction is performed by an orthogonal/unitary similarity
transformation QT*A*Q for real flavors, or QH*A*Q for complex flavors. The routine returns the matrices V
and T which determine Q as a block reflector I - V*T*VT (for real flavors) or I - V*T*VH (for complex flavors),
and also the matrix Y = A*V*T.
The matrix Q is represented as products of nb elementary reflectors:
Q = H(1)*H(2)*... *H(nb)
Each H(i) has the form
Input Parameters
k INTEGER. The offset for the reduction. Elements below the k-th subdiagonal
in the first nb columns are reduced to zero.
ldt INTEGER. The leading dimension of the output array t; must be at least
max(1, nb).
ldy INTEGER. The leading dimension of the output array y; must be at least
max(1, n).
Output Parameters
a On exit, the elements on and above the k-th subdiagonal in the first nb
columns are overwritten with the corresponding elements of the reduced
matrix; the elements below the k-th subdiagonal, with the array tau,
represent the matrix Q as a product of elementary reflectors. The other
columns of a are unchanged. See Application Notes below.
1438
LAPACK Routines 3
Contains scalar factors of the elementary reflectors.
Application Notes
For the elementary reflector H(i),
v(1:i+k-1) = 0, v(i+k) = 1; v(i+k+1:n) is stored on exit in a(i+k+1:n, i) and tau is stored in tau(i).
The elements of the vectors v together form the (n-k+1)-by-nb matrix V which is needed, with T and Y, to
apply the transformation to the unreduced part of the matrix, using an update of the form:
A := (I - V*T*VT) * (A - Y*VT) for real flavors, or
A := (I - V*T*VH) * (A - Y*VH) for complex flavors.
The contents of A on exit are illustrated by the following example with n = 7, k = 3 and nb = 2:
where a denotes an element of the original matrix A, h denotes a modified element of the upper Hessenberg
matrix H, and vi denotes an element of the vector defining H(i).
See Also
?lahr2
?lahr2
Reduces the specified number of first columns of a
general rectangular matrix A so that elements below
the specified subdiagonal are zero, and returns
auxiliary matrices which are needed to apply the
transformation to the unreduced part of A.
Syntax
call slahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )
1439
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The routine reduces the first nb columns of a real/complex general n-by-(n-k+1) matrix A so that elements
below the k-th subdiagonal are zero. The reduction is performed by an orthogonal/unitary similarity
transformation QT*A*Q for real flavors, or QH*A*Q for complex flavors. The routine returns the matrices V
and T which determine Q as a block reflector I - V*T*VT (for real flavors) or I - V*T*VH (for real flavors), and
also the matrix Y = A*V*T.
The matrix Q is represented as products of nb elementary reflectors:
Q = H(1)*H(2)*... *H(nb)
Each H(i) has the form
Input Parameters
k INTEGER. The offset for the reduction. Elements below the k-th subdiagonal
in the first nb columns are reduced to zero (k< n).
lda INTEGER. The leading dimension of the array a; lda max(1, n).
ldt INTEGER. The leading dimension of the output array t; ldt nb.
1440
LAPACK Routines 3
Output Parameters
a On exit, the elements on and above the k-th subdiagonal in the first nb
columns are overwritten with the corresponding elements of the reduced
matrix; the elements below the k-th subdiagonal, with the array tau,
represent the matrix Q as a product of elementary reflectors. The other
columns of a are unchanged. See Application Notes below.
Application Notes
For the elementary reflector H(i),
1441
3 Intel Math Kernel Library Developer Reference
where a denotes an element of the original matrix A, h denotes a modified element of the upper Hessenberg
matrix H, and vi denotes an element of the vector defining H(i).
?laic1
Applies one step of incremental condition estimation.
Syntax
call slaic1( job, j, x, sest, w, gamma, sestpr, s, c )
call dlaic1( job, j, x, sest, w, gamma, sestpr, s, c )
call claic1( job, j, x, sest, w, gamma, sestpr, s, c )
call zlaic1( job, j, x, sest, w, gamma, sestpr, s, c )
Include Files
mkl.fi
Description
The routine ?laic1 applies one step of incremental condition estimation in its simplest version.
Let x, ||x||2 = 1 (where ||a||2 denotes the 2-norm of a), be an approximate singular vector of an j-by-j
lower triangular matrix L, such that
||L*x||2 = sest
Then ?laic1 computes sestpr, s, c such that the vector
1442
LAPACK Routines 3
Input Parameters
job INTEGER.
If job =1, an estimate for the largest singular value is computed;
Output Parameters
1443
3 Intel Math Kernel Library Developer Reference
?lakf2
Forms a matrix containing Kronecker products
between the given matrices.
Syntax
call slakf2( m, n, a, lda, b, d, e, z, ldz )
call dlakf2( m, n, a, lda, b, d, e, z, ldz )
call clakf2( m, n, a, lda, b, d, e, z, ldz )
call zlakf2( m, n, a, lda, b, d, e, z, ldz )
Include Files
mkl.fi
Description
,
where In is the identity matrix of size n and XT is the transpose of X. kron(X, Y) is the Kronecker product
between the matrices X and Y.
Input Parameters
1444
LAPACK Routines 3
DOUBLE COMPLEX for zlakf2,
Array, size lda by m. Matrix used in forming the output matrix Z.
Output Parameters
?laln2
Solves a 1-by-1 or 2-by-2 linear system of equations
of the specified form.
Syntax
call slaln2( ltrans, na, nw, smin, ca, a, lda, d1, d2, b, ldb, wr, wi, x, ldx, scale,
xnorm, info )
call dlaln2( ltrans, na, nw, smin, ca, a, lda, d1, d2, b, ldb, wr, wi, x, ldx, scale,
xnorm, info )
Include Files
mkl.fi
Description
1445
3 Intel Math Kernel Library Developer Reference
If both singular values of (ca*A - w*D) are less than smin, smin*I (where I stands for identity) will be used
instead of (ca*A - w*D). If only one singular value is less than smin, one element of (ca*A - w*D) will be
perturbed enough to make the smallest singular value roughly smin.
If both singular values are at least smin, (ca*A - w*D) will not be perturbed. In any case, the perturbation
will be at most some small multiple of max(smin, ulp*norm(ca*A - w*D)).
The singular values are computed by infinity-norm approximations, and thus will only be correct to a factor of
2 or so.
NOTE
All input quantities are assumed to be smaller than overflow by a reasonable factor (see bignum).
Input Parameters
trans LOGICAL.
If trans = .TRUE., A- transpose will be used.
1446
LAPACK Routines 3
Array, DIMENSION (ldb,nw). The na-by-nw matrix B (right-hand side). If nw
=2 (w is complex), column 1 contains the real part of B and column 2
contains the imaginary part.
ldx INTEGER. The leading dimension of the output array x. Must be at least na.
Output Parameters
info INTEGER.
An error flag. It will be zero if no error occurs, a negative number if an
argument is in error, or a positive number if (ca*A - w*D) had to be
perturbed. The possible values are:
If info = 0: no error occurred, and (ca*A - w*D) did not have to be
perturbed.
If info = 1: (ca*A - w*D) had to be perturbed to make its smallest (or
only) singular value greater than smin.
NOTE
For higher speed, this routine does not check the inputs for errors.
1447
3 Intel Math Kernel Library Developer Reference
?lals0
Applies back multiplying factors in solving the least
squares problem using divide and conquer SVD
approach. Used by ?gelsd.
Syntax
call slals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, info )
call dlals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, info )
call clals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, rwork, info )
call zlals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, rwork, info )
Include Files
mkl.fi
Description
The routine applies back the multiplying factors of either the left or right singular vector matrix of a diagonal
matrix appended by a row to the right hand side matrix B in solving the least squares problem using the
divide-and-conquer SVD approach.
For the left singular vector matrix, three types of orthogonal matrices are involved:
(1L) Givens rotations: the number of such rotations is givptr;the pairs of columns/rows they were applied to
are stored in givcol;and the c- and s-values of these rotations are stored in givnum.
(2L) Permutation. The (nl+1)-st row of B is to be moved to the first row, and for j=2:n, perm(j)-th row of B
is to be moved to the j-th row.
(3L) The left singular vector matrix of the remaining matrix.
For the right singular vector matrix, four types of orthogonal matrices are involved:
(1R) The right singular vector matrix of the remaining matrix.
(2R) If sqre = 1, one extra Givens rotation to generate the right null space.
Input Parameters
1448
LAPACK Routines 3
nr 1.
sqre INTEGER.
If sqre = 0: the lower block is an nr-by-nr square matrix.
Contains the right hand sides of the least squares problem in rows 1
through m.
givptr INTEGER. The number of Givens rotations which took place in this
subproblem.
1449
3 Intel Math Kernel Library Developer Reference
ldgnum INTEGER. The leading dimension of arrays difr, poles and givnum, must be
at least k.
1450
LAPACK Routines 3
Workspace array, DIMENSION (k*(1+nrhs) + 2*nrhs). Used with complex
flavors only.
Output Parameters
info INTEGER.
If info = 0: successful exit.
?lalsa
Computes the SVD of the coefficient matrix in
compact form. Used by ?gelsd.
Syntax
call slalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
call dlalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
call clalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, rwork, iwork, info )
call zlalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, rwork, iwork, info )
Include Files
mkl.fi
Description
The routine is an intermediate step in solving the least squares problem by computing the SVD of the
coefficient matrix in compact form. The singular vectors are computed as products of simple orthogonal
matrices.
If icompq = 0, ?lalsa applies the inverse of the left singular vector matrix of an upper bidiagonal matrix to
the right hand side; and if icompq = 1, the routine applies the right singular vector matrix to the right hand
side. The singular vector matrices were generated in the compact form by ?lalsa.
Input Parameters
icompq INTEGER. Specifies whether the left or the right singular vector matrix is
involved. If icompq = 0: left singular vector matrix is used
smlsiz INTEGER. The maximum size of the subproblems at the bottom of the
computation tree.
n INTEGER. The row and column dimensions of the upper bidiagonal matrix.
1451
3 Intel Math Kernel Library Developer Reference
ldu INTEGER, ldun. The leading dimension of arrays u, vt, difl, difr, poles,
givnum, and z.
1452
LAPACK Routines 3
poles REAL for slalsa/clalsa
DOUBLE PRECISION for dlalsa/zlalsa
Array, DIMENSION (ldu, 2*nlvl).
On entry, poles(*, 2i-1: 2i) contains the new and old singular values
involved in the secular equations on the i-th level.
givcol INTEGER. Array, DIMENSION ( ldgcol, 2*nlvl ). On entry, for each i, givcol(*,
2i-1: 2i) records the locations of Givens rotations performed on the i-th
level on the computation tree.
ldgcol INTEGER, ldgcoln. The leading dimension of arrays givcol and perm.
iwork INTEGER.
Workspace array, DIMENSION at least (3n).
1453
3 Intel Math Kernel Library Developer Reference
Output Parameters
?lalsd
Uses the singular value decomposition of A to solve
the least squares problem.
Syntax
call slalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, iwork, info )
call dlalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, iwork, info )
call clalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, rwork, iwork,
info )
call zlalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, rwork, iwork,
info )
Include Files
mkl.fi
Description
The routine uses the singular value decomposition of A to solve the least squares problem of finding X to
minimize the Euclidean norm of each column of A*X-B, where A is n-by-n upper bidiagonal, and X and B are
n-by-nrhs. The solution X overwrites B.
The singular values of A smaller than rcond times the largest singular value are treated as zero in solving the
least squares problem; in this case a minimum norm solution is returned. The actual singular values are
returned in d in ascending order.
This code makes very mild assumptions about floating point arithmetic. It will work on machines with a guard
digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray XMP, Cray
YMP, Cray C 90, or Cray 2.
It could conceivably fail on hexadecimal or decimal machines without guard digits, but we know of none.
Input Parameters
uplo CHARACTER*1.
If uplo = 'U', d and e define an upper bidiagonal matrix.
1454
LAPACK Routines 3
If uplo = 'L', d and e define a lower bidiagonal matrix.
smlsiz INTEGER. The maximum size of the subproblems at the bottom of the
computation tree.
On input, b contains the right hand sides of the least squares problem. On
output, b contains the solution X.
rank INTEGER. The number of singular values of A greater than rcond times the
largest singular value.
1455
3 Intel Math Kernel Library Developer Reference
Workspace array.
DIMENSION for real flavors at least
(9n+2n*smlsiz+8n*nlvl+n*nrhs+(smlsiz+1)2),
where
nlvl = max(0, int(log2(n/(smlsiz+1))) + 1).
DIMENSION for complex flavors is (n*nrhs).
iwork INTEGER.
Workspace array of DIMENSION(3n*nlvl + 11n).
Output Parameters
e On exit, destroyed.
info INTEGER.
If info = 0: successful exit.
?lamrg
Creates a permutation list to merge the entries of two
independently sorted sets into a single set sorted in
acsending order.
Syntax
call slamrg( n1, n2, a, strd1, strd2, index )
call dlamrg( n1, n2, a, strd1, strd2, index )
Include Files
mkl.fi
1456
LAPACK Routines 3
Description
The routine creates a permutation list which will merge the elements of a (which is composed of two
independently sorted sets) into a single set which is sorted in ascending order.
Input Parameters
n1, n2 INTEGER. These arguments contain the respective lengths of the two sorted
lists to be merged.
Output Parameters
?laneg
Computes the Sturm count, the number of negative
pivots encountered while factoring tridiagonal T-
sigma*I = L*D*LT.
Syntax
value = slaneg( n, d, lld, sigma, pivmin, r )
value = dlaneg( n, d, lld, sigma, pivmin, r )
Include Files
mkl.fi
Description
The routine computes the Sturm count, the number of negative pivots encountered while factoring
tridiagonal T-sigma*I = L*D*LT. This implementation works directly on the factors without forming the
tridiagonal matrix T. The Sturm count is also the number of eigenvalues of T less than sigma. This routine is
called from ?larb. The current routine does not use the pivmin parameter but rather requires IEEE-754
propagation of infinities and NaNs (NaN stands for 'Not A Number'). This routine also has no input range
restrictions but does require default exception handling such that x/0 produces Inf when x is non-zero, and
Inf/Inf produces NaN. (For more information see [Marques06]).
1457
3 Intel Math Kernel Library Developer Reference
Input Parameters
r INTEGER.
The twist index for the twisted factorization that is used for the negcount.
Output Parameters
?langb
Returns the value of the 1-norm, Frobenius norm,
infinity-norm, or the largest absolute value of any
element of general band matrix.
Syntax
val = slangb( norm, n, kl, ku, ab, ldab, work )
val = dlangb( norm, n, kl, ku, ab, ldab, work )
val = clangb( norm, n, kl, ku, ab, ldab, work )
val = zlangb( norm, n, kl, ku, ab, ldab, work )
Include Files
mkl.fi
Description
1458
LAPACK Routines 3
The function returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the element of
largest absolute value of an n-by-n band matrix A, with kl sub-diagonals and ku super-diagonals.
Input Parameters
Output Parameters
1459
3 Intel Math Kernel Library Developer Reference
?lange
Returns the value of the 1-norm, Frobenius norm,
infinity-norm, or the largest absolute value of any
element of a general rectangular matrix.
Syntax
val = slange( norm, m, n, a, lda, work )
val = dlange( norm, m, n, a, lda, work )
val = clange( norm, m, n, a, lda, work )
val = zlange( norm, m, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lange returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a real/complex matrix A.
Input Parameters
The data types are given for the Fortran interface.
1460
LAPACK Routines 3
.
Output Parameters
?langt
Returns the value of the 1-norm, Frobenius norm,
infinity-norm, or the largest absolute value of any
element of a general tridiagonal matrix.
Syntax
val = slangt( norm, n, dl, d, du )
val = dlangt( norm, n, dl, d, du )
val = clangt( norm, n, dl, d, du )
val = zlangt( norm, n, dl, d, du )
Include Files
mkl.fi
Description
The routine returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the element of
largest absolute value of a real/complex tridiagonal matrix A.
Input Parameters
1461
3 Intel Math Kernel Library Developer Reference
Output Parameters
?lanhs
Returns the value of the 1-norm, Frobenius norm,
infinity-norm, or the largest absolute value of any
element of an upper Hessenberg matrix.
Syntax
val = slanhs( norm, n, a, lda, work )
val = dlanhs( norm, n, a, lda, work )
val = clanhs( norm, n, a, lda, work )
val = zlanhs( norm, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lanhs returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a Hessenberg matrix A.
The value val returned by the function is:
val = max(abs(Aij)), if norm = 'M' or 'm'
= norm1(A), if norm = '1' or 'O' or 'o'
where norm1 denotes the 1-norm of a matrix (maximum column sum), normI denotes the infinity norm of a
matrix (maximum row sum) and normF denotes the Frobenius norm of a matrix (square root of sum of
squares). Note that max(abs(Aij)) is not a consistent matrix norm.
1462
LAPACK Routines 3
Input Parameters
Output Parameters
?lansb
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a symmetric band matrix.
Syntax
val = slansb( norm, uplo, n, k, ab, ldab, work )
val = dlansb( norm, uplo, n, k, ab, ldab, work )
val = clansb( norm, uplo, n, k, ab, ldab, work )
val = zlansb( norm, uplo, n, k, ab, ldab, work )
Include Files
mkl.fi
Description
1463
3 Intel Math Kernel Library Developer Reference
The function ?lansb returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of an n-by-n real/complex symmetric band matrix A, with k super-
diagonals.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the band matrix A is
supplied. If uplo = 'U': upper triangular part is supplied; If uplo = 'L':
lower triangular part is supplied.
The upper or lower triangle of the symmetric band matrix A, stored in the
first k+1 rows of ab. The j-th column of A is stored in the j-th column of the
array ab as follows:
if uplo = 'U', ab(k+1+i-j,j) = a(i,j)
1464
LAPACK Routines 3
Output Parameters
?lanhb
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a Hermitian band matrix.
Syntax
val = clanhb( norm, uplo, n, k, ab, ldab, work )
val = zlanhb( norm, uplo, n, k, ab, ldab, work )
Include Files
mkl.fi
Description
The routine returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the element of
largest absolute value of an n-by-n Hermitian band matrix A, with k super-diagonals.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the band matrix A is
supplied.
If uplo = 'U': upper triangular part is supplied;
1465
3 Intel Math Kernel Library Developer Reference
Note that the imaginary parts of the diagonal elements need not be set and
are assumed to be zero.
Output Parameters
?lansp
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a symmetric matrix supplied in
packed form.
Syntax
val = slansp( norm, uplo, n, ap, work )
val = dlansp( norm, uplo, n, ap, work )
val = clansp( norm, uplo, n, ap, work )
val = zlansp( norm, uplo, n, ap, work )
Include Files
mkl.fi
Description
The function ?lansp returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a real/complex symmetric matrix A, supplied in packed form.
1466
LAPACK Routines 3
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is supplied.
If uplo = 'U': Upper triangular part of A is supplied
Output Parameters
1467
3 Intel Math Kernel Library Developer Reference
?lanhp
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a complex Hermitian matrix supplied
in packed form.
Syntax
val = clanhp( norm, uplo, n, ap, work )
val = zlanhp( norm, uplo, n, ap, work )
Include Files
mkl.fi
Description
The function ?lanhp returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a complex Hermitian matrix A, supplied in packed form.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the Hermitian matrix
A is supplied.
If uplo = 'U': Upper triangular part of A is supplied
1468
LAPACK Routines 3
DOUBLE PRECISION for zlanhp.
Workspace array, DIMENSION(max(1,lwork)), where
Output Parameters
?lanst/?lanht
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a real symmetric or complex
Hermitian tridiagonal matrix.
Syntax
val = slanst( norm, n, d, e )
val = dlanst( norm, n, d, e )
val = clanht( norm, n, d, e )
val = zlanht( norm, n, d, e )
Include Files
mkl.fi
Description
The functions ?lanst/?lanht return the value of the 1-norm, or the Frobenius norm, or the infinity norm, or
the element of largest absolute value of a real symmetric or a complex Hermitian tridiagonal matrix A.
Input Parameters
1469
3 Intel Math Kernel Library Developer Reference
Output Parameters
?lansy
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a real/complex symmetric matrix.
Syntax
val = slansy( norm, uplo, n, a, lda, work )
val = dlansy( norm, uplo, n, a, lda, work )
val = clansy( norm, uplo, n, a, lda, work )
val = zlansy( norm, uplo, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lansy returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a real/complex symmetric matrix A.
Input Parameters
The data types are given for the Fortran interface.
1470
LAPACK Routines 3
= 'F', 'f', 'E' or 'e': val = normF(A), Frobenius norm of the matrix
A (square root of sum of squares).
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is to be referenced.
= 'U': Upper triangular part of A is referenced.
Output Parameters
?lanhe
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a complex Hermitian matrix.
Syntax
val = clanhe( norm, uplo, n, a, lda, work )
1471
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The function ?lanhe returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a complex Hermitian matrix A.
Input Parameters
The data types are given for the Fortran interface.
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the Hermitian matrix
A is to be referenced.
= 'U': Upper triangular part of A is referenced.
1472
LAPACK Routines 3
lworkn when norm = 'I' or '1' or 'O'; otherwise, work is not
referenced.
Output Parameters
?lantb
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a triangular band matrix.
Syntax
val = slantb( norm, uplo, diag, n, k, ab, ldab, work )
val = dlantb( norm, uplo, diag, n, k, ab, ldab, work )
val = clantb( norm, uplo, diag, n, k, ab, ldab, work )
val = zlantb( norm, uplo, diag, n, k, ab, ldab, work )
Include Files
mkl.fi
Description
The function ?lantb returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of an n-by-n triangular band matrix A, with ( k + 1 ) diagonals.
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': Upper triangular
diag CHARACTER*1.
1473
3 Intel Math Kernel Library Developer Reference
Note that when diag = 'U', the elements of the array ab corresponding to
the diagonal elements of the matrix A are not referenced, but are assumed
to be one.
Output Parameters
?lantp
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a triangular matrix supplied in
packed form.
Syntax
val = slantp( norm, uplo, diag, n, ap, work )
1474
LAPACK Routines 3
val = dlantp( norm, uplo, diag, n, ap, work )
val = clantp( norm, uplo, diag, n, ap, work )
val = zlantp( norm, uplo, diag, n, ap, work )
Include Files
mkl.fi
Description
The function ?lantp returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a triangular matrix A, supplied in packed form.
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': Upper triangular
diag CHARACTER*1.
Specifies whether or not the matrix A is unit triangular.
= 'N': Non-unit triangular
1475
3 Intel Math Kernel Library Developer Reference
Note that when diag = 'U', the elements of the array ap corresponding to
the diagonal elements of the matrix A are not referenced, but are assumed
to be one.
Output Parameters
?lantr
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a trapezoidal or triangular matrix.
Syntax
val = slantr( norm, uplo, diag, m, n, a, lda, work )
val = dlantr( norm, uplo, diag, m, n, a, lda, work )
val = clantr( norm, uplo, diag, m, n, a, lda, work )
val = zlantr( norm, uplo, diag, m, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lantr returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a trapezoidal or triangular matrix A.
Input Parameters
The data types are given for the Fortran interface.
1476
LAPACK Routines 3
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower trapezoidal.
= 'U': Upper trapezoidal
diag CHARACTER*1.
Specifies whether or not the matrix A has unit diagonal.
= 'N': Non-unit diagonal
If uplo = 'U', the leading m-by-n upper trapezoidal part of the array a
contains the upper trapezoidal matrix, and the strictly lower triangular part
of A is not referenced.
If uplo = 'L', the leading m-by-n lower trapezoidal part of the array a
contains the lower trapezoidal matrix, and the strictly upper triangular part
of A is not referenced. Note that when diag = 'U', the diagonal elements
of A are not referenced and are assumed to be one.
Output Parameters
1477
3 Intel Math Kernel Library Developer Reference
?lanv2
Computes the Schur factorization of a real 2-by-2
nonsymmetric matrix in standard form.
Syntax
call slanv2( a, b, c, d, rt1r, rt1i, rt2r, rt2i, cs, sn )
call dlanv2( a, b, c, d, rt1r, rt1i, rt2r, rt2i, cs, sn )
Include Files
mkl.fi
Description
The routine computes the Schur factorization of a real 2-by-2 nonsymmetric matrix in standard form:
where either
The routine was adjusted to reduce the risk of cancellation errors, when computing real eigenvalues, and to
ensure, if possible, that abs(rt1r) abs(rt2r).
Input Parameters
Output Parameters
1478
LAPACK Routines 3
Parameters of the rotation matrix.
?lapll
Measures the linear dependence of two vectors.
Syntax
call slapll( n, x, incx, Y, incy, ssmin )
call dlapll( n, x, incx, Y, incy, ssmin )
call clapll( n, x, incx, Y, incy, ssmin )
call zlapll( n, x, incx, Y, incy, ssmin )
Include Files
mkl.fi
Description
Input Parameters
1479
3 Intel Math Kernel Library Developer Reference
Output Parameters
x On exit, x is overwritten.
y On exit, y is overwritten.
?lapmr
Rearranges rows of a matrix as specified by a
permutation vector.
Syntax
call slapmr( forwrd, m, n, x, ldx, k )
call dlapmr( forwrd, m, n, x, ldx, k )
call clapmr( forwrd, m, n, x, ldx, k )
call zlapmr( forwrd, m, n, x, ldx, k )
call lapmr( x,k[,forwrd] )
Include Files
mkl.fi
Description
The ?lapmr routine rearranges the rows of the m-by-n matrix X as specified by the permutation
k(1),k(2),...,k(m) of the integers 1,...,m.
If forwrd = .TRUE., forward permutation:
Input Parameters
The data types are given for the Fortran interface.
forwrd LOGICAL.
If forwrd = .TRUE., forward permutation.
1480
LAPACK Routines 3
COMPLEX for clapmr
DOUBLE COMPLEX for zlapmr
Array, size (ldx,n)On entry, the m-by-n matrix X.
k INTEGER. Array, size (m). On entry, k contains the permutation vector and
is used as internal workspace.
Output Parameters
See Also
?lapmt
?lapmt
Performs a forward or backward permutation of the
columns of a matrix.
Syntax
call slapmt( forwrd, m, n, x, ldx, k )
call dlapmt( forwrd, m, n, x, ldx, k )
call clapmt( forwrd, m, n, x, ldx, k )
call zlapmt( forwrd, m, n, x, ldx, k )
Include Files
mkl.fi
Description
The routine ?lapmt rearranges the columns of the m-by-n matrix X as specified by the permutation
k(1),k(2),...,k(n) of the integers 1,...,n.
If forwrd= .TRUE., forward permutation:
1481
3 Intel Math Kernel Library Developer Reference
Input Parameters
forwrd LOGICAL.
If forwrd= .TRUE., forward permutation
k INTEGER. Array, size (n). On entry, k contains the permutation vector and is
used as internal workspace.
Output Parameters
See Also
?lapmr
?lapy2
Returns sqrt(x2+y2).
Syntax
val = slapy2( x, y )
val = dlapy2( x, y )
Include Files
mkl.fi
Description
The function ?lapy2 returns sqrt(x2+y2), avoiding unnecessary overflow or harmful underflow.
Input Parameters
The data types are given for the Fortran interface.
1482
LAPACK Routines 3
DOUBLE PRECISION for dlapy2
Specify the input values x and y.
Output Parameters
?lapy3
Returns sqrt(x2+y2+z2).
Syntax
val = slapy3( x, y, z )
val = dlapy3( x, y, z )
Include Files
mkl.fi
Description
The function ?lapy3 returns sqrt(x2+y2+z2), avoiding unnecessary overflow or harmful underflow.
Input Parameters
The data types are given for the Fortran interface.
Output Parameters
?laqgb
Scales a general band matrix, using row and column
scaling factors computed by ?gbequ.
1483
3 Intel Math Kernel Library Developer Reference
Syntax
call slaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call dlaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call claqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call zlaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a general m-by-n band matrix A with kl subdiagonals and ku superdiagonals using
the row and column scaling factors in the vectors r and c.
Input Parameters
1484
LAPACK Routines 3
colcnd REAL for slaqgb/claqgb
DOUBLE PRECISION for dlaqgb/zlaqgb
Ratio of the smallest c(i) to the largest c(i).
Output Parameters
equed CHARACTER*1.
Specifies the form of equilibration that was done.
If equed = 'N': No equilibration
Application Notes
The routine uses internal parameters thresh, large, and small, which have the following meaning. thresh is a
threshold value used to decide if row or column scaling should be done based on the ratio of the row or
column scaling factors. If rowcnd < thresh, row scaling is done, and if colcnd < thresh, column scaling
is done. large and small are threshold values used to decide if row scaling should be done based on the
absolute size of the largest matrix element. If amax > large or amax < small, row scaling is done.
?laqge
Scales a general rectangular matrix, using row and
column scaling factors computed by ?geequ.
Syntax
call slaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call dlaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call claqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call zlaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a general m-by-n matrix A using the row and column scaling factors in the vectors r
and c.
Input Parameters
1485
3 Intel Math Kernel Library Developer Reference
m 0.
Output Parameters
equed CHARACTER*1.
Specifies the form of equilibration that was done.
If equed = 'N': No equilibration
1486
LAPACK Routines 3
If equed = 'C': Column equilibration, that is, A has been postmultiplied by
diag(c).
If equed = 'B': Both row and column equilibration, that is, A has been
replaced by diag(r)*A*diag(c).
Application Notes
The routine uses internal parameters thresh, large, and small, which have the following meaning. thresh is a
threshold value used to decide if row or column scaling should be done based on the ratio of the row or
column scaling factors. If rowcnd < thresh, row scaling is done, and if colcnd < thresh, column scaling
is done. large and small are threshold values used to decide if row scaling should be done based on the
absolute size of the largest matrix element. If amax > large or amax < small, row scaling is done.
?laqhb
Scales a Hermetian band matrix, using scaling factors
computed by ?pbequ.
Syntax
call claqhb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call zlaqhb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a Hermetian band matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the band matrix A is
stored.
If uplo = 'U': upper triangular.
kd 0.
1487
3 Intel Math Kernel Library Developer Reference
Output Parameters
equed CHARACTER*1.
Specifies whether or not equilibration was done.
If equed = 'N': No equilibration.
If equed = 'Y': Equilibration was done, that is, A has been replaced by
diag(s)*A*diag(s).
Application Notes
The routine uses internal parameters thresh, large, and small, which have the following meaning. thresh is a
threshold value used to decide if scaling should be based on the ratio of the scaling factors. If scond <
thresh, scaling is done.
The values large and small are threshold values used to decide if scaling should be done based on the
absolute size of the largest matrix element. If amax > large or amax < small, scaling is done.
?laqp2
Computes a QR factorization with column pivoting of
the matrix block.
Syntax
call slaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call dlaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call claqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call zlaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
1488
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine computes a QR factorization with column pivoting of the block A(offset+1:m,1:n). The block
A(1:offset,1:n) is accordingly pivoted, but not factorized.
Input Parameters
offset INTEGER. The number of rows of the matrix A that must be pivoted but no
factorized. offset 0.
jpvt INTEGER.
Array, DIMENSION (n).
Output Parameters
1489
3 Intel Math Kernel Library Developer Reference
jpvt On exit, if jpvt(i) = k, then the i-th column of A*P was the k-th column
of A.
vn1, vn2 Contain the vectors with the partial and exact column norms, respectively.
?laqps
Computes a step of QR factorization with column
pivoting of a real m-by-n matrix A by using BLAS level
3.
Syntax
call slaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf )
call dlaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf )
call claqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf )
call zlaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf )
Include Files
mkl.fi
Description
The routine computes a step of QR factorization with column pivoting of a real m-by-n matrix A by using
BLAS level 3. The routine tries to factorize NB columns from A starting from the row offset+1, and updates
all of the matrix with BLAS level 3 routine ?gemm.
In some cases, due to catastrophic cancellations, ?laqps cannot factorize NB columns. Hence, the actual
number of factorized columns is returned in kb.
Block A(1:offset,1:n) is accordingly pivoted, but not factorized.
Input Parameters
offset INTEGER. The number of rows of A that have been factorized in previous
steps.
1490
LAPACK Routines 3
COMPLEX for claqps
DOUBLE COMPLEX for zlaqps
Array, DIMENSION (lda,n).
Output Parameters
jpvt INTEGER array, DIMENSION (n). If jpvt(I) = k then column k of the full
matrix A has been permuted into position i in AP.
1491
3 Intel Math Kernel Library Developer Reference
vn1, vn2 The vectors with the partial and exact column norms, respectively.
?laqr0
Computes the eigenvalues of a Hessenberg matrix,
and optionally the marixes from the Schur
decomposition.
Syntax
call slaqr0( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, work,
lwork, info )
call dlaqr0( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, work,
lwork, info )
call claqr0( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work, lwork,
info )
call zlaqr0( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work, lwork,
info )
Include Files
mkl.fi
Description
The routine computes the eigenvalues of a Hessenberg matrix H, and, optionally, the matrices T and Z from
the Schur decomposition H=Z*T*ZH, where T is an upper quasi-triangular/triangular matrix (the Schur form),
and Z is the orthogonal/unitary matrix of Schur vectors.
Optionally Z may be postmultiplied into an input orthogonal/unitary matrix Q so that this routine can give the
Schur factorization of a matrix A which has been reduced to the Hessenberg form H by the orthogonal/
unitary matrix Q: A = Q*H*QH = (QZ)*H*(QZ)H.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., the full Schur form T is required;
wantz LOGICAL.
If wantz = .TRUE., the matrix of Schur vectors Z is required;
1492
LAPACK Routines 3
ilo, ihi INTEGER.
It is assumed that H is already upper triangular in rows and columns
1:ilo-1 and ihi+1:n, and if ilo > 1 then H(ilo, ilo-1) = 0.
ilo and ihi are normally set by a previous call to cgebal, and then
passed to cgehrd when the matrix output by cgebal is reduced to
Hessenberg form. Otherwise, ilo and ihi should be set to 1 and n,
respectively.
If n > 0, then 1 ilo ihi n.
ldh INTEGER. The leading dimension of the array h. ldh max(1, n).
iloz, ihiz INTEGER. Specify the rows of Z to which transformations must be applied if
wantz is .TRUE., 1 iloz ilo; ihi ihiz n.
1493
3 Intel Math Kernel Library Developer Reference
Output Parameters
The routine may explicitly set h(i,j) for i>j and j=1,2,...ilo-1 or
j=ihi+1, ihi+2,...n.
work(1) On exit work(1) contains the minimum value of lwork required for optimum
performance.
(The output values of z when info > 0 are given under the description of
the info parameter below.)
info INTEGER.
= 0: the execution is successful.
> 0: if info = i, then the routine failed to compute all the eigenvalues.
Elements 1:ilo-1 and i+1:n of wr and wi contain those eigenvalues which
have been successfully computed.
1494
LAPACK Routines 3
> 0: if wantt is .FALSE., then the remaining unconverged eigenvalues are
the eigenvalues of the upper Hessenberg matrix rows and columns ilo
through info of the final output value of h.
?laqr1
Sets a scalar multiple of the first column of the
product of 2-by-2 or 3-by-3 matrix H and specified
shifts.
Syntax
call slaqr1( n, h, ldh, sr1, si1, sr2, si2, v )
call dlaqr1( n, h, ldh, sr1, si1, sr2, si2, v )
call claqr1( n, h, ldh, s1, s2, v )
call zlaqr1( n, h, ldh, s1, s2, v )
Include Files
mkl.fi
Description
Given a 2-by-2 or 3-by-3 matrix H, this routine sets v to a scalar multiple of the first column of the product
K = (H - s1*I)*(H - s2*I), or K = (H - (sr1 + i*si1)*I)*(H - (sr2 + i*si2)*I)
scaling to avoid overflows and most underflows.
It is assumed that either 1) sr1 = sr2 and si1 = -si2, or 2) si1 = si2 = 0.
This is useful for starting double implicit shift bulges in the QR algorithm.
Input Parameters
n INTEGER.
The order of the matrix H. n must be equal to 2 or 3.
1495
3 Intel Math Kernel Library Developer Reference
ldh INTEGER.
The leading dimension of the array h just as declared in the calling routine.
ldh n.
Output Parameters
A scalar multiple of the first column of the matrix K in the formula above.
?laqr2
Performs the orthogonal/unitary similarity
transformation of a Hessenberg matrix to detect and
deflate fully converged eigenvalues from a trailing
principal submatrix (aggressive early deflation).
Syntax
call slaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sr,
si, v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call dlaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sr,
si, v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call claqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sh,
v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call zlaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sh,
v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
Include Files
mkl.fi
Description
1496
LAPACK Routines 3
The routine accepts as input an upper Hessenberg matrix H and performs an orthogonal/unitary similarity
transformation designed to detect and deflate fully converged eigenvalues from a trailing principal submatrix.
On output H has been overwritten by a new Hessenberg matrix that is a perturbation of an orthogonal/
unitary similarity transformation of H. It is to be hoped that the final version of H has many zero subdiagonal
entries.
This subroutine is identical to ?laqr3 except that it avoids recursion by calling ?lahqr instead of ?laqr4.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., then the Hessenberg matrix H is fully updated so that
the quasi-triangular/triangular Schur factor may be computed (in
cooperation with the calling subroutine).
If wantt = .FALSE., then only enough of H is updated to preserve the
eigenvalues.
wantz LOGICAL.
If wantz = .TRUE., then the orthogonal/unitary matrix Z is updated so
that the orthogonal/unitary Schur factor may be computed (in cooperation
with the calling subroutine).
If wantz = .FALSE., then Z is not referenced.
n INTEGER. The order of the Hessenberg matrix H and (if wantz = .TRUE.)
the order of the orthogonal/unitary matrix Z.
ktop INTEGER.
It is assumed that either ktop=1 or h(ktop,ktop-1)=0. ktop and kbot
together determine an isolated block along the diagonal of the Hessenberg
matrix.
kbot INTEGER.
It is assumed without a check that either kbot=n or h(kbot+1,kbot)=0.
ktop and kbot together determine an isolated block along the diagonal of
the Hessenberg matrix.
nw INTEGER.
Size of the deflation window. 1 nw (kbot-ktop+1).
ldh INTEGER. The leading dimension of the array h just as declared in the
calling subroutine. ldhn.
iloz, ihiz INTEGER. Specify the rows of Z to which transformations must be applied if
wantz is .TRUE.. 1 iloz ihiz n.
1497
3 Intel Math Kernel Library Developer Reference
ldz INTEGER. The leading dimension of the array z just as declared in the
calling subroutine. ldz 1.
ldv INTEGER. The leading dimension of the array v just as declared in the
calling subroutine. ldv nw.
ldt INTEGER. The leading dimension of the array t just as declared in the
calling subroutine. ldtnw.
ldwv INTEGER. The leading dimension of the array wv just as declared in the
calling subroutine. ldwvnw.
1498
LAPACK Routines 3
DOUBLE COMPLEX for zlaqr2.
Workspace array with dimension lwork.
Output Parameters
work(1) On exit work(1) is set to an estimate of the optimal value of lwork for the
given values of the input parameters n, nw, ktop, and kbot.
The approximate eigenvalues that may be used for shifts are stored in the
sh(kbot-nd-ns+1)through the sh(kbot-nd).
The converged eigenvalues are stored in the sh(kbot-nd+1)through the
sh(kbot).
The real and imaginary parts of the approximate eigenvalues that may be
used for shifts are stored in the sr(kbot-nd-ns+1)through the sr(kbot-
nd), and si(kbot-nd-ns+1) through the si(kbot-nd), respectively.
The real and imaginary parts of converged eigenvalues are stored in the
sr(kbot-nd+1)through the sr(kbot), and si(kbot-nd+1) through the
si(kbot), respectively.
1499
3 Intel Math Kernel Library Developer Reference
?laqr3
Performs the orthogonal/unitary similarity
transformation of a Hessenberg matrix to detect and
deflate fully converged eigenvalues from a trailing
principal submatrix (aggressive early deflation).
Syntax
call slaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sr,
si, v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call dlaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sr,
si, v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call claqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sh,
v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call zlaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sh,
v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
Include Files
mkl.fi
Description
The routine accepts as input an upper Hessenberg matrix H and performs an orthogonal/unitary similarity
transformation designed to detect and deflate fully converged eigenvalues from a trailing principal submatrix.
On output H has been overwritten by a new Hessenberg matrix that is a perturbation of an orthogonal/
unitary similarity transformation of H. It is to be hoped that the final version of H has many zero subdiagonal
entries.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., then the Hessenberg matrix H is fully updated so that
the quasi-triangular/triangular Schur factor may be computed (in
cooperation with the calling subroutine).
If wantt = .FALSE., then only enough of H is updated to preserve the
eigenvalues.
wantz LOGICAL.
If wantz = .TRUE., then the orthogonal/unitary matrix Z is updated so
that the orthogonal/unitary Schur factor may be computed (in cooperation
with the calling subroutine).
If wantz = .FALSE., then Z is not referenced.
n INTEGER. The order of the Hessenberg matrix H and (if wantz = .TRUE.)
the order of the orthogonal/unitary matrix Z.
ktop INTEGER.
It is assumed that either ktop=1 or h(ktop,ktop-1)=0. ktop and kbot
together determine an isolated block along the diagonal of the Hessenberg
matrix.
1500
LAPACK Routines 3
kbot INTEGER.
It is assumed without a check that either kbot=n or h(kbot+1,kbot)=0.
ktop and kbot together determine an isolated block along the diagonal of
the Hessenberg matrix.
nw INTEGER.
Size of the deflation window. 1nw(kbot-ktop+1).
ldh INTEGER. The leading dimension of the array h just as declared in the
calling subroutine. ldhn.
iloz, ihiz INTEGER. Specify the rows of Z to which transformations must be applied if
wantz is .TRUE.. 1ilozihizn.
ldz INTEGER. The leading dimension of the array z just as declared in the
calling subroutine. ldz1.
ldv INTEGER. The leading dimension of the array v just as declared in the
calling subroutine. ldvnw.
1501
3 Intel Math Kernel Library Developer Reference
ldt INTEGER. The leading dimension of the array t just as declared in the
calling subroutine. ldtnw.
ldwv INTEGER. The leading dimension of the array wv just as declared in the
calling subroutine. ldwvnw.
Output Parameters
work(1) On exit work(1) is set to an estimate of the optimal value of lwork for the
given values of the input parameters n, nw, ktop, and kbot.
1502
LAPACK Routines 3
sh COMPLEX for claqr3
DOUBLE COMPLEX for zlaqr3.
Arrays, DIMENSION (kbot).
The approximate eigenvalues that may be used for shifts are stored in the
sh(kbot-nd-ns+1)through the sh(kbot-nd).
The converged eigenvalues are stored in the sh(kbot-nd+1)through the
sh(kbot).
The real and imaginary parts of the approximate eigenvalues that may be
used for shifts are stored in the sr(kbot-nd-ns+1)through the sr(kbot-
nd), and si(kbot-nd-ns+1) through the si(kbot-nd), respectively.
The real and imaginary parts of converged eigenvalues are stored in the
sr(kbot-nd+1)through the sr(kbot), and si(kbot-nd+1) through the
si(kbot), respectively.
?laqr4
Computes the eigenvalues of a Hessenberg matrix,
and optionally the matrices from the Schur
decomposition.
Syntax
call slaqr4( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, work,
lwork, info )
call dlaqr4( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, work,
lwork, info )
call claqr4( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work, lwork,
info )
call zlaqr4( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work, lwork,
info )
Include Files
mkl.fi
Description
The routine computes the eigenvalues of a Hessenberg matrix H, and, optionally, the matrices T and Z from
the Schur decomposition H=Z*T*ZH, where T is an upper quasi-triangular/triangular matrix (the Schur form),
and Z is the orthogonal/unitary matrix of Schur vectors.
Optionally Z may be postmultiplied into an input orthogonal/unitary matrix Q so that this routine can give the
Schur factorization of a matrix A which has been reduced to the Hessenberg form H by the orthogonal/
unitary matrix Q: A = Q*H*QH = (QZ)*H*(QZ)H.
This routine implements one level of recursion for ?laqr0. It is a complete implementation of the small bulge
multi-shift QR algorithm. It may be called by ?laqr0 and, for large enough deflation window size, it may be
called by ?laqr3. This routine is identical to ?laqr0 except that it calls ?laqr2 instead of ?laqr3.
1503
3 Intel Math Kernel Library Developer Reference
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., the full Schur form T is required;
wantz LOGICAL.
If wantz = .TRUE., the matrix of Schur vectors Z is required;
ldh INTEGER. The leading dimension of the array h. ldh max(1, n).
iloz, ihiz INTEGER. Specify the rows of Z to which transformations must be applied if
wantz is .TRUE., 1 iloz ilo; ihi ihiz n.
1504
LAPACK Routines 3
DOUBLE COMPLEX for zlaqr4.
Workspace array with dimension lwork.
Output Parameters
The routines may explicitly set h(i,j) for i>j and j=1,2,...ilo-1 or
j=ihi+1, ihi+2,...n.
work(1) On exit work(1) contains the minimum value of lwork required for optimum
performance.
1505
3 Intel Math Kernel Library Developer Reference
(The output values of z when info > 0 are given under the description of
the info parameter below.)
info INTEGER.
= 0: the execution is successful.
> 0: if info = i, then the routine failed to compute all the eigenvalues.
Elements 1:ilo-1 and i+1:n of wr and wi contain those eigenvalues which
have been successfully computed.
> 0: if wantt is .FALSE., then the remaining unconverged eigenvalues are
the eigenvalues of the upper Hessenberg matrix rows and columns ilo
through info of the final output value of h.
?laqr5
Performs a single small-bulge multi-shift QR sweep.
Syntax
call slaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, sr, si, h, ldh, iloz, ihiz,
z, ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call dlaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, sr, si, h, ldh, iloz, ihiz,
z, ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call claqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, s, h, ldh, iloz, ihiz, z,
ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call zlaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, s, h, ldh, iloz, ihiz, z,
ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
Include Files
mkl.fi
Description
This auxiliary routine called by ?laqr0 performs a single small-bulge multi-shift QR sweep.
Input Parameters
wantt LOGICAL.
1506
LAPACK Routines 3
wantt = .TRUE. if the quasi-triangular/triangular Schur factor is
computed.
wantt is set to .FALSE. otherwise.
wantz LOGICAL.
wantz = .TRUE. if the orthogonal/unitary Schur factor is computed.
wantz is set to .FALSE. otherwise.
n INTEGER. The order of the Hessenberg matrix H upon which the routine
operates.
nshfts INTEGER.
Number of simultaneous shifts, must be positive and even.
sr contains the real parts and si contains the imaginary parts of the
nshfts shifts of origin that define the multi-shift QR sweep.
1507
3 Intel Math Kernel Library Developer Reference
ldh INTEGER. The leading dimension of the array h just as declared in the
calling routine. ldh max(1, n).
iloz, ihiz INTEGER. Specify the rows of Z to which transformations must be applied if
wantz is .TRUE.. 1 ilozihizn.
ldz INTEGER. The leading dimension of the array z just as declared in the
calling routine. ldzn.
ldv INTEGER. The leading dimension of the array v just as declared in the
calling routine. ldv 3.
ldu INTEGER. The leading dimension of the array u just as declared in the
calling routine. ldu 3*nshfts-3.
ldwh INTEGER. The leading dimension of the array wh just as declared in the
calling routine. ldwh 3*nshfts-3
1508
LAPACK Routines 3
nv INTEGER. The number of rows of the array wv available for workspace. nv
1.
ldwv INTEGER. The leading dimension of the array wv just as declared in the
calling routine. ldwvnv.
Output Parameters
?laqsb
Scales a symmetric band matrix, using scaling factors
computed by ?pbequ.
Syntax
call slaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call dlaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call claqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call zlaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a symmetric band matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is stored.
If uplo = 'U': upper triangular.
1509
3 Intel Math Kernel Library Developer Reference
kd 0.
Output Parameters
equed CHARACTER*1.
Specifies whether or not equilibration was done.
If equed = 'N': No equilibration.
1510
LAPACK Routines 3
If equed = 'Y': Equilibration was done, that is, A has been replaced by
diag(s)*A*diag(s).
Application Notes
The routine uses internal parameters thresh, large, and small, which have the following meaning. thresh is a
threshold value used to decide if scaling should be based on the ratio of the scaling factors. If scond <
thresh, scaling is done. large and small are threshold values used to decide if scaling should be done based
on the absolute size of the largest matrix element. If amax > large or amax < small, scaling is done.
?laqsp
Scales a symmetric/Hermitian matrix in packed
storage, using scaling factors computed by ?ppequ.
Syntax
call slaqsp( uplo, n, ap, s, scond, amax, equed )
call dlaqsp( uplo, n, ap, s, scond, amax, equed )
call claqsp( uplo, n, ap, s, scond, amax, equed )
call zlaqsp( uplo, n, ap, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine ?laqsp equilibrates a symmetric matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is stored.
If uplo = 'U': upper triangular.
1511
3 Intel Math Kernel Library Developer Reference
Output Parameters
equed CHARACTER*1.
Specifies whether or not equilibration was done.
If equed = 'N': No equilibration.
If equed = 'Y': Equilibration was done, that is, A has been replaced by
diag(s)*A*diag(s).
Application Notes
The routine uses internal parameters thresh, large, and small, which have the following meaning. thresh is a
threshold value used to decide if scaling should be based on the ratio of the scaling factors. If scond <
thresh, scaling is done. large and small are threshold values used to decide if scaling should be done based
on the absolute size of the largest matrix element. If amax > large or amax < small, scaling is done.
?laqsy
Scales a symmetric/Hermitian matrix, using scaling
factors computed by ?poequ.
Syntax
call slaqsy( uplo, n, a, lda, s, scond, amax, equed )
call dlaqsy( uplo, n, a, lda, s, scond, amax, equed )
call claqsy( uplo, n, a, lda, s, scond, amax, equed )
call zlaqsy( uplo, n, a, lda, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a symmetric matrix A using the scaling factors in the vector s.
1512
LAPACK Routines 3
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is stored.
If uplo = 'U': upper triangular.
If uplo = 'U', the leading n-by-n upper triangular part of a contains the
upper triangular part of the matrix A, and the strictly lower triangular part
of a is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of a contains the
lower triangular part of the matrix A, and the strictly upper triangular part
of a is not referenced.
Output Parameters
equed CHARACTER*1.
Specifies whether or not equilibration was done.
If equed = 'N': No equilibration.
1513
3 Intel Math Kernel Library Developer Reference
Application Notes
The routine uses internal parameters thresh, large, and small, which have the following meaning. thresh is a
threshold value used to decide if scaling should be based on the ratio of the scaling factors. If scond <
thresh, scaling is done. large and small are threshold values used to decide if scaling should be done based
on the absolute size of the largest matrix element. If amax > large or amax < small, scaling is done.
?laqtr
Solves a real quasi-triangular system of equations, or
a complex quasi-triangular system of special form, in
real arithmetic.
Syntax
call slaqtr( ltran, lreal, n, t, ldt, b, w, scale, x, work, info )
call dlaqtr( ltran, lreal, n, t, ldt, b, w, scale, x, work, info )
Include Files
mkl.fi
Description
1514
LAPACK Routines 3
This routine is designed for the condition number estimation in routine ?trsna.
Input Parameters
ltran LOGICAL.
On entry, ltran specifies the option of conjugate transpose:
= .FALSE., op(T + iB) = T + iB,
lreal LOGICAL.
On entry, lreal specifies the input matrix structure:
= .FALSE., the input is complex
n INTEGER.
On entry, n specifies the order of T + iB. n 0.
1515
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0: successful exit.
NOTE
For higher speed, this routine does not check the inputs for errors.
?lar1v
Computes the (scaled) r-th column of the inverse of
the submatrix in rows b1 through bn of tridiagonal
matrix.
Syntax
call slar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc, negcnt, ztz,
mingma, r, isuppz, nrminv, resid, rqcorr, work )
call dlar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc, negcnt, ztz,
mingma, r, isuppz, nrminv, resid, rqcorr, work )
call clar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc, negcnt, ztz,
mingma, r, isuppz, nrminv, resid, rqcorr, work )
call zlar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc, negcnt, ztz,
mingma, r, isuppz, nrminv, resid, rqcorr, work )
Include Files
mkl.fi
Description
The routine ?lar1v computes the (scaled) r-th column of the inverse of the submatrix in rows b1 through bn
of the tridiagonal matrix L*D*LT - *I. When is close to an eigenvalue, the computed vector is an
accurate eigenvector. Usually, r corresponds to the index where the eigenvector is largest in magnitude.
The following steps accomplish this computation :
1516
LAPACK Routines 3
Computation of the diagonal elements of the inverse of L*D*LT - *I by combining the above
transforms, and choosing r as the index where the diagonal of the inverse is (one of the) largest in
magnitude.
Computation of the (scaled) r-th column of the inverse using the twisted factorization obtained by
combining the top part of the stationary and the bottom part of the progressive transform.
Input Parameters
1517
3 Intel Math Kernel Library Developer Reference
Tolerance that indicates when eigenvector entries are negligible with respect
to their contribution to the residual.
wantnc LOGICAL.
Specifies whether negcnt has to be computed.
r INTEGER.
The twist index for the twisted factorization used to compute z. On input, 0
rn. If r is input as 0, r is set to the index where (L*D*LT -
lambda*I)-1 is largest in magnitude. If 1 rn, r is unchanged.
Output Parameters
negcnt INTEGER. If wantnc is .TRUE. then negcnt = the number of pivots <
pivmin in the matrix factorization L*D*LT, and negcnt = -1 otherwise.
isuppz INTEGER. Array, DIMENSION (2). The support of the vector in Z, that is, the
vector z is nonzero only in elements isuppz(1) through isuppz(2).
1518
LAPACK Routines 3
nrminv REAL for slar1v/clar1v
DOUBLE PRECISION for dlar1v/zlar1v
Equals 1/sqrt( ztz ).
?lar2v
Applies a vector of plane rotations with real cosines
and real/complex sines from both sides to a sequence
of 2-by-2 symmetric/Hermitian matrices.
Syntax
call slar2v( n, x, y, z, incx, c, s, incc )
call dlar2v( n, x, y, z, incx, c, s, incc )
call clar2v( n, x, y, z, incx, c, s, incc )
call zlar2v( n, x, y, z, incx, c, s, incc )
Include Files
mkl.fi
Description
The routine ?lar2v applies a vector of real/complex plane rotations with real cosines from both sides to a
sequence of 2-by-2 real symmetric or complex Hermitian matrices, defined by the elements of the vectors x,
y and z. For i = 1,2,...,n
Input Parameters
1519
3 Intel Math Kernel Library Developer Reference
Output Parameters
?laran
Returns a random real number from a uniform
distribution.
Syntax
res = slaran (iseed)
res = dlaran (iseed)
Description
The ?laran routine returns a random real number from a uniform (0,1) distribution. This routine uses a
multiplicative congruential method with modulus 248 and multiplier 33952834046453. 48-bit integers are
stored in four integer array elements with 12 bits per element. Hence the routine is portable across machines
with integers of 32 bits or more.
Input Parameters
iseed INTEGER. Array, size 4. On entry, the seed of the random number
generator. The array elements must be between 0 and 4095, and iseed(4)
must be odd.
Output Parameters
iseed INTEGER.
On exit, the seed is updated.
1520
LAPACK Routines 3
res REAL for slaran,
DOUBLE PRECISION for dlaran,
Random number.
?larf
Applies an elementary reflector to a general
rectangular matrix.
Syntax
call slarf( side, m, n, v, incv, tau, c, ldc, work )
call dlarf( side, m, n, v, incv, tau, c, ldc, work )
call clarf( side, m, n, v, incv, tau, c, ldc, work )
call zlarf( side, m, n, v, incv, tau, c, ldc, work )
Include Files
mkl.fi
Description
The routine applies a real/complex elementary reflector H to a real/complex m-by-n matrix C, from either the
left or the right. H is represented in one of the following forms:
H = I - tau*v*vT
where tau is a real scalar and v is a real vector.
If tau = 0, then H is taken to be the unit matrix.
H = I - tau*v*vH
where tau is a complex scalar and v is a complex vector.
If tau = 0, then H is taken to be the unit matrix. For clarf/zlarf, to apply HH (the conjugate transpose
of H), supply conjg(tau) instead of tau.
Input Parameters
side CHARACTER*1.
If side = 'L': form H*C
1521
3 Intel Math Kernel Library Developer Reference
Output Parameters
?larfb
Applies a block reflector or its transpose/conjugate-
transpose to a general rectangular matrix.
Syntax
call slarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc, work,
ldwork )
1522
LAPACK Routines 3
call dlarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc, work,
ldwork )
call clarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc, work,
ldwork )
call zlarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc, work,
ldwork )
Include Files
mkl.fi
Description
The real flavors of the routine ?larfb apply a real block reflector H or its transpose HT to a real m-by-n
matrix C from either left or right.
The complex flavors of the routine ?larfb apply a complex block reflector H or its conjugate transpose HH to
a complex m-by-n matrix C from either left or right.
Input Parameters
The data types are given for the Fortran interface.
side CHARACTER*1.
If side = 'L': apply H or HT for real flavors and H or HH for complex
flavors from the left.
If side = 'R': apply H or HT for real flavors and H or HH for complex
flavors from the right.
trans CHARACTER*1.
If trans = 'N': apply H (No transpose).
direct CHARACTER*1.
Indicates how H is formed from a product of elementary reflectors
If direct = 'F': H = H(1)*H(2)*. . . *H(k) (forward)
storev CHARACTER*1.
Indicates how the vectors which define the elementary reflectors are
stored:
If storev = 'C': Column-wise
1523
3 Intel Math Kernel Library Developer Reference
1524
LAPACK Routines 3
ldwork INTEGER. The leading dimension of the array work.
If side = 'L', ldwork max(1, n);
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
Application Notes
The shape of the matrix V and the storage of the vectors which define the H(i) is best illustrated by the
following example with n = 5 and k = 3. The elements equal to 1 are not stored; the corresponding array
elements are modified but restored on exit. The rest of the array is not used.
1525
3 Intel Math Kernel Library Developer Reference
?larfg
Generates an elementary reflector (Householder
matrix).
Syntax
call slarfg( n, alpha, x, incx, tau )
call dlarfg( n, alpha, x, incx, tau )
call clarfg( n, alpha, x, incx, tau )
call zlarfg( n, alpha, x, incx, tau )
Include Files
mkl.fi
Description
The routine ?larfg generates a real/complex elementary reflector H of order n, such that
Input Parameters
The data types are given for the Fortran interface.
1526
LAPACK Routines 3
DOUBLE COMPLEX for zlarfg On entry, the value alpha.
incx INTEGER.
The increment between elements of x. incx > 0.
Output Parameters
?larfgp
Generates an elementary reflector (Householder
matrix) with non-negative beta .
Syntax
call slarfgp( n, alpha, x, incx, tau )
call dlarfgp( n, alpha, x, incx, tau )
call clarfgp( n, alpha, x, incx, tau )
call zlarfgp( n, alpha, x, incx, tau )
Include Files
mkl.fi
Description
The routine ?larfgp generates a real/complex elementary reflector H of order n, such that
1527
3 Intel Math Kernel Library Developer Reference
where alpha and beta are scalars (with beta real and non-negative for all flavors), and x is an (n-1)-element
real/complex vector. H is represented in the form
Input Parameters
x REAL for s
DOUBLE PRECISION for dlarfgp
COMPLEX for clarfgp
DOUBLE COMPLEX for zlarfgp
Array, DIMENSION (1+(n-2)*abs(incx)).
incx INTEGER.
The increment between elements of x. incx > 0.
Output Parameters
1528
LAPACK Routines 3
The value tau.
?larft
Forms the triangular factor T of a block reflector H = I
- V*T*V**H.
Syntax
call slarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call dlarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call clarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call zlarft( direct, storev, n, k, v, ldv, tau, t, ldt )
Include Files
mkl.fi
Description
The routine ?larft forms the triangular factor T of a real/complex block reflector H of order n, which is
defined as a product of k elementary reflectors.
If direct = 'F', H = H(1)*H(2)* . . .*H(k) and T is upper triangular;
If storev = 'C', the vector which defines the elementary reflector H(i) is stored in the i-th column of the
array v, and H = I - V*T*VT (for real flavors) or H = I - V*T*VH (for complex flavors) .
If storev = 'R', the vector which defines the elementary reflector H(i) is stored in the i-th row of the array
v, and H = I - VT*T*V (for real flavors) or H = I - VH*T*V (for complex flavors).
Input Parameters
The data types are given for the Fortran interface.
direct CHARACTER*1.
Specifies the order in which the elementary reflectors are multiplied to form
the block reflector:
= 'F': H = H(1)*H(2)*. . . *H(k) (forward)
storev CHARACTER*1.
Specifies how the vectors which define the elementary reflectors are stored
(see also Application Notes below):
= 'C': column-wise
= 'R': row-wise.
1529
3 Intel Math Kernel Library Developer Reference
The matrix V.
Output Parameters
v The matrix V.
Application Notes
The shape of the matrix V and the storage of the vectors which define the H(i) is best illustrated by the
following example with n = 5 and k = 3. The elements equal to 1 are not stored; the corresponding array
elements are modified but restored on exit. The rest of the array is not used.
1530
LAPACK Routines 3
?larfx
Applies an elementary reflector to a general
rectangular matrix, with loop unrolling when the
reflector has order less than or equal to 10.
Syntax
call slarfx( side, m, n, v, tau, c, ldc, work )
call dlarfx( side, m, n, v, tau, c, ldc, work )
call clarfx( side, m, n, v, tau, c, ldc, work )
call zlarfx( side, m, n, v, tau, c, ldc, work )
Include Files
mkl.fi
Description
The routine ?larfx applies a real/complex elementary reflector H to a real/complex m-by-n matrix C, from
either the left or the right.
H is represented in the following forms:
1531
3 Intel Math Kernel Library Developer Reference
Input Parameters
The data types are given for the Fortran interface.
side CHARACTER*1.
If side = 'L': form H*C
1532
LAPACK Routines 3
Output Parameters
?large
Pre- and post-multiplies a real general matrix with a
random orthogonal matrix.
Syntax
call slarge( n, a, lda, iseed, work, info )
call dlarge( n, a, lda, iseed, work, info )
call clarge( n, a, lda, iseed, work, info )
call zlarge( n, a, lda, iseed, work, info )
Include Files
mkl.fi
Description
The routine ?large pre- and post-multiplies a general n-by-n matrix A with a random orthogonal or unitary
matrix: A = U*D*UT .
Input Parameters
1533
3 Intel Math Kernel Library Developer Reference
Output Parameters
a INTEGER.
On exit, A is overwritten by U*A*U' for some random orthogonal matrix U.
iseed INTEGER.
On exit, the seed is updated.
info INTEGER.
If info = 0, the execution is successful.
?largv
Generates a vector of plane rotations with real cosines
and real/complex sines.
Syntax
call slargv( n, x, incx, y, incy, c, incc )
call dlargv( n, x, incx, y, incy, c, incc )
call clargv( n, x, incx, y, incy, c, incc )
call zlargv( n, x, incx, y, incy, c, incc )
Include Files
mkl.fi
Description
The routine generates a vector of real/complex plane rotations with real cosines, determined by elements of
the real/complex vectors x and y.
For slargv/dlargv:
For clargv/zlargv:
where c(i)2 + abs(s(i))2 = 1 and the following conventions are used (these are the same as in clartg/
zlartg but differ from the BLAS Level 1 routine crotg/zrotg):
If yi = 0, then c(i) = 1 and s(i) = 0;
1534
LAPACK Routines 3
Input Parameters
incc INTEGER. The increment between elements of the output array c. incc >
0.
Output Parameters
?larnd
Returns a random real number from a uniform or
normal distribution.
Syntax
res = slarnd( idist, iseed )
res = dlarnd( idist, iseed )
res = clarnd( idist, iseed )
res = zlarnd( idist, iseed )
Include Files
mkl.fi
Description
The routine ?larnd returns a random number from a uniform or normal distribution.
1535
3 Intel Math Kernel Library Developer Reference
Input Parameters
idist INTEGER. Specifies the distribution of the random numbers. For slarnd and
dlanrd:
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1).
For clarnd and zlanrd:
Output Parameters
iseed INTEGER.
On exit, the seed is updated.
?larnv
Returns a vector of random numbers from a uniform
or normal distribution.
Syntax
call slarnv( idist, iseed, n, x )
call dlarnv( idist, iseed, n, x )
call clarnv( idist, iseed, n, x )
call zlarnv( idist, iseed, n, x )
Include Files
mkl.fi
Description
1536
LAPACK Routines 3
The routine ?larnv returns a vector of n random real/complex numbers from a uniform or normal
distribution.
This routine calls the auxiliary routine ?laruv to generate random real numbers from a uniform (0,1)
distribution, in batches of up to 128 using vectorisable code. The Box-Muller method is used to transform
numbers from a uniform to a normal distribution.
Input Parameters
The data types are given for the Fortran interface.
idist INTEGER. Specifies the distribution of the random numbers: for slarnv and
dlarnv:
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1).
for clarnv and zlarnv:
Output Parameters
?laror
Pre- or post-multiplies an m-by-n matrix by a random
orthogonal/unitary matrix.
Syntax
call slaror( side, init, m, n, a, lda, iseed, x, info )
call dlaror( side, init, m, n, a, lda, iseed, x, info )
call claror( side, init, m, n, a, lda, iseed, x, info )
1537
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The routine ?laror pre- or post-multiplies an m-by-n matrix A by a random orthogonal or unitary matrix U,
overwriting A. A may optionally be initialized to the identity matrix before multiplying by U. U is generated
using the method of G.W. Stewart (SIAM J. Numer. Anal. 17, 1980, 403-409).
Input Parameters
If side = 'C' or 'T', multiply A on the left by U and the right by UT.
1538
LAPACK Routines 3
a REAL for slaror,
DOUBLE PRECISION for dlaror,
COMPLEX for claror,
DOUBLE COMPLEX for zlaror,
Array, size lda by n.
'L' 2*m + n
'R' 2*n + m
'C' or 'T' 3*n
Output Parameters
a On exit, overwritten
by UA ( if side = 'L' ),
by AU ( if side = 'R' ),
iseed The values of iseed are changed on exit, and can be used in the next call
to continue the same random number sequence.
1539
3 Intel Math Kernel Library Developer Reference
?larot
Applies a Givens rotation to two adjacent rows or
columns.
Syntax
call slarot( lrows, lleft, lright, nl, c, s, a, lda, xleft, xright )
call dlarot( lrows, lleft, lright, nl, c, s, a, lda, xleft, xright )
call clarot( lrows, lleft, lright, nl, c, s, a, lda, xleft, xright )
call zlarot( lrows, lleft, lright, nl, c, s, a, lda, xleft, xright )
Include Files
mkl.fi
Description
The routine ?larot applies a Givens rotation to two adjacent rows or columns, where one element of the
first or last column or row is stored in some format other than GE so that elements of the matrix may be
used or modified for which no array element is provided.
One example is a symmetric matrix in SB format (bandwidth = 4), for which uplo = 'L'. Two adjacent rows
will have the format:
If element p is set correctly, ?larot rotates the column and sets p to its new value. The next call to ?larot
rotates columns j and j +1, and restore symmetry. The element q is zero at the beginning, and non-zero
after the rotation. Later, rotations would presumably be chosen to zero q out.
Typical Calling Sequences: rotating the i -th and (i +1)-st rows.
1540
LAPACK Routines 3
Example
j = max(1, i-kl )
nl = min( n, i+ku+1 ) + 1-j
call dlarot( .TRUE., i-kl.GE.1, i+ku.LT.n, nl, c,s,
a(ku+i+1-j,j),lda-1, xleft, xright )
NOTE
i + 1 - j is just min(i, kl + 1).
j = max(1, i-k )
nl = min( k+1, i ) + 1
call dlarot( .TRUE., i-k.GE.1, .TRUE., nl, c,s,
a(i,j), lda, xleft, xright )
Symmetric banded matrix in SB format, bandwidth K, lower triangle only: [ same as for SY, except:]
. . . .
a(i+1-j,j), lda, xleft, xright )
NOTE
i+1-j is just min(i,k+1)
. . . .
a(k+1,i), lda-1, xleft, xright )
Rotating columns is just the transpose of rotating rows, except for GB and SB: (rotating columns i and i+1)
GB:
NOTE
ku+j+1-i is just max(1,ku+2-i)
j = max(1, i-ku )
nl = min( n, i+kl+1 ) + 1-j
call dlarot( .TRUE., i-ku.LE.1, i+kl.LT.n, nl, c,s,
a(ku+j+1-i,i),lda-1, xtop, xbottm )
1541
3 Intel Math Kernel Library Developer Reference
. . . . . .
a(k+j+1-i,i),lda-1, xtop, xbottm )
. . . . . .
a(1,i),lda-1, xtop, xbottm )
Input Parameters
lrows LOGICAL.
If lrows = .TRUE., ?larot rotates two rows.
lleft LOGICAL.
If lleft = .TRUE., xleft is used instead of the corresponding element of
a for the first element in the second row (if lrows = .FALSE.) or column
(if lrows=.TRUE.).
lright LOGICAL.
If lleft = .TRUE., xright is used instead of the corresponding element
of a for the first element in the second row (if lrows = .FALSE.) or
column (if lrows=.TRUE.).
1542
LAPACK Routines 3
is applied from the left.
If lrows = .FALSE., then the transpose thereof is applied from the right.
Output Parameters
?larra
Computes the splitting points with the specified
threshold.
1543
3 Intel Math Kernel Library Developer Reference
Syntax
call slarra( n, d, e, e2, spltol, tnrm, nsplit, isplit, info )
call dlarra( n, d, e, e2, spltol, tnrm, nsplit, isplit, info )
Include Files
mkl.fi
Description
The routine computes the splitting points with the specified threshold and sets any "small" off-diagonal
elements to zero.
Input Parameters
First (n-1) entries contain the squares of the subdiagonal elements of the
tridiagonal matrix T; e2(n) need not be set.
Output Parameters
1544
LAPACK Routines 3
e2 On exit, the entries e2(isplit(i)), 1 i nsplit, are set to zero.
nsplit INTEGER.
The number of blocks the matrix T splits into. 1 nsplit n
isplit INTEGER.
Array, DIMENSION (n).
The splitting points, at which T breaks up into blocks. The first block
consists of rows/columns 1 to isplit(1), the second of rows/columns
isplit(1)+1 through isplit(2), and so on, and the nsplit-th consists
of rows/columns isplit(nsplit-1)+1 through isplit(nsplit)=n.
info INTEGER.
= 0: successful exit.
?larrb
Provides limited bisection to locate eigenvalues for
more accuracy.
Syntax
call slarrb( n, d, lld, ifirst, ilast, rtol1, rtol2, offset, w, wgap, werr, work,
iwork, pivmin, spdiam, twist, info )
call dlarrb( n, d, lld, ifirst, ilast, rtol1, rtol2, offset, w, wgap, werr, work,
iwork, pivmin, spdiam, twist, info )
Include Files
mkl.fi
Description
Given the relatively robust representation (RRR) L*D*LT, the routine does "limited" bisection to refine the
eigenvalues of L*D*LT, w( ifirst-offset ) through w( ilast-offset ), to more accuracy. Initial guesses for these
eigenvalues are input in w. The corresponding estimate of the error in these guesses and their gaps are input
in werr and wgap, respectively. During bisection, intervals [left, right] are maintained by storing their mid-
points and semi-widths in the arrays w and werr respectively.
Input Parameters
1545
3 Intel Math Kernel Library Developer Reference
offset INTEGER. Offset for the arrays w, wgap and werr, that is, the ifirst-offset
through ilast-offset elements of these arrays are to be used.
twist INTEGER. The twist index for the twisted factorization that is used for the
negcount.
1546
LAPACK Routines 3
twist = n: Compute negcount from L*D*LT - lambda*i = Nr*D r*Nr
iwork INTEGER.
Workspace array, DIMENSION (2*n).
Output Parameters
info INTEGER.
Error flag.
?larrc
Computes the number of eigenvalues of the
symmetric tridiagonal matrix.
Syntax
call slarrc( jobt, n, vl, vu, d, e, pivmin, eigcnt, lcnt, rcnt, info )
call dlarrc( jobt, n, vl, vu, d, e, pivmin, eigcnt, lcnt, rcnt, info )
Include Files
mkl.fi
Description
The routine finds the number of eigenvalues of the symmetric tridiagonal matrix T or of its factorization
L*D*LT in the specified interval.
Input Parameters
jobt CHARACTER*1.
= 'T': computes Sturm count for matrix T.
= 'L': computes Sturm count for matrix L*D*LT.
n INTEGER.
The order of the matrix. (n > 1).
1547
3 Intel Math Kernel Library Developer Reference
Output Parameters
eigcnt INTEGER.
The number of eigenvalues of the symmetric tridiagonal matrix T that are in
the half-open interval (vl,vu].
lcnt,rcnt INTEGER.
The left and right negcounts of the interval.
info INTEGER.
Now it is not used and always is set to 0.
?larrd
Computes the eigenvalues of a symmetric tridiagonal
matrix to suitable accuracy.
Syntax
call slarrd( range, order, n, vl, vu, il, iu, gers, reltol, d, e, e2, pivmin, nsplit,
isplit, m, w, werr, wl, wu, iblock, indexw, work, iwork, info )
call dlarrd( range, order, n, vl, vu, il, iu, gers, reltol, d, e, e2, pivmin, nsplit,
isplit, m, w, werr, wl, wu, iblock, indexw, work, iwork, info )
Include Files
mkl.fi
Description
The routine computes the eigenvalues of a symmetric tridiagonal matrix T to suitable accuracy. This is an
auxiliary code to be called from ?stemr. The user may ask for all eigenvalues, all eigenvalues in the half-
open interval (vl, vu], or the il-th through iu-th eigenvalues.
To avoid overflow, the matrix must be scaled so that its largest element is no greater than
(overflow1/2*underflow1/4) in absolute value, and for greatest accuracy, it should not be much smaller
than that. (For more details see [Kahan66].
1548
LAPACK Routines 3
Input Parameters
range CHARACTER.
= 'A': ("All") all eigenvalues will be found.
= 'V': ("Value") all eigenvalues in the half-open interval (vl, vu] will be
found.
= 'I': ("Index") the il-th through iu-th eigenvalues will be found.
order CHARACTER.
= 'B': ("By block") the eigenvalues will be grouped by split-off block (see
iblock, isplit below) and ordered from smallest to largest within the
block.
= 'E': ("Entire matrix") the eigenvalues for the entire matrix will be
ordered from smallest to largest.
il,iu INTEGER.
If range = 'I': the indices (in ascending order) of the smallest and
largest eigenvalues to be returned. 1 il iu n, if n > 0; il=1 and
iu=0 if n=0.
If range = 'A' or 'V': not referenced.
1549
3 Intel Math Kernel Library Developer Reference
nsplit INTEGER.
The number of diagonal blocks the matrix T . 1 nsplit n
isplit INTEGER.
Arrays, DIMENSION (n).
iwork INTEGER.
Workspace array, DIMENSION (4*n).
Output Parameters
m INTEGER.
The actual number of eigenvalues found. 0 mn. (See also the description
of info=2,3.)
1550
LAPACK Routines 3
The first m elements of w contain the eigenvalue approximations. ?laprd
computes an interval Ij = (aj, bj] that includes eigenvalue j. The
eigenvalue approximation is given as the interval midpoint w(j)= (aj
+bj)/2. The corresponding error is bounded by werr(j) = abs(aj-bj)/2.
If range = 'A': then wl and wu are the global Gerschgorin bounds on the
spectrum.
If range = 'I': then wl and wu are computed by ?laebz from the index
range specified.
iblock INTEGER.
Array, DIMENSION (n).
indexw INTEGER.
Array, DIMENSION (n).
The indices of the eigenvalues within each block (submatrix); for example,
indexw(i)= j and iblock(i)=k imply that the i-th eigenvalue w(i) is
the j-th eigenvalue in block k.
info INTEGER.
= 0: successful exit.
< 0: if info = -i, the i-th argument has an illegal value
> 0: some or all of the eigenvalues fail to converge or are not computed:
=1 or 3: bisection fail to converge for some eigenvalues; these eigenvalues
are flagged by a negative block number. The effect is that the eigenvalues
may not be as accurate as the absolute and relative tolerances.
=2 or 3:range='I' only: not all of the eigenvalues il:iu are found.
=4: range='I', and the Gershgorin interval initially used is too small. No
eigenvalues are computed.
1551
3 Intel Math Kernel Library Developer Reference
?larre
Given the tridiagonal matrix T, sets small off-diagonal
elements to zero and for each unreduced block Ti,
finds base representations and eigenvalues.
Syntax
call slarre( range, n, vl, vu, il, iu, d, e, e2, rtol1, rtol2, spltol, nsplit, isplit,
m, w, werr, wgap, iblock, indexw, gers, pivmin, work, iwork, info )
call dlarre( range, n, vl, vu, il, iu, d, e, e2, rtol1, rtol2, spltol, nsplit, isplit,
m, w, werr, wgap, iblock, indexw, gers, pivmin, work, iwork, info )
Include Files
mkl.fi
Description
To find the desired eigenvalues of a given real symmetric tridiagonal matrix T, the routine sets any "small"
off-diagonal elements to zero, and for each unreduced block Ti, it finds
Input Parameters
range CHARACTER.
= 'A': ("All") all eigenvalues will be found.
= 'V': ("Value") all eigenvalues in the half-open interval (vl, vu] will be
found.
= 'I': ("Index") the il-th through iu-th eigenvalues of the entire matrix
will be found.
il, iu INTEGER.
If range='I', the indices (in ascending order) of the smallest and largest
eigenvalues to be returned. 1 iliun.
1552
LAPACK Routines 3
Array, DIMENSION (n).
iwork INTEGER.
Workspace array, DIMENSION (5*n).
Output Parameters
vl, vu On exit, if range='I' or ='A', contain the bounds on the desired part of
the spectrum.
e2 On exit, the entries e2( isplit( i) ), 1 insplit, have been set to zero.
isplit INTEGER. Array, DIMENSION (n). The splitting points, at which T breaks up
into blocks. The first block consists of rows/columns 1 to isplit(1), the
second of rows/columns isplit(1)+1 through isplit(2), etc., and the
nsplit-th consists of rows/columns isplit(nsplit-1)+1 through
isplit(nsplit)=n.
1553
3 Intel Math Kernel Library Developer Reference
m INTEGER. The total number of eigenvalues (of all the Li*Di*LiT) found.
info INTEGER.
If info = 0: successful exit
1554
LAPACK Routines 3
If info = -2, no base representation could be found in maxtry
iterations. Increasing maxtry and recompilation might be a remedy.
If info = -3, there is a problem in ?larrb when computing the refined
root representation for ?lasq2.
If info = -4, there is a problem in ?larrb when preforming bisection
on the desired part of the spectrum.
If info = -5, there is a problem in ?lasq2.
If info = -6, there is a problem in ?lasq2.
See Also
?stemr
?lasq2
?larrb
?larrd
?larrf
Finds a new relatively robust representation such that
at least one of the eigenvalues is relatively isolated.
Syntax
call slarrf( n, d, l, ld, clstrt, clend, w, wgap, werr, spdiam, clgapl, clgapr,
pivmin, sigma, dplus, lplus, work, info )
call dlarrf( n, d, l, ld, clstrt, clend, w, wgap, werr, spdiam, clgapl, clgapr,
pivmin, sigma, dplus, lplus, work, info )
Include Files
mkl.fi
Description
Given the initial representation L*D*LT and its cluster of close eigenvalues (in a relative measure), w(clstrt),
w(clstrt+1), ... w(clend), the routine ?larrf finds a new relatively robust representation
Input Parameters
1555
3 Intel Math Kernel Library Developer Reference
Output Parameters
1556
LAPACK Routines 3
sigma REAL for slarrf
DOUBLE PRECISION for dlarrf
The shift used to form L(+)*D*(+)*L(+)T.
?larrj
Performs refinement of the initial estimates of the
eigenvalues of the matrix T.
Syntax
call slarrj( n, d, e2, ifirst, ilast, rtol, offset, w, werr, work, iwork, pivmin,
spdiam, info )
call dlarrj( n, d, e2, ifirst, ilast, rtol, offset, w, werr, work, iwork, pivmin,
spdiam, info )
Include Files
mkl.fi
Description
Given the initial eigenvalue approximations of T, this routine does bisection to refine the eigenvalues of T,
w(ifirst-offset) through w(ilast-offset), to more accuracy. Initial guesses for these eigenvalues are
input in w, the corresponding estimate of the error in these guesses in werr. During bisection, intervals
[a,b] are maintained by storing their mid-points and semi-widths in the arrays w and werr respectively.
Input Parameters
1557
3 Intel Math Kernel Library Developer Reference
ifirst INTEGER.
The index of the first eigenvalue to be computed.
ilast INTEGER.
The index of the last eigenvalue to be computed.
rtol REAL for slarrj
DOUBLE PRECISION for dlarrj
Tolerance for the convergence of the bisection intervals. An interval [a,b]
is considered to be converged if (b-a) rtol*max(|a|,|b|).
offset INTEGER.
Offset for the arrays w and werr, that is the ifirst-offset through
ilast-offset elements of these arrays are to be used.
w REAL for slarrj
DOUBLE PRECISION for dlarrj
Array, DIMENSION (n).
iwork INTEGER.
Workspace array, DIMENSION (2*n).
1558
LAPACK Routines 3
Output Parameters
werr On exit, contains the refined errors in the estimates of the corresponding
elements in w.
info INTEGER.
Now it is not used and always is set to 0.
?larrk
Computes one eigenvalue of a symmetric tridiagonal
matrix T to suitable accuracy.
Syntax
call slarrk( n, iw, gl, gu, d, e2, pivmin, reltol, w, werr, info )
call dlarrk( n, iw, gl, gu, d, e2, pivmin, reltol, w, werr, info )
Include Files
mkl.fi
Description
The routine computes one eigenvalue of a symmetric tridiagonal matrix T to suitable accuracy. This is an
auxiliary code to be called from ?stemr.
To avoid overflow, the matrix must be scaled so that its largest element is no greater than
(overflow1/2*underflow1/4) in absolute value, and for greatest accuracy, it should not be much smaller
than that. For more details see [Kahan66].
Input Parameters
iw INTEGER.
The index of the eigenvalue to be returned.
1559
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
= 0: Eigenvalue converges
= -1: Eigenvalue does not converge
?larrr
Performs tests to decide whether the symmetric
tridiagonal matrix T warrants expensive computations
which guarantee high relative accuracy in the
eigenvalues.
Syntax
call slarrr( n, d, e, info )
call dlarrr( n, d, e, info )
Include Files
mkl.fi
Description
The routine performs tests to decide whether the symmetric tridiagonal matrix T warrants expensive
computations which guarantee high relative accuracy in the eigenvalues.
1560
LAPACK Routines 3
Input Parameters
Output Parameters
info INTEGER.
= 0: the matrix warrants computations preserving relative accuracy
(default value).
= -1: the matrix warrants computations guaranteeing only absolute
accuracy.
?larrv
Computes the eigenvectors of the tridiagonal matrix T
= L*D*LT given L, D and the eigenvalues of L*D*LT.
Syntax
call slarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1, rtol2, w,
werr, wgap, iblock, indexw, gers, z, ldz, isuppz, work, iwork, info )
call dlarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1, rtol2, w,
werr, wgap, iblock, indexw, gers, z, ldz, isuppz, work, iwork, info )
call clarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1, rtol2, w,
werr, wgap, iblock, indexw, gers, z, ldz, isuppz, work, iwork, info )
call zlarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1, rtol2, w,
werr, wgap, iblock, indexw, gers, z, ldz, isuppz, work, iwork, info )
Include Files
mkl.fi
Description
The routine ?larrv computes the eigenvectors of the tridiagonal matrix T = L*D*LT given L, D and
approximations to the eigenvalues of L*D*LT.
The input eigenvalues should have been computed by slarre for real flavors (slarrv/clarrv) and by
dlarre for double precision flavors (dlarre/zlarre).
1561
3 Intel Math Kernel Library Developer Reference
Input Parameters
On entry, the (n-1) subdiagonal elements of the unit bidiagonal matrix L are
contained in elements 1 to n-1 of L if the matrix is not splitted. At the end
of each block the corresponding shift is stored as given by slarre for real
flavors and by dlarre for double precision flavors.
If you want to compute only selected eigenpairs, then the columns dol-1 to
dou+1 of the eigenvector space Z contain the computed eigenvectors. All
other columns of Z are set to zero.
1562
LAPACK Routines 3
Parameters for bisection. An interval [LEFT,RIGHT] has converged if
RIGHT-LEFT.LT.MAX( rtol1*gap, rtol2*max(|LEFT|,|RIGHT|) ).
ldz INTEGER. The leading dimension of the output array Z. ldz 1, and if jobz
= 'V', ldz max(1,n).
iwork INTEGER.
Workspace array, DIMENSION (7*n).
1563
3 Intel Math Kernel Library Developer Reference
Output Parameters
l On exit, l is overwritten.
wgap On exit, wgap contains refined values of its input approximations. Very
small gaps are changed.
NOTE
The user must ensure that at least max(1,m) columns are supplied in
the array z.
isuppz INTEGER .
Array, DIMENSION(2*max(1,m)). The support of the eigenvectors in z, that
is, the indices indicating the nonzero elements in z. The i-th eigenvector is
nonzero only in elements isuppz(2i-1) through isuppz(2i).
info INTEGER.
If info = 0: successful exit
1564
LAPACK Routines 3
If info = -3, there is a problem in ?larrb when refining a single
eigenvalue after the Rayleigh correction was rejected.
See Also
?larrb
?larre
?larrf
?lartg
Generates a plane rotation with real cosine and real/
complex sine.
Syntax
call slartg( f, g, cs, sn, r )
call dlartg( f, g, cs, sn, r )
call clartg( f, g, cs, sn, r )
call zlartg( f, g, cs, sn, r )
Include Files
mkl.fi
Description
This is a slower, more accurate version of the BLAS Level 1 routine ?rotg, except for the following
differences.
For slartg/dlartg:
If f=0 and g 0, then cs=0 and sn=1 without doing any floating point operations (saves work in ?bdsqr
when there are zeros on the diagonal);
If f exceeds g in magnitude, cs will be positive.
For clartg/zlartg:
1565
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
?lartgp
Generates a plane rotation.
Syntax
call slartgp( f, g, cs, sn, r )
call dlartgp( f, g, cs, sn, r )
call lartgp( f,g,cs,sn,r )
Include Files
mkl.fi
Description
The routine generates a plane rotation so that
1566
LAPACK Routines 3
where cs2 + sn2 = 1
This is a slower, more accurate version of the BLAS Level 1 routine ?rotg, except for the following
differences:
Input Parameters
The data types are given for the Fortran interface.
Output Parameters
1567
3 Intel Math Kernel Library Developer Reference
See Also
?rotg
?lartg
?lartgs
?lartgs
Generates a plane rotation designed to introduce a
bulge in implicit QR iteration for the bidiagonal SVD
problem.
Syntax
call slartgs( x, y, sigma, cs, sn )
call dlartgs( x, y, sigma, cs, sn )
call lartgs( x,y,sigma,cs,sn )
Include Files
mkl.fi
Description
The routine generates a plane rotation designed to introduce a bulge in Golub-Reinsch-style implicit QR
iteration for the bidiagonal SVD problem. x and y are the top-row entries, and sigma is the shift. The
computed cs and sn define a plane rotation that satisfies the following:
with r nonnegative.
If x2 - sigma and x * y are 0, the rotation is by /2
Input Parameters
The data types are given for the Fortran interface.
Output Parameters
1568
LAPACK Routines 3
The cosine of the rotation.
See Also
?lartg
?lartgp
?lartv
Applies a vector of plane rotations with real cosines
and real/complex sines to the elements of a pair of
vectors.
Syntax
call slartv( n, x, incx, y, incy, c, s, incc )
call dlartv( n, x, incx, y, incy, c, s, incc )
call clartv( n, x, incx, y, incy, c, s, incc )
call zlartv( n, x, incx, y, incy, c, s, incc )
Include Files
mkl.fi
Description
The routine applies a vector of real/complex plane rotations with real cosines to elements of the real/complex
vectors x and y. For i = 1,2,...,n
1569
3 Intel Math Kernel Library Developer Reference
Input Parameters
Output Parameters
?laruv
Returns a vector of n random real numbers from a
uniform distribution.
Syntax
call slaruv( iseed, n, x )
call dlaruv( iseed, n, x )
Include Files
mkl.fi
Description
The routine ?laruv returns a vector of n random real numbers from a uniform (0,1) distribution (n 128).
1570
LAPACK Routines 3
This is an auxiliary routine called by ?larnv.
Input Parameters
iseed INTEGER. Array, DIMENSION (4). On entry, the seed of the random number
generator; the array elements must be between 0 and 4095, and iseed(4)
must be odd.
Output Parameters
?larz
Applies an elementary reflector (as returned by ?
tzrzf) to a general matrix.
Syntax
call slarz( side, m, n, l, v, incv, tau, c, ldc, work )
call dlarz( side, m, n, l, v, incv, tau, c, ldc, work )
call clarz( side, m, n, l, v, incv, tau, c, ldc, work )
call zlarz( side, m, n, l, v, incv, tau, c, ldc, work )
Include Files
mkl.fi
Description
The routine ?larz applies a real/complex elementary reflector H to a real/complex m-by-n matrix C, from
either the left or the right. H is represented in the forms
H = I-tau*v*vT for real flavors and H = I-tau*v*vH for complex flavors,
where tau is a real/complex scalar and v is a real/complex vector, respectively.
If tau = 0, then H is taken to be the unit matrix.
For complex flavors, to apply HH (the conjugate transpose of H), supply conjg(tau) instead of tau.
H is a product of k elementary reflectors as returned by ?tzrzf.
Input Parameters
side CHARACTER*1.
If side = 'L': form H*C
1571
3 Intel Math Kernel Library Developer Reference
if side = 'R', nL 0.
1572
LAPACK Routines 3
(m) if side = 'R'.
Output Parameters
?larzb
Applies a block reflector or its transpose/conjugate-
transpose to a general matrix.
Syntax
call slarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc, work,
ldwork )
call dlarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc, work,
ldwork )
call clarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc, work,
ldwork )
call zlarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc, work,
ldwork )
Include Files
mkl.fi
Description
The routine applies a real/complex block reflector H or its transpose HT (or the conjugate transpose HH for
complex flavors) to a real/complex distributed m-by-n matrix C from the left or the right. Currently, only
storev = 'R' and direct = 'B' are supported.
Input Parameters
side CHARACTER*1.
If side = 'L': apply H or HT/HH from the left
trans CHARACTER*1.
If trans = 'N': apply H (No transpose)
direct CHARACTER*1.
Indicates how H is formed from a product of elementary reflectors
= 'F': H = H(1)*H(2)*...*H(k) (forward, not supported)
storev CHARACTER*1.
1573
3 Intel Math Kernel Library Developer Reference
Indicates how the vectors which define the elementary reflectors are
stored:
= 'C': Column-wise (not supported)
= 'R': Row-wise.
If storev = 'C', nv = k;
if storev = 'R', nv = l.
1574
LAPACK Routines 3
work REAL for slarzb
DOUBLE PRECISION for dlarzb
COMPLEX for clarzb
DOUBLE COMPLEX for zlarzb
Workspace array, DIMENSION (ldwork, k).
Output Parameters
?larzt
Forms the triangular factor T of a block reflector H = I
- V*T*VH.
Syntax
call slarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call dlarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call clarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call zlarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
Include Files
mkl.fi
Description
The routine forms the triangular factor T of a real/complex block reflector H of order > n, which is defined as
a product of k elementary reflectors.
If direct = 'F', H = H(1)*H(2)*...*H(k), and T is upper triangular.
If storev = 'C', the vector which defines the elementary reflector H(i) is stored in the i-th column of the
array v, and H = I-V*T*VT (for real flavors) or H = I-V*T*VH (for complex flavors).
If storev = 'R', the vector which defines the elementary reflector H(i) is stored in the i-th row of the array
v, and H = I-VT*T*V (for real flavors) or H = I-VH*T*V (for complex flavors).
Input Parameters
direct CHARACTER*1.
Specifies the order in which the elementary reflectors are multiplied to form
the block reflector:
1575
3 Intel Math Kernel Library Developer Reference
storev CHARACTER*1.
Specifies how the vectors which define the elementary reflectors are stored
(see also Application Notes below):
If storev = 'C': column-wise (not supported)
Output Parameters
1576
LAPACK Routines 3
Array, DIMENSION (ldt,k). The k-by-k triangular factor T of the block
reflector. If direct = 'F', T is upper triangular; if direct = 'B', T is
lower triangular. The rest of the array is not used.
Application Notes
The shape of the matrix V and the storage of the vectors which define the H(i) is best illustrated by the
following example with n = 5 and k = 3. The elements equal to 1 are not stored; the corresponding array
elements are modified but restored on exit. The rest of the array is not used.
1577
3 Intel Math Kernel Library Developer Reference
?las2
Computes singular values of a 2-by-2 triangular
matrix.
Syntax
call slas2( f, g, h, ssmin, ssmax )
call dlas2( f, g, h, ssmin, ssmax )
Include Files
mkl.fi
Description
The routine ?las2 computes the singular values of the 2-by-2 matrix
On return, ssmin is the smaller singular value and SSMAX is the larger singular value.
Input Parameters
1578
LAPACK Routines 3
Output Parameters
Application Notes
Barring over/underflow, all output quantities are correct to within a few units in the last place (ulps), even in
the absence of a guard digit in addition/subtraction. In ieee arithmetic, the code works correctly if one matrix
element is infinite. Overflow will not occur unless the largest singular value itself overflows, or is within a few
ulps of overflow. (On machines with partial overflow, like the Cray, overflow may occur if the largest singular
value is within a factor of 2 of overflow.) Underflow is harmless if underflow is gradual. Otherwise, results
may correspond to a matrix modified by perturbations of size near the underflow threshold.
?lascl
Multiplies a general rectangular matrix by a real scalar
defined as cto/cfrom.
Syntax
call slascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call dlascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call clascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call zlascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?lascl multiplies the m-by-n real/complex matrix A by the real scalar cto/cfrom. The operation
is performed without over/underflow as long as the final result cto*A(i,j)/cfrom does not over/underflow.
type specifies that A may be full, upper triangular, lower triangular, upper Hessenberg, or banded.
Input Parameters
type CHARACTER*1. This parameter specifies the storage type of the input
matrix.
= 'G': A is a full matrix.
1579
3 Intel Math Kernel Library Developer Reference
= 'Z': A is a band matrix with lower bandwidth kl and upper bandwidth ku.
See description of the ?gbtrf function for storage details.
Output Parameters
info INTEGER.
If info = 0 - successful exit
See Also
?gbtrf
?lasd0
Computes the singular values of a real upper
bidiagonal n-by-m matrix B with diagonal d and off-
diagonal e. Used by ?bdsdc.
Syntax
call slasd0( n, sqre, d, e, u, ldu, vt, ldvt, smlsiz, iwork, work, info )
call dlasd0( n, sqre, d, e, u, ldu, vt, ldvt, smlsiz, iwork, work, info )
1580
LAPACK Routines 3
Include Files
mkl.fi
Description
Using a divide and conquer approach, the routine ?lasd0 computes the singular value decomposition (SVD)
of a real upper bidiagonal n-by-m matrix B with diagonal d and offdiagonal e, where m = n + sqre.
The algorithm computes orthogonal matrices U and VT such that B = U*S*VT. The singular values S are
overwritten on d.
The related subroutine ?lasda computes only the singular values, and optionally, the singular vectors in
compact form.
Input Parameters
n INTEGER. On entry, the row dimension of the upper bidiagonal matrix. This
is also the dimension of the main diagonal array d.
smlsiz INTEGER. On entry, maximum size of the subproblems at the bottom of the
computation tree.
iwork INTEGER.
Workspace array, dimension must be at least (8n).
Output Parameters
1581
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0: successful exit.
?lasd1
Computes the SVD of an upper bidiagonal matrix B of
the specified size. Used by ?bdsdc.
Syntax
call slasd1( nl, nr, sqre, d, alpha, beta, u, ldu, vt, ldvt, idxq, iwork, work, info )
call dlasd1( nl, nr, sqre, d, alpha, beta, u, ldu, vt, ldvt, idxq, iwork, work, info )
Include Files
mkl.fi
Description
The routine computes the SVD of an upper bidiagonal n-by-m matrix B, where n = nl + nr + 1 and m = n
+ sqre.
The routine ?lasd1 is called from ?lasd0.
A related subroutine ?lasd7 handles the case in which the singular values (and the singular vectors in
factored form) are desired.
?lasd1 computes the SVD as follows:
= U(out)*(D(out) 0)*VT(out)
whereZT = (Z1TaZ2Tb) = uT*VTT, and u is a vector of dimension m with alpha and beta in the nl+1 and nl
+2-th entries and zeros elsewhere; and the entry b is empty if sqre = 0.
The left singular vectors of the original matrix are stored in u, and the transpose of the right singular vectors
are stored in vt, and the singular values are in d. The algorithm consists of three stages:
1582
LAPACK Routines 3
1. The first stage consists of deflating the size of the problem when there are multiple singular values or
when there are zeros in the Z vector. For each such occurrence the dimension of the secular equation
problem is reduced by one. This stage is performed by the routine ?lasd2.
2. The second stage consists of calculating the updated singular values. This is done by finding the square
roots of the roots of the secular equation via the routine ?lasd4 (as called by ?lasd3). This routine
also calculates the singular vectors of the current problem.
3. The final stage consists of computing the updated singular vectors directly using the updated singular
values. The singular vectors for the current problem are multiplied with the singular vectors from the
overall problem.
Input Parameters
sqre INTEGER.
If sqre = 0: the lower block is an nr-by-nr square matrix.
1583
3 Intel Math Kernel Library Developer Reference
iwork INTEGER.
Workspace array, DIMENSION (4n).
Output Parameters
alpha On exit, the diagonal element associated with the added row deflated by
max( abs( alpha ), abs( beta ), abs( D(I) ) ), I = 1,n.
beta On exit, the off-diagonal element associated with the added row deflated by
max( abs( alpha ), abs( beta ), abs( D(I) ) ), I = 1,n.
vt On exit vtT contains the right singular vectors of the bidiagonal matrix.
idxq INTEGER
Array, DIMENSION (n). Contains the permutation which will reintegrate the
subproblem just solved back into sorted order, that is, d(idxq( i = 1,
n )) will be in ascending order.
info INTEGER.
If info = 0: successful exit.
?lasd2
Merges the two sets of singular values together into a
single sorted set. Used by ?bdsdc.
Syntax
call slasd2( nl, nr, sqre, k, d, z, alpha, beta, u, ldu, vt, ldvt, dsigma, u2, ldu2,
vt2, ldvt2, idxp, idx, idxp, idxq, coltyp, info )
1584
LAPACK Routines 3
call dlasd2( nl, nr, sqre, k, d, z, alpha, beta, u, ldu, vt, ldvt, dsigma, u2, ldu2,
vt2, ldvt2, idxp, idx, idxp, idxq, coltyp, info )
Include Files
mkl.fi
Description
The routine ?lasd2 merges the two sets of singular values together into a single sorted set. Then it tries to
deflate the size of the problem. There are two ways in which deflation can occur: when two or more singular
values are close together or if there is a tiny entry in the Z vector. For each such occurrence the order of the
related secular equation problem is reduced by one.
The routine ?lasd2 is called from ?lasd1.
Input Parameters
sqre INTEGER.
If sqre = 0): the lower block is an nr-by-nr square matrix
1585
3 Intel Math Kernel Library Developer Reference
ldun.
ldu2 INTEGER. The leading dimension of the output array u2. ldu2n.
ldvt2 INTEGER. The leading dimension of the output array vt2. ldvt2m.
idxp INTEGER.
Workspace array, DIMENSION (n). This will contain the permutation used to
place deflated values of D at the end of the array. On output idxp(2:k)
points to the nondeflated d-values and idxp(k+1:n) points to the deflated
singular values.
idx INTEGER.
Workspace array, DIMENSION (n). This will contain the permutation used to
sort the contents of d into ascending order.
coltyp INTEGER.
Workspace array, DIMENSION (n). As workspace, this array contains a label
that indicates which of the following types a column in the u2 matrix or a
row in the vt2 matrix is:
1 : non-zero in the upper half only
2 : non-zero in the lower half only
3 : dense
4 : deflated.
idxq INTEGER. Array, DIMENSION (n). This parameter contains the permutation
that separately sorts the two sub-problems in D in the ascending order.
Note that entries in the first half of this permutation must first be moved
one position backwards and entries in the second half must have nl+1
added to their values.
Output Parameters
d On exit D contains the trailing (n-k) updated singular values (those which
were deflated) sorted into increasing order.
u On exit u contains the trailing (n-k) updated left singular vectors (those
which were deflated) in its last n-k columns.
1586
LAPACK Routines 3
DOUBLE PRECISION for dlasd2
Array, DIMENSION (n). On exit, z contains the updating row vector in the
secular equation.
vt On exit, vtT contains the trailing (n-k) updated right singular vectors (those
which were deflated) in its last n-k columns. In case sqre =1, the last row
of vt spans the right null space.
idxc INTEGER. Array, DIMENSION (n). This will contain the permutation used to
arrange the columns of the deflated u matrix into three groups: the first
group contains non-zero entries only at and above nl, the second contains
non-zero entries only below nl+2, and the third is dense.
info INTEGER.
If info = 0): successful exit
?lasd3
Finds all square roots of the roots of the secular
equation, as defined by the values in D and Z, and
then updates the singular vectors by matrix
multiplication. Used by ?bdsdc.
1587
3 Intel Math Kernel Library Developer Reference
Syntax
call slasd3( nl, nr, sqre, k, d, q, ldq, dsigma, u, ldu, u2, ldu2, vt, ldvt, vt2,
ldvt2, idxc, ctot, z, info )
call dlasd3( nl, nr, sqre, k, d, q, ldq, dsigma, u, ldu, u2, ldu2, vt, ldvt, vt2,
ldvt2, idxc, ctot, z, info )
Include Files
mkl.fi
Description
The routine ?lasd3 finds all the square roots of the roots of the secular equation, as defined by the values in
D and Z.
It makes the appropriate calls to ?lasd4 and then updates the singular vectors by matrix multiplication.
Input Parameters
sqre INTEGER.
If sqre = 0): the lower block is an nr-by-nr square matrix.
1588
LAPACK Routines 3
u2 REAL for slasd3
DOUBLE PRECISION for dlasd3
Array, DIMENSION (ldu2, n).
The first k columns of this matrix contain the non-deflated left singular
vectors for the split problem.
The first k columns of vt2' contain the non-deflated right singular vectors
for the split problem.
ctot INTEGER. Array, DIMENSION (4). A count of the total number of the various
types of columns in u (or rows in vt), as described in idxc.
The fourth column type is any column which has been deflated.
Output Parameters
1589
3 Intel Math Kernel Library Developer Reference
The last n - k columns of this matrix contain the deflated left singular
vectors.
The last m - k columns of vt' contain the deflated right singular vectors.
z Destroyed on exit.
info INTEGER.
If info = 0): successful exit.
Application Notes
This code makes very mild assumptions about floating point arithmetic. It will work on machines with a guard
digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray XMP, Cray
YMP, Cray C 90, or Cray 2. It could conceivably fail on hexadecimal or decimal machines without guard digits,
but we know of none.
?lasd4
Computes the square root of the i-th updated
eigenvalue of a positive symmetric rank-one
modification to a positive diagonal matrix. Used by ?
bdsdc.
Syntax
call slasd4( n, i, d, z, delta, rho, sigma, work, info)
call dlasd4( n, i, d, z, delta, rho, sigma, work, info )
Include Files
mkl.fi
Description
The routine computes the square root of the i-th updated eigenvalue of a positive symmetric rank-one
modification to a positive diagonal matrix whose entries are given as the squares of the corresponding
entries in the array d, and that 0 d(i) < d(j) for i < j and that rho > 0. This is arranged by the calling
routine, and is no loss in generality. The rank-one modified system is thus
diag(d)*diag(d) + rho*Z*ZT,
where the Euclidean norm of Z is equal to 1.The method consists of approximating the rational functions in
the secular equation by simpler interpolating rational functions.
1590
LAPACK Routines 3
Input Parameters
i INTEGER.
The index of the eigenvalue to be computed. 1 in.
The original eigenvalues. They must be in order, 0 d(i) < d(j) for i <
j.
If n = 1, then work( 1 ) = 1.
Output Parameters
info INTEGER.
= 0: successful exit
> 0: If info = 1, the updating process failed.
1591
3 Intel Math Kernel Library Developer Reference
?lasd5
Computes the square root of the i-th eigenvalue of a
positive symmetric rank-one modification of a 2-by-2
diagonal matrix.Used by ?bdsdc.
Syntax
call slasd5( i, d, z, delta, rho, dsigma, work )
call dlasd5( i, d, z, delta, rho, dsigma, work )
Include Files
mkl.fi
Description
The routine computes the square root of the i-th eigenvalue of a positive symmetric rank-one modification of
a 2-by-2 diagonal matrix diag(d)*diag(d)+rho*Z*ZT
The diagonal entries in the array d must satisfy 0 d(i) < d(j) for i<i, rho mustbe greater than 0, and
that the Euclidean norm of the vector Z is equal to 1.
Input Parameters
Output Parameters
1592
LAPACK Routines 3
Array, dimension ( 2 ).
?lasd6
Computes the SVD of an updated upper bidiagonal
matrix obtained by merging two smaller ones by
appending a row. Used by ?bdsdc.
Syntax
call slasd6( icompq, nl, nr, sqre, d, vf, vl, alpha, beta, idxq, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, iwork, info )
call dlasd6( icompq, nl, nr, sqre, d, vf, vl, alpha, beta, idxq, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, iwork, info )
Include Files
mkl.fi
Description
The routine ?lasd6 computes the SVD of an updated upper bidiagonal matrix B obtained by merging two
smaller ones by appending a row. This routine is used only for the problem which requires all singular values
and optionally singular vector matrices in factored form. B is an n-by-m matrix with n = nl + nr + 1 and m
= n + sqre. A related subroutine, ?lasd1, handles the case in which all singular values and singular vectors
of the bidiagonal matrix are desired. ?lasd6 computes the SVD as follows:
= U(out)*(D(out)*VT(out)
where Z' = (Z1' aZ2' b) = u'*VT', and u is a vector of dimension m with alpha and beta in the nl+1
and nl+2-th entries and zeros elsewhere; and the entry b is empty if sqre = 0.
The singular values of B can be computed using D1, D2, the first components of all the right singular vectors
of the lower block, and the last components of all the right singular vectors of the upper block. These
components are stored and updated in vf and vl, respectively, in ?lasd6. Hence U and VT are not explicitly
referenced.
The singular values are stored in D. The algorithm consists of two stages:
1. The first stage consists of deflating the size of the problem when there are multiple singular values or if
there is a zero in the Z vector. For each such occurrence the dimension of the secular equation problem
is reduced by one. This stage is performed by the routine ?lasd7.
1593
3 Intel Math Kernel Library Developer Reference
2. The second stage consists of calculating the updated singular values. This is done by finding the roots
of the secular equation via the routine ?lasd4 (as called by ?lasd8). This routine also updates vf and
vl and computes the distances between the updated singular values and the old singular values. ?
lasd6 is called from ?lasda.
Input Parameters
sqre INTEGER .
= 0: the lower block is an nr-by-nr square matrix.
= 1: the lower block is an nr-by-(nr+1) rectangular matrix.
The bidiagonal matrix has row dimension n=nl+nr+1, and column
dimension m = n + sqre.
1594
LAPACK Routines 3
beta REAL for slasd6
DOUBLE PRECISION for dlasd6
Contains the off-diagonal element associated with the added row.
ldgcol INTEGER.The leading dimension of the output array givcol, must be at least
n.
ldgnum INTEGER.
The leading dimension of the output arrays givnum and poles, must be at
least n.
iwork INTEGER.
Workspace array, dimension ( 3n ).
Output Parameters
vf On exit, vf contains the first components of all right singular vectors of the
bidiagonal matrix.
vl On exit, vl contains the last components of all right singular vectors of the
bidiagonal matrix.
alpha On exit, the diagonal element associated with the added row deflated by
max(abs(alpha), abs(beta), abs(D(I))), I = 1,n.
beta On exit, the off-diagonal element associated with the added row deflated by
max(abs(alpha), abs(beta), abs(D(I))), I = 1,n.
idxq INTEGER.
Array, dimension (n). This contains the permutation which will reintegrate
the subproblem just solved back into sorted order, that is, d( idxq( i =
1, n ) ) will be in ascending order.
perm INTEGER.
Array, dimension (n). The permutations (from deflation and sorting) to be
applied to each block. Not referenced if icompq = 0.
givptr INTEGER. The number of Givens rotations which took place in this
subproblem. Not referenced if icompq = 0.
givcol INTEGER.
Array, dimension ( ldgcol, 2 ). Each pair of numbers indicates a pair of
columns to take place in a Givens rotation. Not referenced if icompq = 0.
1595
3 Intel Math Kernel Library Developer Reference
The first elements of this array contain the components of the deflation-
adjusted updating row vector.
info INTEGER.
1596
LAPACK Routines 3
= 0: successful exit.
< 0: if info = -i, the i-th argument had an illegal value.
?lasd7
Merges the two sets of singular values together into a
single sorted set. Then it tries to deflate the size of
the problem. Used by ?bdsdc.
Syntax
call slasd7( icompq, nl, nr, sqre, k, d, z, zw, vf, vfw, vl, vlw, alpha, beta, dsigma,
idx, idxp, idxq, perm, givptr, givcol, ldgcol, givnum, ldgnum, c, s, info )
call dlasd7( icompq, nl, nr, sqre, k, d, z, zw, vf, vfw, vl, vlw, alpha, beta, dsigma,
idx, idxp, idxq, perm, givptr, givcol, ldgcol, givnum, ldgnum, c, s, info )
Include Files
mkl.fi
Description
The routine ?lasd7 merges the two sets of singular values together into a single sorted set. Then it tries to
deflate the size of the problem. There are two ways in which deflation can occur: when two or more singular
values are close together or if there is a tiny entry in the Z vector. For each such occurrence the order of the
related secular equation problem is reduced by one. ?lasd7 is called from ?lasd6.
Input Parameters
sqre INTEGER.
= 0: the lower block is an nr-by-nr square matrix.
= 1: the lower block is an nr-by-(nr+1) rectangular matrix. The bidiagonal
matrix has n = nl + nr + 1 rows and m = n + sqren columns.
1597
3 Intel Math Kernel Library Developer Reference
Workspace for z.
idx INTEGER.
Workspace array, DIMENSION (n). This will contain the permutation used to
sort the contents of d into ascending order.
idxp INTEGER.
Workspace array, DIMENSION (n). This will contain the permutation used to
place deflated values of d at the end of the array.
1598
LAPACK Routines 3
idxq INTEGER.
Array, DIMENSION (n).
This contains the permutation which separately sorts the two sub-problems
in d into ascending order. Note that entries in the first half of this
permutation must first be moved one position backward; and entries in the
second half must first have nl+1 added to their values.
ldgcol INTEGER.The leading dimension of the output array givcol, must be at least
n.
ldgnum INTEGER. The leading dimension of the output array givnum, must be at
least n.
Output Parameters
d On exit, d contains the trailing (n-k) updated singular values (those which
were deflated) sorted into increasing order.
vf On exit, vf contains the first components of all right singular vectors of the
bidiagonal matrix.
vl On exit, vl contains the last components of all right singular vectors of the
bidiagonal matrix.
idxp On output, idxp(2: k) points to the nondeflated d-values and idxp( k+1:n)
points to the deflated singular values.
perm INTEGER.
Array, DIMENSION (n).
givptr INTEGER.
The number of Givens rotations which took place in this subproblem. Not
referenced if icompq = 0.
1599
3 Intel Math Kernel Library Developer Reference
givcol INTEGER.
Array, DIMENSION ( ldgcol, 2 ). Each pair of numbers indicates a pair of
columns to take place in a Givens rotation. Not referenced if icompq = 0.
info INTEGER.
= 0: successful exit.
< 0: if info = -i, the i-th argument had an illegal value.
?lasd8
Finds the square roots of the roots of the secular
equation, and stores, for each element in D, the
distance to its two nearest poles. Used by ?bdsdc.
Syntax
call slasd8( icompq, k, d, z, vf, vl, difl, difr, lddifr, dsigma, work, info )
call dlasd8( icompq, k, d, z, vf, vl, difl, difr, lddifr, dsigma, work, info )
Include Files
mkl.fi
Description
The routine ?lasd8 finds the square roots of the roots of the secular equation, as defined by the values in
dsigma and z. It makes the appropriate calls to ?lasd4, and stores, for each element in d, the distance to its
two nearest poles (elements in dsigma). It also updates the arrays vf and vl, the first and last components of
all the right singular vectors of the original bidiagonal matrix. ?lasd8 is called from ?lasd6.
Input Parameters
1600
LAPACK Routines 3
= 0: Compute singular values only.
= 1: Compute singular vectors in factored form as well.
The first k elements of this array contain the components of the deflation-
adjusted updating row vector.
lddifr INTEGER. The leading dimension of the output array difr, must be at least
k.
The first k elements of this array contain the old roots of the deflated
updating problem. These are the poles of the secular equation.
Output Parameters
z Updated on exit.
1601
3 Intel Math Kernel Library Developer Reference
dsigma The elements of this array may be very slightly altered in value.
info INTEGER.
= 0: successful exit.
< 0: if info = -i, the i-th argument had an illegal value.
?lasd9
Finds the square roots of the roots of the secular
equation, and stores, for each element in D, the
distance to its two nearest poles. Used by ?bdsdc.
Syntax
call slasd9( icompq, ldu, k, d, z, vf, vl, difl, difr, dsigma, work, info )
call dlasd9( icompq, ldu, k, d, z, vf, vl, difl, difr, dsigma, work, info )
Include Files
mkl.fi
Description
The routine ?lasd9 finds the square roots of the roots of the secular equation, as defined by the values in
dsigma and z. It makes the appropriate calls to ?lasd4, and stores, for each element in d, the distance to its
two nearest poles (elements in dsigma). It also updates the arrays vf and vl, the first and last components of
all the right singular vectors of the original bidiagonal matrix. ?lasd9 is called from ?lasd7.
Input Parameters
1602
LAPACK Routines 3
If icompq = 1, compute singular vector matrices in factored form also.
The first k elements of this array contain the old roots of the deflated
updating problem. These are the poles of the secular equation.
Output Parameters
1603
3 Intel Math Kernel Library Developer Reference
info INTEGER.
= 0: successful exit.
< 0: if info = -i, the i-th argument had an illegal value.
?lasda
Computes the singular value decomposition (SVD) of a
real upper bidiagonal matrix with diagonal d and off-
diagonal e. Used by ?bdsdc.
Syntax
call slasda( icompq, smlsiz, n, sqre, d, e, u, ldu, vt, k, difl, difr, z, poles,
givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
call dlasda( icompq, smlsiz, n, sqre, d, e, u, ldu, vt, k, difl, difr, z, poles,
givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
Include Files
mkl.fi
Description
Using a divide and conquer approach, ?lasda computes the singular value decomposition (SVD) of a real
upper bidiagonal n-by-m matrix B with diagonal d and off-diagonal e, where m = n + sqre.
The algorithm computes the singular values in the SVDB = U*S*VT. The orthogonal matrices U and VT are
optionally computed in compact form. A related subroutine ?lasd0 computes the singular values and the
singular vectors in explicit form.
Input Parameters
icompq INTEGER.
Specifies whether singular vectors are to be computed in compact form, as
follows:
= 0: Compute singular values only.
= 1: Compute singular vectors of upper bidiagonal matrix in compact form.
smlsiz INTEGER.
1604
LAPACK Routines 3
The maximum size of the subproblems at the bottom of the computation
tree.
n INTEGER. The row dimension of the upper bidiagonal matrix. This is also the
dimension of the main diagonal array d.
ldu INTEGER. The leading dimension of arrays u, vt, difl, difr, poles, givnum,
and z. ldun.
ldgcol INTEGER. The leading dimension of arrays givcol and perm. ldgcoln.
iwork INTEGER.
Workspace array, Dimension must be at least (7n).
Output Parameters
1605
3 Intel Math Kernel Library Developer Reference
k INTEGER.
Array, DIMENSION (n) if icompq = 1 and
givcol INTEGER .
1606
LAPACK Routines 3
Array, DIMENSION(ldgcol, 2*nlvl) if icompq = 1, and not referenced if
icompq = 0. If icompq = 1, on exit, for each i, givcol(1, 2 i - 1) and
givcol(1, 2 i) record the locations of Givens rotations performed on the i-th
level on the computation tree.
info INTEGER.
= 0: successful exit.
< 0: if info = -i, the i-th argument had an illegal value
> 0: If info = 1, an singular value did not converge
?lasdq
Computes the SVD of a real bidiagonal matrix with
diagonal d and off-diagonal e. Used by ?bdsdc.
Syntax
call slasdq( uplo, sqre, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work,
info )
1607
3 Intel Math Kernel Library Developer Reference
call dlasdq( uplo, sqre, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work,
info )
Include Files
mkl.fi
Description
The routine ?lasdq computes the singular value decomposition (SVD) of a real (upper or lower) bidiagonal
matrix with diagonal d and off-diagonal e, accumulating the transformations if desired. If B is the input
bidiagonal matrix, the algorithm computes orthogonal matrices Q and P such that B = Q*S*PT. The singular
values S are overwritten on d.
The input matrix U is changed to U*Q if desired.
Input Parameters
uplo CHARACTER*1. On entry, uplo specifies whether the input bidiagonal matrix
is upper or lower bidiagonal.
If uplo = 'U' or 'u', B is upper bidiagonal;
sqre INTEGER.
= 0: then the input matrix is n-by-n.
= 1: then the input matrix is n-by-(n+1) if uplu = 'U' and (n+1)-by-n if
uplu
= 'L'. The bidiagonal matrix has n = nl + nr + 1 rows and m = n +
sqren columns.
ncvt INTEGER. On entry, ncvt specifies the number of columns of the matrix VT.
ncvt must be at least 0.
nru INTEGER. On entry, nru specifies the number of rows of the matrix U. nru
must be at least 0.
ncc INTEGER. On entry, ncc specifies the number of columns of the matrix C.
ncc must be at least 0.
1608
LAPACK Routines 3
Array, DIMENSION is (n-1) if sqre = 0 and n if sqre = 1. On entry, the
entries of e contain the off-diagonal entries of the bidiagonal matrix.
Output Parameters
e On normal exit, e will contain 0. If the algorithm does not converge, d and e
will contain the diagonal and superdiagonal entries of a bidiagonal matrix
orthogonally equivalent to the one given as input.
1609
3 Intel Math Kernel Library Developer Reference
?lasdt
Creates a tree of subproblems for bidiagonal divide
and conquer. Used by ?bdsdc.
Syntax
call slasdt( n, lvl, nd, inode, ndiml, ndimr, msub )
call dlasdt( n, lvl, nd, inode, ndiml, ndimr, msub )
Include Files
mkl.fi
Description
The routine creates a tree of subproblems for bidiagonal divide and conquer.
Input Parameters
msub INTEGER. On entry, the maximum row dimension each subproblem at the
bottom of the tree can be of.
Output Parameters
inode INTEGER.
Array, DIMENSION (n). On exit, centers of subproblems.
ndiml INTEGER .
Array, DIMENSION (n). On exit, row dimensions of left children.
ndimr INTEGER .
Array, DIMENSION (n). On exit, row dimensions of right children.
?laset
Initializes the off-diagonal elements and the diagonal
elements of a matrix to given values.
Syntax
call slaset( uplo, m, n, alpha, beta, a, lda )
call dlaset( uplo, m, n, alpha, beta, a, lda )
1610
LAPACK Routines 3
call claset( uplo, m, n, alpha, beta, a, lda )
call zlaset( uplo, m, n, alpha, beta, a, lda )
Include Files
mkl.fi
Description
The routine initializes an m-by-n matrix A to beta on the diagonal and alpha on the off-diagonals.
Input Parameters
The data types are given for the Fortran interface.
Output Parameters
1611
3 Intel Math Kernel Library Developer Reference
?lasq1
Computes the singular values of a real square
bidiagonal matrix. Used by ?bdsqr.
Syntax
call slasq1( n, d, e, work, info )
call dlasq1( n, d, e, work, info )
Include Files
mkl.fi
Description
The routine ?lasq1 computes the singular values of a real n-by-n bidiagonal matrix Z with diagonal d and
off-diagonal e. The singular values are computed to high relative accuracy, in the absence of
denormalization, underflow and overflow.
Input Parameters
Output Parameters
e On exit, e is overwritten.
info INTEGER.
1612
LAPACK Routines 3
= 0: successful exit;
< 0: if info = -i, the i-th argument had an illegal value;
?lasq2
Computes all the eigenvalues of the symmetric
positive definite tridiagonal matrix associated with the
quotient difference array z to high relative accuracy.
Used by ?bdsqr and ?stegr.
Syntax
call slasq2( n, z, info )
call dlasq2( n, z, info )
Include Files
mkl.fi
Description
The routine ?lasq2 computes all the eigenvalues of the symmetric positive definite tridiagonal matrix
associated with the quotient difference array z to high relative accuracy, in the absence of denormalization,
underflow and overflow.
To see the relation of z to the tridiagonal matrix, let L be a unit lower bidiagonal matrix with subdiagonals
z(2,4,6,,..) and let U be an upper bidiagonal matrix with 1's above and diagonal z(1,3,5,,..). The tridiagonal
is LU or, if you prefer, the symmetric tridiagonal to which it is similar.
Input Parameters
1613
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
= 0: successful exit;
< 0: if the i-th argument is a scalar and had an illegal value, then info =
-i, if the i-th argument is an array and the j-entry had an illegal value,
then info = -(i*100+ j);
Application Notes
The routine ?lasq2 defines a logical variable, ieee, which is .TRUE. on machines which follow ieee-754
floating-point standard in their handling of infinities and NaNs, and .FALSE. otherwise. This variable is
passed to ?lasq3.
?lasq3
Checks for deflation, computes a shift and calls dqds.
Used by ?bdsqr.
Syntax
call slasq3( i0, n0, z, pp, dmin, sigma, desig, qmax, nfail, iter, ndiv, ieee, ttype,
dmin1, dmin2, dn, dn1, dn2, g, tau )
call dlasq3( i0, n0, z, pp, dmin, sigma, desig, qmax, nfail, iter, ndiv, ieee, ttype,
dmin1, dmin2, dn, dn1, dn2, g, tau )
Include Files
mkl.fi
Description
The routine ?lasq3 checks for deflation, computes a shift tau, and calls dqds. In case of failure, it changes
shifts, and tries again until output is positive.
Input Parameters
1614
LAPACK Routines 3
z REAL for slasq3
DOUBLE PRECISION for dlasq3.
Array, DIMENSION (4n). z holds the qd array.
pp INTEGER. pp=0 for ping, pp=1 for pong. pp=2 indicates that flipping was
applied to the Z array and that the initial tests for deflation should not be
performed.
ieee LOGICAL.
Flag for ieee or non-ieee arithmetic (passed to ?lasq5).
ttype INTEGER.
Shift type.
Output Parameters
pp INTEGER. pp=0 for ping, pp=1 for pong. pp=2 indicates that flipping was
applied to the Z array and that the initial tests for deflation should not be
performed.
1615
3 Intel Math Kernel Library Developer Reference
ttype INTEGER.
Shift type.
?lasq4
Computes an approximation to the smallest
eigenvalue using values of d from the previous
transform. Used by ?bdsqr.
Syntax
call slasq4( i0, n0, z, pp, n0in, dmin, dmin1, dmin2, dn, dn1, dn2, tau, ttype, g )
call dlasq4( i0, n0, z, pp, n0in, dmin, dmin1, dmin2, dn, dn1, dn2, tau, ttype, g )
Include Files
mkl.fi
Description
The routine computes an approximation tau to the smallest eigenvalue using values of d from the previous
transform.
Input Parameters
1616
LAPACK Routines 3
dmin2 REAL for slasq4
DOUBLE PRECISION for dlasq4.
Minimum value of d, excluding d(n0) and d(n0-1).
Output Parameters
?lasq5
Computes one dqds transform in ping-pong form.
Used by ?bdsqr and ?stegr.
Syntax
call slasq5( i0, n0, z, pp, tau, sigma, dmin, dmin1, dmin2, dn, dnm1, dnm2, ieee,
eps )
call dlasq5( i0, n0, z, pp, tau, sigma, dmin, dmin1, dmin2, dn, dnm1, dnm2, ieee,
eps )
Include Files
mkl.fi
Description
1617
3 Intel Math Kernel Library Developer Reference
The routine computes one dqds transform in ping-pong form: one version for ieee machines, another for
non-ieee machines.
Input Parameters
Output Parameters
1618
LAPACK Routines 3
DOUBLE PRECISION for dlasq5. Contains d(n0-1).
?lasq6
Computes one dqd transform in ping-pong form. Used
by ?bdsqr and ?stegr.
Syntax
call slasq6( i0, n0, z, pp, dmin, dmin1, dmin2, dn, dnm1, dnm2 )
call dlasq6( i0, n0, z, pp, dmin, dmin1, dmin2, dn, dnm1, dnm2 )
Include Files
mkl.fi
Description
The routine ?lasq6 computes one dqd (shift equal to zero) transform in ping-pong form, with protection
against underflow and overflow.
Input Parameters
Output Parameters
1619
3 Intel Math Kernel Library Developer Reference
?lasr
Applies a sequence of plane rotations to a general
rectangular matrix.
Syntax
call slasr( side, pivot, direct, m, n, c, s, a, lda )
call dlasr( side, pivot, direct, m, n, c, s, a, lda )
call clasr( side, pivot, direct, m, n, c, s, a, lda )
call zlasr( side, pivot, direct, m, n, c, s, a, lda )
Include Files
mkl.fi
Description
The routine applies a sequence of plane rotations to a real/complex matrix A, from the left or the right.
A := P*A, when side = 'L' ( Left-hand side )
A := A*P', when side = 'R' ( Right-hand side )
where P is an orthogonal matrix consisting of a sequence of plane rotations with z = m when side = 'L'
and z = n when side = 'R'.
P = P(z-1)*...P(2)*P(1),
and when direct = 'B' (Backward sequence), then
P = P(1)*P(2)*...*P(z-1),
where P( k ) is a plane rotation matrix defined by the 2-by-2 plane rotation:
When pivot = 'V' ( Variable pivot ), the rotation is performed for the plane (k, k + 1), that is, P(k) has the
form
1620
LAPACK Routines 3
where R(k) appears as a rank-2 modification to the identity matrix in rows and columns k and k+1.
When pivot = 'T' ( Top pivot ), the rotation is performed for the plane (1,k+1), so P(k) has the form
1621
3 Intel Math Kernel Library Developer Reference
where R(k) appears in rows and columns k and z. The rotations are performed without ever forming P(k)
explicitly.
Input Parameters
pivot CHARACTER*1. Specifies the plane for which P(k) is a plane rotation matrix.
= 'V': Variable pivot, the plane (k, k+1)
c(k) and s(k) contain the cosine and sine of the plane rotations respectively
that define the 2-by-2 plane rotation part (R(k)) of the P(k) matrix as
described above in Description.
1622
LAPACK Routines 3
Output Parameters
?lasrt
Sorts numbers in increasing or decreasing order.
Syntax
call slasrt( id, n, d, info )
call dlasrt( id, n, d, info )
Include Files
mkl.fi
Description
The routine ?lasrt sorts the numbers in d in increasing order (if id = 'I') or in decreasing order (if id =
'D'). It uses Quick Sort, reverting to Insertion Sort on arrays of size 20. Dimension of stack limits n to
about 232.
Input Parameters
The data types are given for the Fortran interface.
id CHARACTER*1.
(d(1) ... d(n)) or into decreasing order
(d(1) ... d(n)), depending on id.
Output Parameters
?lassq
Updates a sum of squares represented in scaled form.
Syntax
call slassq( n, x, incx, scale, sumsq )
1623
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The real routines slassq/dlassq return the values scl and smsq such that
The value of sumsq is assumed to be non-negative and scl returns the value
scl = max( scale, abs(x(i))).
Values scale and sumsq must be supplied in scale and sumsq, and scl and smsq are overwritten on scale and
sumsq, respectively.
The complex routines classq/zlassq return the values scl and ssq such that
The value of sumsq is assumed to be at least unity and the value of ssq will then satisfy 1.0 ssqsumsq
+ 2n
scale is assumed to be non-negative and scl returns the value
scl = max( scale, abs(real(x(i))), abs(aimag(x(i)))).
Values scale and sumsq must be supplied in scale and sumsq, and scl and ssq are overwritten on scale and
sumsq, respectively.
All routines ?lassq make only one pass through the vector x.
Input Parameters
incx INTEGER. The increment between successive values of the vector x. incx >
0.
1624
LAPACK Routines 3
sumsq REAL for slassq/classq
DOUBLE PRECISION for dlassq/zlassq.
On entry, the value sumsq in the equation above.
Output Parameters
scale On exit, scale is overwritten with scl, the scaling factor for the sum of
squares.
?lasv2
Computes the singular value decomposition of a 2-
by-2 triangular matrix.
Syntax
call slasv2( f, g, h, ssmin, ssmax, snr, csr, snl, csl )
call dlasv2( f, g, h, ssmin, ssmax, snr, csr, snl, csl )
Include Files
mkl.fi
Description
The routine ?lasv2 computes the singular value decomposition of a 2-by-2 triangular matrix
On return, abs(ssmax) is the larger singular value, abs(ssmin) is the smaller singular value, and (csl,snl) and
(csr,snr) are the left and right singular vectors for abs(ssmax), giving the decomposition
Input Parameters
1625
3 Intel Math Kernel Library Developer Reference
Output Parameters
Application Notes
Any input parameter may be aliased with any output parameter.
Barring over/underflow and assuming a guard digit in subtraction, all output quantities are correct to within a
few units in the last place (ulps).
In ieee arithmetic, the code works correctly if one matrix element is infinite. Overflow will not occur unless
the largest singular value itself overflows or is within a few ulps of overflow. (On machines with partial
overflow, like the Cray, overflow may occur if the largest singular value is within a factor of 2 of overflow.)
Underflow is harmless if underflow is gradual. Otherwise, results may correspond to a matrix modified by
perturbations of size near the underflow threshold.
?laswp
Performs a series of row interchanges on a general
rectangular matrix.
Syntax
call slaswp( n, a, lda, k1, k2, ipiv, incx )
call dlaswp( n, a, lda, k1, k2, ipiv, incx )
call claswp( n, a, lda, k1, k2, ipiv, incx )
call zlaswp( n, a, lda, k1, k2, ipiv, incx )
Include Files
mkl.fi
Description
The routine performs a series of row interchanges on the matrix A. One row interchange is initiated for each
of rows k1 through k2 of A.
Input Parameters
The data types are given for the Fortran interface.
1626
LAPACK Routines 3
n INTEGER. The number of columns of the matrix A.
k1 INTEGER. The first element of ipiv for which a row interchange will be done.
k2 INTEGER. The last element of ipiv for which a row interchange will be done.
ipiv INTEGER.
Array, size k1+(k2-k1)*|incx|).
Output Parameters
?lasy2
Solves the Sylvester matrix equation where the
matrices are of order 1 or 2.
Syntax
call slasy2( ltranl, ltranr, isgn, n1, n2, tl, ldtl, tr, ldtr, b, ldb, scale, x, ldx,
xnorm, info )
call dlasy2( ltranl, ltranr, isgn, n1, n2, tl, ldtl, tr, ldtr, b, ldb, scale, x, ldx,
xnorm, info )
Include Files
mkl.fi
Description
1627
3 Intel Math Kernel Library Developer Reference
TR is n2-by-n2,
B is n1-by-n2,
and isgn = 1 or -1. op(T) = T or TT, where TT denotes the transpose of T.
Input Parameters
ltranl LOGICAL.
On entry, ltranl specifies the op(TL):
ltranr LOGICAL.
On entry, ltranr specifies the op(TR):
= .FALSE., op(TR) = TR,
isgn INTEGER. On entry, isgn specifies the sign of the equation as described
before. isgn may only be 1 or -1.
1628
LAPACK Routines 3
ldb max(1,n1).
Output Parameters
info INTEGER. On exit, info is set to 0: successful exit. 1: TL and TR have too
close eigenvalues, so TL or TR is perturbed to get a nonsingular equation.
NOTE
For higher speed, this routine does not check the inputs for errors.
?lasyf
Computes a partial factorization of a symmetric
matrix, using the diagonal pivoting method.
Syntax
call slasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call dlasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call clasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Include Files
mkl.fi
Description
The routine ?lasyf computes a partial factorization of a symmetric matrix A using the Bunch-Kaufman
diagonal pivoting method. The partial factorization has the form:
If uplo = 'U':
1629
3 Intel Math Kernel Library Developer Reference
uplo = 'L'
This is an auxiliary routine called by ?sytrf. It uses blocked code (calling Level 3 BLAS) to update the
submatrix A11 (if uplo = 'U') or A22 (if uplo = 'L').
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is stored:
= 'U': Upper triangular
1630
LAPACK Routines 3
Output Parameters
ipiv INTEGER. Array, DIMENSION (n ). Details of the interchanges and the block
structure of D.
If uplo = 'U', only the last kb elements of ipiv are set;
If ipiv(k) > 0, then rows and columns k and ipiv(k) were interchanged
and D(k, k) is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, then rows and columns
k-1 and -ipiv(k) were interchanged and D(k-1:k, k-1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, then rows and columns k
+1 and -ipiv(k) were interchanged and D(k:k+1, k:k+1) is a 2-by-2
diagonal block.
info INTEGER.
= 0: successful exit
> 0: if info = k, D(k, k) is exactly zero. The factorization has been
completed, but the block diagonal matrix D is exactly singular.
?lasyf_rook
Computes a partial factorization of a complex
symmetric matrix, using the bounded Bunch-Kaufman
diagonal pivoting method.
Syntax
call slasyf_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call dlasyf_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call clasyf_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlasyf_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Include Files
mkl.fi
Description
The routine ?lasyf_rook computes a partial factorization of a complex symmetric matrix A using the bounded
Bunch-Kaufman ("rook") diagonal pivoting method. The partial factorization has the form:
1631
3 Intel Math Kernel Library Developer Reference
This is an auxiliary routine called by ?sytrf_rook. It uses blocked code (calling Level 3 BLAS) to update the
submatrix A11 (if uplo = 'U') or A22 (if uplo = 'L').
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is stored:
= 'U': Upper triangular
Output Parameters
1632
LAPACK Routines 3
a On exit, a contains details of the partial factorization.
ipiv INTEGER. Array, DIMENSION (n ). Details of the interchanges and the block
structure of D.
If uplo = 'U', only the last kb elements of ipiv are set;
If ipiv(k) > 0, then rows and columns k and ipiv(k) were interchanged
and Dk, k is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) < 0 and ipiv(k - 1) < 0, then rows and
columns k and -ipiv(k) were interchanged, rows and columns k - 1 and -
ipiv(k - 1) were interchanged, and Dk-1:k, k-1:k is a 2-by-2 diagonal block.
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows and
columns k and -ipiv(k) were interchanged, rows and columns k + 1 and -
ipiv(k + 1) were interchanged, and Dk:k+1, k:k+1 is a 2-by-2 diagonal block.
info INTEGER.
= 0: successful exit
> 0: if info = k, D(k, k) is exactly zero. The factorization has been
completed, but the block diagonal matrix D is exactly singular.
?lahef
Computes a partial factorization of a complex
Hermitian indefinite matrix, using the diagonal
pivoting method.
Syntax
call clahef( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlahef( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Include Files
mkl.fi
Description
The routine ?lahef computes a partial factorization of a complex Hermitian matrix A, using the Bunch-
Kaufman diagonal pivoting method. The partial factorization has the form:
If uplo = 'U':
If uplo = 'U':
1633
3 Intel Math Kernel Library Developer Reference
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the Hermitian matrix
A is stored:
= 'U': upper triangular
Output Parameters
ipiv INTEGER.
Array, DIMENSION (n ). Details of the interchanges and the block structure
of D.
1634
LAPACK Routines 3
If uplo = 'U', only the last kb elements of ipiv are set;
If ipiv(k) > 0, then rows and columns k and ipiv(k) are interchanged and
D(k, k) is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, then rows and columns
k-1 and -ipiv(k) are interchanged and D(k-1:k, k-1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, then rows and columns k
+1 and -ipiv(k) are interchanged and D( k:k+1, k:k+1) is a 2-by-2
diagonal block.
info INTEGER.
= 0: successful exit
> 0: if info = k, D(k, k) is exactly zero. The factorization has been
completed, but the block diagonal matrix D is exactly singular.
?lahef_rook
Computes a partial factorization of a complex
Hermitian indefinite matrix, using the bounded Bunch-
Kaufman diagonal pivoting method.
Syntax
call clahef_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlahef_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Include Files
mkl.fi
Description
The routine ?lahef_rook computes a partial factorization of a complex Hermitian matrix A, using the
bounded Bunch-Kaufman ("rook") diagonal pivoting method. The partial factorization has the form:
If uplo = 'U':
If uplo = 'L':
1635
3 Intel Math Kernel Library Developer Reference
This is an auxiliary routine called by ?hetrf_rook. It uses blocked code (calling Level 3 BLAS) to update the
submatrix A11 (if uplo = 'U') or A22 (if uplo = 'L').
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the Hermitian matrix
A is stored:
= 'U': upper triangular
Output Parameters
ipiv INTEGER.
Array, DIMENSION (n ). Details of the interchanges and the block structure
of D.
If uplo = 'U', only the last kb elements of ipiv are set;
If ipiv(k) > 0, then rows and columns k and ipiv(k) are interchanged and
D(k, k) is a 1-by-1 diagonal block.
1636
LAPACK Routines 3
If uplo = 'U' and ipiv(k) < 0 and ipiv(k-1) < 0, then rows and
columns k and -ipiv(k) are interchanged, rows and columns k - 1 and -
ipiv(k - 1) are interchanged, and Dk-1:k, k-1:k is a 2-by-2 diagonal block.
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows and
columns k and -ipiv(k) are interchanged, rows and columns k + 1 and -
ipiv(k + 1) are interchanged, and Dk:k+1, k:k+1 is a 2-by-2 diagonal block.
info INTEGER.
= 0: successful exit
> 0: if info = k, D(k, k) is exactly zero. The factorization has been
completed, but the block diagonal matrix D is exactly singular.
?latbs
Solves a triangular banded system of equations.
Syntax
call slatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm, info )
call dlatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm, info )
call clatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm, info )
call zlatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm, info )
Include Files
mkl.fi
Description
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': upper triangular
trans CHARACTER*1.
Specifies the operation applied to A.
= 'N': solve A*x = s*b (no transpose)
1637
3 Intel Math Kernel Library Developer Reference
diag CHARACTER*1.
Specifies whether the matrix A is unit triangular
= 'N': non-unit triangular
normin CHARACTER*1.
Specifies whether cnorm is set.
= 'Y': cnorm contains the column norms on entry;
= 'N': cnorm is not set on entry. On exit, the norms is computed and
stored in cnorm.
The upper or lower triangular band matrix A, stored in the first kb+1 rows
of the array. The j-th column of A is stored in the j-th column of the array
ab as follows:
if uplo = 'U', ab(kd+1+i-j,j) = A(i,j) for max(1, j-kd) ij;
1638
LAPACK Routines 3
If trans = 'N', cnorm(j) must be greater than or equal to the infinity-
norm, and if trans = 'T' or 'C', cnorm(j) must be greater than or equal
to the 1-norm.
Output Parameters
cnorm If normin = 'N', cnorm is an output argument and cnorm(j) returns the
1-norm of the off-diagonal part of the j-th column of A.
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
?latm1
Computes the entries of a matrix as specified.
Syntax
call slatm1( mode, cond, irsign, idist, iseed, d, n, info )
call dlatm1( mode, cond, irsign, idist, iseed, d, n, info )
call clatm1( mode, cond, irsign, idist, iseed, d, n, info )
call zlatm1( mode, cond, irsign, idist, iseed, d, n, info )
Include Files
mkl.fi
Description
The ?latm1 routine computes the entries of D(1..n) as specified by mode, cond and irsign. idist and
iseed determine the generation of random numbers.
?latm1 is called by slatmr (for slatm1 and dlatm1), and by clatmr(for clatm1 and zlatm1) to generate
random test matrices for LAPACK programs.
Input Parameters
1639
3 Intel Math Kernel Library Developer Reference
irsign INTEGER.
On entry, if mode is not -6, 0, or 6, determines sign of entries of d.
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1)
For clatm1 and zlatm1:
1640
LAPACK Routines 3
Array, size n.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
If info = -2, mode is neither -6, 0 nor 6, and irsign is neither 0 nor 1.
If info = -3, mode is neither -6, 0 nor 6 and cond is less than 1.
?latm2
Returns an entry of a random matrix.
Syntax
res = slatm2( m, n, i, j, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng, iwork,
sparse )
res = dlatm2( m, n, i, j, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng, iwork,
sparse )
res = clatm2( m, n, i, j, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng, iwork,
sparse )
res = zlatm2( m, n, i, j, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng, iwork,
sparse )
Include Files
mkl.fi
Description
The ?latm2 routine returns entry (i , j ) of a random matrix of dimension (m, n). It is called by the ?latmr
routine in order to build random test matrices. No error checking on parameters is done, because this routine
is called in a tight loop by ?latmr which has already checked the parameters.
Use of ?latm2 differs from ?latm3 in the order in which the random number generator is called to fill in
random matrix entries. With ?latm2, the generator is called to fill in the pivoted matrix columnwise. With ?
latm2, the generator is called to fill in the matrix columnwise, after which it is pivoted. Thus, ?latm3 can be
used to construct random matrices which differ only in their order of rows and/or columns. ?latm2 is used to
construct band matrices while avoiding calling the random number generator for entries outside the band
(and therefore generating random numbers).
The matrix whose (i , j ) entry is returned is constructed as follows (this routine only computes one entry):
1641
3 Intel Math Kernel Library Developer Reference
If i is outside (1..m) or j is outside (1..n), returns zero (this is convenient for generating matrices in
band format).
Generate a matrix A with random entries of distribution idist.
Set the diagonal to D.
Grade the matrix, if desired, from the left (by dl) and/or from the right (by dr or dl) as specified by
igrade.
Permute, if desired, the rows and/or columns as specified by ipvtng and iwork.
Band the matrix to have lower bandwidth kl and upper bandwidth ku.
Set random entries to zero as specified by sparse.
Input Parameters
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1)
for clatm2 and zlatm2:
1642
LAPACK Routines 3
= 2: matrix postmultiplied by diag( dr )
iwork INTEGER.
Array, size (i or j), as appropriate. This array specifies the permutation
used. The row (or column) in position k was originally in position
iwork( k ). This differs from iwork for ?latm3.
1643
3 Intel Math Kernel Library Developer Reference
Output Parameters
iseed INTEGER.
On exit, the seed is updated.
?latm3
Returns set entry of a random matrix.
Syntax
res = slatm3( m, n, i, j, isub, jsub, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng,
iwork, sparse )
res = dlatm3( m, n, i, j, isub, jsub, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng,
iwork, sparse )
res = clatm3( m, n, i, j, isub, jsub, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng,
iwork, sparse )
res = zlatm3( m, n, i, j, isub, jsub, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng,
iwork, sparse )
Include Files
mkl.fi
Description
The ?latm3 routine returns the (isub, jsub) entry of a random matrix of dimension (m, n) described by the
other parameters. (isub, jsub) is the final position of the (i ,j ) entry after pivoting according to ipvtng and
iwork. ?latm3 is called by the ?latmr routine in order to build random test matrices. No error checking on
parameters is done, because this routine is called in a tight loop by ?latmr which has already checked the
parameters.
Use of ?latm3 differs from ?latm2 in the order in which the random number generator is called to fill in
random matrix entries. With ?latm2, the generator is called to fill in the pivoted matrix columnwise. With ?
latm3, the generator is called to fill in the matrix columnwise, after which it is pivoted. Thus, ?latm3 can be
used to construct random matrices which differ only in their order of rows and/or columns. ?latm2 is used to
construct band matrices while avoiding calling the random number generator for entries outside the band
(and therefore generating random numbers in different orders for different pivot orders).
The matrix whose (isub, jsub ) entry is returned is constructed as follows (this routine only computes one
entry):
1644
LAPACK Routines 3
If isub is outside (1..m) or jsub is outside (1..n), returns zero (this is convenient for generating
matrices in band format).
Generate a matrix A with random entries of distribution idist.
Set the diagonal to D.
Grade the matrix, if desired, from the left (by dl) and/or from the right (by dr or dl) as specified by
igrade.
Permute, if desired, the rows and/or columns as specified by ipvtng and iwork.
Band the matrix to have lower bandwidth kl and upper bandwidth ku.
Set random entries to zero as specified by sparse.
Input Parameters
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1)
for clatm2 and zlatm2:
1645
3 Intel Math Kernel Library Developer Reference
1646
LAPACK Routines 3
On entry, specifies the sparsity of the matrix if sparse matrix is to be
generated. sparse should lie between 0 and 1. A uniform( 0, 1 ) random
number x is generated and compared to sparse; if x is larger the matrix
entry is unchanged and if x is smaller the entry is set to zero. Thus on the
average a fraction sparse of the entries will be set to zero.
iwork INTEGER.
Array, size (i or j, as appropriate). This array specifies the permutation
used. The row (or column) originally in position k is in position iwork( k )
after pivoting. This differs from iwork for ?latm2.
Output Parameters
?latm5
Generates matrices involved in the Generalized
Sylvester equation.
Syntax
call slatm5( prtype, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, r, ldr, l,
ldl, alpha, qblcka, qblckb )
call dlatm5( prtype, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, r, ldr, l,
ldl, alpha, qblcka, qblckb )
call clatm5( prtype, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, r, ldr, l,
ldl, alpha, qblcka, qblckb )
call zlatm5( prtype, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, r, ldr, l,
ldl, alpha, qblcka, qblckb )
Include Files
mkl.fi
Description
The ?latm5 routine generates matrices involved in the Generalized Sylvester equation:
A * R - L * B = C
D * R - L * E = F
They also satisfy the diagonalization condition:
1647
3 Intel Math Kernel Library Developer Reference
Input Parameters
B:
If (i == j) then Bi, j = 1.0 - alpha.
D:
If (i == j) then Di, j = 1.0.
E:
If (i == j) then Ei, j = 1.0
L = R are chosen from [-10...10], which specifies the right hand sides
(C, F).
If prtype = 2 or 3: Triangular and/or quasi- triangular.
A:
If (ij) then Ai, j = [-1...1].
Ak + 1, k = [-1...1];
k = 1, m- 1, qblcka
B:
If (ij) then Bi, j = [-1...1].
Bk + 1, k = [-1...1]
1648
LAPACK Routines 3
sign(Bk, k + 1)= -(sign(Bk + 1, k)
k = 1, n - 1, qblckb.
D:
If (ij) then Di, j = [-1...1].
E:
If (i <= j) then Ei, j = [-1...1].
L, R are chosen from [-10...10], which specifies the right hand sides (C,
F).
If prtype = 4 Full
Ai, j = [-10...10]
Bi, j = [-10...10]
Ri, j = [-10...10]
1649
3 Intel Math Kernel Library Developer Reference
qblcka INTEGER. When prtype = 3, specifies the distance between 2-by-2 blocks
on the diagonal in A. Otherwise, qblcka is not referenced. qblcka > 1.
qblckb INTEGER. When prtype = 3, specifies the distance between 2-by-2 blocks
on the diagonal in B. Otherwise, qblckb is not referenced. qblckb > 1.
Output Parameters
1650
LAPACK Routines 3
f REAL for slatm5,
DOUBLE PRECISION for dlatm5,
COMPLEX for clatm5,
DOUBLE COMPLEX for zlatm5,
Array, size (ldf, n). On exit f contains the m-by-n array F initialized
according to prtype.
?latm6
Generates test matrices for the generalized eigenvalue
problem, their corresponding right and left
eigenvector matrices, and also reciprocal condition
numbers for all eigenvalues and the reciprocal
condition numbers of eigenvectors corresponding to
the 1th and 5th eigenvalues.
Syntax
call slatm6( type, n, a, lda, b, x, ldx, y, ldy, alpha, beta, wx, wy, s, dif )
call dlatm6( type, n, a, lda, b, x, ldx, y, ldy, alpha, beta, wx, wy, s, dif )
call clatm6( type, n, a, lda, b, x, ldx, y, ldy, alpha, beta, wx, wy, s, dif )
call zlatm6( type, n, a, lda, b, x, ldx, y, ldy, alpha, beta, wx, wy, s, dif )
Include Files
mkl.fi
Description
The ?latm6 routine generates test matrices for the generalized eigenvalue problem, their corresponding right
and left eigenvector matrices, and also reciprocal condition numbers for all eigenvalues and the reciprocal
condition numbers of eigenvectors corresponding to the 1th and 5th eigenvalues.
There two kinds of test matrix pairs:
(A, B)= inverse(YH) * (Da, Db) * inverse(X)
1651
3 Intel Math Kernel Library Developer Reference
Type 1:
Type 2:
In both cases the same inverse(YH) and inverse(X) are used to compute (A, B), giving the exact eigenvectors
to (A,B) as (YH, X):
,
where a, b, x and y will have all values independently of each other.
Input Parameters
1652
LAPACK Routines 3
DOUBLE COMPLEX for zlatm6,
Constant for right eigenvector matrix.
Output Parameters
1653
3 Intel Math Kernel Library Developer Reference
Array, size (n). s(i ) is the reciprocal condition number for eigenvalue i .
?latme
Generates random non-symmetric square matrices
with specified eigenvalues.
Syntax
call slatme( n, dist, iseed, d, mode, cond, dmax, ei, rsign, upper, sim, ds, modes,
conds, kl, ku, anorm, a, lda, work, info )
call dlatme( n, dist, iseed, d, mode, cond, dmax, ei, rsign, upper, sim, ds, modes,
conds, kl, ku, anorm, a, lda, work, info )
call clatme( n, dist, iseed, d, mode, cond, dmax, ei, rsign, upper, sim, ds, modes,
conds, kl, ku, anorm, a, lda, work, info )
call zlatme( n, dist, iseed, d, mode, cond, dmax, ei, rsign, upper, sim, ds, modes,
conds, kl, ku, anorm, a, lda, work, info )
Include Files
mkl.fi
Description
The ?latme routine generates random non-symmetric square matrices with specified eigenvalues. ?latme
operates by applying the following sequence of operations:
1. Set the diagonal to d, where d may be input or computed according to mode, cond, dmax, and rsign as
described below.
2. If upper = 'T', the upper triangle of a is set to random values out of distribution dist.
3. If sim='T', a is multiplied on the left by a random matrix X, whose singular values are specified by ds,
modes, and conds, and on the right by X inverse.
4. If kl < n-1, the lower bandwidth is reduced to kl using Householder transformations. If ku < n-1,
the upper bandwidth is reduced to ku.
5. If anorm is not negative, the matrix is scaled to have maximum-element-norm anorm.
NOTE
Since the matrix cannot be reduced beyond Hessenberg form, no packing options are available.
Input Parameters
1654
LAPACK Routines 3
dist CHARACTER*1. On entry, dist specifies the type of distribution to be used
to generate the random eigen-/singular values, and on the upper triangle
(see upper).
mode INTEGER. On entry mode describes how the eigenvalues are to be specified:
mode = 0 means use d (with ei for slatme and dlatme) as input.
mode = 1 sets d(1) = 1 and d[1:n - 1]=1.0/cond.
mode = 2 sets d(1:n-1) = 1 and d(n)=1.0/cond.
mode = 3 sets d(i) = cond**(-(i-1)/(n-1)).
mode = 4 sets d(i) = 1 - (i-1)/(n-1)*(1 - 1/cond).
mode = 5 sets d to random numbers in the range ( 1/cond , 1 ) such
that their logarithms are uniformly distributed.
mode = 6 sets d to random numbers from same distribution as the rest of
the matrix.
mode < 0 has the same meaning as abs(mode), except that the order of
the elements of d is reversed.
1655
3 Intel Math Kernel Library Developer Reference
If mode = 0, and ei(1)is not ' ' (space character), this array specifies
which elements of d (on input) are real eigenvalues and which are the real
and imaginary parts of a complex conjugate pair of eigenvalues. The
elements of ei may then only have the values 'R' and 'I'.
If mode is not 0, then ei is ignored. If mode is 0 and ei(1) = ' ', then
the eigenvalues will all be real.
rsign CHARACTER*1. If mode is not 0, 6, or -6, and rsign = 'T', then the
elements of d, as computed according to mode and cond, are multiplied by
a random sign (+1 or -1) for slatme and dlatme or by a complex number
from the unit circle |z| = 1 for clatme and zlatme.
If rsign = 'F', the elements of d are not multiplied. rsign may only have
the values 'T' or 'F'.
upper CHARACTER*1. If upper = 'T', then the elements of a above the diagonal
will be set to random numbers out of dist.
If upper = 'F', they will not. upper may only have the values 'T' or 'F'.
1656
LAPACK Routines 3
REAL for clatme,
DOUBLE PRECISION for zlatme,
This array is used to specify the singular values of X, in the same way that
d specifies the eigenvalues of a. If mode = 0, the ds contains the singular
values, which may not be zero.
modes INTEGER.
Similar to mode, but for specifying the diagonal of S. modes = -6 and +6
are not allowed (since they would result in randomly ill-conditioned
eigenvalues.)
If ku and ku are both at least n-1, then a will be dense. Only one of ku and
kl may be less than n-1.
Output Parameters
iseed INTEGER.
On exit, the seed is updated.
1657
3 Intel Math Kernel Library Developer Reference
info INTEGER.
If info = 0, execution is successful.
If info = -6, cond is less than 1.0, and mode is not -6, 0, or 6 .
If info = -16, ku is less than 1, or kl and ku are both less than n-1.
?latmr
Generates random matrices of various types.
Syntax
call slatmr (m, n, dist, iseed, sym, d, mode, cond, dmax, rsign, grade, dl, model,
condl, dr, moder, condr, pivtng, ipivot, kl, ku, sparse, anorm, pack, a, lda, iwork,
info)
1658
LAPACK Routines 3
call dlatmr (m, n, dist, iseed, sym, d, mode, cond, dmax, rsign, grade, dl, model,
condl, dr, moder, condr, pivtng, ipivot, kl, ku, sparse, anorm, pack, a, lda, iwork,
info)
call clatmr (m, n, dist, iseed, sym, d, mode, cond, dmax, rsign, grade, dl, model,
condl, dr, moder, condr, pivtng, ipivot, kl, ku, sparse, anorm, pack, a, lda, iwork,
info)
call zlatmr (m, n, dist, iseed, sym, d, mode, cond, dmax, rsign, grade, dl, model,
condl, dr, moder, condr, pivtng, ipivot, kl, ku, sparse, anorm, pack, a, lda, iwork,
info)
Description
NOTE
If two calls to ?latmr differ only in the pack parameter, they generate mathematically equivalent
matrices. If two calls to ?latmr both have full bandwidth (kl = m-1 and ku = n-1), and differ only in
the pivtng and pack parameters, then the matrices generated differ only in the order of the rows and
columns, and otherwise contain the same data. This consistency cannot be and is not maintained with
less than full bandwidth.
Input Parameters
If dist = 'S', real and imaginary parts are independent uniform( -1, 1 ).
1659
3 Intel Math Kernel Library Developer Reference
1660
LAPACK Routines 3
COMPLEX for clatmr,
DOUBLE COMPLEX for zlatmr,
If mode is not -6, 0, or 6, the diagonal is scaled by dmax /
max(abs(d(i))), so that maximum absolute entry of diagonal is
abs(dmax). If dmax is complex (or zero), the diagonal is scaled by a
complex number (or zero).
rsign CHARACTER. If mode is not -6, 0, or 6, specifies the sign of the diagonal as
follows:
For slatmr and dlatmr, if rsign = 'T', diagonal entries are multiplied 1
or -1 with a probability of 0.5.
For clatmr and zlatmr, if rsign = 'T', diagonal entries are multiplied by
a random complex number uniformly distributed with absolute value 1.
If rsign = 'F', diagonal entries are unchanged.
NOTE
if grade = 'E', then m must equal n.
1661
3 Intel Math Kernel Library Developer Reference
model INTEGER. This specifies how the diagonal array dl is computed, just as
mode specifies how D is computed.
moder INTEGER. This specifies how the diagonal array dr is to be computed, just
as mode specifies how d is to be computed.
If pivtng = 'B' or 'F': both or full pivoting, i.e., on both sides. In this
case, m must equal n.
ipivot INTEGER. Array, size (n or m) This array specifies the permutation used.
After the basic matrix is generated, the rows, columns, or both are
permuted.
1662
LAPACK Routines 3
If row pivoting is selected, ?latmr starts with the last row and interchanges
row m and row ipivot(m), then moves to the next-to-last row,
interchanging rows (m-1) and row ipivot(m-1), and so on. In terms of
"2-cycles", the permutation is (1 ipivot(1)) (2 ipivot(2)) ...
(mipivot(m)) where the rightmost cycle is applied first. This is the inverse
of the effect of pivoting in LINPACK. The idea is that factoring (with
pivoting) an identity matrix which has been inverse-pivoted in this way
should result in a pivot vector identical to ipivot. Not referenced if pivtng
= 'N'.
for dlatmr,
for clatmr,
for zlatmr,
If pack = 'U': zero out all subdiagonal entries (if symmetric or Hermitian)
1663
3 Intel Math Kernel Library Developer Reference
GB 'Z'
PB, HB or TB 'B' or 'Q'
PP, HP or TP 'C' or 'R'
If two calls to ?latmr differ only in the pack parameter, they generate
mathematically equivalent matrices.
lda INTEGER. On entry, lda specifies the first dimension of a as declared in the
calling program.
If pack = 'N', 'U' or 'L', lda must be at least max( 1, m ).
iwork INTEGER. Array, size (n or m). Workspace. Not referenced if pivtng = 'N'.
Changed on exit.
Output Parameters
1664
LAPACK Routines 3
COMPLEX for clatmr,
DOUBLE COMPLEX for zlatmr,
On exit, a is the desired test matrix. Only those entries of a which are
significant on output is referenced (even if a is in packed or band storage
format). The unoccupied corners of a in band format are zeroed out.
info INTEGER.
If info = 0, the execution is successful.
If info = -8, cond is less than 1.0, and mode is neither -6, 0 nor 6.
If info = -10, mode is neither -6, 0 nor 6 and rsign is an illegal string.
If info = -17, condr is less than 1.0, grade = 'R' or 'B', and moder
is neither -6, 0 nor 6 .
If info = -18, pivtng is an illegal string, or pivtng = 'B' or 'F' and m
is not equal to n, or pivtng = 'L' or 'R' and sym = 'S' or 'H'.
If info = -19, ipivot contains out of range number and pivtng is not
equal to 'N' .
1665
3 Intel Math Kernel Library Developer Reference
?latdf
Uses the LU factorization of the n-by-n matrix
computed by ?getc2 and computes a contribution to
the reciprocal Dif-estimate.
Syntax
call slatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call dlatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call clatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call zlatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
Include Files
mkl.fi
Description
The routine ?latdf uses the LU factorization of the n-by-n matrix Z computed by ?getc2 and computes a
contribution to the reciprocal Dif-estimate by solving Z*x = b for x, and choosing the right-hand side b such
that the norm of x is as large as possible. On entry rhs = b holds the contribution from earlier solved sub-
systems, and on return rhs = x.
The factorization of Z returned by ?getc2 has the form Z = P*L*U*Q, where P and Q are permutation
matrices. L is lower triangular with unit diagonal elements and U is upper triangular.
Input Parameters
ijob INTEGER.
ijob = 2: First compute an approximative null-vector e of Z using ?gecon,
e is normalized, and solve for Z*x = e-fwith the sign giving the greater
value of 2-norm(x). This option is about 5 times as expensive as default.
ijob 2 (default): Local look ahead strategy where all entries of the right-
hand side b is chosen as either +1 or -1 .
ldz INTEGER. The leading dimension of the array Z. lda max(1, n).
1666
LAPACK Routines 3
rhs REAL for slatdf/clatdf
DOUBLE PRECISION for dlatdf/zlatdf.
Array, DIMENSION (n).
Note that rdsum only makes sense when ?tgsy2 is called by ?tgsyL.
Note that rdscal only makes sense when ?tgsy2 is called by ?tgsyL.
ipiv INTEGER.
Array, DIMENSION (n).
The pivot indices; for 1 i n, row i of the matrix has been interchanged
with row ipiv(i).
jpiv INTEGER.
Array, DIMENSION (n).
The pivot indices; for 1 jn, column j of the matrix has been interchanged
with column jpiv(j).
Output Parameters
rhs On exit, rhs contains the solution of the subsystem with entries according to
the value of ijob.
rdsum On exit, the corresponding sum of squares updated with the contributions
from the current sub-system.
If trans = 'T', rdsum is not touched.
?latps
Solves a triangular system of equations with the
matrix held in packed storage.
1667
3 Intel Math Kernel Library Developer Reference
Syntax
call slatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call dlatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call clatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call zlatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
Include Files
mkl.fi
Description
The routine ?latps solves one of the triangular systems
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': upper triangular
trans CHARACTER*1.
Specifies the operation applied to A.
= 'N': solve A*x = s*b (no transpose)
diag CHARACTER*1.
Specifies whether the matrix A is unit triangular.
= 'N': non-unit triangular
= 'U': unit triangular
normin CHARACTER*1.
Specifies whether cnorm is set.
= 'Y': cnorm contains the column norms on entry;
= 'N': cnorm is not set on entry. On exit, the norms will be computed and
stored in cnorm.
1668
LAPACK Routines 3
ap REAL for slatps
DOUBLE PRECISION for dlatps
COMPLEX for clatps
DOUBLE COMPLEX for zlatps.
Array, DIMENSION (n(n+1)/2).
Output Parameters
cnorm If normin = 'N', cnorm is an output argument and cnorm(j) returns the
1-norm of the off-diagonal part of the j-th column of A.
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
1669
3 Intel Math Kernel Library Developer Reference
?latrd
Reduces the first nb rows and columns of a
symmetric/Hermitian matrix A to real tridiagonal form
by an orthogonal/unitary similarity transformation.
Syntax
call slatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call dlatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call clatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call zlatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
Include Files
mkl.fi
Description
The routine ?latrd reduces nb rows and columns of a real symmetric or complex Hermitian matrix A to
symmetric/Hermitian tridiagonal form by an orthogonal/unitary similarity transformation QT*A*Q for real
flavors, QH*A*Q for complex flavors, and returns the matrices V and W which are needed to apply the
transformation to the unreduced part of A.
If uplo = 'U', ?latrd reduces the last nb rows and columns of a matrix, of which the upper triangle is
supplied;
if uplo = 'L', ?latrd reduces the first nb rows and columns of a matrix, of which the lower triangle is
supplied.
This is an auxiliary routine called by ?sytrd/?hetrd.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric/
Hermitian matrix A is stored:
= 'U': upper triangular
1670
LAPACK Routines 3
If uplo = 'U', the leading n-by-n upper triangular part of a contains the
upper triangular part of the matrix A, and the strictly lower triangular part
of a is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of a contains the
lower triangular part of the matrix A, and the strictly upper triangular part
of a is not referenced.
ldw INTEGER.
The leading dimension of the output array w. ldw max(1,n).
Output Parameters
1671
3 Intel Math Kernel Library Developer Reference
Application Notes
If uplo = 'U', the matrix Q is represented as a product of elementary reflectors
Q = H(n)*H(n-1)*...*H(n-nb+1)
Each H(i) has the form
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(i:n) = 0 and v(i-1) = 1; v(1:
i-1) is stored on exit in a(1: i-1, i), and tau in tau(i-1).
If uplo = 'L', the matrix Q is represented as a product of elementary reflectors
Q = H(1)*H(2)*...*H(nb)
Each H(i) has the form H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(1: i) = 0 and v(i+1) = 1; v( i
+1:n) is stored on exit in a(i+1:n, i), and tau in tau(i).
The elements of the vectors v together form the n-by-nb matrix V which is needed, with W, to apply the
transformation to the unreduced part of the matrix, using a symmetric/Hermitian rank-2k update of the
form:
A := A - VW' - WV'.
The contents of a on exit are illustrated by the following examples with n = 5 and nb = 2:
where d denotes a diagonal element of the reduced matrix, a denotes an element of the original matrix that
is unchanged, and vi denotes an element of the vector defining H(i).
?latrs
Solves a triangular system of equations with the scale
factor set to prevent overflow.
Syntax
call slatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call dlatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call clatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call zlatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
1672
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine solves one of the triangular systems
A*x = s*b, or AT*x = s*b, or AH*x = s*b (for complex flavors)
with scaling to prevent overflow. Here A is an upper or lower triangular matrix, AT denotes the transpose of
A, AH denotes the conjugate transpose of A, x and b are n-element vectors, and s is a scaling factor, usually
less than or equal to 1, chosen so that the components of x will be less than the overflow threshold. If the
unscaled problem will not cause overflow, the Level 2 BLAS routine ?trsv is called. If the matrix A is singular
(A(j,j) = 0 for some j), then s is set to 0 and a non-trivial solution to A*x = 0 is returned.
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': Upper triangular
trans CHARACTER*1.
Specifies the operation applied to A.
= 'N': solve A*x = s*b (no transpose)
diag CHARACTER*1.
Specifies whether or not the matrix A is unit triangular.
= 'N': non-unit triangular
normin CHARACTER*1.
Specifies whether cnorm has been set or not.
= 'Y': cnorm contains the column norms on entry;
1673
3 Intel Math Kernel Library Developer Reference
If uplo = 'U', the leading n-by-n upper triangular part of the array a
contains the upper triangular matrix, and the strictly lower triangular part
of A is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of the array a
contains the lower triangular matrix, and the strictly upper triangular part
of A is not referenced.
If diag = 'U', the diagonal elements of A are also not referenced and are
assumed to be 1.
lda INTEGER. The leading dimension of the array a. lda max(1, n).
If normin = 'Y', cnorm is an input argument and cnorm (j) contains the
norm of the off-diagonal part of the j-th column of A.
If trans = 'N', cnorm (j) must be greater than or equal to the infinity-
norm, and if trans = 'T' or 'C', cnorm(j) must be greater than or equal
to the 1-norm.
Output Parameters
cnorm If normin = 'N', cnorm is an output argument and cnorm(j) returns the
1-norm of the off-diagonal part of the j-th column of A.
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
1674
LAPACK Routines 3
Application Notes
A rough bound on x is computed; if that is less than overflow, ?trsv is called, otherwise, specific code is
used which checks for possible overflow or divide-by-zero at every operation.
A columnwise scheme is used for solving Ax = b. The basic algorithm if A is lower triangular is
x[1:n] := b[1:n]
for j = 1, ..., n
x(j) := x(j) / A(j,j)
x[j+1:n] := x[j+1:n] - x(j)*a[j+1:n,j]
end
Define bounds on the components of x after j iterations of the loop:
M(j) = bound on x[1:j]
G(j) = bound on x[j+1:n]
Initially, let M(0) = 0 and G(0) = max{x(i), i=1,...,n}.
and
Since |x(j)| M(j), we use the Level 2 BLAS routine ?trsv if the reciprocal of the largest M(j),
j=1,..,n, is larger than max(underflow, 1/overflow).
The bound on x(j) is also used to determine when a step in the columnwise method can be performed
without fear of overflow. If the computed bound is greater than a large constant, x is scaled to prevent
overflow, but if the bound overflows, x is set to 0, x(j) to 1, and scale to 0, and a non-trivial solution to Ax =
0 is found.
Similarly, a row-wise scheme is used to solve ATx = b or AHx = b. The basic algorithm for A upper triangular
is
for j = 1, ..., n
x(j) := ( b(j) - A[1:j-1,j]' x[1:j-1]) / A(j,j)
end
We simultaneously compute two bounds
1675
3 Intel Math Kernel Library Developer Reference
and we can safely call ?trsv if 1/M(n) and 1/G(n) are both greater than max(underflow, 1/overflow).
?latrz
Factors an upper trapezoidal matrix by means of
orthogonal/unitary transformations.
Syntax
call slatrz( m, n, l, a, lda, tau, work )
call dlatrz( m, n, l, a, lda, tau, work )
call clatrz( m, n, l, a, lda, tau, work )
call zlatrz( m, n, l, a, lda, tau, work )
Include Files
mkl.fi
Description
The routine ?latrz factors the m-by-(m+l) real/complex upper trapezoidal matrix
Input Parameters
1676
LAPACK Routines 3
DOUBLE COMPLEX for zlatrz.
Array, DIMENSION (lda, n).
On entry, the leading m-by-n upper trapezoidal part of the array a must
contain the matrix to be factorized.
Output Parameters
a On exit, the leading m-by-m upper triangular part of a contains the upper
triangular matrix R, and elements n-l+1 to n of the first m rows of a, with
the array tau, represent the orthogonal/unitary matrix Z as a product of m
elementary reflectors.
Application Notes
The factorization is obtained by Householder's method. The k-th transformation matrix, z(k), which is used to
introduce zeros into the (m - k + 1)-th row of A, is given in the form
1677
3 Intel Math Kernel Library Developer Reference
tau is a scalar and z(k) is an l-element vector. tau and z(k) are chosen to annihilate the elements of the k-th
row of A2.
The scalar tau is returned in the k-th element of tau and the vector u(k) in the k-th row of A2, such that the
elements of z(k) are in a(k, l+1), ..., a(k, n).
?lauu2
Computes the product U*UT(U*UH) or LT*L (LH*L),
where U and L are upper or lower triangular matrices
(unblocked algorithm).
Syntax
call slauu2( uplo, n, a, lda, info )
call dlauu2( uplo, n, a, lda, info )
call clauu2( uplo, n, a, lda, info )
call zlauu2( uplo, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?lauu2 computes the product U*UT or LT*L for real flavors, and U*UH or LH*L for complex
flavors. Here the triangular factor U or L is stored in the upper or lower triangular part of the array a.
If uplo = 'U' or 'u', then the upper triangle of the result is stored, overwriting the factor U in A.
If uplo = 'L' or 'l', then the lower triangle of the result is stored, overwriting the factor L in A.
This is the unblocked form of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
Specifies whether the triangular factor stored in the array a is upper or
lower triangular:
= 'U': Upper triangular
1678
LAPACK Routines 3
a REAL for slauu2
DOUBLE PRECISION for dlauu2
COMPLEX for clauu2
DOUBLE COMPLEX for zlauu2.
Array, DIMENSION (lda, n). On entry, the triangular factor U or L.
Output Parameters
a On exit,
if uplo = 'U', then the upper triangle of a is overwritten with the upper
triangle of the product U*UT (U*UH);
if uplo = 'L', then the lower triangle of a is overwritten with the lower
triangle of the product LT*L (LH*L).
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
?lauum
Computes the product U*UT(U*UH) or LT*L (LH*L),
where U and L are upper or lower triangular matrices
(blocked algorithm).
Syntax
call slauum( uplo, n, a, lda, info )
call dlauum( uplo, n, a, lda, info )
call clauum( uplo, n, a, lda, info )
call zlauum( uplo, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?lauum computes the product U*UT or LT*L for real flavors, and U*UH or LH*L for complex
flavors. Here the triangular factor U or L is stored in the upper or lower triangular part of the array a.
If uplo = 'U' or 'u', then the upper triangle of the result is stored, overwriting the factor U in A.
If uplo = 'L' or 'l', then the lower triangle of the result is stored, overwriting the factor L in A.
This is the blocked form of the algorithm, calling BLAS Level 3 Routines.
Input Parameters
The data types are given for the Fortran interface.
uplo CHARACTER*1.
1679
3 Intel Math Kernel Library Developer Reference
Output Parameters
a On exit,
if uplo = 'U', then the upper triangle of a is overwritten with the upper
triangle of the product U*UT(U*UH);
if uplo = 'L', then the lower triangle of a is overwritten with the lower
triangle of the product LT*L (LH*L).
info INTEGER.
If info = 0, the execution is successful.
?orbdb1/?unbdb1
Simultaneously bidiagonalizes the blocks of a tall and
skinny matrix with orthonormal columns.
Syntax
call sorbdb1( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call dorbdb1( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call cunbdb1( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call zunbdb1( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
1680
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routines ?orbdb1/?unbdb1 simultaneously bidiagonalize the blocks of a tall and skinny matrix X with
orthonormal columns:
The size of x11 is p by q, and x12 is (m - p) by q. q must not be larger than p, m-p, or m-q.
Tall and Skinny Matrix Routines
q min(p, m - p, m - q) ?orbdb1/?unbdb1
p min(q, m - p, m - q) ?orbdb2/?unbdb2
m - p min(p, q, m - q) ?orbdb3/?unbdb3
m - q min(p, q, m - p) ?orbdb4/?unbdb4
The orthogonal/unitary matrices p1, p2, and q1 are p-by-p, (m-p)-by-(m-p), (m-q)-by-(m-q), respectively.
p1, p2, and q1 are represented as products of elementary reflectors. See the description of ?orcsd2by1/?
uncsd2by1 for details on generating p1, p2, and q1 using ?orgqr and ?orglq.
The upper-bidiagonal matrices b11 and b12 of size q by q are represented implicitly by angles theta(1), ...,
theta(q) and phi(1), ..., phi(q-1). Every entry in each bidiagonal band is a product of a sine or cosine of
theta with a sine or cosine of phi. See [Sutton09] or the description of ?orcsd/?uncsd for details.
Input Parameters
m INTEGER. The number of rows in x11 plus the number of rows in x21.
1681
3 Intel Math Kernel Library Developer Reference
Output Parameters
x11 The columns of tril(x11) specify reflectors for p1 and the rows of
triu(x11,1) specify reflectors for q1, where tril(A) denotes the lower
triangle of A, and triu(A) denotes the upper triangle of A.
1682
LAPACK Routines 3
Array, DIMENSION (p).
See Also
?orcsd/?uncsd Computes the CS decomposition of a block-partitioned orthogonal/unitary matrix.
?orcsd2by1/?uncsd2by1 Computes the CS decomposition of a block-partitioned orthogonal/unitary
matrix.
?orbdb2/?unbdb2 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb3/?unbdb3 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb4/?unbdb4 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb5/?unbdb5 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
?orbdb6/?unbdb6 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
xerbla
?orbdb2/?unbdb2
Simultaneously bidiagonalizes the blocks of a tall and
skinny matrix with orthonormal columns.
Syntax
call sorbdb2( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call dorbdb2( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call cunbdb2( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
1683
3 Intel Math Kernel Library Developer Reference
call zunbdb2( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orbdb2/?unbdb2 simultaneously bidiagonalize the blocks of a tall and skinny matrix X with
orthonormal columns:
The size of x11 is p by q, and x12 is (m - p) by q. q must not be larger than p, m-p, or m-q.
Tall and Skinny Matrix Routines
q min(p, m - p, m - q) ?orbdb1/?unbdb1
p min(q, m - p, m - q) ?orbdb2/?unbdb2
m - p min(p, q, m - q) ?orbdb3/?unbdb3
m - q min(p, q, m - p) ?orbdb4/?unbdb4
The orthogonal/unitary matrices p1, p2, and q1 are p-by-p, (m-p)-by-(m-p), (m-q)-by-(m-q), respectively.
p1, p2, and q1 are represented as products of elementary reflectors. See the description of ?orcsd2by1/?
uncsd2by1 for details on generating p1, p2, and q1 using ?orgqr and ?orglq.
The upper-bidiagonal matrices b11 and b12 of size p by p are represented implicitly by angles theta(1), ...,
theta(q) and phi(1), ..., phi(q-1). Every entry in each bidiagonal band is a product of a sine or cosine of
theta with a sine or cosine of phi. See [Sutton09] or the description of ?orcsd/?uncsd for details.
Input Parameters
m INTEGER. The number of rows in x11 plus the number of rows in x21.
1684
LAPACK Routines 3
DOUBLE PRECISION for dorbdb2
COMPLEX for cunbdb2
DOUBLE COMPLEX for zunbdb2
Array, DIMENSION (ldx21,q).
Output Parameters
x11 On exit: the columns of tril(x11) specify reflectors for p1 and the rows of
triu(x11,1) specify reflectors for q1.
1685
3 Intel Math Kernel Library Developer Reference
See Also
?orcsd/?uncsd
?orcsd2by1/?uncsd2by1 Computes the CS decomposition of a block-partitioned orthogonal/unitary
matrix.
?orbdb1/?unbdb1 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb3/?unbdb3 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb4/?unbdb4 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb5/?unbdb5 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
?orbdb6/?unbdb6 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
xerbla
?orbdb3/?unbdb3
Simultaneously bidiagonalizes the blocks of a tall and
skinny matrix with orthonormal columns.
Syntax
call sorbdb3( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call dorbdb3( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
1686
LAPACK Routines 3
call cunbdb3( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call zunbdb3( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orbdb3/?unbdb3 simultaneously bidiagonalize the blocks of a tall and skinny matrix X with
orthonormal columns:
The size of x11 is p by q, and x12 is (m - p) by q. m-p must not be larger than p, q, or m-q.
Tall and Skinny Matrix Routines
q min(p, m - p, m - q) ?orbdb1/?unbdb1
p min(q, m - p, m - q) ?orbdb2/?unbdb2
m - p min(p, q, m - q) ?orbdb3/?unbdb3
m - q min(p, q, m - p) ?orbdb4/?unbdb4
The orthogonal/unitary matrices p1, p2, and q1 are p-by-p, (m-p)-by-(m-p), (m-q)-by-(m-q), respectively.
p1, p2, and q1 are represented as products of elementary reflectors. See the description of ?orcsd2by1/?
uncsd2by1 for details on generating p1, p2, and q1 using ?orgqr and ?orglq.
The upper-bidiagonal matrices b11 and b12 of size (m-p) by (m-p)are represented implicitly by angles
theta(1), ..., theta(q) and phi(1), ..., phi(q-1). Every entry in each bidiagonal band is a product of a
sine or cosine of theta with a sine or cosine of phi. See [Sutton09] or the description of ?orcsd/?uncsd for
details.
Input Parameters
m INTEGER. The number of rows in x11 plus the number of rows in x21.
1687
3 Intel Math Kernel Library Developer Reference
Output Parameters
x11 On exit: the columns of tril(x11) specify reflectors for p1 and the rows of
triu(x11,1) specify reflectors for q1.
1688
LAPACK Routines 3
taup1 REAL for sorbdb3
DOUBLE PRECISION for dorbdb3
COMPLEX for cunbdb3
DOUBLE COMPLEX for zunbdb3
Array, DIMENSION (p).
See Also
?orcsd/?uncsd
?orcsd2by1/?uncsd2by1 Computes the CS decomposition of a block-partitioned orthogonal/unitary
matrix.
?orbdb1/?unbdb1 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb2/?unbdb2 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb4/?unbdb4 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb5/?unbdb5 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
?orbdb6/?unbdb6 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
xerbla
?orbdb4/?unbdb4
Simultaneously bidiagonalizes the blocks of a tall and
skinny matrix with orthonormal columns.
Syntax
call sorbdb4( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1,
phantom, work, lwork, info )
1689
3 Intel Math Kernel Library Developer Reference
call dorbdb4( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1,
phantom, work, lwork, info )
call cunbdb4( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1,
phantom, work, lwork, info )
call zunbdb4( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1,
phantom, work, lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orbdb4/?unbdb4 simultaneously bidiagonalize the blocks of a tall and skinny matrix X with
orthonormal columns:
The size of x11 is p by q, and x12 is (m - p) by q. m-q must not be larger than q, p, or m-p.
Tall and Skinny Matrix Routines
q min(p, m - p, m - q) ?orbdb1/?unbdb1
p min(q, m - p, m - q) ?orbdb2/?unbdb2
m - p min(p, q, m - q) ?orbdb3/?unbdb3
m - q min(p, q, m - p) ?orbdb4/?unbdb4
The orthogonal/unitary matrices p1, p2, and q1 are p-by-p, (m-p)-by-(m-p), (m-q)-by-(m-q), respectively.
p1, p2, and q1 are represented as products of elementary reflectors. See the description of ?orcsd2by1/?
uncsd2by1 for details on generating p1, p2, and q1 using ?orgqr and ?orglq.
The upper-bidiagonal matrices b11 and b12 of size (m-q) by (m-q) are represented implicitly by angles
theta(1), ..., theta(q) and phi(1), ..., phi(q-1). Every entry in each bidiagonal band is a product of a
sine or cosine of theta with a sine or cosine of phi. See [Sutton09] or the description of ?orcsd/?uncsd for
details.
Input Parameters
m INTEGER. The number of rows in x11 plus the number of rows in x21.
1690
LAPACK Routines 3
Array, DIMENSION (ldx11,q).
Output Parameters
x11 On exit: the columns of tril(x11) specify reflectors for p1 and the rows of
triu(x11,1) specify reflectors for q1.
1691
3 Intel Math Kernel Library Developer Reference
Array, DIMENSION (q-1). The entries of bidiagonal blocks b11 and b21 can be
computed from the angles theta and phi. See the Description section for
details.
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value.
See Also
?orcsd/?uncsd
?orcsd2by1/?uncsd2by1 Computes the CS decomposition of a block-partitioned orthogonal/unitary
matrix.
?orbdb1/?unbdb1 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
1692
LAPACK Routines 3
?orbdb2/?unbdb2 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb3/?unbdb3 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb5/?unbdb5 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
?orbdb6/?unbdb6 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
xerbla
?orbdb5/?unbdb5
Orthogonalizes a column vector with respect to the
orthonormal basis matrix.
Syntax
call sorbdb5( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call dorbdb5( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call cunbdb5( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call zunbdb5( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The ?orbdb5/?unbdb5 routines orthogonalize the column vector
Input Parameters
m1 INTEGER
The dimension of x1 and the number of rows in q1. 0 m1.
m2 INTEGER
The dimension of x2 and the number of rows in q2. 0 m2.
n INTEGER
The number of columns in q1 and q2. 0 n.
1693
3 Intel Math Kernel Library Developer Reference
incx1 INTEGER
Increment for entries of x1.
incx2 INTEGER
Increment for entries of x2.
ldq1 INTEGER
The leading dimension of q1. ldq1m1.
ldq2 INTEGER
The leading dimension of q2. ldq2m2.
1694
LAPACK Routines 3
COMPLEX*16 for zundb5
Workspace array of size lwork.
lwork INTEGER
The size of the array work. lworkn.
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value.
See Also
?orcsd/?uncsd
?orcsd2by1/?uncsd2by1 Computes the CS decomposition of a block-partitioned orthogonal/unitary
matrix.
?orbdb1/?unbdb1 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb2/?unbdb2 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb3/?unbdb3 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb4/?unbdb4 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb6/?unbdb6 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
xerbla
?orbdb6/?unbdb6
Orthogonalizes a column vector with respect to the
orthonormal basis matrix.
Syntax
call sorbdb6( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call dorbdb6( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call cunbdb6( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call zunbdb6( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The ?orbdb6/?unbdb6 routines orthogonalize the column vector
1695
3 Intel Math Kernel Library Developer Reference
Input Parameters
m1 INTEGER
The dimension of x1 and the number of rows in q1. 0 m1.
m2 INTEGER
The dimension of x2 and the number of rows in q2. 0 m2.
n INTEGER
The number of columns in q1 and q2. 0 n.
incx1 INTEGER
Increment for entries of x1.
incx2 INTEGER
Increment for entries of x2.
1696
LAPACK Routines 3
ldq1 INTEGER
The leading dimension of q1. ldq1m1.
ldq2 INTEGER
The leading dimension of q2. ldq2m2.
lwork INTEGER
The size of the array work. lworkn.
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value.
See Also
?orcsd/?uncsd
?orcsd2by1/?uncsd2by1 Computes the CS decomposition of a block-partitioned orthogonal/unitary
matrix.
?orbdb1/?unbdb1 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb2/?unbdb2 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb3/?unbdb3 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb4/?unbdb4 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb5/?unbdb5 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
xerbla
1697
3 Intel Math Kernel Library Developer Reference
?org2l/?ung2l
Generates all or part of the orthogonal/unitary matrix
Q from a QL factorization determined by ?geqlf
(unblocked algorithm).
Syntax
call sorg2l( m, n, k, a, lda, tau, work, info )
call dorg2l( m, n, k, a, lda, tau, work, info )
call cung2l( m, n, k, a, lda, tau, work, info )
call zung2l( m, n, k, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine ?org2l/?ung2l generates an m-by-n real/complex matrix Q with orthonormal columns, which is
defined as the last n columns of a product of k elementary reflectors of order m:
Q = H(k)*...*H(2)*H(1) as returned by ?geqlf.
Input Parameters
On entry, the (n-k+i)-th column must contain the vector which defines the
elementary reflector H(i), for i = 1,2,..., k, as returned by ?geqlf in the
last k columns of its array argument A.
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by ?geqlf.
1698
LAPACK Routines 3
work REAL for sorg2l
DOUBLE PRECISION for dorg2l
COMPLEX for cung2l
DOUBLE COMPLEX for zung2l.
Workspace array, DIMENSION (n).
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value
?org2r/?ung2r
Generates all or part of the orthogonal/unitary matrix
Q from a QR factorization determined by ?geqrf
(unblocked algorithm).
Syntax
call sorg2r( m, n, k, a, lda, tau, work, info )
call dorg2r( m, n, k, a, lda, tau, work, info )
call cung2r( m, n, k, a, lda, tau, work, info )
call zung2r( m, n, k, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine ?org2r/?ung2r generates an m-by-n real/complex matrix Q with orthonormal columns, which is
defined as the first n columns of a product of k elementary reflectors of order m
Q = H(1)*H(2)*...*H(k)
as returned by ?geqrf.
Input Parameters
1699
3 Intel Math Kernel Library Developer Reference
On entry, the i-th column must contain the vector which defines the
elementary reflector H(i), for i = 1,2,..., k, as returned by ?geqrf in the
first k columns of its array argument a.
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by ?geqrf.
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value
?orgl2/?ungl2
Generates all or part of the orthogonal/unitary matrix
Q from an LQ factorization determined by ?gelqf
(unblocked algorithm).
Syntax
call sorgl2( m, n, k, a, lda, tau, work, info )
call dorgl2( m, n, k, a, lda, tau, work, info )
call cungl2( m, n, k, a, lda, tau, work, info )
call zungl2( m, n, k, a, lda, tau, work, info )
Include Files
mkl.fi
1700
LAPACK Routines 3
Description
The routine ?orgl2/?ungl2 generates a m-by-n real/complex matrix Q with orthonormal rows, which is
defined as the first m rows of a product of k elementary reflectors of order n
Q = H(k)*...*H(2)*H(1)for real flavors, or Q = (H(k))H*...*(H(2))H*(H(1))H for complex flavors as
returned by ?gelqf.
Input Parameters
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by ?gelqf.
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value.
1701
3 Intel Math Kernel Library Developer Reference
?orgr2/?ungr2
Generates all or part of the orthogonal/unitary matrix
Q from an RQ factorization determined by ?gerqf
(unblocked algorithm).
Syntax
call sorgr2( m, n, k, a, lda, tau, work, info )
call dorgr2( m, n, k, a, lda, tau, work, info )
call cungr2( m, n, k, a, lda, tau, work, info )
call zungr2( m, n, k, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine ?orgr2/?ungr2 generates an m-by-n real matrix Q with orthonormal rows, which is defined as
the last m rows of a product of k elementary reflectors of order n
Q = H(1)*H(2)*...*H(k) for real flavors, or Q = (H(1))H*(H(2))H*...*(H(k))H for complex flavors as
returned by ?gerqf.
Input Parameters
k INTEGER.
The number of elementary reflectors whose product defines the matrix Q.
mk 0.
On entry, the ( m- k+i)-th row must contain the vector which defines the
elementary reflector H(i), for i = 1,2,..., k, as returned by ?gerqf in
the last k rows of its array argument a.
1702
LAPACK Routines 3
Array, DIMENSION (k).tau(i) must contain the scalar factor of the
elementary reflector H(i), as returned by ?gerqf.
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value
?orm2l/?unm2l
Multiplies a general matrix by the orthogonal/unitary
matrix from a QL factorization determined by ?geqlf
(unblocked algorithm).
Syntax
call sorm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?orm2l/?unm2l overwrites the general real/complex m-by-n matrix C with
Input Parameters
side CHARACTER*1.
1703
3 Intel Math Kernel Library Developer Reference
trans CHARACTER*1.
= 'N': apply Q (no transpose)
if side = 'R', nk 0.
The i-th column must contain the vector which defines the elementary
reflector H(i), for i = 1,2,..., k, as returned by ?geqlf in the last k
columns of its array argument a. The array a is modified by the routine but
restored on exit.
1704
LAPACK Routines 3
ldc INTEGER. The leading dimension of the array C. ldc max(1,m).
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument had an illegal value
?orm2r/?unm2r
Multiplies a general matrix by the orthogonal/unitary
matrix from a QR factorization determined by ?geqrf
(unblocked algorithm).
Syntax
call sorm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?orm2r/?unm2r overwrites the general real/complex m-by-n matrix C with
1705
3 Intel Math Kernel Library Developer Reference
Input Parameters
side CHARACTER*1.
= 'L': apply Q or QT / QH from the left
trans CHARACTER*1.
= 'N': apply Q (no transpose)
if side = 'R', nk 0.
The i-th column must contain the vector which defines the elementary
reflector H(i), for i = 1,2,..., k, as returned by ?geqrf in the first k
columns of its array argument a. The array a is modified by the routine but
restored on exit.
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by ?geqrf.
1706
LAPACK Routines 3
DOUBLE COMPLEX for zunm2r.
Array, DIMENSION (ldc, n).
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument had an illegal value
?orml2/?unml2
Multiplies a general matrix by the orthogonal/unitary
matrix from a LQ factorization determined by ?gelqf
(unblocked algorithm).
Syntax
call sorml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?orml2/?unml2 overwrites the general real/complex m-by-n matrix C with
1707
3 Intel Math Kernel Library Developer Reference
Input Parameters
side CHARACTER*1.
= 'L': apply Q or QT / QH from the left
trans CHARACTER*1.
= 'N': apply Q (no transpose)
if side = 'R', nk 0.
The i-th row must contain the vector which defines the elementary reflector
H(i), for i = 1,2,..., k, as returned by ?gelqf in the first k rows of its
array argument a. The array a is modified by the routine but restored on
exit.
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by ?gelqf.
1708
LAPACK Routines 3
c REAL for sorml2
DOUBLE PRECISION for dorml2
COMPLEX for cunml2
DOUBLE COMPLEX for zunml2.
Array, DIMENSION (ldc, n) On entry, the m-by-n matrix C.
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument had an illegal value
?ormr2/?unmr2
Multiplies a general matrix by the orthogonal/unitary
matrix from a RQ factorization determined by ?gerqf
(unblocked algorithm).
Syntax
call sormr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dormr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunmr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunmr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?ormr2/?unmr2 overwrites the general real/complex m-by-n matrix C with
1709
3 Intel Math Kernel Library Developer Reference
C*QT / C*QH if side = 'R' and trans = 'T' (for real flavors) or trans = 'C' (for complex flavors).
Here Q is a real orthogonal or complex unitary matrix defined as the product of k elementary reflectors
Q = H(1)*H(2)*...*H(k) for real flavors, or Q = (H(1))H*(H(2))H*...*(H(k))H as returned by ?gerqf.
Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
side CHARACTER*1.
= 'L': apply Q or QT / QH from the left
trans CHARACTER*1.
= 'N': apply Q (no transpose)
if side = 'R', nk 0.
The i-th row must contain the vector which defines the elementary reflector
H(i), for i = 1,2,...,k, as returned by ?gerqf in the last k rows of its
array argument a. The array a is modified by the routine but restored on
exit.
lda INTEGER.
The leading dimension of the array a. lda max(1,k).
1710
LAPACK Routines 3
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by ?gerqf.
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument had an illegal value
?ormr3/?unmr3
Multiplies a general matrix by the orthogonal/unitary
matrix from a RZ factorization determined by ?tzrzf
(unblocked algorithm).
Syntax
call sormr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call dormr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call cunmr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call zunmr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?ormr3/?unmr3 overwrites the general real/complex m-by-n matrix C with
1711
3 Intel Math Kernel Library Developer Reference
Input Parameters
side CHARACTER*1.
= 'L': apply Q or QT / QH from the left
trans CHARACTER*1.
= 'N': apply Q (no transpose)
if side = 'R', nk 0.
if side = 'R', nl 0.
The i-th row must contain the vector which defines the elementary reflector
H(i), for i = 1,2,...,k, as returned by ?tzrzf in the last k rows of its
array argument a. The array a is modified by the routine but restored on
exit.
1712
LAPACK Routines 3
lda INTEGER.
The leading dimension of the array a. lda max(1,k).
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by ?tzrzf.
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument had an illegal value
?pbtf2
Computes the Cholesky factorization of a symmetric/
Hermitian positive-definite band matrix (unblocked
algorithm).
Syntax
call spbtf2( uplo, n, kd, ab, ldab, info )
call dpbtf2( uplo, n, kd, ab, ldab, info )
1713
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The routine computes the Cholesky factorization of a real symmetric or complex Hermitian positive definite
band matrix A.
The factorization has the form
A = UT*U for real flavors, A = UH*U for complex flavors if uplo = 'U', or
A = L*LT for real flavors, A = L*LH for complex flavors if uplo = 'L',
where U is an upper triangular matrix, and L is lower triangular. This is the unblocked version of the
algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric/
Hermitian matrix A is stored:
= 'U': upper triangular
kd 0.
1714
LAPACK Routines 3
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
> 0: if info = k, the leading minor of order k is not positive definite, and
the factorization could not be completed.
?potf2
Computes the Cholesky factorization of a symmetric/
Hermitian positive-definite matrix (unblocked
algorithm).
Syntax
call spotf2( uplo, n, a, lda, info )
call dpotf2( uplo, n, a, lda, info )
call cpotf2( uplo, n, a, lda, info )
call zpotf2( uplo, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?potf2 computes the Cholesky factorization of a real symmetric or complex Hermitian positive
definite matrix A. The factorization has the form
A = UT*U for real flavors, A = UH*U for complex flavors if uplo = 'U', or
A = L*LT for real flavors, A = L*LH for complex flavors if uplo = 'L',
where U is an upper triangular matrix, and L is lower triangular.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric/
Hermitian matrix A is stored.
= 'U': upper triangular
1715
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
> 0: if info = k, the leading minor of order k is not positive definite, and
the factorization could not be completed.
?ptts2
Solves a tridiagonal system of the form A*X=B using
the L*D*LH/L*D*LH factorization computed by ?pttrf.
Syntax
call sptts2( n, nrhs, d, e, b, ldb )
call dptts2( n, nrhs, d, e, b, ldb )
call cptts2( iuplo, n, nrhs, d, e, b, ldb )
call zptts2( iuplo, n, nrhs, d, e, b, ldb )
Include Files
mkl.fi
Description
The routine ?ptts2 solves a tridiagonal system of the form
A*X = B
Real flavors sptts2/dptts2 use the L*D*LT factorization of A computed by spttrf/dpttrf, and complex
flavors cptts2/zptts2 use the UH*D*U or L*D*LH factorization of A computed by cpttrf/zpttrf.
D is a diagonal matrix specified in the vector d, U (or L) is a unit bidiagonal matrix whose superdiagonal
(subdiagonal) is specified in the vector e, and X and B are n-by-nrhs matrices.
1716
LAPACK Routines 3
Input Parameters
nrhs INTEGER. The number of right hand sides, that is, the number of columns
of the matrix B. nrhs 0.
Contains the (n-1) subdiagonal elements of the unit bidiagonal factor L from
the L*D*LT (for real flavors) or L*D*LH (for complex flavors when iuplo =
0) factorization of A.
For complex flavors when iuplo = 1, e contains the (n-1) superdiagonal
elements of the unit bidiagonal factor U from the factorization A = UH*D*U.
On entry, the right hand side vectors B for the system of linear equations.
Output Parameters
?rscl
Multiplies a vector by the reciprocal of a real scalar.
1717
3 Intel Math Kernel Library Developer Reference
Syntax
call srscl( n, sa, sx, incx )
call drscl( n, sa, sx, incx )
call csrscl( n, sa, sx, incx )
call zdrscl( n, sa, sx, incx )
Include Files
mkl.fi
Description
The routine ?rscl multiplies an n-element real/complex vector x by the real scalar 1/a. This is done without
overflow or underflow as long as the final result x/a does not overflow or underflow.
Input Parameters
incx INTEGER. The increment between successive values of the vector sx.
If incx > 0, sx(1)=x(1), and sx(1+(i-1)*incx)=x(i), 1<in.
Output Parameters
?syswapr
Applies an elementary permutation on the rows and
columns of a symmetric matrix.
Syntax
call ssyswapr( uplo, n, a, lda, i1, i2 )
call dsyswapr( uplo, n, a, lda, i1, i2 )
call csyswapr( uplo, n, a, lda, i1, i2 )
call zsyswapr( uplo, n, a, lda, i1, i2 )
1718
LAPACK Routines 3
call syswapr( a,i1,i2[,uplo] )
Include Files
mkl.fi, lapack.f90
Description
The routine applies an elementary permutation on the rows and columns of a symmetric matrix.
Input Parameters
The data types are given for the Fortran interface.
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LT.
The array a contains the block diagonal matrix D and the multipliers used to
obtain the factor U or L as computed by ?sytrf.
Output Parameters
If info = 'U', the upper triangular part of the inverse is formed and the part of
A below the diagonal is not referenced.
If info = 'L', the lower triangular part of the inverse is formed and the part of
A above the diagonal is not referenced.
1719
3 Intel Math Kernel Library Developer Reference
uplo Indicates how the matrix A has been factored. Must be 'U' or 'L'.
See Also
?sytrf
?heswapr
Applies an elementary permutation on the rows and
columns of a Hermitian matrix.
Syntax
call cheswapr( uplo, n, a, lda, i1, i2 )
call zheswapr( uplo, n, a, lda, i1, i2 )
call heswapr( a, i1, i2 [,uplo] )
Include Files
mkl.fi, lapack.f90
Description
The routine applies an elementary permutation on the rows and columns of a Hermitian matrix.
Input Parameters
The data types are given for the Fortran interface.
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LH.
The array a contains the block diagonal matrix D and the multipliers used to
obtain the factor U or L as computed by ?hetrf.
1720
LAPACK Routines 3
i1 INTEGER. Index of the first row to swap.
Output Parameters
If info = 'U', the upper triangular part of the inverse is formed and the part of
A below the diagonal is not referenced.
If info = 'L', the lower triangular part of the inverse is formed and the part of
A above the diagonal is not referenced.
See Also
?hetrf
?syswapr1
?syswapr1
Applies an elementary permutation on the rows and
columns of a symmetric matrix.
Syntax
call ssyswapr1( uplo, n, a, lda, i1, i2 )
call dsyswapr1( uplo, n, a, lda, i1, i2 )
call csyswapr1( uplo, n, a, lda, i1, i2 )
call zsyswapr1( uplo, n, a, lda, i1, i2 )
call syswapr1( a,i1,i2[,uplo] )
Include Files
mkl.fi, lapack.f90
Description
The routine applies an elementary permutation on the rows and columns of a symmetric matrix.
1721
3 Intel Math Kernel Library Developer Reference
Input Parameters
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LT.
The array a contains the block diagonal matrix D and the multipliers used to
obtain the factor U or L as computed by ?sytrf.
Output Parameters
If info = 'U', the upper triangular part of the inverse is formed and the part of
A below the diagonal is not referenced.
If info = 'L', the lower triangular part of the inverse is formed and the part of
A above the diagonal is not referenced.
uplo Indicates how the matrix A has been factored. Must be 'U' or 'L'.
1722
LAPACK Routines 3
See Also
?sytrf
?sygs2/?hegs2
Reduces a symmetric/Hermitian positive-definite
generalized eigenproblem to standard form, using the
factorization results obtained from ?potrf (unblocked
algorithm).
Syntax
call ssygs2( itype, uplo, n, a, lda, b, ldb, info )
call dsygs2( itype, uplo, n, a, lda, b, ldb, info )
call chegs2( itype, uplo, n, a, lda, b, ldb, info )
call zhegs2( itype, uplo, n, a, lda, b, ldb, info )
Include Files
mkl.fi
Description
The routine ?sygs2/?hegs2 reduces a real symmetric-definite or a complex Hermitian positive-definite
generalized eigenproblem to standard form.
If itype = 1, the problem is
A*x = *B*x
and A is overwritten by inv(UH)*A*inv(U) or inv(L)*A*inv(LH) for complex flavors and by
inv(UT)*A*inv(U) or inv(L)*A*inv(LT)for real flavors.
If itype = 2 or 3, the problem is
Input Parameters
itype INTEGER.
For complex flavors:
= 1: compute inv(UH)*A*inv(U) or inv(L)*A*inv(LH);
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
symmetric/Hermitian matrix A is stored, and how B has been factorized.
1723
3 Intel Math Kernel Library Developer Reference
lda INTEGER.
The leading dimension of the array a. lda max(1,n).
Output Parameters
info INTEGER.
= 0: successful exit.
< 0: if info = -i, the i-th argument had an illegal value.
?sytd2/?hetd2
Reduces a symmetric/Hermitian matrix to real
symmetric tridiagonal form by an orthogonal/unitary
similarity transformation(unblocked algorithm).
1724
LAPACK Routines 3
Syntax
call ssytd2( uplo, n, a, lda, d, e, tau, info )
call dsytd2( uplo, n, a, lda, d, e, tau, info )
call chetd2( uplo, n, a, lda, d, e, tau, info )
call zhetd2( uplo, n, a, lda, d, e, tau, info )
Include Files
mkl.fi
Description
The routine ?sytd2/?hetd2 reduces a real symmetric/complex Hermitian matrix A to real symmetric
tridiagonal form T by an orthogonal/unitary similarity transformation: QT*A*Q = T (QH*A*Q = T ).
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric/
Hermitian matrix A is stored:
= 'U': upper triangular
Output Parameters
1725
3 Intel Math Kernel Library Developer Reference
The first n-1 elements contain scalar factors of the elementary reflectors.
tau(n) is used as workspace.
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument had an illegal value.
?sytf2
Computes the factorization of a real/complex
symmetric indefinite matrix, using the diagonal
pivoting method (unblocked algorithm).
Syntax
call ssytf2( uplo, n, a, lda, ipiv, info )
call dsytf2( uplo, n, a, lda, ipiv, info )
call csytf2( uplo, n, a, lda, ipiv, info )
call zsytf2( uplo, n, a, lda, ipiv, info )
Include Files
mkl.fi
1726
LAPACK Routines 3
Description
The routine ?sytf2 computes the factorization of a real/complex symmetric matrix A using the Bunch-
Kaufman diagonal pivoting method:
A = U*D*UT, or A = L*D*LT,
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, and D is symmetric
and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is stored
= 'U': upper triangular
lda INTEGER.
The leading dimension of the array a. lda max(1,n).
Output Parameters
a On exit, the block diagonal matrix D and the multipliers used to obtain the
factor U or L.
ipiv INTEGER.
Array, DIMENSION (n).
1727
3 Intel Math Kernel Library Developer Reference
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, then rows and columns
k-1 and -ipiv(k) are interchanged and D(k - 1:k, k - 1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv( k) = ipiv( k+1)< 0, then rows and columns
k+1 and -ipiv(k) were interchanged and D(k:k + 1,k:k + 1) is a 2-by-2
diagonal block.
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument has an illegal value
> 0: if info = k, D(k,k) is exactly zero. The factorization are completed, but
the block diagonal matrix D is exactly singular, and division by zero will
occur if it is used to solve a system of equations.
?sytf2_rook
Computes the factorization of a real/complex
symmetric indefinite matrix, using the bounded
Bunch-Kaufman diagonal pivoting method (unblocked
algorithm).
Syntax
call ssytf2_rook( uplo, n, a, lda, ipiv, info )
call dsytf2_rook( uplo, n, a, lda, ipiv, info )
call csytf2_rook( uplo, n, a, lda, ipiv, info )
call zsytf2_rook( uplo, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
The routine ?sytf2_rook computes the factorization of a real/complex symmetric matrix A using the
bounded Bunch-Kaufman ("rook") diagonal pivoting method:
A = U*D*UT, or A = L*D*LT,
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, and D is symmetric
and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is stored
= 'U': upper triangular
1728
LAPACK Routines 3
a REAL for ssytf2_rook
DOUBLE PRECISION for dsytf2_rook
COMPLEX for csytf2_rook
DOUBLE COMPLEX for zsytf2_rook.
Array, DIMENSION (lda, n).
lda INTEGER.
The leading dimension of the array a. lda max(1,n).
Output Parameters
a On exit, the block diagonal matrix D and the multipliers used to obtain the
factor U or L.
ipiv INTEGER.
Array, DIMENSION (n).
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows and
columns k and -ipiv(k) were interchanged, rows and columns k + 1 and -
ipiv(k + 1) were interchanged, and Dk:k+1, k:k+1 is a 2-by-2 diagonal block.
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument has an illegal value
> 0: if info = k, D(k,k) is exactly zero. The factorization are completed, but
the block diagonal matrix D is exactly singular, and division by zero will
occur if it is used to solve a system of equations.
?hetf2
Computes the factorization of a complex Hermitian
matrix, using the diagonal pivoting method (unblocked
algorithm).
1729
3 Intel Math Kernel Library Developer Reference
Syntax
call chetf2( uplo, n, a, lda, ipiv, info )
call zhetf2( uplo, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
The routine computes the factorization of a complex Hermitian matrix A using the Bunch-Kaufman diagonal
pivoting method:
A = U*D*UH or A = L*D*LH
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, UH is the conjugate
transpose of U, and D is Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the Hermitian matrix
A is stored:
= 'U': Upper triangular
Output Parameters
a On exit, the block diagonal matrix D and the multipliers used to obtain the
factor U or L.
1730
LAPACK Routines 3
If uplo = 'U' and ipiv(k) = ipiv( k-1) < 0, then rows and columns
k-1 and -ipiv(k) were interchanged and D(k-1:k,k-1:k ) is a 2-by-2 diagonal
block.
If uplo = 'L' and ipiv(k) = ipiv( k+1) < 0, then rows and columns
k+1 and -ipiv(k) were interchanged and D(k:k+1, k:k+1) is a 2-by-2
diagonal block.
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
?hetf2_rook
Computes the factorization of a complex Hermitian
matrix, using the bounded Bunch-Kaufman diagonal
pivoting method (unblocked algorithm).
Syntax
call chetf2_rook( uplo, n, a, lda, ipiv, info )
call zhetf2_rook( uplo, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
The routine computes the factorization of a complex Hermitian matrix A using the bounded Bunch-Kaufman
("rook") diagonal pivoting method:
A = U*D*UH or A = L*D*LH
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, UH is the conjugate
transpose of U, and D is Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the Hermitian matrix
A is stored:
= 'U': Upper triangular
1731
3 Intel Math Kernel Library Developer Reference
Output Parameters
a On exit, the block diagonal matrix D and the multipliers used to obtain the
factor U or L.
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows and
columns k and -ipiv(k) were interchanged, rows and columns k + 1 and -
ipiv(k + 1) were interchanged, and Dk:k+1, k:k+1 is a 2-by-2 diagonal block.
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
?tgex2
Swaps adjacent diagonal blocks in an upper (quasi)
triangular matrix pair by an orthogonal/unitary
equivalence transformation.
Syntax
call stgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, n1, n2, work,
lwork, info )
call dtgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, n1, n2, work,
lwork, info )
call ctgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, info )
call ztgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, info )
1732
LAPACK Routines 3
Include Files
mkl.fi
Description
The real routines stgex2/dtgex2 swap adjacent diagonal blocks (A11, B11) and (A22, B22) of size 1-by-1 or
2-by-2 in an upper (quasi) triangular matrix pair (A, B) by an orthogonal equivalence transformation. (A, B)
must be in generalized real Schur canonical form (as returned by sgges/dgges), that is, A is block upper
triangular with 1-by-1 and 2-by-2 diagonal blocks. B is upper triangular.
The complex routines ctgex2/ztgex2 swap adjacent diagonal 1-by-1 blocks (A11, B11) and (A22, B22) in
an upper triangular matrix pair (A, B) by an unitary equivalence transformation.
(A, B) must be in generalized Schur canonical form, that is, A and B are both upper triangular.
All routines optionally update the matrices Q and Z of generalized Schur vectors:
For real flavors,
Q(in)*A(in)*Z(in)T = Q(out)*A(out)*Z(out)T
Q(in)*B(in)*Z(in)T = Q(out)*B(out)*Z(out)T.
For complex flavors,
Q(in)*A(in)*Z(in)H = Q(out)*A(out)*Z(out)H
Q(in)*B(in)*Z(in)H = Q(out)*B(out)*Z(out)H.
Input Parameters
wantq LOGICAL.
If wantq = .TRUE. : update the left transformation matrix Q;
wantz LOGICAL.
If wantz = .TRUE. : update the right transformation matrix Z;
1733
3 Intel Math Kernel Library Developer Reference
j1 INTEGER.
The index to the first block (A11, B11). 1 j1n.
n1 INTEGER. Used with real flavors only. The order of the first block (A11,
B11). n1 = 0, 1 or 2.
n2 INTEGER. Used with real flavors only. The order of the second block (A22,
B22). n2 = 0, 1 or 2.
Output Parameters
info INTEGER.
=0: Successful exit For stgex2/dtgex2: If info = 1, the transformed
matrix (A, B) would be too far from generalized Schur form; the blocks are
not swapped and (A, B) and (Q, Z) are unchanged. The problem of
swapping is too ill-conditioned. If info = -16: lwork is too small.
Appropriate value for lwork is returned in work(1).
For ctgex2/ztgex2:
If info = 1, the transformed matrix pair (A, B) would be too far from
generalized Schur form; the problem is ill-conditioned.
1734
LAPACK Routines 3
?tgsy2
Solves the generalized Sylvester equation (unblocked
algorithm).
Syntax
call stgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
scale, rdsum, rdscal, iwork, pq, info )
call dtgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
scale, rdsum, rdscal, iwork, pq, info )
call ctgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
scale, rdsum, rdscal, iwork, pq, info )
call ztgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
scale, rdsum, rdscal, iwork, pq, info )
Include Files
mkl.fi
Description
The routine ?tgsy2 solves the generalized Sylvester equation:
A*R-L*B=scale*C (1)
D*R-L*E=scale*F
using Level 1 and 2 BLAS, where R and L are unknown m-by-n matrices, (A, D), ( B, E) and (C, F) are given
matrix pairs of size m-by -m, n-by-n and m-by-n, respectively. For stgsy2/dtgsy2, pairs (A, D) and (B, E)
must be in generalized Schur canonical form, that is, A, B are upper quasi triangular and D, E are upper
triangular. For ctgsy2/ztgsy2, matrices A, B, D and E are upper triangular (that is, (A, D) and (B, E) in
generalized Schur form).
The solution (R, L) overwrites (C, F).
0 scale 1 is an output scaling factor chosen to avoid overflow.
In matrix notation, solving equation (1) corresponds to solve
Z*x = scale*b
where Z is defined for real flavors as
Here Ik is the identity matrix of size k and XT (XH) is the transpose (conjugate transpose) of X. kron(X, Y)
denotes the Kronecker product between the matrices X and Y.
1735
3 Intel Math Kernel Library Developer Reference
ZT*y = scale*b
for y, which is equivalent to solving for R and L in
AT*R+DT*L=scale*C (4)
R*BT+L*ET=scale*(-F)
For complex flavors, if trans = 'C', solve the conjugate transposed system
ZH*y = scale*b
for y, which is equivalent to solving for R and L in
AH*R+DH*L=scale*C (5)
R*BH+L*EH=scale*(-F)
These cases are used to compute an estimate of Dif[(A,D),(B,E)] = sigma_min(Z) using reverse
communication with ?lacon.
?tgsy2 also (for ijob 1) contributes to the computation in ?tgsyl of an upper bound on the separation
between two matrix pairs. Then the input (A, D), (B, E) are sub-pencils of the matrix pair (two matrix pairs)
in ?tgsyl. See ?tgsyl for details.
Input Parameters
trans CHARACTER*1.
If trans = 'N', solve the generalized Sylvester equation (1);
1736
LAPACK Routines 3
Arrays, DIMENSION (lda, m) and (ldb, n), respectively. On entry, a contains
an upper (quasi) triangular matrix A, and b contains an upper (quasi)
triangular matrix B.
lda INTEGER. The leading dimension of the array a. lda max(1, m).
ldb INTEGER.
The leading dimension of the array b. ldb max(1, n).
ldd INTEGER. The leading dimension of the array d. ldd max(1, m).
lde INTEGER. The leading dimension of the array e. lde max(1, n).
ldf INTEGER. The leading dimension of the array f. ldf max(1, m).
Output Parameters
1737
3 Intel Math Kernel Library Developer Reference
rdsum On exit, the corresponding sum of squares updated with the contributions
from the current sub-system.
If trans = 'T', rdsum is not touched.
Note that rdsum only makes sense when ?tgsy2 is called by ?tgsyl.
Note that rdscal only makes sense when ?tgsy2 is called by ?tgsyl.
> 0: The matrix pairs (A, D) and (B, E) have common or very close
eigenvalues.
?trti2
Computes the inverse of a triangular matrix
(unblocked algorithm).
Syntax
call strti2( uplo, diag, n, a, lda, info )
call dtrti2( uplo, diag, n, a, lda, info )
call ctrti2( uplo, diag, n, a, lda, info )
call ztrti2( uplo, diag, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?trti2 computes the inverse of a real/complex upper or lower triangular matrix.
1738
LAPACK Routines 3
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': upper triangular
diag CHARACTER*1.
Specifies whether or not the matrix A is unit triangular.
= 'N': non-unit triangular
Output Parameters
a On exit, the (triangular) inverse of the original matrix, in the same storage
format.
info INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
clag2z
Converts a complex single precision matrix to a
complex double precision matrix.
Syntax
call clag2z( m, n, sa, ldsa, a, lda, info )
1739
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
This routine converts a complex single precision matrix SA to a complex double precision matrix A.
Note that while it is possible to overflow while converting from double to single, it is not possible to overflow
when converting from single to double.
This is an auxiliary routine so there is no argument checking.
Input Parameters
ldsa INTEGER. The leading dimension of the array sa; ldsa max(1, m).
lda INTEGER. The leading dimension of the array a; lda max(1, m).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
dlag2s
Converts a double precision matrix to a single
precision matrix.
Syntax
call dlag2s( m, n, a, lda, sa, ldsa, info )
Include Files
mkl.fi
Description
This routine converts a double precision matrix SA to a single precision matrix A.
RMAX is the overflow for the single precision arithmetic. dlag2s checks that all the entries of A are between
-RMAX and RMAX. If not, the convertion is aborted and a flag is raised.
This is an auxiliary routine so there is no argument checking.
1740
LAPACK Routines 3
Input Parameters
lda INTEGER. The leading dimension of the array a; lda max(1, m).
ldsa INTEGER. The leading dimension of the array sa; ldsa max(1, m).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
slag2d
Converts a single precision matrix to a double
precision matrix.
Syntax
call slag2d( m, n, sa, ldsa, a, lda, info )
Include Files
mkl.fi
Description
The routine converts a single precision matrix SA to a double precision matrix A.
Note that while it is possible to overflow while converting from double to single, it is not possible to overflow
when converting from single to double.
This is an auxiliary routine so there is no argument checking.
Input Parameters
1741
3 Intel Math Kernel Library Developer Reference
ldsa INTEGER. The leading dimension of the array sa; ldsa max(1, m).
lda INTEGER. The leading dimension of the array a; lda max(1, m).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
zlag2c
Converts a complex double precision matrix to a
complex single precision matrix.
Syntax
call zlag2c( m, n, a, lda, sa, ldsa, info )
Include Files
mkl.fi
Description
The routine converts a double precision complex matrix SA to a single precision complex matrix A.
RMAX is the overflow for the single precision arithmetic. zlag2c checks that all the entries of A are between
-RMAX and RMAX. If not, the convertion is aborted and a flag is raised.
This is an auxiliary routine so there is no argument checking.
Input Parameters
lda INTEGER. The leading dimension of the array a; lda max(1, m).
ldsa INTEGER. The leading dimension of the array sa; ldsa max(1, m).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
1742
LAPACK Routines 3
If info = 1, an entry of the matrix A is greater than the single precision
overflow threshold; in this case, the content of sa on exit is unspecified.
?larfp
Generates a real or complex elementary reflector.
Syntax
call slarfp(n, alpha, x, incx, tau)
call dlarfp(n, alpha, x, incx, tau)
call clarfp(n, alpha, x, incx, tau)
call zlarfp(n, alpha, x, incx, tau)
Include Files
mkl.fi
Description
The ?larfp routines generate a real or complex elementary reflector H of order n, such that
H * (alpha) = (beta),
( x ) ( 0 )
Here
alpha and beta are scalars, beta is real and non-negative,
x is (n-1)-element vector.
H = I - tau*( 1 )* (1 v'),
( v )
For real flavors if the elements of x are all zero, then tau = 0 and H is taken to be the unit matrix. Otherwise
1 tau 2.
For complex flavors if the elements of x are all zero and alpha is real, then tau = 0 and H is taken to be the
unit matrix. Otherwise 1 real(tau) 2, and abs (tau-1 1.
Input Parameters
1743
3 Intel Math Kernel Library Developer Reference
Output Parameters
ila?lc
Scans a matrix for its last non-zero column.
Syntax
value = ilaslc(m, n, a, lda)
value = iladlc(m, n, a, lda)
value = ilaclc(m, n, a, lda)
value = ilazlc(m, n, a, lda)
Include Files
mkl.fi
Description
The ila?lc routines scan a matrix A for its last non-zero column.
Input Parameters
1744
LAPACK Routines 3
DOUBLE COMPLEX for ilazlc
Array, DIMENSION(lda, *). The second dimension of a must be at least
max(1, n).
Before entry the leading n-by-n part of the array a must contain the matrix
A.
Output Parameters
value INTEGER
Number of the last non-zero column.
ila?lr
Scans a matrix for its last non-zero row.
Syntax
value = ilaslr(m, n, a, lda)
value = iladlr(m, n, a, lda)
value = ilaclr(m, n, a, lda)
value = ilazlr(m, n, a, lda)
Include Files
mkl.fi
Description
The ila?lr routines scan a matrix A for its last non-zero row.
Input Parameters
1745
3 Intel Math Kernel Library Developer Reference
Output Parameters
value INTEGER
Number of the last non-zero row.
?gsvj0
Pre-processor for the routine ?gesvj.
Syntax
call sgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work,
lwork, info)
call dgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work,
lwork, info)
call cgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work,
lwork, info)
call zgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work,
lwork, info)
Include Files
mkl.fi
Description
This routine is called from ?gesvj as a pre-processor and that is its main purpose. It applies Jacobi rotations
in the same way as ?gesvj does, but it does not check convergence (stopping criterion).
The routine ?gsvj0 enables ?gesvj to use a simplified version of itself to work on a submatrix of the original
matrix.
Input Parameters
1746
LAPACK Routines 3
Array, DIMENSION (lda, n). Contains the m-by-n matrix A, such that
A*diag(D) represents the input matrix.
1747
3 Intel Math Kernel Library Developer Reference
nsweep INTEGER.
The number of sweeps of Jacobi rotations to be performed.
lwork INTEGER. The size of the array work; at least max(1, m).
Output Parameters
sva On exit, contains the Euclidean norms of the columns of the output matrix
A*diag(D).
info INTEGER.
1748
LAPACK Routines 3
If info = 0, the execution is successful.
?gsvj1
Pre-processor for the routine ?gesvj, applies Jacobi
rotations targeting only particular pivots.
Syntax
call sgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep,
work, lwork, info)
call dgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep,
work, lwork, info)
call cgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep,
work, lwork, info)
call zgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep,
work, lwork, info)
Include Files
mkl.fi
Description
This routine is called from ?gesvj as a pre-processor and that is its main purpose. It applies Jacobi rotations
in the same way as ?gesvj does, but it targets only particular pivots and it does not check convergence
(stopping criterion).
The routine ?gsvj1 applies few sweeps of Jacobi rotations in the column space of the input m-by-n matrix A.
The pivot pairs are taken from the (1,2) off-diagonal block in the corresponding n-by-n Gram matrix A'*A.
The block-entries (tiles) of the (1,2) off-diagonal block are marked by the [x]'s in the following scheme:
* * * x x x
* * * x x x
* * * x x x
x x x * * *
x x x * * *
x x x * * *
row-cycling in the nblr-by-nblc[x] blocks, row-cyclic pivoting inside each [x] block
In terms of the columns of the matrix A, the first n1 columns are rotated 'against' the remaining n-n1
columns, trying to increase the angle between the corresponding subspaces. The off-diagonal block is n1-by-
(n-n1) and it is tiled using quadratic tiles. The number of sweeps is specified by nsweep, and the
orthogonality threshold is set by tol.
Input Parameters
1749
3 Intel Math Kernel Library Developer Reference
n1 INTEGER. Specifies the 2-by-2 block partition. The first n1 columns are
rotated 'against' the remaining n-n1 columns of the matrix A.
1750
LAPACK Routines 3
DOUBLE COMPLEX for zgsvj1.
Array, DIMENSION (ldv, n).
nsweep INTEGER.
The number of sweeps of Jacobi rotations to be performed.
lwork INTEGER. The size of the array work; at least max(1, m).
1751
3 Intel Math Kernel Library Developer Reference
Output Parameters
sva On exit, contains the Euclidean norms of the columns of the output matrix
A*diag(D).
info INTEGER.
If info = 0, the execution is successful.
?sfrk
Performs a symmetric rank-k operation for matrix in
RFP format.
Syntax
call ssfrk(transr, uplo, trans, n, k, alpha, a, lda, beta, c)
call dsfrk(transr, uplo, trans, n, k, alpha, a, lda, beta, c)
Include Files
mkl.fi
Description
The ?sfrk routines perform a matrix-matrix operation using symmetric matrices. The operation is defined as
C := alpha*A*AT + beta*C,
or
C := alpha*AT*A + beta*C,
where:
alpha and beta are scalars,
C is an n-by-n symmetric matrix in rectangular full packed (RFP) format,
A is an n-by-k matrix in the first case and a k-by-n matrix in the second case.
1752
LAPACK Routines 3
Input Parameters
transr CHARACTER*1.
if transr = 'N' or 'n', the normal form of RFP C is stored;
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array c is used.
If uplo = 'U' or 'u', then the upper triangular part of the array c is used.
If uplo = 'L' or 'l', then the low triangular part of the array c is used.
1753
3 Intel Math Kernel Library Developer Reference
Output Parameters
?hfrk
Performs a Hermitian rank-k operation for matrix in
RFP format.
Syntax
call chfrk(transr, uplo, trans, n, k, alpha, a, lda, beta, c)
call zhfrk(transr, uplo, trans, n, k, alpha, a, lda, beta, c)
Include Files
mkl.fi
Description
The ?hfrk routines perform a matrix-matrix operation using Hermitian matrices. The operation is defined as
C := alpha*A*AH + beta*C,
or
C := alpha*AH*A + beta*C,
where:
alpha and beta are real scalars,
C is an n-by-n Hermitian matrix in RFP format,
A is an n-by-k matrix in the first case and a k-by-n matrix in the second case.
Input Parameters
transr CHARACTER*1.
if transr = 'N' or 'n', the normal form of RFP C is stored;
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
array c is used.
If uplo = 'U' or 'u', then the upper triangular part of the array c is used.
If uplo = 'L' or 'l', then the low triangular part of the array c is used.
1754
LAPACK Routines 3
k INTEGER. On entry with trans = 'N' or 'n', k specifies the number of
columns of the matrix a, and on entry with trans = 'T' or 't' or 'C' or
'c', k specifies the number of rows of the matrix a.
The value of k must be at least zero.
Output Parameters
?tfsm
Solves a matrix equation (one operand is a triangular
matrix in RFP format).
Syntax
call stfsm(transr, side, uplo, trans, diag, m, n, alpha, a, b, ldb)
call dtfsm(transr, side, uplo, trans, diag, m, n, alpha, a, b, ldb)
call ctfsm(transr, side, uplo, trans, diag, m, n, alpha, a, b, ldb)
call ztfsm(transr, side, uplo, trans, diag, m, n, alpha, a, b, ldb)
Include Files
mkl.fi
1755
3 Intel Math Kernel Library Developer Reference
Description
op(A)*X = alpha*B,
or
X*op(A) = alpha*B,
where:
alpha is a scalar,
X and B are m-by-n matrices,
A is a unit, or non-unit, upper or lower triangular matrix in rectangular full packed (RFP) format.
op(A) can be one of the following:
Input Parameters
transr CHARACTER*1.
if transr = 'N' or 'n', the normal form of RFP A is stored;
1756
LAPACK Routines 3
n INTEGER. Specifies the number of columns of B. The value of n must be at
least zero.
Before entry, the leading m-by-n part of the array b must contain the right-
hand side matrix B.
Output Parameters
?lansf
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a symmetric matrix in RFP format.
Syntax
val = slansf(norm, transr, uplo, n, a, work)
val = dlansf(norm, transr, uplo, n, a, work)
Include Files
mkl.fi
Description
T
1757
3 Intel Math Kernel Library Developer Reference
The function ?lansf returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of an n-by-n real symmetric matrix A in the rectangular full packed (RFP)
format .
Input Parameters
transr CHARACTER*1.
Specifies whether the RFP format of matrix A is normal or transposed
format.
If transr = 'N': RFP format is normal;
uplo CHARACTER*1.
Specifies whether the RFP matrix A came from upper or lower triangular
matrix.
If uplo = 'U': RFP matrix A came from an upper triangular matrix;
The upper (if uplo = 'U') or lower (if uplo = 'L') part of the symetric
matrix A stored in RFP format.
Output Parameters
1758
LAPACK Routines 3
Value returned by the function.
?lanhf
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a Hermitian matrix in RFP format.
Syntax
val = clanhf(norm, transr, uplo, n, a, work)
val = zlanhf(norm, transr, uplo, n, a, work)
Include Files
mkl.fi
Description
The function ?lanhf returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of an n-by-n complex Hermitian matrix A in the rectangular full packed
(RFP) format.
Input Parameters
norm CHARACTER*1.
Specifies the value to be returned by the routine:
= 'M' or 'm': val = max(abs(Aij)), largest absolute value of the matrix A.
transr CHARACTER*1.
Specifies whether the RFP format of matrix A is normal or conjugate-
transposed format.
If transr = 'N': RFP format is normal;
uplo CHARACTER*1.
Specifies whether the RFP matrix A came from upper or lower triangular
matrix.
If uplo = 'U': RFP matrix A came from an upper triangular matrix;
1759
3 Intel Math Kernel Library Developer Reference
The upper (if uplo = 'U') or lower (if uplo = 'L') part of the Hermitian
matrix A stored in RFP format.
Output Parameters
?tfttp
Copies a triangular matrix from the rectangular full
packed format (TF) to the standard packed format
(TP) .
Syntax
call stfttp( transr, uplo, n, arf, ap, info )
call dtfttp( transr, uplo, n, arf, ap, info )
call ctfttp( transr, uplo, n, arf, ap, info )
call ztfttp( transr, uplo, n, arf, ap, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the Rectangular Full Packed (RFP) format to the standard
packed format. For the description of the RFP format, see Matrix Storage Schemes.
Input Parameters
transr CHARACTER*1.
= 'N': arf is in the Normal format,
uplo CHARACTER*1.
1760
LAPACK Routines 3
Specifies whether A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
On entry, the upper or lower triangular matrix A stored in the RFP format.
Output Parameters
?tfttr
Copies a triangular matrix from the rectangular full
packed format (TF) to the standard full format (TR) .
Syntax
call stfttr( transr, uplo, n, arf, a, lda, info )
call dtfttr( transr, uplo, n, arf, a, lda, info )
call ctfttr( transr, uplo, n, arf, a, lda, info )
call ztfttr( transr, uplo, n, arf, a, lda, info )
Include Files
mkl.fi
1761
3 Intel Math Kernel Library Developer Reference
Description
The routine copies a triangular matrix A from the Rectangular Full Packed (RFP) format to the standard full
format. For the description of the RFP format, see Matrix Storage Schemes.
Input Parameters
transr CHARACTER*1.
= 'N': arf is in the Normal format,
uplo CHARACTER*1.
Specifies whether A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
Output Parameters
1762
LAPACK Routines 3
?tpqrt2
Computes a QR factorization of a real or complex
"triangular-pentagonal" matrix, which is composed of
a triangular block and a pentagonal block, using the
compact WY representation for Q.
Syntax
call stpqrt2(m, n, l, a, lda, b, ldb, t, ldt, info)
call dtpqrt2(m, n, l, a, lda, b, ldb, t, ldt, info)
call ctpqrt2(m, n, l, a, lda, b, ldb, t, ldt, info)
call ztpqrt2(m, n, l, a, lda, b, ldb, t, ldt, info)
call tpqrt2(a, b, t [, info])
Include Files
mkl.fi, lapack.f90
Description
where A is an n-by-n upper triangular matrix, and B is an m-by-n pentagonal matrix consisting of an (m-l)-
by-n rectangular matrix B1 on top of an l-by-n upper trapezoidal matrix B2:
The upper trapezoidal matrix B2 consists of the first l rows of an n-by-n upper triangular matrix, where 0
l min(m,n). If l=0, B is an m-by-n rectangular matrix. If m=l=n, B is upper triangular. The matrix W
contains the elementary reflectors H(i) in the ith column below the diagonal (of A) in the (n+m)-by-n input
matrix C so that W can be represented as
Thus, V contains all of the information needed for W, and is returned in array b.
NOTE
V has the same form as B:
1763
3 Intel Math Kernel Library Developer Reference
Input Parameters
b, size (ldb, n), the pentagonal m-by-n matrix B. The first (m-l) rows
contain the rectangular B1 matrix, and the next l rows contain the upper
trapezoidal B2 matrix.
Output Parameters
a The elements on and above the diagonal of the array contain the upper
triangular matrix R.
1764
LAPACK Routines 3
COMPLEX*16 for ztpqrt2.
Array, size (ldt, n).
info INTEGER.
If info = 0, the execution is successful.
If info < 0 and info = -i, the ith argument had an illegal value.
?tprfb
Applies a real or complex "triangular-pentagonal"
blocked reflector to a real or complex matrix, which is
composed of two blocks.
Syntax
call stprfb(side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, a, lda, b, ldb,
work, ldwork)
call dtprfb(side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, a, lda, b, ldb,
work, ldwork)
call ctprfb(side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, a, lda, b, ldb,
work, ldwork)
call ztprfb(side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, a, lda, b, ldb,
work, ldwork)
call tprfb(t, v, a, b[, direct][, storev][, side][, trans])
Include Files
mkl.fi, lapack.f90
Description
The ?tprfb routine applies a real or complex "triangular-pentagonal" block reflector H, HT, or HH from either
the left or the right to a real or complex matrix C, which is composed of two blocks A and B.
The block B is m-by-n. If side = 'R', A is m-by-k, and if side = 'L', A is of size k-by-n.
The pentagonal matrix V is composed of a rectangular block V1 and a trapezoidal block V2. The size of the
trapezoidal block is determined by the parameter l, where 0lk. if l=k, the V2 block of V is triangular; if
l=0, there is no trapezoidal block, thus V = V1 is rectangular.
1765
3 Intel Math Kernel Library Developer Reference
direct='F' direct='B'
storev='C'
V2 is upper trapezoidal (first l rows of k-by-k V2 is lower trapezoidal (last l rows of k-by-k
upper triangular) lower triangular matrix)
storev='R'
side='L' side='R'
storev='C'
V is m-by-k V is n-by-k
V2 is l-by-k V2 is l-by-k
storev='R'
V is k-by-m V is k-by-n
V2 is k-by-l V2 is k-by-l
Input Parameters
side CHARACTER*1.
= 'L': apply H, HT, or HH from the left,
= 'R': apply H, HT, or HH from the right.
trans CHARACTER*1.
= 'N': apply H (no transpose),
= 'T': apply HT (transpose),
= 'C': apply HH (conjugate transpose).
direct CHARACTER*1.
Indicates how H is formed from a product of elementary reflectors:
= 'F': H = H(1) H(2) . . . H(k) (Forward),
storev CHARACTER*1.
Indicates how the vectors that define the elementary reflectors are stored:
= 'C': Columns,
= 'R': Rows.
1766
LAPACK Routines 3
k INTEGER. The order of the matrix T, which is the number of elementary
reflectors whose product defines the block reflector. (k 0)
1767
3 Intel Math Kernel Library Developer Reference
ldb INTEGER. The leading dimension of the array b (ldb max(1, m)).
Output Parameters
a Contains the corresponding block of H*C, HT*C, HH*C, C*H, C*HT, or C*HH.
b Contains the corresponding block of H*C, HT*C, HH*C, C*H, C*HT, or C*HH.
?tpttf
Copies a triangular matrix from the standard packed
format (TP) to the rectangular full packed format (TF).
Syntax
call stpttf( transr, uplo, n, ap, arf, info )
call dtpttf( transr, uplo, n, ap, arf, info )
call ctpttf( transr, uplo, n, ap, arf, info )
call ztpttf( transr, uplo, n, ap, arf, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the standard packed format to the Rectangular Full Packed
(RFP) format. For the description of the RFP format, see Matrix Storage Schemes.
1768
LAPACK Routines 3
Input Parameters
transr CHARACTER*1.
= 'N': arf must be in the Normal format,
= 'T': arf must be in the Transpose format (for stpttf and dtpttf),
uplo CHARACTER*1.
Specifies whether A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
Output Parameters
1769
3 Intel Math Kernel Library Developer Reference
?tpttr
Copies a triangular matrix from the standard packed
format (TP) to the standard full format (TR) .
Syntax
call stpttr( uplo, n, ap, a, lda, info )
call dtpttr( uplo, n, ap, a, lda, info )
call ctpttr( uplo, n, ap, a, lda, info )
call ztpttr( uplo, n, ap, a, lda, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the standard packed format to the standard full format.
Input Parameters
uplo CHARACTER*1.
Specifies whether A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
Output Parameters
1770
LAPACK Routines 3
On exit, the triangular matrix A. If uplo = 'U', the leading n-by-n upper
triangular part of the array a contains the upper triangular part of the
matrix A, and the strictly lower triangular part of a is not referenced. If
uplo = 'L', the leading n-by-n lower triangular part of the array a contains
the lower triangular part of the matrix A, and the strictly upper triangular
part of a is not referenced.
info INTEGER.
If info = 0, the execution is successful.
?trttf
Copies a triangular matrix from the standard full
format (TR) to the rectangular full packed format (TF).
Syntax
call strttf( transr, uplo, n, a, lda, arf, info )
call dtrttf( transr, uplo, n, a, lda, arf, info )
call ctrttf( transr, uplo, n, a, lda, arf, info )
call ztrttf( transr, uplo, n, a, lda, arf, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the standard full format to the Rectangular Full Packed (RFP)
format. For the description of the RFP format, see Matrix Storage Schemes.
Input Parameters
transr CHARACTER*1.
= 'N': arf must be in the Normal format,
= 'T': arf must be in the Transpose format (for strttf and dtrttf),
uplo CHARACTER*1.
Specifies whether A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
1771
3 Intel Math Kernel Library Developer Reference
Output Parameters
?trttp
Copies a triangular matrix from the standard full
format (TR) to the standard packed format (TP) .
Syntax
call strttp( uplo, n, a, lda, ap, info )
call dtrttp( uplo, n, a, lda, ap, info )
call ctrttp( uplo, n, a, lda, ap, info )
call ztrttp( uplo, n, a, lda, ap, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the standard full format to the standard packed format.
Input Parameters
uplo CHARACTER*1.
1772
LAPACK Routines 3
Specifies whether A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
Output Parameters
?pstf2
Computes the Cholesky factorization with complete
pivoting of a real symmetric or complex Hermitian
positive semi-definite matrix.
Syntax
call spstf2( uplo, n, a, lda, piv, rank, tol, work, info )
call dpstf2( uplo, n, a, lda, piv, rank, tol, work, info )
call cpstf2( uplo, n, a, lda, piv, rank, tol, work, info )
1773
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The real flavors spstf2 and dpstf2 compute the Cholesky factorization with complete pivoting of a real
symmetric positive semi-definite matrix A. The complex flavors cpstf2 and zpstf2 compute the Cholesky
factorization with complete pivoting of a complex Hermitian positive semi-definite matrix A. The factorization
has the form:
PT* A * P = UT * U, if uplo = 'U' for real flavors,
PT* A * P = UH * U, if uplo = 'U' for complex flavors,
PT* A * P = L * LT, if uplo = 'L' for real flavors,
PT* A * P = L * LH, if uplo = 'L' for complex flavors,
where U is an upper triangular matrix and L is lower triangular, and P is stored as vector piv.
This algorithm does not check that A is positive semi-definite. This version of the algorithm calls level 2
BLAS.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric or
Hermitian matrix A is stored:
= 'U': Upper triangular,
= 'L': Lower triangular.
On entry, the symmetric matrix A. If uplo = 'U', the leading n-by-n upper
triangular part of the array a contains the upper triangular part of the
matrix A, and the strictly lower triangular part of a is not referenced. If
uplo = 'L', the leading n-by-n lower triangular part of the array a contains
the lower triangular part of the matrix A, and the strictly upper triangular
part of a is not referenced.
1774
LAPACK Routines 3
lda INTEGER. The leading dimension of the matrix A. lda max(1,n).
Output Parameters
rank INTEGER.
The rank of A, determined by the number of steps the algorithm completed.
info INTEGER.
< 0: if info = -k, the k-th parameter had an illegal value,
dlat2s
Converts a double-precision triangular matrix to a
single-precision triangular matrix.
Syntax
call dlat2s( uplo, n, a, lda, sa, ldsa, info )
Include Files
mkl.fi
Description
This routine converts a double-precision triangular matrix A to a single-precision triangular matrix SA.
dlat2s checks that all the elements of A are between -RMAX and RMAX, where RMAX is the overflow for the
single-precision arithmetic. If this condition is not met, the conversion is aborted and a flag is raised. The
routine does no parameter checking.
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
1775
3 Intel Math Kernel Library Developer Reference
a DOUBLE PRECISION.
Array, DIMENSION (lda, *).
ldsa INTEGER. The leading dimension of the array sa. ldsa max(1,n).
Output Parameters
sa REAL.
Array, DIMENSION (ldsa, *).
info INTEGER.
=0: successful exit,
> 0: an element of the matrix A is greater than the single-precision
overflow threshold; in this case, the content of the part of sa determined by
uplo is unspecified on exit.
zlat2c
Converts a double complex triangular matrix to a
complex triangular matrix.
Syntax
call zlat2c( uplo, n, a, lda, sa, ldsa, info )
Include Files
mkl.fi
Description
This routine is declared in mkl_lapack.fi.
The routine converts a DOUBLE COMPLEX triangular matrix A to a COMPLEX triangular matrix SA. zlat2c
checks that the real and complex parts of all the elements of A are between -RMAX and RMAX, where RMAX
is the overflow for the single-precision arithmetic. If this condition is not met, the conversion is aborted and a
flag is raised. The routine does no parameter checking.
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
1776
LAPACK Routines 3
n INTEGER. The number of rows and columns in the matrix A. n 0.
a DOUBLE COMPLEX.
Array, DIMENSION (lda, *).
ldsa INTEGER. The leading dimension of the array sa. ldsa max(1,n).
Output Parameters
sa COMPLEX.
Array, DIMENSION (ldsa, *).
info INTEGER.
=0: successful exit,
> 0: the real or complex part of an element of the matrix A is greater than
the single-precision overflow threshold; in this case, the content of the part
of sa determined by uplo is unspecified on exit.
?lacp2
Copies all or part of a real two-dimensional array to a
complex array.
Syntax
call clacp2( uplo, m, n, a, lda, b, ldb )
call zlacp2( uplo, m, n, a, lda, b, ldb )
Include Files
mkl.fi
Description
Input Parameters
uplo CHARACTER*1.
Specifies the part of the matrix A to be copied to B.
If uplo = 'U', the upper triangular part of A;
1777
3 Intel Math Kernel Library Developer Reference
ldb INTEGER. The leading dimension of the output array b; ldb max(1, m).
Output Parameters
?la_gbamv
Performs a matrix-vector operation to calculate error
bounds.
Syntax
call sla_gbamv(trans, m, n, kl, ku, alpha, ab, ldab, x, incx, beta, y, incy)
call dla_gbamv(trans, m, n, kl, ku, alpha, ab, ldab, x, incx, beta, y, incy)
call cla_gbamv(trans, m, n, kl, ku, alpha, ab, ldab, x, incx, beta, y, incy)
call zla_gbamv(trans, m, n, kl, ku, alpha, ab, ldab, x, incx, beta, y, incy)
Include Files
mkl.fi
Description
The ?la_gbamv function performs one of the matrix-vector operations defined as
y := alpha*abs(A)*abs(x) + beta*abs(y),
or
y := alpha*abs(A)T*abs(x) + beta*abs(y),
1778
LAPACK Routines 3
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-n matrix, with kl sub-diagonals and ku super-diagonals.
This function is primarily used in calculating error bounds. To protect against underflow during evaluation,
the function perturbs components in the resulting vector away from zero by (n + 1) times the underflow
threshold. To prevent unnecessarily large errors for block structure embedded in general matrices, the
function does not perturb symbolically zero components. A zero entry is considered symbolic if all
multiplications involved in computing that entry have at least one zero multiplicand.
Input Parameters
Before entry, the leading m-by-n part of the array ab must contain the
matrix of coefficients. The second dimension of ab must be at least
max(1,n). Unchanged on exit.
1779
3 Intel Math Kernel Library Developer Reference
incx INTEGER. Specifies the increment for the elements of x. incx must not be
zero.
Output Parameters
y Updated vector y.
?la_gbrcond
Estimates the Skeel condition number for a general
banded matrix.
Syntax
call sla_gbrcond( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, cmode, c, info, work,
iwork )
1780
LAPACK Routines 3
call dla_gbrcond( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, cmode, c, info, work,
iwork )
Include Files
mkl.fi
Description
The function estimates the Skeel condition number of
op(A) * op2(C)
where
the cmode parameter determines op2 as follows:
1 C
0 I
-1 inv(C)
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
1781
3 Intel Math Kernel Library Developer Reference
max(1,j-ku) i min(n,j+kl)
afb(ldafb,*) contains details of the LU factorization of the band matrix A, as
returned by ?gbtrf. U is stored as an upper triangular band matrix with kl+ku
superdiagonals in rows 1 to kl+ku+1, and the multipliers used during the
factorization are stored in rows kl+ku+2 to 2*kl+ku+1.
ipiv INTEGER.
Array with DIMENSIONn. The pivot indices from the factorization A = P*L*U as
computed by ?gbtrf. Row i of the matrix was interchanged with row ipiv(i).
If cmode = 0, op2(C) = I.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?gbtrf
?la_gbrcond_c
Computes the infinity norm condition number of
op(A)*inv(diag(c)) for general banded matrices.
Syntax
call cla_gbrcond_c( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, c, capply, info,
work, rwork )
call zla_gbrcond_c( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, c, capply, info,
work, rwork )
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
1782
LAPACK Routines 3
op(A) * inv(diag(c))
where the c is a REAL vector for cla_gbrcond_c and a DOUBLE PRECISION vector for zla_gbrcond_c.
Input Parameters
If trans = 'C', the system has the form AH*X = B (Conjugate Transpose =
Transpose)
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
ipiv INTEGER.
Array with DIMENSIONn. The pivot indices from the factorization A = P*L*U as
computed by ?gbtrf. Row i of the matrix was interchanged with row ipiv(i).
op(A) * inv(diag(c)).
1783
3 Intel Math Kernel Library Developer Reference
capply LOGICAL. If .TRUE., then the function uses the vector c from the formula
op(A) * inv(diag(c)).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?gbtrf
?la_gbrcond_x
Computes the infinity norm condition number of
op(A)*diag(x) for general banded matrices.
Syntax
call cla_gbrcond_x( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, x, info, work,
rwork )
call zla_gbrcond_x( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, x, info, work,
rwork )
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
op(A) * diag(x)
where the x is a COMPLEX vector for cla_gbrcond_x and a DOUBLE COMPLEX vector for zla_gbrcond_x.
Input Parameters
If trans = 'C', the system has the form AH*X = B (Conjugate Transpose =
Transpose)
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
1784
LAPACK Routines 3
ab, afb, x, work COMPLEX for cla_gbrcond_x
DOUBLE COMPLEX for zla_gbrcond_x
Arrays:
ab(ldab,*) contains the original band matrix A stored in rows from 1 to kl + ku
+ 1. The j-th column of A is stored in the j-th column of the array ab as follows:
ab(ku+1+i-j,j) = A(i,j)
for
max(1,j-ku) i min(n,j+kl)
afb(ldafb,*) contains details of the LU factorization of the band matrix A, as
returned by ?gbtrf. U is stored as an upper triangular band matrix with kl+ku
superdiagonals in rows 1 to kl+ku+1, and the multipliers used during the
factorization are stored in rows kl+ku+2 to 2*kl+ku+1.
ipiv INTEGER.
Array with DIMENSIONn. The pivot indices from the factorization A = P*L*U as
computed by ?gbtrf. Row i of the matrix was interchanged with row ipiv(i).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?gbtrf
?la_gbrfsx_extended
Improves the computed solution to a system of linear
equations for general banded matrices by performing
extra-precise iterative refinement and provides error
bounds and backward error estimates for the solution.
1785
3 Intel Math Kernel Library Developer Reference
Syntax
call sla_gbrfsx_extended( prec_type, trans_type, n, kl, ku, nrhs, ab, ldab, afb,
ldafb, ipiv, colequ, c, b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm,
err_bnds_comp, res, ayb, dy, y_tail, rcond, ithresh, rthresh, dz_ub, ignore_cwise,
info )
call dla_gbrfsx_extended( prec_type, trans_type, n, kl, ku, nrhs, ab, ldab, afb,
ldafb, ipiv, colequ, c, b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm,
err_bnds_comp, res, ayb, dy, y_tail, rcond, ithresh, rthresh, dz_ub, ignore_cwise,
info )
call cla_gbrfsx_extended( prec_type, trans_type, n, kl, ku, nrhs, ab, ldab, afb,
ldafb, ipiv, colequ, c, b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm,
err_bnds_comp, res, ayb, dy, y_tail, rcond, ithresh, rthresh, dz_ub, ignore_cwise,
info )
call zla_gbrfsx_extended( prec_type, trans_type, n, kl, ku, nrhs, ab, ldab, afb,
ldafb, ipiv, colequ, c, b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm,
err_bnds_comp, res, ayb, dy, y_tail, rcond, ithresh, rthresh, dz_ub, ignore_cwise,
info )
Include Files
mkl.fi
Description
The ?la_gbrfsx_extended subroutine improves the computed solution to a system of linear equations by
performing extra-precise iterative refinement and provides error bounds and backward error estimates for
the solution. The ?gbrfsx routine calls ?la_gbrfsx_extended to perform iterative refinement.
In addition to normwise error bound, the code provides maximum componentwise error bound, if possible.
See comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
Use ?la_gbrfsx_extended to set only the second fields of err_bnds_norm and err_bnds_comp.
Input Parameters
prec_type INTEGER.
Specifies the intermediate precision to be used in refinement. The value is
defined by ilaprec(p), where p is a CHARACTER and:
If p = 'S': Single.
If p = 'D': Double.
If p = 'I': Indigenous.
trans_type INTEGER.
Specifies the transposition operation on A. The value is defined by
ilatrans(t), where t is a CHARACTER and:
If t = 'N': No transpose.
If t = 'T': Transpose.
1786
LAPACK Routines 3
n INTEGER. The number of linear equations; the order of the matrix A; n 0.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrix B.
The array ab contains the original n-by-n matrix A. The second dimension
of ab must be at least max(1,n).
The array afb contains the factors L and U from the factorization A =
P*L*U) as computed by ?gbtrf. The second dimension of afb must be at
least max(1,n).
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
The array y on entry contains the solution matrix X as computed by ?
gbtrs. The second dimension of y must be at least max(1,nrhs).
ldab INTEGER. The leading dimension of the array ab; ldab max(1,n).
ldafb INTEGER. The leading dimension of the array afb; ldafb max(1,n).
ipiv INTEGER.
Array, DIMENSION at least max(1, n). Contains the pivot indices from the
factorization A = P*L*U) as computed by ?gbtrf; row i of the matrix was
interchanged with row ipiv(i).
1787
3 Intel Math Kernel Library Developer Reference
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldy INTEGER. The leading dimension of the array y; ldy max(1, n).
1788
LAPACK Routines 3
numbers are 1/(norm(1/
z,inf)*norm(z,inf)) for some appropriately
scaled matrix Z.
Let z=s*a, where s scales each row by a power
of the radix so all absolute row sums of z are
approximately 1.
Use this subroutine to set only the second field
above.
1789
3 Intel Math Kernel Library Developer Reference
1790
LAPACK Routines 3
norm(dx_{i+1}) < rthresh * norm(dx_i)
where norm(z) is the infinity norm of Z.
rthresh satisfies
0 < rthresh 1.
The default value is 0.5. For 'aggressive' set to 0.9 to permit convergence
on extremely ill-conditioned matrices.
ignore_cwise LOGICAL
If .TRUE., the function ignores componentwise convergence. Default value
is .FALSE.
Output Parameters
See Also
?gbrfsx
?gbtrf
?gbtrs
1791
3 Intel Math Kernel Library Developer Reference
?lamch
ilaprec
ilatrans
?la_lin_berr
?la_gbrpvgrw
Computes the reciprocal pivot growth factor norm(A)/
norm(U) for a general band matrix.
Syntax
call sla_gbrpvgrw( n, kl, ku, ncols, ab, ldab, afb, ldafb )
call dla_gbrpvgrw( n, kl, ku, ncols, ab, ldab, afb, ldafb )
call cla_gbrpvgrw( n, kl, ku, ncols, ab, ldab, afb, ldafb )
call zla_gbrpvgrw( n, kl, ku, ncols, ab, ldab, afb, ldafb )
Include Files
mkl.fi
Description
The ?la_gbrpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
absolute element norm is used. If this is much less than 1, the stability of the LU factorization of the
equilibrated matrix A could be poor. This also means that the solution X, estimated condition numbers, and
error bounds could be unreliable.
Input Parameters
1792
LAPACK Routines 3
afb contains details of the LU factorization of the band matrix A, as
returned by ?gbtrf. U is stored as an upper triangular band matrix
with kl+ku superdiagonals in rows 1 to kl+ku+1, and the multipliers
used during the factorization are stored in rows kl+ku+2 to 2*kl+ku
+1.
See Also
?gbtrf
?la_geamv
Computes a matrix-vector product using a general
matrix to calculate error bounds.
Syntax
call sla_geamv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call dla_geamv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call cla_geamv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call zla_geamv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
Include Files
mkl.fi
Description
The ?la_geamv routines perform a matrix-vector operation defined as
y := alpha*abs(A)*(x) + beta*abs(y),
or
y := alpha*abs(AT)*abs(x) + beta*abs(y),
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-n matrix.
This function is primarily used in calculating error bounds. To protect against underflow during evaluation,
the function perturbs components in the resulting vector away from zero by (n + 1) times the underflow
threshold. To prevent unnecessarily large errors for block structure embedded in general matrices, the
function does not perturb symbolically zero components. A zero entry is considered symbolic if all
multiplications involved in computing that entry have at least one zero multiplicand.
Input Parameters
1793
3 Intel Math Kernel Library Developer Reference
1794
LAPACK Routines 3
incy INTEGER. Specifies the increment for the elements of y.
The value of incy must be non-zero.
Output Parameters
y Updated vector Y.
?la_gercond
Estimates the Skeel condition number for a general
matrix.
Syntax
call sla_gercond( trans, n, a, lda, af, ldaf, ipiv, cmode, c, info, work, iwork )
call dla_gercond( trans, n, a, lda, af, ldaf, ipiv, cmode, c, info, work, iwork )
Include Files
mkl.fi
Description
The function estimates the Skeel condition number of
op(A) * op2(C)
where
the cmode parameter determines op2 as follows:
1 C
0 I
-1 inv(C)
Input Parameters
If trans = 'C', the system has the form AH*X = B (Conjugate Transpose =
Transpose).
1795
3 Intel Math Kernel Library Developer Reference
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
ipiv INTEGER.
Array with DIMENSIONn. The pivot indices from the factorization A = P*L*U as
computed by ?getrf. Row i of the matrix was interchanged with row ipiv(i).
If cmode = 0, op2(C) = I.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?getrf
?la_gercond_c
Computes the infinity norm condition number of
op(A)*inv(diag(c)) for general matrices.
Syntax
call cla_gercond_c( trans, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
call zla_gercond_c( trans, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
1796
LAPACK Routines 3
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
op(A) * inv(diag(c))
where the c is a REAL vector for cla_gercond_c and a DOUBLE PRECISION vector for zla_gercond_c.
Input Parameters
If trans = 'C', the system has the form AH*X = B (Conjugate Transpose =
Transpose)
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
ipiv INTEGER.
Array with DIMENSIONn. The pivot indices from the factorization A = P*L*U as
computed by ?getrf. Row i of the matrix was interchanged with row ipiv(i).
op(A) * inv(diag(c)).
Array rwork with DIMENSIONn is a workspace.
capply LOGICAL. If capply=.TRUE., then the function uses the vector c from the
formula
1797
3 Intel Math Kernel Library Developer Reference
op(A) * inv(diag(c)).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?getrf
?la_gercond_x
Computes the infinity norm condition number of
op(A)*diag(x) for general matrices.
Syntax
call cla_gercond_x( trans, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
call zla_gercond_x( trans, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
op(A) * diag(x)
where the x is a COMPLEX vector for cla_gercond_x and a DOUBLE COMPLEX vector for zla_gercond_x.
Input Parameters
If trans = 'C', the system has the form AH*X = B (Conjugate Transpose =
Transpose)
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
1798
LAPACK Routines 3
work is a workspace array of DIMENSION (2*n).
The second dimension of a and af must be at least max(1, n).
ipiv INTEGER.
Array with DIMENSIONn. The pivot indices from the factorization A = P*L*U as
computed by ?getrf. Row i of the matrix was interchanged with row ipiv(i).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?getrf
?la_gerfsx_extended
Improves the computed solution to a system of linear
equations for general matrices by performing extra-
precise iterative refinement and provides error bounds
and backward error estimates for the solution.
Syntax
call sla_gerfsx_extended( prec_type, trans_type, n, nrhs, a, lda, af, ldaf, ipiv,
colequ, c, b, ldb, y, ldy, berr_out, n_norms, errs_n, errs_c, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call dla_gerfsx_extended( prec_type, trans_type, n, nrhs, a, lda, af, ldaf, ipiv,
colequ, c, b, ldb, y, ldy, berr_out, n_norms, errs_n, errs_c, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call cla_gerfsx_extended( prec_type, trans_type, n, nrhs, a, lda, af, ldaf, ipiv,
colequ, c, b, ldb, y, ldy, berr_out, n_norms, errs_n, errs_c, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call zla_gerfsx_extended( prec_type, trans_type, n, nrhs, a, lda, af, ldaf, ipiv,
colequ, c, b, ldb, y, ldy, berr_out, n_norms, errs_n, errs_c, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
Include Files
mkl.fi
1799
3 Intel Math Kernel Library Developer Reference
Description
The ?la_gerfsx_extended subroutine improves the computed solution to a system of linear equations for
general matrices by performing extra-precise iterative refinement and provides error bounds and backward
error estimates for the solution. The ?gerfsx routine calls ?la_gerfsx_extended to perform iterative
refinement.
In addition to normwise error bound, the code provides maximum componentwise error bound, if possible.
See comments for errs_n and errs_c for details of the error bounds.
Use ?la_gerfsx_extended to set only the second fields of errs_n and errs_c.
Input Parameters
prec_type INTEGER.
Specifies the intermediate precision to be used in refinement. The value is
defined by ilaprec(p), where p is a CHARACTER and:
If p = 'S': Single.
If p = 'D': Double.
If p = 'I': Indigenous.
trans_type INTEGER.
Specifies the transposition operation on A. The value is defined by
ilatrans(t), where t is a CHARACTER and:
If t = 'N': No transpose.
If t = 'T': Transpose.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrix B.
The array a contains the original matrix n-by-n matrix A. The second
dimension of a must be at least max(1,n).
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
1800
LAPACK Routines 3
The array y on entry contains the solution matrix X as computed by ?
getrs. The second dimension of y must be at least max(1,nrhs).
ldaf INTEGER. The leading dimension of the array af; ldaf max(1,n).
ipiv INTEGER.
Array, DIMENSION at least max(1, n). Contains the pivot indices from the
factorization A = P*L*U) as computed by ?getrf; row i of the matrix was
interchanged with row ipiv(i).
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldy INTEGER. The leading dimension of the array y; ldy max(1, n).
n_norms INTEGER. Determines which error bounds to return. See errs_n and
errs_c descriptions in Output Arguments section below.
If n_norms 1, returns normwise error bounds.
1801
3 Intel Math Kernel Library Developer Reference
1802
LAPACK Routines 3
each right-hand side. If componentwise accuracy is nit requested
(params(3) = 0.0), then errs_c is not accessed. If n_err_bnds < 3, then
at most the first (:,n_err_bnds) entries are returned.
1803
3 Intel Math Kernel Library Developer Reference
rthresh satisfies
0 < rthresh 1.
The default value is 0.5. For 'aggressive' set to 0.9 to permit convergence
on extremely ill-conditioned matrices.
ignore_cwise LOGICAL
If .TRUE., the function ignores componentwise convergence. Default value
is .FALSE.
Output Parameters
1804
LAPACK Routines 3
The improved solution matrix Y.
errs_n, errs_c Values of the corresponding input parameters improved after iterative
refinement and stored in the second column of the array ( 1:nrhs, 2 ).
The other elements are kept unchanged.
See Also
?gerfsx
?getrf
?getrs
?lamch
ilaprec
ilatrans
?la_lin_berr
?la_heamv
Computes a matrix-vector product using a Hermitian
indefinite matrix to calculate error bounds.
Syntax
call cla_heamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call zla_heamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
Include Files
mkl.fi
Description
The ?la_heamv routines perform a matrix-vector operation defined as
y := alpha*abs(A)*abs(x) + beta*abs(y),
where:
alpha and beta are scalars,
x and y are vectors,
A is an n-by-n Hermitian matrix.
1805
3 Intel Math Kernel Library Developer Reference
This function is primarily used in calculating error bounds. To protect against underflow during evaluation,
the function perturbs components in the resulting vector away from zero by (n + 1) times the underflow
threshold. To prevent unnecessarily large errors for block structure embedded in general matrices, the
function does not perturb symbolically zero components. A zero entry is considered symbolic if all
multiplications involved in computing that entry have at least one zero multiplicand.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the array A is to be
referenced:
If uplo = 'BLAS_UPPER', only the upper triangular part of A is to be
referenced,
If uplo = 'BLAS_LOWER', only the lower triangular part of A is to be
referenced.
n INTEGER. Specifies the number of rows and columns of the matrix A. The
value of n must be at least zero.
1806
LAPACK Routines 3
Array, DIMENSION at least (1 +(n - 1)*abs(incy)) otherwise. Before
entry with non-zero beta, the incremented array y must contain the vector
Y.
Output Parameters
y Updated vector Y.
?la_hercond_c
Computes the infinity norm condition number of
op(A)*inv(diag(c)) for Hermitian indefinite matrices.
Syntax
call cla_hercond_c( uplo, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
call zla_hercond_c( uplo, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
op(A) * inv(diag(c))
where the c is a REAL vector for cla_hercond_c and a DOUBLE PRECISION vector for zla_hercond_c.
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
1807
3 Intel Math Kernel Library Developer Reference
Array, DIMENSION(ldaf, *). The block diagonal matrix D and the multipliers
used to obtain the factor U or L as computed by ?hetrf. The second dimension
of af must be at least max(1,n).
ipiv INTEGER.
Array with DIMENSIONn. Details of the interchanges and the block structure of D
as determined by ?hetrf.
op(A) * inv(diag(c)).
capply LOGICAL. If .TRUE., then the function uses the vector c from the formula
op(A) * inv(diag(c)).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?hetrf
?la_hercond_x
Computes the infinity norm condition number of
op(A)*diag(x) for Hermitian indefinite matrices.
Syntax
call cla_hercond_x( uplo, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
call zla_hercond_x( uplo, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
Include Files
mkl.fi
1808
LAPACK Routines 3
Description
The function computes the infinity norm condition number of
op(A) * diag(x)
where the x is a COMPLEX vector for cla_hercond_x and a DOUBLE COMPLEX vector for zla_hercond_x.
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
ipiv INTEGER.
Array with DIMENSIONn. Details of the interchanges and the block structure of D
as determined by ?hetrf.
op(A) * inv(diag(x)).
1809
3 Intel Math Kernel Library Developer Reference
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?hetrf
?la_herfsx_extended
Improves the computed solution to a system of linear
equations for Hermitian indefinite matrices by
performing extra-precise iterative refinement and
provides error bounds and backward error estimates
for the solution.
Syntax
call cla_herfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call zla_herfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
Include Files
mkl.fi
Description
The ?la_herfsx_extended subroutine improves the computed solution to a system of linear equations by
performing extra-precise iterative refinement and provides error bounds and backward error estimates for
the solution. The ?herfsx routine calls ?la_herfsx_extended to perform iterative refinement.
In addition to normwise error bound, the code provides maximum componentwise error bound, if possible.
See comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
Use ?la_herfsx_extended to set only the second fields of err_bnds_norm and err_bnds_comp.
Input Parameters
prec_type INTEGER.
Specifies the intermediate precision to be used in refinement. The value is
defined by ilaprec(p), where p is a CHARACTER and:
If p = 'S': Single.
If p = 'D': Double.
If p = 'I': Indigenous.
1810
LAPACK Routines 3
If uplo = 'U', the upper triangle of A is stored,
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrix B.
The array a contains the original n-by-n matrix A. The second dimension of
a must be at least max(1,n).
The array af contains the block diagonal matrix D and the multipliers used
to obtain the factor U or L as computed by ?hetrf. The second dimension
of af must be at least max(1,n).
ldaf INTEGER. The leading dimension of the array af; ldaf max(1,n).
ipiv INTEGER.
Array, DIMENSIONn. Details of the interchanges and the block structure of D
as determined by ?hetrf.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldy INTEGER. The leading dimension of the array y; ldy max(1, n).
1811
3 Intel Math Kernel Library Developer Reference
1812
LAPACK Routines 3
Let z=s*a, where s scales each row by a power
of the radix so all absolute row sums of z are
approximately 1.
Use this subroutine to set only the second field
above.
1813
3 Intel Math Kernel Library Developer Reference
cla_herfsx_extended and
sqrt(n)*dlamch() for
zla_herfsx_extended to determine if the error
estimate is "guaranteed". These reciprocal
condition numbers are 1/(norm(1/
z,inf)*norm(z,inf)) for some appropriately
scaled matrix Z.
Let z=s*(a*diag(x)), where x is the solution
for the current right-hand side and s scales each
row of a*diag(x) by a power of the radix so all
absolute row sums of z are approximately 1.
Use this subroutine to set only the second field
above.
1814
LAPACK Routines 3
rthresh satisfies
0 < rthresh 1.
The default value is 0.5. For 'aggressive' set to 0.9 to permit convergence
on extremely ill-conditioned matrices.
ignore_cwise LOGICAL
If .TRUE., the function ignores componentwise convergence. Default value
is .FALSE.
Output Parameters
See Also
?herfsx
?hetrf
?hetrs
?lamch
ilaprec
ilatrans
?la_lin_berr
1815
3 Intel Math Kernel Library Developer Reference
?la_herpvgrw
Computes the reciprocal pivot growth factor norm(A)/
norm(U) for a Hermitian indefinite matrix.
Syntax
call cla_herpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
call zla_herpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
Include Files
mkl.fi
Description
The ?la_herpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
absolute element norm is used. If this is much less than 1, the stability of the LU factorization of the
equilibrated matrix A could be poor. This also means that the solution X, estimated condition numbers, and
error bounds could be unreliable.
Input Parameters
info INTEGER. The value of INFO returned from ?hetrf, that is, the pivot
in column info is exactly 0.
ipiv INTEGER.
Array, DIMENSIONn. Details of the interchanges and the block
structure of D as determined by ?hetrf.
1816
LAPACK Routines 3
DOUBLE PRECISION for zla_herpvgrw.
Array, DIMENSION 2*n. Workspace.
See Also
?hetrf
?la_lin_berr
Computes component-wise relative backward error.
Syntax
call sla_lin_berr(n, nz, nrhs, res, ayb, berr )
call dla_lin_berr(n, nz, nrhs, res, ayb, berr )
call cla_lin_berr(n, nz, nrhs, res, ayb, berr )
call zla_lin_berr(n, nz, nrhs, res, ayb, berr )
Include Files
mkl.fi
Description
The ?la_lin_berr computes a component-wise relative backward error from the formula:
Input Parameters
res is the residual matrix, that is, the matrix R in the relative backward
error formula.
ayb is the denominator of that formula, that is, the matrix
abs(op(A_s))*abs(Y) + abs(B_s). The matrices A, Y, and B are from
iterative refinement. See description of ?la_gerfsx_extended.
Output Parameters
1817
3 Intel Math Kernel Library Developer Reference
See Also
?lamch
?la_gerfsx_extended
?la_porcond
Estimates the Skeel condition number for a symmetric
positive-definite matrix.
Syntax
call sla_porcond( uplo, n, a, lda, af, ldaf, cmode, c, info, work, iwork )
call dla_porcond( uplo, n, a, lda, af, ldaf, cmode, c, info, work, iwork )
Include Files
mkl.fi
Description
The function estimates the Skeel condition number of
op(A) * op2(C)
where
the cmode parameter determines op2 as follows:
1 C
0 I
-1 inv(C)
Input Parameters
1818
LAPACK Routines 3
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
If cmode = 0, op2(C) = I.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?potrf
?la_porcond_c
Computes the infinity norm condition number of
op(A)*inv(diag(c)) for Hermitian positive-definite
matrices.
Syntax
call cla_porcond_c( uplo, n, a, lda, af, ldaf, c, capply, info, work, rwork )
call zla_porcond_c( uplo, n, a, lda, af, ldaf, c, capply, info, work, rwork )
Include Files
mkl.fi
1819
3 Intel Math Kernel Library Developer Reference
Description
The function computes the infinity norm condition number of
op(A) * inv(diag(c))
where the c is a REAL vector for cla_porcond_c and a DOUBLE PRECISION vector for zla_porcond_c.
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
op(A) * inv(diag(c)).
capply LOGICAL. If .TRUE., then the function uses the vector c from the formula
op(A) * inv(diag(c)).
1820
LAPACK Routines 3
Array DIMENSIONn. Workspace.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?potrf
?la_porcond_x
Computes the infinity norm condition number of
op(A)*diag(x) for Hermitian positive-definite matrices.
Syntax
call cla_porcond_x( uplo, n, a, lda, af, ldaf, x, info, work, rwork )
call zla_porcond_x( uplo, n, a, lda, af, ldaf, x, info, work, rwork )
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
op(A) * diag(x)
where the x is a COMPLEX vector for cla_porcond_x and a DOUBLE COMPLEX vector for zla_porcond_x.
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
1821
3 Intel Math Kernel Library Developer Reference
op(A) * inv(diag(x)).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?potrf
?la_porfsx_extended
Improves the computed solution to a system of linear
equations for symmetric or Hermitian positive-definite
matrices by performing extra-precise iterative
refinement and provides error bounds and backward
error estimates for the solution.
Syntax
call sla_porfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, colequ, c, b,
ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call dla_porfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, colequ, c, b,
ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call cla_porfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, colequ, c, b,
ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
1822
LAPACK Routines 3
call zla_porfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, colequ, c, b,
ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
Include Files
mkl.fi
Description
The ?la_porfsx_extended subroutine improves the computed solution to a system of linear equations by
performing extra-precise iterative refinement and provides error bounds and backward error estimates for
the solution. The ?herfsx routine calls ?la_porfsx_extended to perform iterative refinement.
In addition to normwise error bound, the code provides maximum componentwise error bound, if possible.
See comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
Use ?la_porfsx_extended to set only the second fields of err_bnds_norm and err_bnds_comp.
Input Parameters
prec_type INTEGER.
Specifies the intermediate precision to be used in refinement. The value is
defined by ilaprec(p), where p is a CHARACTER and:
If p = 'S': Single.
If p = 'D': Double.
If p = 'I': Indigenous.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrix B.
The array a contains the original n-by-n matrix A. The second dimension of
a must be at least max(1,n).
The array af contains the triangular factor L or U from the Cholesky
factorization as computed by ?potrf:
1823
3 Intel Math Kernel Library Developer Reference
ldaf INTEGER. The leading dimension of the array af; ldaf max(1,n).
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldy INTEGER. The leading dimension of the array y; ldy max(1, n).
1824
LAPACK Routines 3
The first index in err_bnds_norm(i,:) corresponds to the i-th right-hand
side.
The second index in err_bnds_norm(:,err) contains the following three
fields:
1825
3 Intel Math Kernel Library Developer Reference
1826
LAPACK Routines 3
condition numbers are 1/(norm(1/
z,inf)*norm(z,inf)) for some appropriately
scaled matrix Z.
Let z=s*(a*diag(x)), where x is the solution
for the current right-hand side and s scales each
row of a*diag(x) by a power of the radix so all
absolute row sums of z are approximately 1.
Use this subroutine to set only the second field
above.
1827
3 Intel Math Kernel Library Developer Reference
rthresh satisfies
0 < rthresh 1.
The default value is 0.5. For 'aggressive' set to 0.9 to permit convergence
on extremely ill-conditioned matrices.
ignore_cwise LOGICAL
If .TRUE., the function ignores componentwise convergence. Default value
is .FALSE.
Output Parameters
See Also
?porfsx
?potrf
1828
LAPACK Routines 3
?potrs
?lamch
ilaprec
ilatrans
?la_lin_berr
?la_porpvgrw
Computes the reciprocal pivot growth factor norm(A)/
norm(U) for a symmetric or Hermitian positive-
definite matrix.
Syntax
call sla_porpvgrw( uplo, ncols, a, lda, af, ldaf, work )
call dla_porpvgrw( uplo, ncols, a, lda, af, ldaf, work )
call cla_porpvgrw( uplo, ncols, a, lda, af, ldaf, work )
call zla_porpvgrw( uplo, ncols, a, lda, af, ldaf, work )
Include Files
mkl.fi
Description
The ?la_porpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
absolute element norm is used. If this is much less than 1, the stability of the LU factorization of the
equilibrated matrix A could be poor. This also means that the solution X, estimated condition numbers, and
error bounds could be unreliable.
Input Parameters
The array a contains the input n-by-n matrix A. The second dimension
of a must be at least max(1,n).
1829
3 Intel Math Kernel Library Developer Reference
See Also
?potrf
?laqhe
Scales a Hermitian matrix.
Syntax
call claqhe( uplo, n, a, lda, s, scond, amax, equed )
call zlaqhe( uplo, n, a, lda, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a Hermitian matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
Specifies whether to store the upper or lower part of the Hermitian matrix
A.
If uplo = 'U', the upper triangular part of A;
If uplo = 'U', the leading n-by-n upper triangular part of a contains the
upper triangular part of matrix A and the strictly lower triangular part of a is
not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of a contains the
lower triangular part of matrix A and the strictly upper triangular part of a is
not referenced.
1830
LAPACK Routines 3
lda max(n,1).
Output Parameters
equed CHARACTER*1.
Specifies whether or not equilibration was done.
If equed = 'N': No equilibration.
If equed = 'Y': Equilibration was done, that is, A has been replaced by
diag(s)*A*diag(s).
Application Notes
The routine uses internal parameters thresh, large, and small. The parameter thresh is a threshold value
used to decide if scaling should be done based on the ratio of the scaling factors. If scond < thresh, scaling
is done.
The large and small parameters are threshold values used to decide if scaling should be done based on the
absolute size of the largest matrix element. If amax > large or amax < small, scaling is done.
?laqhp
Scales a Hermitian matrix stored in packed form.
Syntax
call claqhp( uplo, n, ap, s, scond, amax, equed )
call zlaqhp( uplo, n, ap, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a Hermitian matrix A using the scaling factors in the vector s.
1831
3 Intel Math Kernel Library Developer Reference
Input Parameters
uplo CHARACTER*1.
Specifies whether to store the upper or lower part of the Hermitian matrix
A.
If uplo = 'U', the upper triangular part of A;
Output Parameters
equed CHARACTER*1.
Specifies whether or not equilibration was done.
If equed = 'N': No equilibration.
If equed = 'Y': Equilibration was done, that is, A has been replaced by
diag(s)*A*diag(s).
1832
LAPACK Routines 3
Application Notes
The routine uses internal parameters thresh, large, and small. The parameter thresh is a threshold value
used to decide if scaling should be done based on the ratio of the scaling factors. If scond < thresh, scaling
is done.
The large and small parameters are threshold values used to decide if scaling should be done based on the
absolute size of the largest matrix element. If amax > large or amax < small, scaling is done.
?larcm
Multiplies a square real matrix by a complex matrix.
Syntax
call clarcm( m, n, a, lda, b, ldb, c, ldc, rwork )
call zlarcm( m, n, a, lda, b, ldb, c, ldc, rwork )
Include Files
mkl.fi
Description
Input Parameters
m INTEGER. The number of rows and columns of the matrix A and of the
number of rows of the matrix C (m 0).
ldc INTEGER. The leading dimension of the output array c, ldcmax(1, m).
1833
3 Intel Math Kernel Library Developer Reference
Output Parameters
?la_gerpvgrw
Computes the reciprocal pivot growth factor norm(A)/
norm(U) for a general matrix.
Syntax
call sla_gerpvgrw( n, ncols, a, lda, af, ldaf )
call dla_gerpvgrw( n, ncols, a, lda, af, ldaf )
call cla_gerpvgrw( n, ncols, a, lda, af, ldaf )
call zla_gerpvgrw( n, ncols, a, lda, af, ldaf )
Include Files
mkl.fi
Description
The ?la_gerpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
absolute element norm is used. If this is much less than 1, the stability of the LU factorization of the
equilibrated matrix A could be poor. This also means that the solution X, estimated condition numbers, and
error bounds could be unreliable.
Input Parameters
The array a contains the input n-by-n matrix A. The second dimension
of a must be at least max(1,n).
1834
LAPACK Routines 3
lda INTEGER. The leading dimension of a; lda max(1,n).
See Also
?getrf
?larscl2
Performs reciprocal diagonal scaling on a vector.
Syntax
call slarscl2(m, n, d, x, ldx)
call dlarscl2(m, n, d, x, ldx)
call clarscl2(m, n, d, x, ldx)
call zlarscl2(m, n, d, x, ldx)
Include Files
mkl.fi
Description
The ?larscl2 routines perform reciprocal diagonal scaling on a vector
x := D-1*x,
where:
x is a vector, and
D is a diagonal matrix.
Input Parameters
m INTEGER. Specifies the number of rows of the matrix D and the number of
elements of the vector x. The value of m must be at least zero.
ldx INTEGER.
1835
3 Intel Math Kernel Library Developer Reference
The leading dimension of the vector x. The value of ldx must be at least
zero.
Output Parameters
x Scaled vector x.
?lascl2
Performs diagonal scaling on a vector.
Syntax
call slascl2(m, n, d, x, ldx)
call dlascl2(m, n, d, x, ldx)
call clascl2(m, n, d, x, ldx)
call zlascl2(m, n, d, x, ldx)
Include Files
mkl.fi
Description
The ?lascl2 routines perform diagonal scaling on a vector
x := D*x,
where:
x is a vector, and
D is a diagonal matrix.
Input Parameters
m INTEGER. Specifies the number of rows of the matrix D and the number of
elements of the vector x. The value of m must be at least zero.
ldx INTEGER.
1836
LAPACK Routines 3
The leading dimension of the vector x. The value of ldx must be at least
zero.
Output Parameters
x Scaled vector x.
?la_syamv
Computes a matrix-vector product using a symmetric
indefinite matrix to calculate error bounds.
Syntax
call sla_syamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call dla_syamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call cla_syamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call zla_syamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
Include Files
mkl.fi
Description
The ?la_syamv routines perform a matrix-vector operation defined as
y := alpha*abs(A)*abs(x) + beta*abs(y),
where:
alpha and beta are scalars,
x and y are vectors,
A is an n-by-n Hermitian matrix.
This function is primarily used in calculating error bounds. To protect against underflow during evaluation,
the function perturbs components in the resulting vector away from zero by (n + 1) times the underflow
threshold. To prevent unnecessarily large errors for block structure embedded in general matrices, the
function does not perturb symbolically zero components. A zero entry is considered symbolic if all
multiplications involved in computing that entry have at least one zero multiplicand.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the array A is to be
referenced:
If uplo = 'BLAS_UPPER', only the upper triangular part of A is to be
referenced,
If uplo = 'BLAS_LOWER', only the lower triangular part of A is to be
referenced.
n INTEGER. Specifies the number of rows and columns of the matrix A. The
value of n must be at least zero.
1837
3 Intel Math Kernel Library Developer Reference
Output Parameters
y Updated vector Y.
1838
LAPACK Routines 3
?la_syrcond
Estimates the Skeel condition number for a symmetric
indefinite matrix.
Syntax
call sla_syrcond( uplo, n, a, lda, af, ldaf, ipiv, cmode, c, info, work, iwork )
call dla_syrcond( uplo, n, a, lda, af, ldaf, ipiv, cmode, c, info, work, iwork )
Include Files
mkl.fi
Description
The function estimates the Skeel condition number of
op(A) * op2(C)
where
the cmode parameter determines op2 as follows:
1 C
0 I
-1 inv(C)
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
1839
3 Intel Math Kernel Library Developer Reference
ipiv INTEGER.
Array with DIMENSIONn. Details of the interchanges and the block structure of D
as determined by ?sytrf.
If cmode = 0, op2(C) = I.
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?sytrf
?la_syrcond_c
Computes the infinity norm condition number of
op(A)*inv(diag(c)) for symmetric indefinite matrices.
Syntax
call cla_syrcond_c( uplo, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
call zla_syrcond_c( uplo, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
op(A) * inv(diag(c))
where the c is a REAL vector for cla_syrcond_c and a DOUBLE PRECISION vector for zla_syrcond_c.
Input Parameters
1840
LAPACK Routines 3
Specifies the triangle of A to store:
If uplo = 'U', the upper triangle of A is stored,
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
ipiv INTEGER.
Array with DIMENSIONn. Details of the interchanges and the block structure of D
as determined by ?sytrf.
op(A) * inv(diag(c)).
capply LOGICAL. If .TRUE., then the function uses the vector c from the formula
op(A) * inv(diag(c)).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
1841
3 Intel Math Kernel Library Developer Reference
See Also
?sytrf
?la_syrcond_x
Computes the infinity norm condition number of
op(A)*diag(x) for symmetric indefinite matrices.
Syntax
call cla_syrcond_x( uplo, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
call zla_syrcond_x( uplo, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
op(A) * diag(x)
where the x is a COMPLEX vector for cla_syrcond_x and a DOUBLE COMPLEX vector for zla_syrcond_x.
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
ipiv INTEGER.
1842
LAPACK Routines 3
Array with DIMENSIONn. Details of the interchanges and the block structure of D
as determined by ?sytrf.
op(A) * inv(diag(x)).
Output Parameters
info INTEGER.
If info = 0, the execution is successful.
See Also
?sytrf
?la_syrfsx_extended
Improves the computed solution to a system of linear
equations for symmetric indefinite matrices by
performing extra-precise iterative refinement and
provides error bounds and backward error estimates
for the solution.
Syntax
call sla_syrfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call dla_syrfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call cla_syrfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call zla_syrfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
1843
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The ?la_syrfsx_extended subroutine improves the computed solution to a system of linear equations by
performing extra-precise iterative refinement and provides error bounds and backward error estimates for
the solution. The ?syrfsx routine calls ?la_syrfsx_extended to perform iterative refinement.
In addition to normwise error bound, the code provides maximum componentwise error bound, if possible.
See comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
Use ?la_syrfsx_extended to set only the second fields of err_bnds_norm and err_bnds_comp.
Input Parameters
prec_type INTEGER.
Specifies the intermediate precision to be used in refinement. The value is
defined by ilaprec(p), where p is a CHARACTER and:
If p = 'S': Single.
If p = 'D': Double.
If p = 'I': Indigenous.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrix B.
The array a contains the original n-by-n matrix A. The second dimension of
a must be at least max(1,n).
The array af contains the block diagonal matrix D and the multipliers used
to obtain the factor U or L as computed by ?sytrf.
1844
LAPACK Routines 3
The array y on entry contains the solution matrix X as computed by ?
sytrs. The second dimension of y must be at least max(1,nrhs).
ldaf INTEGER. The leading dimension of the array af; ldaf max(1,n).
ipiv INTEGER.
Array with DIMENSIONn. Details of the interchanges and the block structure
of D as determined by ?sytrf.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
ldy INTEGER. The leading dimension of the array y; ldy max(1, n).
1845
3 Intel Math Kernel Library Developer Reference
1846
LAPACK Routines 3
Array, DIMENSION(nrhs,n_err_bnds). For each right-hand side, contains
information about various error bounds and condition numbers
corresponding to the componentwise relative error, which is defined as
follows:
Componentwise relative error in the i-th solution vector:
1847
3 Intel Math Kernel Library Developer Reference
1848
LAPACK Routines 3
where norm(z) is the infinity norm of Z.
rthresh satisfies
0 < rthresh 1.
The default value is 0.5. For 'aggressive' set to 0.9 to permit convergence
on extremely ill-conditioned matrices.
ignore_cwise LOGICAL
If .TRUE., the function ignores componentwise convergence. Default value
is .FALSE.
Output Parameters
See Also
?syrfsx
?sytrf
1849
3 Intel Math Kernel Library Developer Reference
?sytrs
?lamch
ilaprec
ilatrans
?la_lin_berr
?la_syrpvgrw
Computes the reciprocal pivot growth factor norm(A)/
norm(U) for a symmetric indefinite matrix.
Syntax
call sla_syrpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
call dla_syrpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
call cla_syrpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
call zla_syrpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
Include Files
mkl.fi
Description
The ?la_syrpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
absolute element norm is used. If this is much less than 1, the stability of the LU factorization of the
equilibrated matrix A could be poor. This also means that the solution X, estimated condition numbers, and
error bounds could be unreliable.
Input Parameters
info INTEGER. The value of INFO returned from ?sytrf, that is, the pivot
in column info is exactly 0.
The array a contains the input n-by-n matrix A. The second dimension
of a must be at least max(1,n).
1850
LAPACK Routines 3
The array af contains the block diagonal matrix D and the multipliers
used to obtain the factor U or L as computed by ?sytrf.
ipiv INTEGER.
Array, DIMENSIONn. Details of the interchanges and the block
structure of D as determined by ?sytrf.
See Also
?sytrf
?la_wwaddw
Adds a vector into a doubled-single vector.
Syntax
call sla_wwaddw( n, x, y, w )
call dla_wwaddw( n, x, y, w )
call cla_wwaddw( n, x, y, w )
call zla_wwaddw( n, x, y, w )
Include Files
mkl.fi
Description
The ?la_wwaddw routine adds a vector W into a doubled-single vector (X, Y). This works for all existing IBM
hex and binary floating-point arithmetics, but not for decimal.
Input Parameters
1851
3 Intel Math Kernel Library Developer Reference
Output Parameters
x, y Contain the first and second parts of the doubled-single accumulation vector,
respectively, after adding the vector W.
mkl_?tppack
Copies a triangular/symmetric matrix or submatrix
from standard full format to standard packed format.
Syntax
call mkl_stppack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_dtppack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_ctppack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_ztppack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_tppack (ap, i, j, rows, cols, a[, uplo] [, trans] [, info])
Include Files
mkl.fi, lapack.f90
Description
The routine copies a triangular or symmetric matrix or its submatrix from standard full format to packed
format
GE: general
TR: triangular
SY: symmetric indefinite
HE: Hermitian indefinite
PO: symmetric or Hermitian positive definite
NOTE
Any elements of the copied submatrix rectangular outside of the triangular part of the matrix AP are
skipped.
Input Parameters
The data types are given for the Fortran interface.
1852
LAPACK Routines 3
uplo CHARACTER*1. Specifies whether the matrix AP is upper or lower
triangular.
If uplo = 'U', AP is upper triangular.
If trans = 'C',conjugate transpose: op(A) = AH. For real data this is the
same as trans = 'T'.
If uplo=L, 1 jin.
NOTE
If there are elements outside of the triangular part of AP, they are
skipped and are not copied from a.
lda max(1, rows) for trans = 'N' and and lda max(1, cols) for
trans='T' or trans='C'.
Output Parameters
1853
3 Intel Math Kernel Library Developer Reference
info INTEGER. If info=0, the execution is successful. If info = -i, the i-th
parameter had an illegal value.
mkl_?tpunpack
Copies a triangular/symmetric matrix or submatrix
from standard packed format to full format.
Syntax
call mkl_stpunpack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_dtpunpack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_ctpunpack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_ztpunpack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_tpunpack (ap, i, j, rows, cols, a[, uplo] [, trans] [, info])
Include Files
mkl.fi, lapack.f90
Description
The routine copies a triangular or symmetric matrix or its submatrix from standard packed format to full
format.
A := op(APi:i+rows-1, j:j+cols-1)
GE: general
TR: triangular
SY: symmetric indefinite
HE: Hermitian indefinite
PO: symmetric or Hermitian positive definite
NOTE
Any elements of the copied submatrix rectangular outside of the triangular part of AP are skipped.
1854
LAPACK Routines 3
Input Parameters
The data types are given for the Fortran interface.
If trans = 'C',conjugate transpose: op(AP) = APH. For real data this is the
same as trans = 'T'.
If uplo=L, 1 jin.
lda max(1,rows) for trans = 'N' and and lda max(1,cols) for
trans='T' or trans='C'.
Output Parameters
1855
3 Intel Math Kernel Library Developer Reference
NOTE
If there are elements outside of the triangular part of ap indicated by
uplo, they are skipped and are not copied to a.
info INTEGER. If info=0, the execution is successful. If info = -i, the i-th
parameter had an illegal value.
ieeeck Checks if the infinity and NaN arithmetic is safe. Called by ilaenv.
?labad s, d Returns the square root of the underflow and overflow thresholds if
the exponent-range is very large.
1856
LAPACK Routines 3
Routine Name Data Description
Types
See Also
lsame Tests two characters for equality regardless of the case.
lsamen Tests two character strings for equality regardless of the case.
second/dsecnd Returns elapsed time in seconds. Use to estimate real time between two calls to
this function.
xerblaError handling function called by BLAS, LAPACK, Vector Math, and Vector Statistics
functions.
ilaver
Returns the version of the LAPACK library.
Syntax
call ilaver( vers_major, vers_minor, vers_patch )
Include Files
mkl.fi
Description
This routine returns the version of the LAPACK library.
Output Parameters
vers_major INTEGER.
Returns the major version of the LAPACK library.
vers_minor INTEGER.
Returns the minor version from the major version of the LAPACK library.
vers_patch INTEGER.
Returns the patch version from the minor version of the LAPACK library.
ilaenv
Environmental enquiry function that returns values for
tuning algorithmic performance.
Syntax
value = ilaenv( ispec, name, opts, n1, n2, n3, n4 )
1857
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The enquiry function ilaenv is called from the LAPACK routines to choose problem-dependent parameters
for the local environment. See ispec below for a description of the parameters.
This version provides a set of parameters that should give good, but not optimal, performance on many of
the currently available computers.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
ispec INTEGER.
Specifies the parameter to be returned as the value of ilaenv:
1858
LAPACK Routines 3
12 ispec 16: ?hseqr or one of its subroutines, see iparmq for detailed
explanation.
name CHARACTER*(*). The name of the calling subroutine, in either upper case
or lower case.
NOTE
Use only uppercase characters for the opts string.
n1, n2, n3, n4 INTEGER. Problem dimensions for the subroutine name; these may not all
be required.
Output Parameters
value INTEGER.
If value 0: the value of the parameter specified by ispec;
Application Notes
The following conventions have been used when calling ilaenv from the LAPACK routines:
1. opts is a concatenation of all of the character options to subroutine name, in the same order that they
appear in the argument list for name, even if they are not used in determining the value of the
parameter specified by ispec.
2. The problem dimensions n1, n2, n3, n4 are specified in the order that they appear in the argument list
for name. n1 is used first, n2 second, and so on, and unused problem dimensions are passed a value of
-1.
3. The parameter value returned by ilaenv is checked for validity in the calling subroutine. For example,
ilaenv is used to retrieve the optimal blocksize for strtri as follows:
See Also
?hseqr
iparmq
iparmq
Environmental enquiry function which returns values
for tuning algorithmic performance.
Syntax
value = iparmq( ispec, name, opts, n, ilo, ihi, lwork )
1859
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The function sets problem and machine dependent parameters useful for ?hseqr and its subroutines. It is
called whenever ilaenv is called with 12ispec16.
Input Parameters
ispec INTEGER.
Specifies the parameter to be returned as the value of iparmq:
= 12: (inmin) Matrices of order nmin or less are sent directly to ?lahqr,
the implicit double shift QR algorithm. nmin must be at least 11.
= 13: (inwin) Size of the deflation window. This is best set greater than or
equal to the number of simultaneous shifts ns. Larger matrices benefit from
larger deflation windows.
= 14: (inibl) Determines when to stop nibbling and invest in an
(expensive) multi-shift QR sweep. If the aggressive early deflation
subroutine finds ld converged eigenvalues from an order nw deflation
window and ld>(nw*nibble)/100, then the next QR sweep is skipped and
early deflation is applied immediately to the remaining active diagonal
block. Setting iparmq(ispec=14)=0 causes TTQRE to skip a multi-shift QR
sweep whenever early deflation finds a converged eigenvalue. Setting
iparmq(ispec=14) greater than or equal to 100 prevents TTQRE from
skipping a multi-shift QR sweep.
= 15: (nshfts) The number of simultaneous shifts in a multi-shift QR
iteration.
= 16: (iacc22) iparmq is set to 0, 1 or 2 with the following meanings.
1860
LAPACK Routines 3
It is assumed that H is already upper triangular in rows and columns
1:ilo-1 and ihi+1:n.
lwork INTEGER.
The amount of workspace available.
Output Parameters
value INTEGER.
If value 0: the value of the parameter specified by iparmq;
Application Notes
The following conventions have been used when calling ilaenv from the LAPACK routines:
1. opts is a concatenation of all of the character options to subroutine name, in the same order that they
appear in the argument list for name, even if they are not used in determining the value of the
parameter specified by ispec.
2. The problem dimensions n1, n2, n3, n4 are specified in the order that they appear in the argument list
for name. n1 is used first, n2 second, and so on, and unused problem dimensions are passed a value of
-1.
3. The parameter value returned by ilaenv is checked for validity in the calling subroutine. For example,
ilaenv is used to retrieve the optimal blocksize for strtri as follows:
ieeeck
Checks if the infinity and NaN arithmetic is safe.
Called by ilaenv.
Syntax
ival = ieeeck( ispec, zero, one )
Include Files
mkl.fi
Description
The function ieeeck is called from ilaenv to verify that infinity and possibly NaN arithmetic is safe, that is,
will not trap.
Input Parameters
ispec INTEGER.
Specifies whether to test just for infinity arithmetic or both for infinity and
NaN arithmetic:
If ispec = 0: Verify infinity arithmetic only.
1861
3 Intel Math Kernel Library Developer Reference
Output Parameters
ival INTEGER.
If ival = 0: Arithmetic failed to produce the correct answers.
?labad
Returns the square root of the underflow and overflow
thresholds if the exponent-range is very large.
Syntax
call slabad( small, large )
call dlabad( small, large )
Include Files
mkl.fi
Description
The routine takes as input the values computed by slamch/dlamch for underflow and overflow, and returns
the square root of each of these values if the log of large is sufficiently large. This subroutine is intended to
identify machines with a large exponent range, such as the Crays, and redefine the underflow and overflow
limits to be the square roots of the values computed by ?lamch. This subroutine is needed because ?lamch
does not compensate for poor arithmetic in the upper half of the exponent range, as is found on a Cray.
Input Parameters
Output Parameters
1862
LAPACK Routines 3
large On exit, if log10(large) is sufficiently large, the square root of large,
otherwise unchanged.
?lamch
Determines machine parameters for floating-point
arithmetic.
Syntax
val = slamch( cmach )
val = dlamch( cmach )
Include Files
mkl.fi
Description
The function ?lamch determines single precision and double precision machine parameters.
Input Parameters
where
eps = relative machine precision;
sfmin = safe minimum, such that 1/sfmin does not overflow;
base = base of the machine;
prec = eps*base;
t = number of (base) digits in the mantissa;
rnd = 1.0 when rounding occurs in addition, 0.0 otherwise;
emin = minimum exponent before (gradual) underflow;
rmin = underflow_threshold - base**(emin-1);
emax = largest exponent before overflow;
rmax = overflow_threshold - (base**emax)*(1-eps).
1863
3 Intel Math Kernel Library Developer Reference
NOTE
You can use a character string for cmach instead of a single character
in order to make your code more readable. The first character of the
string determines the value to be returned. For example, 'Precision'
is interpreted as 'p'.
Output Parameters
?lamc1
Called from ?lamc2. Determines machine parameters
given by beta, t, rnd, ieee1.
Syntax
call slamc1( beta, t, rnd, ieee1 )
call dlamc1( beta, t, rnd, ieee1 )
Include Files
mkl.fi
Description
The routine ?lamc1 determines machine parameters given by beta, t, rnd, ieee1.
Output Parameters
rnd LOGICAL.
Specifies whether proper rounding ( rnd = .TRUE. ) or chopping ( rnd
= .FALSE. ) occurs in addition. This may not be a reliable guide to the way
in which the machine performs its arithmetic.
ieee1 LOGICAL.
Specifies whether rounding appears to be done in the ieee 'round to
nearest' style.
?lamc2
Used by ?lamch. Determines machine parameters
specified in its arguments list.
1864
LAPACK Routines 3
Syntax
call slamc2( beta, t, rnd, eps, emin, rmin, emax, rmax )
call dlamc2( beta, t, rnd, eps, emin, rmin, emax, rmax )
Include Files
mkl.fi
Description
The routine ?lamc2 determines machine parameters specified in its arguments list.
Output Parameters
rnd LOGICAL.
Specifies whether proper rounding (rnd = .TRUE.) or chopping (rnd
= .FALSE.) occurs in addition. This may not be a reliable guide to the way
in which the machine performs its arithmetic.
?lamc3
Called from ?lamc1-?lamc5. Intended to force a and
b to be stored prior to doing the addition of a and b.
1865
3 Intel Math Kernel Library Developer Reference
Syntax
val = slamc3( a, b )
val = dlamc3( a, b )
Include Files
mkl.fi
Description
The routine is intended to force A and B to be stored prior to doing the addition of A and B, for use in
situations where optimizers might hold one of these in a register.
Input Parameters
Output Parameters
?lamc4
This is a service routine for ?lamc2.
Syntax
call slamc4( emin, start, base )
call dlamc4( emin, start, base )
Include Files
mkl.fi
Description
Input Parameters
1866
LAPACK Routines 3
Output Parameters
?lamc5
Called from ?lamc2. Attempts to compute the largest
machine floating-point number, without overflow.
Syntax
call slamc5( beta, p, emin, ieee, emax, rmax)
call dlamc5( beta, p, emin, ieee, emax, rmax)
Include Files
mkl.fi
Description
The routine ?lamc5 attempts to compute rmax, the largest machine floating-point number, without overflow.
It assumes that emax + abs(emin) sum approximately to a power of 2. It will fail on machines where this
assumption does not hold, for example, the Cyber 205 (emin = -28625, emax = 28718). It will also fail if
the value supplied for emin is too large (that is, too close to zero), probably with overflow.
Input Parameters
ieee LOGICAL. A logical flag specifying whether or not the arithmetic system is
thought to comply with the IEEE standard.
Output Parameters
chla_transtype
Translates a BLAST-specified integer constant to the
character string specifying a transposition operation.
Syntax
val = chla_transtype( trans )
1867
3 Intel Math Kernel Library Developer Reference
Include Files
mkl.fi
Description
The chla_transtype function translates a BLAST-specified integer constant to the character string
specifying a transposition operation.
The function returns a CHARACTER*1. If the input is not an integer indicating a transposition operator, then
val is 'X'. Otherwise, the function returns the constant value corresponding to trans.
Input Parameters
trans INTEGER.
Specifies the form of the system of equations:
If trans = BLAS_NO_TRANS = 111: No transpose.
Output Parameters
val CHARACTER*1
Character that specifies a transposition operation.
iladiag
Translates a character string specifying whether a
matrix has a unit diagonal to the relevant BLAST-
specified integer constant.
Syntax
val = iladiag( diag )
Include Files
mkl.fi
Description
The iladiag function translates a character string specifying whether a matrix has a unit diagonal or not to
the relevant BLAST-specified integer constant.
The function returns an INTEGER. If val < 0, the input is not a character indicating a unit or non-unit
diagonal. Otherwise, the function returns the constant value corresponding to diag.
Input Parameters
diag CHARACTER*1.
Specifies the form of the system of equations:
If diag = 'N': A is non-unit triangular.
1868
LAPACK Routines 3
Output Parameters
val INTEGER
Value returned by the function.
ilaprec
Translates a character string specifying an
intermediate precision to the relevant BLAST-specified
integer constant.
Syntax
val = ilaprec( prec )
Include Files
mkl.fi
Description
The ilaprec function translates a character string specifying an intermediate precision to the relevant
BLAST-specified integer constant.
The function returns an INTEGER. If val < 0, the input is not a character indicating a supported intermediate
precision. Otherwise, the function returns the constant value corresponding to prec.
Input Parameters
prec CHARACTER*1.
Specifies the form of the system of equations:
If prec = 'S': Single.
Output Parameters
val INTEGER
Value returned by the function.
ilatrans
Translates a character string specifying a transposition
operation to the BLAST-specified integer constant.
Syntax
val = ilatrans( trans )
Include Files
mkl.fi
1869
3 Intel Math Kernel Library Developer Reference
Description
The ilatrans function translates a character string specifying a transposition operation to the BLAST-
specified integer constant.
The function returns a INTEGER. If val < 0, the input is not a character indicating a transposition operator.
Otherwise, the function returns the constant value corresponding to trans.
Input Parameters
trans CHARACTER*1.
Specifies the form of the system of equations:
If trans = 'N': No transpose.
Output Parameters
val INTEGER
Character that specifies a transposition operation.
ilauplo
Translates a character string specifying an upper- or
lower-triangular matrix to the relevant BLAST-
specified integer constant.
Syntax
val = ilauplo( uplo )
Include Files
mkl.fi
Description
The ilauplo function translates a character string specifying an upper- or lower-triangular matrix to the
relevant BLAST-specified integer constant.
The function returns an INTEGER. If val < 0, the input is not a character indicating an upper- or lower-
triangular matrix. Otherwise, the function returns the constant value corresponding to uplo.
Input Parameters
diag CHARACTER.
Specifies the form of the system of equations:
If diag = 'U': A is upper triangular.
Output Parameters
val INTEGER
1870
LAPACK Routines 3
Value returned by the function.
xerbla_array
Assists other languages in calling the xerbla function.
Syntax
call xerbla_array( srname_array, srname_len, info )
Include Files
mkl.fi
Description
The routine assists other languages in calling the error handling xerbla function. Rather than taking a
Fortran string argument as the function name, xerbla_array takes an array of single characters along with
the array length. The routine then copies up to 32 characters of that array into a Fortran string and passes
that to xerbla. If called with a non-positive srname_len, the routine will call xerbla with a string of all
blank characters.
If some macro or other device makes xerbla_array available to C99 by a name lapack_xerbla and with a
common Fortran calling convention, a C99 program could invoke xerbla via:
{
int flen = strlen(__func__);
lapack_xerbla(__func__, &flen, &info);
}
Providing xerbla_array is not necessary for intercepting LAPACK errors. xerbla_array calls xerbla.
Output Parameters
srname_array CHARACTER(1).
Array, dimension (srname_len). The name of the routine that called
xerbla_array.
srname_len INTEGER.
The length of the name in srname_array.
info INTEGER.
Position of the invalid parameter in the parameter list of the calling routine.
1871
3 Intel Math Kernel Library Developer Reference
?latms
Generates a general m-by-n matrix with specific
singular values.
Syntax
call slatms (m, n, dist, iseed, sym, d, mode, cond, dmax, kl, ku, pack, a, lda, work,
info)
call dlatms (m, n, dist, iseed, sym, d, mode, cond, dmax, kl, ku, pack, a, lda, work,
info)
call clatms (m, n, dist, iseed, sym, d, mode, cond, dmax, kl, ku, pack, a, lda, work,
info)
call zlatms (m, n, dist, iseed, sym, d, mode, cond, dmax, kl, ku, pack, a, lda, work,
info)
Description
The ?latms routine generates random matrices with specified singular values, or symmetric/Hermitian
matrices with specified eigenvalues for testing LAPACK programs.
It applies this sequence of operations:
1. Set the diagonal to d, where d is input or computed according to mode, cond, dmax, and sym as
described in Input Parameters.
2. Generate a matrix with the appropriate band structure, by one of two methods:
Method A is chosen if the bandwidth is a large fraction of the order of the matrix, and lda is at least m (so a
dense matrix can be stored.) Method B is chosen if the bandwidth is small (less than (1/2)*n for symmetric
or Hermitian or less than .3*n+m for nonsymmetric), or lda is less than m and not less than the bandwidth.
Pack the matrix if desired, using one of the methods specified by the pack parameter.
1872
LAPACK Routines 3
If Method B is chosen and band format is specified, then the matrix is generated in the band format and no
repacking is necessary.
Input Parameters
The data types are given for the Fortran interface.
sym CHARACTER*1.
If sym='S' or 'H', the generated matrix is symmetric or Hermitian, with
eigenvalues specified by d, cond, mode, and dmax; they can be positive,
negative, or zero.
If sym='P', the generated matrix is symmetric or Hermitian, with
eigenvalues (which are singular, non-negative values) specified by d, cond,
mode, and dmax.
If sym='N', the generated matrix is nonsymmetric, with singular, non-
negative values specified by d, cond, mode, and dmax.
This array is used to specify the singular values or eigenvalues of A (see the
description of sym). If mode=0, then d is assumed to contain the
eigenvalues or singular values, otherwise elements of d are computed
according to mode, cond, and dmax.
1873
3 Intel Math Kernel Library Developer Reference
mode < 0 has the same meaning as ABS(mode), except that the order of the
elements of d is reversed. Thus, if mode is positive, d has entries ranging
from 1 to 1/cond, if negative, from 1/cond to 1.
If sym='S' or 'H', and mode is not 0, 6, nor -6, then the elements of d are
also given a random sign (multiplied by +1 or -1).
NOTE
dmax need not be positive: if dmax is negative (or zero), d will be
scaled by a negative number (or zero).
kl INTEGER. Specifies the lower bandwidth of the matrix. For example, kl=0
implies upper triangular, kl=1 implies upper Hessenberg, and kl being at
least m - 1 means that the matrix has full lower bandwidth. kl must equal
ku if the matrix is symmetric or Hermitian.
ku INTEGER. Specifies the upper bandwidth of the matrix. For example, ku=0
implies lower triangular, ku=1 implies lower Hessenberg, and ku being at
least n - 1 means that the matrix has full upper bandwidth. kl must equal
ku if the matrix is symmetric or Hermitian.
'N': no packing
'U': zero out all subdiagonal entries (if symmetric or Hermitian)
'L': zero out all superdiagonal entries (if symmetric or Hermitian)
'B': store the lower triangle in band storage scheme (only if matrix
symmetric, Hermitian, or lower triangular)
'Q': store the upper triangle in band storage scheme (only if matrix
symmetric, Hermitian, or upper triangular)
'Z': store the entire matrix in band storage scheme (pivoting can be
provided for by using this option to store A in the trailing rows of the
allocated storage)
Using these options, the various LAPACK packed and banded storage
schemes can be obtained:
1874
LAPACK Routines 3
'Z' 'B' 'Q' 'C' 'R'
If two calls to ?latms differ only in the pack parameter, they generate
mathematically equivalent matrices.
lda INTEGER. lda specifies the first dimension of a as declared in the calling
program.
If pack='N', 'U', 'L', 'C', or 'R', then lda must be at least m.
If pack='Z', lda must be large enough to hold the packed array: MIN( ku,
n - 1) + MIN( kl, m - 1) + 1.
Output Parameters
NOTE
The array d is not modified if mode = 0.
1875
3 Intel Math Kernel Library Developer Reference
elements of the first n columns are modified; the elements of the array
which do not correspond to elements of the generated matrix are set to
zero.
Workspace.
1876
ScaLAPACK Routines 4
Intel Math Kernel Library implements routines from the ScaLAPACK package for distributed-memory
architectures. Routines are supported for both real and complex dense and band matrices to perform the
tasks of solving systems of linear equations, solving linear least-squares problems, eigenvalue and singular
value problems, as well as performing a number of related computational tasks.
Intel MKL ScaLAPACK routines are written in FORTRAN 77 with exception of a few utility routines written in C
to exploit the IEEE arithmetic. All routines are available in all precision types: single precision, double
precision, complexm, and double complex precision. See the mkl_scalapack.h header file for C declarations
of ScaLAPACK routines.
NOTE
ScaLAPACK routines are provided only for Intel 64 or Intel Many Integrated Core architectures.
See descriptions of ScaLAPACK computational routines that perform distinct computational tasks, as well as
driver routines for solving standard types of problems in one call. Additionally, Intel Math Kernel Library
implements ScaLAPACK Auxiliary Routines, Utility Functions and Routines, and Matrix Redistribution/Copy
Routines. The library includes routines for both real and complex data.
The <install_directory>/examples/scalapackf directory contains sample code demonstrating the use
of ScaLAPACK routines.
Generally, ScaLAPACK runs on a network of computers using MPI as a message-passing layer and a set of
prebuilt communication subprograms (BLACS), as well as a set of BLAS optimized for the target architecture.
Intel MKL version of ScaLAPACK is optimized for Intel processors. For the detailed system and environment
requirements, see Intel MKL Release Notes and Intel MKL Developer Guide.
For full reference on ScaLAPACK routines and related information, see [SLUG].
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
1877
4 Intel Math Kernel Library Developer Reference
dtype_a desca(dtype_) 1
Descriptor type ( =1 for dense matrices).
ctxt_a desca(ctxt_) BLACS context handle for the process grid. 2
m_a desca(m_) Number of rows in the global matrix A. 3
n_a desca(n_) Number of columns in the global matrix A. 4
mb_a desca(mb_) Row blocking factor. 5
nb_a desca(nb_) Column blocking factor. 6
1878
ScaLAPACK Routines 4
Element Stored in Description Element Index
Name Number
rsrc_a desca(rsrc_) Process row over which the first row of the 7
global matrix A is distributed.
csrc_a desca(csrc_) Process column over which the first column of 8
the global matrix A is distributed.
lld_a desca(lld_) Leading dimension of the local matrix A. 9
Similar notations are used for different matrices. For example: lld_b is the leading dimension of the local
matrix storing the local blocks of the distributed matrix B and dtype_z is the type of the global matrix Z.
The number of rows and columns of a global dense matrix that a particular process in a grid receives after
data distributing is denoted by LOCr() and LOCc(), respectively. To compute these numbers, you can use the
ScaLAPACK tool routine numroc.
After the block-cyclic distribution of global data is done, you may choose to perform an operation on a
submatrix sub(A) of the global matrix A defined by the following 6 values (for dense matrices):
1879
4 Intel Math Kernel Library Developer Reference
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
The second and third letters yy indicate the matrix type as:
ge general
gb general band
sy symmetric
he Hermitian
or orthogonal
1880
ScaLAPACK Routines 4
tr triangular (or quasi-triangular)
tz trapezoidal
un unitary
For computational routines, the last three letters zzz indicate the computation performed and have the same
meaning as for LAPACK routines.
For driver routines, the last two letters zz or three letters zzz have the following meaning:
evd a simple driver for solving an eigenvalue problem using a divide and conquer
algorithm
gvx an expert driver for solving a generalized symmetric definite eigenvalue problem
Simple driver here means that the driver just solves the general problem, whereas an expert driver is more
versatile and can also optionally perform some related computations (such, for example, as refining the
solution and computing error bounds after the linear system is solved).
A diagonally dominant-like matrix is defined as a matrix for which it is known in advance that pivoting is not
required in the LU factorization of this matrix.
1881
4 Intel Math Kernel Library Developer Reference
For the above matrix types, the library includes routines for performing the following computations: factoring
the matrix; equilibrating the matrix; solving a system of linear equations; estimating the condition number of
a matrix; refining the solution of linear equations and computing its error bounds; inverting the matrix. Note
that for some of the listed matrix types only part of the computational routines are provided (for example,
routines that refine the solution are not provided for band or tridiagonal matrices). See Table Computational
Routines for Systems of Linear Equations for full list of available routines.
To solve a particular problem, you can either call two or more computational routines or call a corresponding
driver routine that combines several tasks in one call. Thus, to solve a system of linear equations with a
general matrix, you can first call p?getrf(LU factorization) and then p?getrs(computing the solution).
Then, you might wish to call p?gerfs to refine the solution and get the error bounds. Alternatively, you can
just use the driver routine p?gesvx which performs all these tasks in one call.
Table Computational Routines for Systems of Linear Equations lists the ScaLAPACK computational routines
for factorizing, equilibrating, and inverting matrices, estimating their condition numbers, solving systems of
equations with real matrices, refining the solution, and estimating its error.
Computational Routines for Systems of Linear Equations
Matrix type, storage Factorize Equilibrate Solve Condition Estimate Invert
scheme matrix matrix system number error matrix
general (partial pivoting) p?getrf p?geequ p?getrs p?gecon p?gerfs p?getri
general band (partial p?gbtrf p?gbtrs
pivoting)
general band (no p?dbtrf p?dbtrs
pivoting)
general tridiagonal (no p?dttrf p?dttrs
pivoting)
symmetric/Hermitian p?potrf p?poequ p?potrs p?pocon p?porfs p?potri
positive-definite
symmetric/Hermitian p?pbtrf p?pbtrs
positive-definite, band
symmetric/Hermitian p?pttrf p?pttrs
positive-definite,
tridiagonal
triangular p?trtrs p?trcon p?trrfs p?trtri
In this table ? stands for s (single precision real), d (double precision real), c (single precision complex), or z
(double precision complex).
p?getrf
Computes the LU factorization of a general m-by-n
distributed matrix.
Syntax
call psgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pdgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pcgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pzgetrf(m, n, a, ia, ja, desca, ipiv, info)
1882
ScaLAPACK Routines 4
Include Files
Description
The p?getrfroutine forms the LU factorization of a general m-by-n distributed matrix sub(A) = A(ia:ia+m-1,
ja:ja+n-1) as
A = P*L*U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m>n)
and U is upper triangular (upper trapezoidal if m < n). L and U are stored in sub(A).
NOTE
This routine supports the Progress Routine feature. See mkl_progress for details.
Input Parameters
a (local)
REAL for psgetrf
DOUBLE PRECISION for pdgetrf
COMPLEX for pcgetrf
DOUBLE COMPLEX for pzgetrf.
Pointer into the local memory to an array of local size (lld_a, LOCc(ja
+n-1)).
Contains the local pieces of the distributed matrix sub(A) to be factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
Output Parameters
Contains the pivoting information: local row i was interchanged with global
row ipiv(i). This array is tied to the distributed matrix A.
1883
4 Intel Math Kernel Library Developer Reference
info < 0: if the i-th argument is an array and the j-th entry had an illegal
value, then info = -(i*100+j); if the i-th argument is a scalar and had an
illegal value, then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?gbtrf
Computes the LU factorization of a general n-by-n
banded distributed matrix.
Syntax
call psgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pdgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pcgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pzgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
Include Files
Description
The p?gbtrf routine computes the LU factorization of a general n-by-n real/complex banded distributed
matrix A(1:n, ja:ja+n-1) using partial pivoting with row interchanges.
The resulting factorization is not the same factorization as returned from the LAPACK routine ?gbtrf.
Additional permutations are performed on the matrix for the sake of parallelism.
The factorization has the form
A(1:n, ja:ja+n-1) = P*L*U*Q
where P and Q are permutation matrices, and L and U are banded lower and upper triangular matrices,
respectively. The matrix Q represents reordering of columns for the sake of parallelism, while P represents
reordering of rows for numerical stability using classic partial pivoting.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
1884
ScaLAPACK Routines 4
Input Parameters
( 0 bwln-1 ).
( 0 bwun-1 ).
a (local)
REAL for psgbtrf
DOUBLE PRECISION for pdgbtrf
COMPLEX for pcgbtrf
DOUBLE COMPLEX for pzgbtrf.
Pointer into the local memory to an array of local size (lld_a, LOCc(ja
+n-1)) where
lld_a 2*bwl + 2*bwu +1.
Contains the local pieces of the n-by-n distributed banded matrix A(1:n,
ja:ja+n-1) to be factored.
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501, then dlen_ 7;
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the work array (lwork 1). If lwork
is too small, the minimal acceptable size will be returned in work(1) and an
error code is returned.
Output Parameters
a On exit, this array contains details of the factorization. Note that additional
permutations are performed on the matrix, so that the factors returned are
different from those returned by LAPACK.
1885
4 Intel Math Kernel Library Developer Reference
Contains pivot indices for local factorizations. Note that you should not alter
the contents of this array between factorization and solve.
af (local)
REAL for psgbtrf
DOUBLE PRECISION for pdgbtrf
COMPLEX for pcgbtrf
DOUBLE COMPLEX for pzgbtrf.
Array of size laf.
Auxiliary fill-in space. The fill-in space is created in a call to the factorization
routine p?gbtrf and is stored in af.
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = kNPROCS, the submatrix stored on processor info and factored
locally was not nonsingular, and the factorization was not completed.
If info = k > NPROCS, the submatrix stored on processor info-NPROCS
representing interactions with other processors was not nonsingular, and
the factorization was not completed.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?dbtrf
Computes the LU factorization of a n-by-n diagonally
dominant-like banded distributed matrix.
Syntax
call psdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pddbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pcdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pzdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
1886
ScaLAPACK Routines 4
Include Files
Description
The p?dbtrfroutine computes the LU factorization of a n-by-n real/complex diagonally dominant-like banded
distributed matrix A(1:n, ja:ja+n-1) without pivoting.
NOTE
A matrix is called diagonally dominant-like if pivoting is not required for LU to be numerically stable.
Note that the resulting factorization is not the same factorization as returned from LAPACK. Additional
permutations are performed on the matrix for the sake of parallelism.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
(0 bwln-1).
(0 bwun-1).
a (local)
REAL for psdbtrf
DOUBLE PRECISION for pddbtrf
COMPLEX for pcdbtrf
DOUBLE COMPLEX for pzdbtrf.
Pointer into the local memory to an array of local size (lld_a, LOCc(ja
+n-1)).
Contains the local pieces of the n-by-n distributed banded matrix A(1:n,
ja:ja+n-1) to be factored.
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
1887
4 Intel Math Kernel Library Developer Reference
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501, then dlen_ 7;
Must be lafNB*(bwl+bwu)+6*(max(bwl,bwu))2 .
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the work array, must be lwork
(max(bwl,bwu))2. If lwork is too small, the minimal acceptable size will
be returned in work(1) and an error code is returned.
Output Parameters
a On exit, this array contains details of the factorization. Note that additional
permutations are performed on the matrix, so that the factors returned are
different from those returned by LAPACK.
af (local)
REAL for psdbtrf
DOUBLE PRECISION for pddbtrf
COMPLEX for pcdbtrf
DOUBLE COMPLEX for pzdbtrf.
Array of size laf.
Auxiliary fill-in space. The fill-in space is created in a call to the factorization
routine p?dbtrf and is stored in af.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = kNPROCS, the submatrix stored on processor info and factored
locally was not diagonally dominant-like, and the factorization was not
completed.
1888
ScaLAPACK Routines 4
If info = k > NPROCS, the submatrix stored on processor info-NPROCS
representing interactions with other processors was not nonsingular, and
the factorization was not completed.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?dttrf
Computes the LU factorization of a diagonally
dominant-like tridiagonal distributed matrix.
Syntax
call psdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pddttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pcdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pzdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
Include Files
Description
The p?dttrfroutine computes the LU factorization of an n-by-n real/complex diagonally dominant-like
tridiagonal distributed matrix A(1:n, ja:ja+n-1) without pivoting for stability.
The resulting factorization is not the same factorization as returned from LAPACK. Additional permutations
are performed on the matrix for the sake of parallelism.
The factorization has the form:
A(1:n, ja:ja+n-1) = P*L*U*PT,
where P is a permutation matrix, and L and U are banded lower and upper triangular matrices, respectively.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
n (global) INTEGER. The number of rows and columns to be operated on, that
is, the order of the distributed submatrix A(1:n, ja:ja+n-1) (n 0).
dl, d, du (local)
REAL for pspttrf
DOUBLE PRECISON for pdpttrf
1889
4 Intel Math Kernel Library Developer Reference
On entry, the array dl contains the local part of the global vector storing
the subdiagonal elements of the matrix. Globally, dl(1) is not referenced,
and dl must be aligned with d.
On entry, the array d contains the local part of the global vector storing the
diagonal elements of the matrix.
On entry, the array du contains the local part of the global vector storing
the super-diagonal elements of the matrix. du(n) is not referenced, and du
must be aligned with d.
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501, then dlen_ 7;
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the work array, must be at least
lwork 8*NPCOL.
Output Parameters
dl, d, du On exit, overwritten by the information containing the factors of the matrix.
af (local)
REAL for psdttrf
DOUBLE PRECISION for pddttrf
COMPLEX for pcdttrf
DOUBLE COMPLEX for pzdttrf.
Array of size laf.
Auxiliary fill-in space. The fill-in space is created in a call to the factorization
routine p?dttrf and is stored in af.
1890
ScaLAPACK Routines 4
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = kNPROCS, the submatrix stored on processor info and factored
locally was not diagonally dominant-like, and the factorization was not
completed.
If info = k > NPROCS, the submatrix stored on processor info-NPROCS
representing interactions with other processors was not nonsingular, and
the factorization was not completed.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?potrf
Computes the Cholesky factorization of a symmetric
(Hermitian) positive-definite distributed matrix.
Syntax
call pspotrf(uplo, n, a, ia, ja, desca, info)
call pdpotrf(uplo, n, a, ia, ja, desca, info)
call pcpotrf(uplo, n, a, ia, ja, desca, info)
call pzpotrf(uplo, n, a, ia, ja, desca, info)
Include Files
Description
The p?potrfroutine computes the Cholesky factorization of a real symmetric or complex Hermitian positive-
definite distributed n-by-n matrix A(ia:ia+n-1, ja:ja+n-1), denoted below as sub(A).
Input Parameters
1891
4 Intel Math Kernel Library Developer Reference
If uplo = 'U', the array a stores the upper triangular part of the matrix
sub(A) that is factored as UH*U.
If uplo = 'L', the array a stores the lower triangular part of the
matrix sub(A) that is factored as L*LH.
n (global) INTEGER. The order of the distributed matrix sub(A) (n0).
a (local)
REAL for pspotrf
DOUBLE PRECISON for pdpotrf
COMPLEX for pcpotrf
DOUBLE COMPLEX for pzpotrf.
Pointer into the local memory to an array of size (lld_a, LOCc(ja+n-1)).
On entry, this array contains the local pieces of the n-by-n symmetric/
Hermitian distributed matrix sub(A) to be factored.
Depending on uplo, the array a contains either the upper or the lower
triangular part of the matrix sub(A) (see uplo).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
Output Parameters
info < 0: if the i-th argument is an array, and the j-th entry had an illegal
value, then info = -(i*100+j); if the i-th argument is a scalar and had an
illegal value, then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?pbtrf
Computes the Cholesky factorization of a symmetric
(Hermitian) positive-definite banded distributed
matrix.
Syntax
call pspbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
1892
ScaLAPACK Routines 4
call pdpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
call pcpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
call pzpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
Include Files
Description
The p?pbtrfroutine computes the Cholesky factorization of an n-by-n real symmetric or complex Hermitian
positive-definite banded distributed matrix A(1:n, ja:ja+n-1).
The resulting factorization is not the same factorization as returned from LAPACK. Additional permutations
are performed on the matrix for the sake of parallelism.
The factorization has the form:
A(1:n, ja:ja+n-1) = P*UH*U*PT, if uplo='U', or
where P is a permutation matrix and U and L are banded upper and lower triangular matrices, respectively.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
Input Parameters
(n0).
bw (global) INTEGER.
a (local)
REAL for pspbtrf
DOUBLE PRECISON for pdpbtrf
COMPLEX for pcpbtrf
DOUBLE COMPLEX for pzpbtrf.
1893
4 Intel Math Kernel Library Developer Reference
On entry, this array contains the local pieces of the upper or lower triangle
of the symmetric/Hermitian band distributed matrix A(1:n, ja:ja+n-1) to
be factored.
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501, then dlen_ 7;
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the work array, must be lworkbw2.
Output Parameters
af (local)
REAL for pspbtrf
DOUBLE PRECISON for pdpbtrf
COMPLEX for pcpbtrf
DOUBLE COMPLEX for pzpbtrf.
Array of size laf. Auxiliary fill-in space. The fill-in space is created in a call
to the factorization routine p?pbtrf and stored in af. Note that if a linear
system is to be solved using p?pbtrs after the factorization routine,af
must not be altered.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
1894
ScaLAPACK Routines 4
info>0:
If info = kNPROCS, the submatrix stored on processor info and factored
locally was not positive definite, and the factorization was not completed.
If info = k > NPROCS, the submatrix stored on processor info-NPROCS
representing interactions with other processors was not nonsingular, and
the factorization was not completed.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?pttrf
Computes the Cholesky factorization of a symmetric
(Hermitian) positive-definite tridiagonal distributed
matrix.
Syntax
call pspttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pdpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pcpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pzpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
Include Files
Description
The p?pttrfroutine computes the Cholesky factorization of an n-by-n real symmetric or complex hermitian
positive-definite tridiagonal distributed matrix A(1:n, ja:ja+n-1).
The resulting factorization is not the same factorization as returned from LAPACK. Additional permutations
are performed on the matrix for the sake of parallelism.
The factorization has the form:
A(1:n, ja:ja+n-1) = P*L*D*LH*PT, or
where P is a permutation matrix, and U and L are tridiagonal upper and lower triangular matrices,
respectively.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
1895
4 Intel Math Kernel Library Developer Reference
Input Parameters
d, e (local)
REAL for pspttrf
DOUBLE PRECISON for pdpttrf
COMPLEX for pcpttrf
DOUBLE COMPLEX for pzpttrf.
Pointers into the local memory to arrays of size nb_a each.
On entry, the array d contains the local part of the global vector storing the
main diagonal of the distributed matrix A.
On entry, the array e contains the local part of the global vector storing the
upper diagonal of the distributed matrix A.
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
desca (global and local ) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501, then dlen_ 7;
Must be lafnb_a+2.
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the work array, must be at least
lwork 8*NPCOL.
Output Parameters
af (local)
REAL for pspttrf
DOUBLE PRECISION for pdpttrf
COMPLEX for pcpttrf
DOUBLE COMPLEX for pzpttrf.
Array of size laf.
1896
ScaLAPACK Routines 4
Auxiliary fill-in space. The fill-in space is created in a call to the factorization
routine p?pttrf and stored in af.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = kNPROCS, the submatrix stored on processor info and factored
locally was not positive definite, and the factorization was not completed.
If info = k > NPROCS, the submatrix stored on processor info-NPROCS
representing interactions with other processors was not nonsingular, and
the factorization was not completed.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?getrs
Solves a system of distributed linear equations with a
general square matrix, using the LU factorization
computed by p?getrf.
Syntax
call psgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pdgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pcgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pzgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
Include Files
Description
The p?getrsroutine solves a system of distributed linear equations with a general n-by-n distributed matrix
sub(A) = A(ia:ia+n-1, ja:ja+n-1) using the LU factorization computed by p?getrf.
1897
4 Intel Math Kernel Library Developer Reference
Before calling this routine,you must call p?getrf to compute the LU factorization of sub(A).
Input Parameters
n (global) INTEGER. The number of linear equations; the order of the matrix
sub(A) (n0).
nrhs (global) INTEGER. The number of right hand sides; the number of columns
of the distributed matrix sub(B) (nrhs0).
a, b (local)
REAL for psgetrs
DOUBLE PRECISION for pdgetrs
COMPLEX for pcgetrs
DOUBLE COMPLEX for pzgetrs.
Pointers into the local memory to arrays of local sizes (lld_a,LOCc(ja
+n-1)) and (lld_b,LOCc(jb+nrhs-1)), respectively.
On entry, the array a contains the local pieces of the factors L and U from
the factorization sub(A) = P*L*U; the unit diagonal elements of L are not
stored. On entry, the array b contains the right hand sides sub(B).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
ipiv (local) INTEGER Array of size of LOCr(m_a) + mb_a. Contains the pivoting
information: local row i of the matrix was interchanged with the global row
ipiv(i).
This array is tied to the distributed matrix A.
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the matrix sub(B),
respectively.
1898
ScaLAPACK Routines 4
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
Output Parameters
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?gbtrs
Solves a system of distributed linear equations with a
general band matrix, using the LU factorization
computed by p?gbtrf.
Syntax
call psgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af, laf,
work, lwork, info)
call pdgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af, laf,
work, lwork, info)
call pcgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af, laf,
work, lwork, info)
call pzgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af, laf,
work, lwork, info)
Include Files
Description
The p?gbtrs routine solves a system of distributed linear equations with a general band distributed matrix
sub(A) = A(1:n, ja:ja+n-1) using the LU factorization computed by p?gbtrf.
Before calling this routine,you must call p?gbtrf to compute the LU factorization of sub(A).
Input Parameters
1899
4 Intel Math Kernel Library Developer Reference
nrhs (global) INTEGER. The number of right hand sides; the number of columns
of the distributed matrix sub(B) (nrhs 0).
a, b (local)
REAL for psgbtrs
DOUBLE PRECISION for pdgbtrs
COMPLEX for pcgbtrs
DOUBLE COMPLEX for pzgbtrs.
Pointers into the local memory to arrays of local sizes (lld_a,LOCc(ja
+n-1)) and (lld_b,LOCc(nrhs)), respectively.
The array a contains details of the LU factorization of the distributed band
matrix A.
On entry, the array b contains the local pieces of the right hand sides
B(ib:ib+n-1, 1:nrhs).
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on ( which may be either all of A or a submatrix
of A).
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501, then dlen_ 7;
ib (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_b = 502, then dlen_ 7;
Must be lafnb_a*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu).
1900
ScaLAPACK Routines 4
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the work array, must be at least
lworknrhs*(nb_a+2*bwl+4*bwu).
Output Parameters
Contains pivot indices for local factorizations. Note that you should not alter
the contents of this array between factorization and solve.
af (local)
REAL for psgbtrs
DOUBLE PRECISION for pdgbtrs
COMPLEX for pcgbtrs
DOUBLE COMPLEX for pzgbtrs.
Array of size laf.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?dbtrs
Solves a system of linear equations with a diagonally
dominant-like banded distributed matrix using the
factorization computed by p?dbtrf.
Syntax
call psdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
1901
4 Intel Math Kernel Library Developer Reference
call pddbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pcdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pzdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
Include Files
Description
The p?dbtrsroutine solves for X one of the systems of equations:
sub(A)*X = sub(B),
(sub(A))T*X = sub(B), or
(sub(A))H*X = sub(B),
where sub(A) = A(1:n, ja:ja+n-1) is a diagonally dominant-like banded distributed matrix, and sub(B)
denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
Input Parameters
( 0 bwln-1 ).
( 0 bwun-1 ).
nrhs (global) INTEGER. The number of right hand sides; the number of columns
of the distributed matrix sub(B) (nrhs 0).
a, b (local)
REAL for psdbtrs
DOUBLE PRECISON for pddbtrs
COMPLEX for pcdbtrs
DOUBLE COMPLEX for pzdbtrs.
Pointers into the local memory to arrays of local sizes (lld_a,LOCc(ja
+n-1)) and (lld_b,LOCc(nrhs)), respectively.
1902
ScaLAPACK Routines 4
On entry, the array a contains details of the LU factorization of the band
matrix A, as computed by p?dbtrf.
On entry, the array b contains the local pieces of the right hand side
distributed matrix sub(B).
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501, then dlen_ 7;
ib (global) INTEGER. The row index in the global matrix B indicating the first
row of the matrix to be operated on (which may be either all of B or a
submatrix of B).
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
If dtype_b = 502, then dlen_ 7;
Must be lafNB*(bwl+bwu)+6*(max(bwl,bwu))2 .
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the array work, must be at least
lwork (max(bwl,bwu))2.
Output Parameters
b On exit, this array contains the local pieces of the solution distributed
matrix X.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
1903
4 Intel Math Kernel Library Developer Reference
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?dttrs
Solves a system of linear equations with a diagonally
dominant-like tridiagonal distributed matrix using the
factorization computed by p?dttrf.
Syntax
call psdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pddttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pcdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pzdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
Include Files
Description
The p?dttrsroutine solves for X one of the systems of equations:
sub(A)*X = sub(B),
(sub(A))T*X = sub(B), or
(sub(A))H*X = sub(B),
where sub(A) =A(1:n, ja:ja+n-1) is a diagonally dominant-like tridiagonal distributed matrix, and sub(B)
denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
Input Parameters
nrhs (global) INTEGER. The number of right hand sides; the number of columns
of the distributed matrix sub(B) (nrhs 0).
1904
ScaLAPACK Routines 4
dl, d, du (local)
REAL for psdttrs
DOUBLE PRECISON for pddttrs
COMPLEX for pcdttrs
DOUBLE COMPLEX for pzdttrs.
Pointers to the local arrays of size nb_a each.
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501 or dtype_a = 502, then dlen_ 7;
ib (global) INTEGER. The row index in the global matrix B indicating the first
row of the matrix to be operated on (which may be either all of B or a
submatrix of B).
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
If dtype_b = 502, then dlen_ 7;
The array af contains auxiliary fill-in space. The fill-in space is created in a
call to the factorization routine p?dttrf and is stored in af. If a linear
system is to be solved using p?dttrs after the factorization routine,af
must not be altered.
The array work is a workspace array.
1905
4 Intel Math Kernel Library Developer Reference
Must be lafNB*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu).
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the array work, must be at least
lwork 10*NPCOL+4*nrhs.
Output Parameters
b On exit, this array contains the local pieces of the solution distributed
matrix X.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?potrs
Solves a system of linear equations with a Cholesky-
factored symmetric/Hermitian distributed positive-
definite matrix.
Syntax
call pspotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pdpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pcpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pzpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
Include Files
Description
The p?potrsroutine solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian positive
definite distributed matrix, and sub(B) denotes the distributed matrix B(ib:ib+n-1, jb:jb+nrhs-1).
1906
ScaLAPACK Routines 4
Input Parameters
nrhs (global) INTEGER. The number of right hand sides; the number of columns
of the distributed matrix sub(B) (nrhs0).
a, b (local)
REAL for pspotrs
DOUBLE PRECISION for pdpotrs
COMPLEX for pcpotrs
DOUBLE COMPLEX for pzpotrs.
Pointers into the local memory to arrays of local sizes
(lld_a,LOCc(ja+n-1)) and (lld_b,LOCc(jb+nrhs-1)), respectively.
The array a contains the factors L or U from the Cholesky factorization
sub(A) = L*LH or sub(A) = UH*U, as computed by p?potrf.
On entry, the array b contains the local pieces of the right hand sides
sub(B).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the matrix sub(B),
respectively.
descb (local) INTEGER array of size dlen_. The array descriptor for the distributed
matrix B.
Output Parameters
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
1907
4 Intel Math Kernel Library Developer Reference
p?pbtrs
Solves a system of linear equations with a Cholesky-
factored symmetric/Hermitian positive-definite band
matrix.
Syntax
call pspbtrs(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
call pdpbtrs(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
call pcpbtrs(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
call pzpbtrs(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
Include Files
Description
The p?pbtrsroutine solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(1:n, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian positive definite
distributed band matrix, and sub(B) denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
Input Parameters
nrhs (global) INTEGER. The number of right hand sides; the number of columns
of the distributed matrix sub(B) (nrhs0).
a, b (local)
REAL for pspbtrs
DOUBLE PRECISION for pdpbtrs
COMPLEX for pcpbtrs
DOUBLE COMPLEX for pzpbtrs.
1908
ScaLAPACK Routines 4
Pointers into the local memory to arrays of local sizes (lld_a,LOCc(ja
+n-1)) and (lld_b,LOCc(nrhs-1)), respectively.
The array a contains the permuted triangular factor U or L from the
Cholesky factorization sub(A) = P*UH*U*PT, or sub(A) = P*L*LH*PT of the
band matrix A, as returned by p?pbtrf.
On entry, the array b contains the local pieces of the n-by-nrhs right hand
side distributed matrix sub(B).
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501, then dlen_ 7;
ib (global) INTEGER. The row index in the global matrix B indicating the first
row of the matrix sub(B).
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
If dtype_b = 502, then dlen_ 7;
The array af is of size laf. It contains auxiliary fill-in space. The fill-in
space is created in a call to the factorization routine p?dbtrf and is stored
in af.
Must be lafnrhs*bw.
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the array work, must be at least
lworkbw2.
Output Parameters
b On exit, if info=0, this array contains the local pieces of the n-by-nrhs
solution distributed matrix X.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
1909
4 Intel Math Kernel Library Developer Reference
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?pttrs
Solves a system of linear equations with a symmetric
(Hermitian) positive-definite tridiagonal distributed
matrix using the factorization computed by p?pttrf.
Syntax
call pspttrs(n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork, info)
call pdpttrs(n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork, info)
call pcpttrs(uplo, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork, info)
call pzpttrs(uplo, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork, info)
Include Files
Description
The p?pttrsroutine solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(1:n, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian positive definite
tridiagonal distributed matrix, and sub(B) denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
Input Parameters
nrhs (global) INTEGER. The number of right hand sides; the number of columns
of the distributed matrix sub(B) (nrhs0).
d, e (local)
REAL for pspttrs
DOUBLE PRECISON for pdpttrs
COMPLEX for pcpttrs
1910
ScaLAPACK Routines 4
DOUBLE COMPLEX for pzpttrs.
Pointers into the local memory to arrays of size nb_a each.
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
If dtype_a = 501 or dtype_a = 502, then dlen_ 7;
ib (global) INTEGER. The row index in the global matrix B indicating the first
row of the matrix to be operated on (which may be either all of B or a
submatrix of B).
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
If dtype_b = 502, then dlen_ 7;
Must be lafnb_a+2.
If laf is not large enough, an error code is returned and the minimum
acceptable size will be returned in af(1).
lwork (local or global) INTEGER. The size of the array work, must be at least
lwork (10+2*min(100,nrhs))*NPCOL+4*nrhs.
1911
4 Intel Math Kernel Library Developer Reference
Output Parameters
b On exit, this array contains the local pieces of the solution distributed
matrix X.
work(1)) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?trtrs
Solves a system of linear equations with a triangular
distributed matrix.
Syntax
call pstrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pdtrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pctrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pztrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
Include Files
Description
The p?trtrsroutine solves for X one of the following systems of linear equations:
sub(A)*X = sub(B),
(sub(A))T*X = sub(B), or
(sub(A))H*X = sub(B),
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a triangular distributed matrix of order n, and sub(B) denotes
the distributed matrix B(ib:ib+n-1, jb:jb+nrhs-1).
Input Parameters
1912
ScaLAPACK Routines 4
If trans = 'N', then sub(A)*X = sub(B) is solved for X.
nrhs (global) INTEGER. The number of right-hand sides; i.e., the number of
columns of the distributed matrix sub(B) (nrhs0).
a, b (local)
REAL for pstrtrs
DOUBLE PRECISION for pdtrtrs
COMPLEX for pctrtrs
DOUBLE COMPLEX for pztrtrs.
Pointers into the local memory to arrays of local sizes (lld_a,LOCc(ja
+n-1)) and (lld_b,LOCc(jb+nrhs-1)), respectively.
The array a contains the local pieces of the distributed triangular matrix
sub(A).
If uplo = 'U', the leading n-by-n upper triangular part of sub(A) contains
the upper triangular matrix, and the strictly lower triangular part of sub(A)
is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of sub(A) contains
the lower triangular matrix, and the strictly upper triangular part of sub(A)
is not referenced.
If diag = 'U', the diagonal elements of sub(A) are also not referenced
and are assumed to be 1.
On entry, the array b contains the local pieces of the right hand side
distributed matrix sub(B).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the matrix sub(B),
respectively.
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
1913
4 Intel Math Kernel Library Developer Reference
Output Parameters
info> 0:
if info = i, the i-th diagonal element of sub(A) is zero, indicating that the
submatrix is singular and the solutions X have not been computed.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?gecon
Estimates the reciprocal of the condition number of a
general distributed matrix in either the 1-norm or the
infinity-norm.
Syntax
call psgecon(norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork, liwork,
info)
call pdgecon(norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork, liwork,
info)
call pcgecon(norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork, lrwork,
info)
call pzgecon(norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork, lrwork,
info)
Include Files
Description
The p?gecon routine estimates the reciprocal of the condition number of a general distributed real/complex
matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) in either the 1-norm or infinity-norm, using the LU factorization
computed by p?getrf.
An estimate is obtained for ||(sub(A))-1||, and the reciprocal of the condition number is computed as
1914
ScaLAPACK Routines 4
Input Parameters
a (local)
REAL for psgecon
DOUBLE PRECISION for pdgecon
COMPLEX for pcgecon
DOUBLE COMPLEX for pzgecon.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
The array a contains the local pieces of the factors L and U from the
factorization sub(A) = P*L*U; the unit diagonal elements of L are not
stored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
anorm (global) REAL for single precision flavors, DOUBLE PRECISION for double
precision flavors.
If norm = '1' or 'O', the 1-norm of the original distributed matrix sub(A);
work (local)
REAL for psgecon
DOUBLE PRECISION for pdgecon
COMPLEX for pcgecon
DOUBLE COMPLEX for pzgecon.
The array work of size lwork is a workspace array.
1915
4 Intel Math Kernel Library Developer Reference
NOTE
iceil(x,y) is the ceiling of x/y, and mod(x,y) is the integer
remainder of x/y.
iwork (local) INTEGER. Workspace array of size liwork. Used in real flavors only.
liwork (local or global) INTEGER. The size of the array iwork; used in real flavors
only. Must be at least
liworkLOCr(n+mod(ia-1,mb_a)).
lrwork (local or global) INTEGER. The size of the array rwork; used in complex
flavors only. Must be at least
lrwork max(1, 2*LOCc(n+mod(ja-1,nb_a))).
Output Parameters
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
iwork(1) On exit, iwork(1) contains the minimum value of liwork required for
optimum performance (for real flavors).
rwork(1) On exit, rwork(1) contains the minimum value of lrwork required for
optimum performance (for complex flavors).
1916
ScaLAPACK Routines 4
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?pocon
Estimates the reciprocal of the condition number (in
the 1 - norm) of a symmetric / Hermitian positive-
definite distributed matrix.
Syntax
call pspocon(uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork, liwork,
info)
call pdpocon(uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork, liwork,
info)
call pcpocon(uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork, lrwork,
info)
call pzpocon(uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork, lrwork,
info)
Include Files
Description
The p?poconroutine estimates the reciprocal of the condition number (in the 1 - norm) of a real symmetric
or complex Hermitian positive definite distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1), using the
Cholesky factorization sub(A) = UH*U or sub(A) = L*LH computed by p?potrf.
An estimate is obtained for ||(sub(A))-1||, and the reciprocal of the condition number is computed as
Input Parameters
1917
4 Intel Math Kernel Library Developer Reference
a (local)
REAL for pspocon
DOUBLE PRECISION for pdpocon
COMPLEX for pcpocon
DOUBLE COMPLEX for pzpocon.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
The array a contains the local pieces of the factors L or U from the Cholesky
factorization sub(A) = UH*U, or sub(A) = L*LH, as computed by p?potrf.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
work (local)
REAL for pspocon
DOUBLE PRECISION for pdpocon
COMPLEX for pcpocon
DOUBLE COMPLEX for pzpocon.
The array work of size lwork is a workspace array.
NOTE
iceil(x,y) is the ceiling of x/y, and mod(x,y) is the integer
remainder of x/y.
iwork (local) INTEGER. Workspace array of size liwork. Used in real flavors only.
1918
ScaLAPACK Routines 4
liwork (local or global) INTEGER. The size of the array iwork; used in real flavors
only. Must be at least liworkLOCr(n+mod(ia-1,mb_a)).
lrwork (local or global) INTEGER. The size of the array rwork; used in complex
flavors only. Must be at least lrwork 2*LOCc(n+mod(ja-1,nb_a)).
Output Parameters
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
iwork(1) On exit, iwork(1) contains the minimum value of liwork required for
optimum performance (for real flavors).
rwork(1) On exit, rwork(1) contains the minimum value of lrwork required for
optimum performance (for complex flavors).
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?trcon
Estimates the reciprocal of the condition number of a
triangular distributed matrix in either 1-norm or
infinity-norm.
Syntax
call pstrcon(norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork, iwork, liwork,
info)
call pdtrcon(norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork, iwork, liwork,
info)
call pctrcon(norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork, rwork, lrwork,
info)
call pztrcon(norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork, rwork, lrwork,
info)
1919
4 Intel Math Kernel Library Developer Reference
Include Files
Description
The p?trconroutine estimates the reciprocal of the condition number of a triangular distributed matrix
sub(A) = A(ia:ia+n-1, ja:ja+n-1), in either the 1-norm or the infinity-norm.
The norm of sub(A) is computed and an estimate is obtained for ||(sub(A))-1||, then the reciprocal of the
condition number is computed as
Input Parameters
a (local)
REAL for pstrcon
DOUBLE PRECISION for pdtrcon
COMPLEX for pctrcon
DOUBLE COMPLEX for pztrcon.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
The array a contains the local pieces of the triangular distributed matrix
sub(A).
If uplo = 'U', the leading n-by-n upper triangular part of this distributed
matrix contains the upper triangular matrix, and its strictly lower triangular
part is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of this distributed
matrix contains the lower triangular matrix, and its strictly upper triangular
part is not referenced.
1920
ScaLAPACK Routines 4
If diag = 'U', the diagonal elements of sub(A) are also not referenced
and are assumed to be 1.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
work (local)
REAL for pstrcon
DOUBLE PRECISION for pdtrcon
COMPLEX for pctrcon
DOUBLE COMPLEX for pztrcon.
The array work of size lwork is a workspace array.
NOTE
iceil(x,y) is the ceiling of x/y, and mod(x,y) is the integer
remainder of x/y.
iwork (local) INTEGER. Workspace array of size liwork. Used in real flavors only.
liwork (local or global) INTEGER. The size of the array iwork; used in real flavors
only. Must be at least
liworkLOCr(n+mod(ia-1,mb_a)).
lrwork (local or global) INTEGER. The size of the array rwork; used in complex
flavors only. Must be at least
1921
4 Intel Math Kernel Library Developer Reference
lrworkLOCc(n+mod(ja-1,nb_a)).
Output Parameters
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
iwork(1) On exit, iwork(1) contains the minimum value of liwork required for
optimum performance (for real flavors).
rwork(1) On exit, rwork(1) contains the minimum value of lrwork required for
optimum performance (for complex flavors).
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
Refining the Solution and Estimating Its Error: ScaLAPACK Computational Routines
This section describes the ScaLAPACK routines for refining the computed solution of a system of linear
equations and estimating the solution error. You can call these routines after factorizing the matrix of the
system of equations and computing the solution (see Routines for Matrix Factorization and Solving Systems
of Linear Equations).
p?gerfs
Improves the computed solution to a system of linear
equations and provides error bounds and backward
error estimates for the solution.
Syntax
call psgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv, b, ib, jb,
descb, x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pdgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv, b, ib, jb,
descb, x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pcgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv, b, ib, jb,
descb, x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
call pzgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv, b, ib, jb,
descb, x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
Include Files
1922
ScaLAPACK Routines 4
Description
The p?gerfs routine improves the computed solution to one of the systems of linear equations
sub(A)*sub(X) = sub(B),
sub(A)T*sub(X) = sub(B), or
sub(A)H*sub(X) = sub(B) and provides error bounds and backward error estimates for the solution.
Here sub(A) = A(ia:ia+n-1, ja:ja+n-1), sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1), and sub(X) = X(ix:ix
+n-1, jx:jx+nrhs-1).
Input Parameters
nrhs (global) INTEGER. The number of right-hand sides, i.e., the number of
columns of the matrices sub(B) and sub(X) (nrhs 0).
a, af, b, x (local)
REAL for psgerfs
DOUBLE PRECISION for pdgerfs
COMPLEX for pcgerfs
DOUBLE COMPLEX for pzgerfs.
Pointers into the local memory to arrays of local sizes a(lld_a, LOCc(ja
+n-1)), af(lld_af,LOCc(jaf+n-1)), b(lld_b,LOCc(jb+nrhs-1)),
and x(lld_x,LOCc(jx+nrhs-1)), respectively.
The array a contains the local pieces of the distributed matrix sub(A).
The array af contains the local pieces of the distributed factors of the
matrix sub(A) = P*L*U as computed by p?getrf.
The array b contains the local pieces of the distributed matrix of right hand
sides sub(B).
On entry, the array x contains the local pieces of the distributed solution
matrix sub(X).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
1923
4 Intel Math Kernel Library Developer Reference
iaf, jaf (global) INTEGER. The row and column indices in the global matrix AF
indicating the first row and the first column of the matrix sub(AF),
respectively.
descaf (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix AF.
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the matrix sub(B),
respectively.
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
ix, jx (global) INTEGER. The row and column indices in the global matrix X
indicating the first row and the first column of the matrix sub(X),
respectively.
descx (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix X.
work (local)
REAL for psgerfs
DOUBLE PRECISION for pdgerfs
COMPLEX for pcgerfs
DOUBLE COMPLEX for pzgerfs.
The array work of size lwork is a workspace array.
NOTE
mod(x,y) is the integer remainder of x/y.
iwork (local) INTEGER. Workspace array, size liwork. Used in real flavors only.
1924
ScaLAPACK Routines 4
liwork (local or global) INTEGER. The size of the array iwork; used in real flavors
only. Must be at least
liworkLOCr(n+mod(ib-1,mb_b)).
lrwork (local or global) INTEGER. The size of the array rwork; used in complex
flavors only. Must be at least lrworkLOCr(n+mod(ib-1,mb_b))).
Output Parameters
The array ferr contains the estimated forward error bound for each
solution vector of sub(X).
If XTRUE is the true solution corresponding to sub(X), ferr is an estimated
upper bound for the magnitude of the largest element in (sub(X) - XTRUE)
divided by the magnitude of the largest element in sub(X). The estimate is
as reliable as the estimate for rcond, and is almost always a slight
overestimate of the true error.
This array is tied to the distributed matrix X.
The array berr contains the component-wise relative backward error of
each solution vector (that is, the smallest relative change in any entry of
sub(A) or sub(B) that makes sub(X) an exact solution). This array is tied to
the distributed matrix X.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
iwork(1) On exit, iwork(1) contains the minimum value of liwork required for
optimum performance (for real flavors).
rwork(1) On exit, rwork(1) contains the minimum value of lrwork required for
optimum performance (for complex flavors).
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
1925
4 Intel Math Kernel Library Developer Reference
p?porfs
Improves the computed solution to a system of linear
equations with symmetric/Hermitian positive definite
distributed matrix and provides error bounds and
backward error estimates for the solution.
Syntax
call psporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pdporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pcporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
call pzporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
Include Files
Description
The p?porfsroutine improves the computed solution to the system of linear equations
sub(A)*sub(X) = sub(B),
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a real symmetric or complex Hermitian positive definite
distributed matrix and
sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1),
are right-hand side and solution submatrices, respectively. This routine also provides error bounds and
backward error estimates for the solution.
Input Parameters
nrhs (global) INTEGER. The number of right-hand sides, i.e., the number of
columns of the matrices sub(B) and sub(X) (nrhs0).
a, af, b, x (local)
REAL for psporfs
DOUBLE PRECISION for pdporfs
COMPLEX for pcporfs
DOUBLE COMPLEX for pzporfs.
1926
ScaLAPACK Routines 4
Pointers into the local memory to arrays of local sizes
a(lld_a, LOCc(ja+n-1)), af(lld_af,LOCc(jaf+n-1)),
b(lld_b,LOCc(jb+nrhs-1)), and x(lld_x,LOCc(jx+nrhs-1)),
respectively.
The array a contains the local pieces of the n-by-n symmetric/Hermitian
distributed matrix sub(A).
If uplo = 'U', the leading n-by-n upper triangular part of sub(A) contains
the upper triangular part of the matrix, and its strictly lower triangular part
is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of sub(A) contains
the lower triangular part of the distributed matrix, and its strictly upper
triangular part is not referenced.
The array af contains the factors L or U from the Cholesky factorization
sub(A) = L*LH or sub(A) = UH*U, as computed by p?potrf.
On entry, the array b contains the local pieces of the distributed matrix of
right hand sides sub(B).
On entry, the array x contains the local pieces of the solution vectors
sub(X).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
iaf, jaf (global) INTEGER. The row and column indices in the global matrix AF
indicating the first row and the first column of the matrix sub(AF),
respectively.
descaf (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix AF.
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the matrix sub(B),
respectively.
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
ix, jx (global) INTEGER. The row and column indices in the global matrix X
indicating the first row and the first column of the matrix sub(X),
respectively.
descx (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix X.
work (local)
REAL for psporfs
DOUBLE PRECISION for pdporfs
COMPLEX for pcporfs
1927
4 Intel Math Kernel Library Developer Reference
NOTE
mod(x,y) is the integer remainder of x/y.
iwork (local) INTEGER. Workspace array of size liwork. Used in real flavors only.
liwork (local or global) INTEGER. The size of the array iwork; used in real flavors
only. Must be at least
liworkLOCr(n+mod(ib-1,mb_b)).
lrwork (local or global) INTEGER. The size of the array rwork; used in complex
flavors only. Must be at least lrworkLOCr(n+mod(ib-1,mb_b))).
Output Parameters
The array ferr contains the estimated forward error bound for each
solution vector of sub(X).
If XTRUE is the true solution corresponding to sub(X), ferr is an estimated
upper bound for the magnitude of the largest element in (sub(X) - XTRUE)
divided by the magnitude of the largest element in sub(X). The estimate is
as reliable as the estimate for rcond, and is almost always a slight
overestimate of the true error.
This array is tied to the distributed matrix X.
1928
ScaLAPACK Routines 4
The array berr contains the component-wise relative backward error of
each solution vector (that is, the smallest relative change in any entry of
sub(A) or sub(B) that makes sub(X) an exact solution). This array is tied to
the distributed matrix X.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
iwork(1) On exit, iwork(1) contains the minimum value of liwork required for
optimum performance (for real flavors).
rwork(1) On exit, rwork(1) contains the minimum value of lrwork required for
optimum performance (for complex flavors).
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?trrfs
Provides error bounds and backward error estimates
for the solution to a system of linear equations with a
distributed triangular coefficient matrix.
Syntax
call pstrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, x, ix,
jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pdtrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, x, ix,
jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pctrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, x, ix,
jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
call pztrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, x, ix,
jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
Include Files
Description
The p?trrfsroutine provides error bounds and backward error estimates for the solution to one of the
systems of linear equations
sub(A)*sub(X) = sub(B),
sub(A)T*sub(X) = sub(B), or
sub(A)H*sub(X) = sub(B) ,
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a triangular matrix,
1929
4 Intel Math Kernel Library Developer Reference
The solution matrix X must be computed by p?trtrs or some other means before entering this routine. The
routine p?trrfs does not do iterative refinement because doing so cannot improve the backward error.
Input Parameters
nrhs (global) INTEGER. The number of right-hand sides, that is, the number of
columns of the matrices sub(B) and sub(X) (nrhs0).
a, b, x (local)
REAL for pstrrfs
DOUBLE PRECISION for pdtrrfs
COMPLEX for pctrrfs
DOUBLE COMPLEX for pztrrfs.
Pointers into the local memory to arrays of local sizes a(lld_a, LOCc(ja
+n-1)), b(lld_b,LOCc(jb+nrhs-1)), and x(lld_x,LOCc(jx+nrhs-1)),
respectively.
The array a contains the local pieces of the original triangular distributed
matrix sub(A).
If uplo = 'U', the leading n-by-n upper triangular part of sub(A) contains
the upper triangular part of the matrix, and its strictly lower triangular part
is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of sub(A) contains
the lower triangular part of the distributed matrix, and its strictly upper
triangular part is not referenced.
If diag = 'U', the diagonal elements of sub(A) are also not referenced
and are assumed to be 1.
1930
ScaLAPACK Routines 4
On entry, the array b contains the local pieces of the distributed matrix of
right hand sides sub(B).
On entry, the array x contains the local pieces of the solution vectors
sub(X).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the matrix sub(B),
respectively.
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
ix, jx (global) INTEGER. The row and column indices in the global matrix X
indicating the first row and the first column of the matrix sub(X),
respectively.
descx (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix X.
work (local)
REAL for pstrrfs
DOUBLE PRECISION for pdtrrfs
COMPLEX for pctrrfs
DOUBLE COMPLEX for pztrrfs.
The array work of size lwork is a workspace array.
NOTE
mod(x,y) is the integer remainder of x/y.
iwork (local) INTEGER. Workspace array of size liwork. Used in real flavors only.
liwork (local or global) INTEGER. The size of the array iwork; used in real flavors
only. Must be at least
liworkLOCr(n+mod(ib-1,mb_b)).
1931
4 Intel Math Kernel Library Developer Reference
lrwork (local or global) INTEGER. The size of the array rwork; used in complex
flavors only. Must be at least lrworkLOCr(n+mod(ib-1,mb_b))).
Output Parameters
The array ferr contains the estimated forward error bound for each
solution vector of sub(X).
If XTRUE is the true solution corresponding to sub(X), ferr is an estimated
upper bound for the magnitude of the largest element in (sub(X) - XTRUE)
divided by the magnitude of the largest element in sub(X). The estimate is
as reliable as the estimate for rcond, and is almost always a slight
overestimate of the true error.
This array is tied to the distributed matrix X.
The array berr contains the component-wise relative backward error of
each solution vector (that is, the smallest relative change in any entry of
sub(A) or sub(B) that makes sub(X) an exact solution). This array is tied to
the distributed matrix X.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
iwork(1) On exit, iwork(1) contains the minimum value of liwork required for
optimum performance (for real flavors).
rwork(1) On exit, rwork(1) contains the minimum value of lrwork required for
optimum performance (for complex flavors).
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
1932
ScaLAPACK Routines 4
p?getri
Computes the inverse of a LU-factored distributed
matrix.
Syntax
call psgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pdgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pcgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pzgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
Include Files
Description
The p?getriroutine computes the inverse of a general distributed matrix sub(A) = A(ia:ia+n-1, ja:ja
+n-1) using the LU factorization computed by p?getrf. This method inverts U and then computes the
inverse of sub(A) by solving the system
inv(sub(A))*L = inv(U)
for inv(sub(A)).
Input Parameters
n (global) INTEGER. The number of rows and columns to be operated on, that
is, the order of the distributed matrix sub(A) (n0).
a (local)
REAL for psgetri
DOUBLE PRECISION for pdgetri
COMPLEX for pcgetri
DOUBLE COMPLEX for pzgetri.
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
On entry, the array a contains the local pieces of the L and U obtained by
the factorization sub(A) = P*L*U computed by p?getrf.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
work (local)
REAL for psgetri
DOUBLE PRECISION for pdgetri
COMPLEX for pcgetri
DOUBLE COMPLEX for pzgetri.
1933
4 Intel Math Kernel Library Developer Reference
lwork (local) INTEGER. The size of the array work. lwork must be at least
lworkLOCr(n+mod(ia-1,mb_a))*nb_a.
NOTE
mod(x,y) is the integer remainder of x/y.
The array work is used to keep at most an entire column block of sub(A).
iwork (local) INTEGER. Workspace array used for physically transposing the
pivots, size liwork.
where lcm is the least common multiple of process rows and columns
(NPROW and NPCOL).
Output Parameters
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
iwork(1) On exit, iwork(1) contains the minimum value of liwork required for
optimum performance.
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
1934
ScaLAPACK Routines 4
If info = i, U(i,i) is exactly zero. The factorization has been completed, but
the factor U is exactly singular, and division by zero will occur if it is used to
solve a system of equations.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?potri
Computes the inverse of a symmetric/Hermitian
positive definite distributed matrix.
Syntax
call pspotri(uplo, n, a, ia, ja, desca, info)
call pdpotri(uplo, n, a, ia, ja, desca, info)
call pcpotri(uplo, n, a, ia, ja, desca, info)
call pzpotri(uplo, n, a, ia, ja, desca, info)
Include Files
Description
The p?potriroutine computes the inverse of a real symmetric or complex Hermitian positive definite
distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) using the Cholesky factorization sub(A) = UH*U or
sub(A) = L*LH computed by p?potrf.
Input Parameters
n (global) INTEGER. The number of rows and columns to be operated on, that
is, the order of the distributed matrix sub(A) (n0).
a (local)
REAL for pspotri
DOUBLE PRECISION for pdpotri
COMPLEX for pcpotri
DOUBLE COMPLEX for pzpotri.
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
On entry, the array a contains the local pieces of the triangular factor U or L
from the Cholesky factorization sub(A) = UH*U, or sub(A) = L*LH, as
computed by p?potrf.
1935
4 Intel Math Kernel Library Developer Reference
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
Output Parameters
a On exit, overwritten by the local pieces of the upper or lower triangle of the
(symmetric/Hermitian) inverse of sub(A).
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = i, the element (i, i) of the factor U or L is zero, and the inverse
could not be computed.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?trtri
Computes the inverse of a triangular distributed
matrix.
Syntax
call pstrtri(uplo, diag, n, a, ia, ja, desca, info)
call pdtrtri(uplo, diag, n, a, ia, ja, desca, info)
call pctrtri(uplo, diag, n, a, ia, ja, desca, info)
call pztrtri(uplo, diag, n, a, ia, ja, desca, info)
Include Files
Description
The p?trtriroutine computes the inverse of a real or complex upper or lower triangular distributed matrix
sub(A) = A(ia:ia+n-1, ja:ja+n-1).
Input Parameters
1936
ScaLAPACK Routines 4
diag CHARACTER*1. Must be 'N' or 'U'.
Specifies whether or not the distributed matrix sub(A) is unit triangular.
If diag = 'N', then sub(A) is non-unit triangular.
n (global) INTEGER. The number of rows and columns to be operated on, that
is, the order of the distributed matrix sub(A) (n0).
a (local)
REAL for pstrtri
DOUBLE PRECISION for pdtrtri
COMPLEX for pctrtri
DOUBLE COMPLEX for pztrtri.
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
The array a contains the local pieces of the triangular distributed matrix
sub(A).
If uplo = 'U', the leading n-by-n upper triangular part of sub(A) contains
the upper triangular matrix to be inverted, and the strictly lower triangular
part of sub(A) is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of sub(A) contains
the lower triangular matrix, and the strictly upper triangular part of sub(A)
is not referenced.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
Output Parameters
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = k, A(ia+k-1, ja+k-1) is exactly zero. The triangular matrix
sub(A) is singular and its inverse cannot be computed.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
1937
4 Intel Math Kernel Library Developer Reference
p?geequ
Computes row and column scaling factors intended to
equilibrate a general rectangular distributed matrix
and reduce its condition number.
Syntax
call psgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pdgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pcgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pzgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
Include Files
Description
The p?geequroutine computes row and column scalings intended to equilibrate an m-by-n distributed matrix
sub(A) = A(ia:ia+m-1, ja:ja+n-1) and reduce its condition number. The output array r returns the row
scale factors ri , and the array c returns the column scale factors cj . These factors are chosen to try to make
the largest element in each row and column of the matrix B with elements bij=ri*aij*cj have absolute value 1.
ri and cj are restricted to be between SMLNUM = smallest safe number and BIGNUM = largest safe number.
Use of these scaling factors is not guaranteed to reduce the condition number of sub(A) but works well in
practice.
SMLNUM and BIGNUM are parameters representing machine precision. You can use the ?lamch routines to
compute them. For example, compute single precision values of SMLNUM and BIGNUM as follows:
The auxiliary function p?laqge uses scaling factors computed by p?geequ to scale a general rectangular
matrix.
Input Parameters
m (global) INTEGER. The number of rows to be operated on, that is, the
number of rows of the distributed matrix sub(A) (m 0).
n (global) INTEGER. The number of columns to be operated on, that is, the
number of columns of the distributed matrix sub(A) (n 0).
a (local)
REAL for psgeequ
DOUBLE PRECISION for pdgeequ
COMPLEX for pcgeequ
DOUBLE COMPLEX for pzgeequ .
1938
ScaLAPACK Routines 4
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
The array a contains the local pieces of the m-by-n distributed matrix whose
equilibration factors are to be computed.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
Output Parameters
If info = 0, colcnd contains the ratio of the smallest c(j) to the largest
c(j) (ja jja+n-1).
If colcnd 0.1, it is not worth scaling by c(ja:ja+n-1).
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = i and
1939
4 Intel Math Kernel Library Developer Reference
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?poequ
Computes row and column scaling factors intended to
equilibrate a symmetric (Hermitian) positive definite
distributed matrix and reduce its condition number.
Syntax
call pspoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pdpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pcpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pzpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
Include Files
Description
The p?poequ routine computes row and column scalings intended to equilibrate a real symmetric or complex
Hermitian positive definite distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) and reduce its condition
number (with respect to the two-norm). The output arrays sr and sc return the row and column scale
factors
These factors are chosen so that the scaled distributed matrix B with elements bij=s(i)*aij*s(j) has ones on
the diagonal.
This choice of sr and sc puts the condition number of B within a factor n of the smallest possible condition
number over all possible diagonal scalings.
The auxiliary function p?laqsy uses scaling factors computed by p?geequ to scale a general rectangular
matrix.
Input Parameters
n (global) INTEGER. The number of rows and columns to be operated on, that
is, the order of the distributed matrix sub(A) (n0).
a (local)
REAL for pspoequ
DOUBLE PRECISION for pdpoequ
1940
ScaLAPACK Routines 4
COMPLEX for pcpoequ
DOUBLE COMPLEX for pzpoequ.
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
The array a contains the n-by-n symmetric/Hermitian positive definite
distributed matrix sub(A) whose scaling factors are to be computed. Only
the diagonal elements of sub(A) are referenced.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
Output Parameters
sr, sc (local)
If info = 0, the array sr(ia:ia+n-1) contains the row scale factors for
sub(A). sr is aligned with the distributed matrix A, and replicated across
every process column. sr is tied to the distributed matrix A.
scond (global)
amax (global)
REAL for single precision flavors;
DOUBLE PRECISION for double precision flavors.
Absolute value of the largest matrix element. If amax is very close to
overflow or very close to underflow, the matrix should be scaled.
1941
4 Intel Math Kernel Library Developer Reference
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = k, the k-th diagonal entry of sub(A) is nonpositive.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?geqrf
Computes the QR factorization of a general m-by-n
matrix.
1942
ScaLAPACK Routines 4
Syntax
call psgeqrf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgeqrf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgeqrf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgeqrf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?geqrf routine forms the QR factorization of a general m-by-n distributed matrix sub(A)= A(ia:ia
+m-1, ja:ja+n-1) as
A=Q*R.
Input Parameters
a (local)
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf
COMPLEX for pcgeqrf
DOUBLE COMPLEX for pzgeqrf.
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
Contains the local pieces of the distributed matrix sub(A) to be factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia+m-1,
ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A
work (local).
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf.
COMPLEX for pcgeqrf.
DOUBLE COMPLEX for pzgeqrf
Workspace array of size lwork.
1943
4 Intel Math Kernel Library Developer Reference
Output Parameters
a The elements on and above the diagonal of sub(A) contain the min(m,n)-by-
n upper trapezoidal matrix R (R is upper triangular if mn); the elements
below the diagonal, with the array tau, represent the orthogonal/unitary
matrix Q as a product of elementary reflectors (see Application Notes
below).
tau (local)
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf
COMPLEX for pcgeqrf
DOUBLE COMPLEX for pzgeqrf.
Array of size LOCc(ja+min(m,n)-1).
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(ja)*H(ja+1)*...*H(ja+k-1),
where k = min(m,n).
1944
ScaLAPACK Routines 4
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is
stored on exit in A(ia+i:ia+m-1, ja+i-1), and tau in tau(ja+i-1).
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?geqpf
Computes the QR factorization of a general m-by-n
matrix with pivoting.
Syntax
call psgeqpf(m, n, a, ia, ja, desca, ipiv, tau, work, lwork, info)
call pdgeqpf(m, n, a, ia, ja, desca, ipiv, tau, work, lwork, info)
call pcgeqpf(m, n, a, ia, ja, desca, ipiv, tau, work, lwork, rwork, lrwork, info)
call pzgeqpf(m, n, a, ia, ja, desca, ipiv, tau, work, lwork, rwork, lrwork, info)
Include Files
Description
The p?geqpf routine forms the QR factorization with column pivoting of a general m-by-n distributed matrix
sub(A)= A(ia:ia+m-1, ja:ja+n-1) as
sub(A)*P=Q*R.
Input Parameters
a (local)
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf
COMPLEX for pcgeqpf
DOUBLE COMPLEX for pzgeqpf.
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
Contains the local pieces of the distributed matrix sub(A) to be factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia+m-1,
ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
work (local).
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf.
1945
4 Intel Math Kernel Library Developer Reference
You can determine MYROW, MYCOL, NPROW and NPCOL by calling the
blacs_gridinfosubroutine.
If lwork = -1, then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
rwork (local).
REAL for pcgeqpf.
DOUBLE PRECISION for pzgeqpf.
Workspace array of size lrwork (complex flavors only).
lrwork (local or global) INTEGER, size of rwork (complex flavors only). The value
of lrwork must be at least
1946
ScaLAPACK Routines 4
You can determine MYROW, MYCOL, NPROW and NPCOL by calling the
blacs_gridinfosubroutine.
If lrwork = -1, then lrwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
ipiv(i) = k, the local i-th column of sub(A)*P was the global k-th column of
sub(A). ipiv is tied to the distributed matrix A.
tau (local)
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf
COMPLEX for pcgeqpf
DOUBLE COMPLEX for pzgeqpf.
Array of size LOCc(ja+min(m, n)-1).
Contains the scalar factor tau of elementary reflectors. tau is tied to the
distributed matrix A.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
rwork(1) On exit, rwork(1) contains the minimum value of lrwork required for
optimum performance.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(1)*H(2)*...*H(k)
where k = min(m,n).
1947
4 Intel Math Kernel Library Developer Reference
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is
stored on exit in A(ia+i:ia+m-1, ja+i-1).
The matrix P is represented in ipiv as follows: if ipiv(j)= i then the j-th column of P is the i-th canonical
unit vector.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?orgqr
Generates the orthogonal matrix Q of the QR
factorization formed by p?geqrf.
Syntax
call psorgqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?orgqrroutine generates the whole or part of m-by-n real distributed matrix Q denoting A(ia:ia+m-1,
ja:ja+n-1) with orthonormal columns, which is defined as the first n columns of a product of k elementary
reflectors of order m
Q= H(1)*H(2)*...*H(k)
as returned by p?geqrf.
Input Parameters
n (global) INTEGER. The number of columns in the matrix sub(Q) (mn 0).
a (local)
REAL for psorgqr
DOUBLE PRECISION for pdorgqr
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)). The j-th column must contain the vector that defines the
elementary reflector H(j), jajja +k-1, as returned by p?geqrf in the k
columns of its distributed matrix argument A(ia:*, ja:ja+k-1).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia+m-1,
ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
1948
ScaLAPACK Routines 4
REAL for psorgqr
DOUBLE PRECISION for pdorgqr
Array of size LOCc(ja+k-1).
work (local)
REAL for psorgqr
DOUBLE PRECISION for pdorgqr
Workspace array of size of lwork.
Output Parameters
work(1) On exit, (1) contains the minimum value of lwork required for optimum
performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ungqr
Generates the complex unitary matrix Q of the QR
factorization formed by p?geqrf.
1949
4 Intel Math Kernel Library Developer Reference
Syntax
call pcungqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
This routine generates the whole or part of m-by-n complex distributed matrix Q denoting A(ia:ia+m-1,
ja:ja+n-1) with orthonormal columns, which is defined as the first n columns of a product of k elementary
reflectors of order m
Q = H(1)*H(2)*...*H(k)
as returned by p?geqrf.
Input Parameters
a (local)
COMPLEX for pcungqr
DOUBLE COMPLEX for pzungqr
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
The j-th column must contain the vector that defines the elementary
reflector H(j), jajja +k-1, as returned by p?geqrf in the k columns of
its distributed matrix argument A(ia:*, ja:ja+k-1).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
COMPLEX for pcungqr
DOUBLE COMPLEX for pzungqr
Array of size LOCc(ja+k-1).
work (local)
COMPLEX for pcungqr
DOUBLE COMPLEX for pzungqr
Workspace array of size of lwork.
1950
ScaLAPACK Routines 4
lwork (local or global) INTEGER, size of work, must be at least
lworknb_a*(nqa0 + mpa0 + nb_a), where
iroffa = mod(ia-1, mb_a),
indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ormqr
Multiplies a general matrix by the orthogonal matrix Q
of the QR factorization formed by p?geqrf.
Syntax
call psormqr(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pdormqr(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
The p?ormqrroutine overwrites the general real m-by-n distributed matrix sub (C) = C(i:i+m-1,j:j
+n-1) with
1951
4 Intel Math Kernel Library Developer Reference
where Q is a real orthogonal distributed matrix defined as the product of k elementary reflectors
Q = H(1) H(2)... H(k)
Input Parameters
a (local)
REAL for psormqr
DOUBLE PRECISION for pdormqr.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
The j-th column must contain the vector that defines the elementary
reflector H(j), jajja+k-1, as returned by p?geqrf in the k columns of its
distributed matrix argument A(ia:*, ja:ja+k-1). A(ia:*, ja:ja+k-1) is
modified by the routine but restored on exit.
If side = 'L', lld_amax(1, LOCr(ia+m-1))
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
REAL for psormqr
1952
ScaLAPACK Routines 4
DOUBLE PRECISION for pdormqr
Array of size LOCc(ja+k-1).
c (local)
REAL for psormqr
DOUBLE PRECISION for pdormqr
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the matrix sub(C),
respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
REAL for psormqr
DOUBLE PRECISION for pdormqr.
Workspace array of size of lwork.
if side = 'L',
lworkmax((nb_a*(nb_a-1))/2, (nqc0+max(npa0+numroc(numroc(n
+icoffc, nb_a, 0, 0, NPCOL), nb_a, 0, 0, lcmq), mpc0))*nb_a)
+ nb_a*nb_a
end if
where
lcmq = lcm/NPCOL with lcm = ilcm(NPROW, NPCOL),
iroffa = mod(ia-1, mb_a),
icoffa = mod(ja-1, nb_a),
iarow = indxg2p(ia, mb_a, MYROW, rsrc_a, NPROW),
npa0= numroc(n+iroffa, mb_a, MYROW, iarow, NPROW),
iroffc = mod(ic-1, mb_c),
icoffc = mod(jc-1, nb_c),
icrow = indxg2p(ic, mb_c, MYROW, rsrc_c, NPROW),
iccol = indxg2p(jc, nb_c, MYCOL, csrc_c, NPCOL),
mpc0= numroc(m+iroffc, mb_c, MYROW, icrow, NPROW),
1953
4 Intel Math Kernel Library Developer Reference
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?unmqr
Multiplies a complex matrix by the unitary matrix Q of
the QR factorization formed by p?geqrf.
Syntax
call pcunmqr(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmqr(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general complex m-by-n distributed matrix sub (C) = C(i:i+m-1,j:j+n-1)
with
where Q is a complex unitary distributed matrix defined as the product of k elementary reflectors
Q = H(1) H(2)... H(k) as returned by p?geqrf. Q is of order m if side = 'L' and of order n if side ='R'.
1954
ScaLAPACK Routines 4
Input Parameters
a (local)
COMPLEX for pcunmqr
DOUBLE COMPLEX for pzunmqr.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+k-1)).
The j-th column must contain the vector that defines the elementary
reflector H(j), jajja+k-1, as returned by p?geqrf in the k columns of its
distributed matrix argument A(ia:*, ja:ja+k-1). A(ia:*, ja:ja+k-1) is
modified by the routine but restored on exit.
If side = 'L', lld_amax(1, LOCr(ia+m-1))
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
COMPLEX for pcunmqr
DOUBLE COMPLEX for pzunmqr
Array of size LOCc(ja+k-1).
c (local)
COMPLEX for pcunmqr
1955
4 Intel Math Kernel Library Developer Reference
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the submatrix C, respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
COMPLEX for pcunmqr
DOUBLE COMPLEX for pzunmqr.
Workspace array of size of lwork.
If side = 'L',
1956
ScaLAPACK Routines 4
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?gelqf
Computes the LQ factorization of a general
rectangular matrix.
Syntax
call psgelqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgelqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgelqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgelqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?gelqf routine computes the LQ factorization of a real/complex distributed m-by-n matrix sub(A)=
A(ia:ia+m-1,ja:ja+n-1) = L*Q.
Input Parameters
a (local)
REAL for psgelqf
DOUBLE PRECISION for pdgelqf
COMPLEX for pcgelqf
DOUBLE COMPLEX for pzgelqf
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
1957
4 Intel Math Kernel Library Developer Reference
ia, ja (global) INTEGER. The row and column indices in the global array A
indicating the first row and the first column of the submatrix A(ia:ia
+m-1,ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
work (local)
REAL for psgelqf
DOUBLE PRECISION for pdgelqf
COMPLEX for pcgelqf
DOUBLE COMPLEX for pzgelqf
Workspace array of size of lwork.
NOTE
mod(x,y) is the integer remainder of x/y.
Output Parameters
a The elements on and below the diagonal of sub(A) contain the m-by-
min(m,n) lower trapezoidal matrix L (L is lower trapezoidal if mn); the
elements above the diagonal, with the array tau, represent the orthogonal/
unitary matrix Q as a product of elementary reflectors (see Application
Notes below).
tau (local)
REAL for psgelqf
DOUBLE PRECISION for pdgelqf
1958
ScaLAPACK Routines 4
COMPLEX for pcgelqf
DOUBLE COMPLEX for pzgelqf
Array of size LOCr(ia+min(m, n)-1).
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(ia+k-1)*H(ia+k-2)*...*H(ia),
where k = min(m,n)
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?orglq
Generates the real orthogonal matrix Q of the LQ
factorization formed by p?gelqf.
Syntax
call psorglq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorglq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?orglq routine generates the whole or part of m-by-n real distributed matrix Q denoting A(ia:ia
+m-1,ja:ja+n-1) with orthonormal rows, which is defined as the first m rows of a product of k elementary
reflectors of order n
as returned by p?gelqf.
1959
4 Intel Math Kernel Library Developer Reference
Input Parameters
a (local)
REAL for psorglq
DOUBLE PRECISION for pdorglq
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)). On entry, the i-th row must contain the vector that defines the
elementary reflector H(i), iaiia+k-1, as returned by p?gelqf in the k
rows of its distributed matrix argument A(ia:ia+k-1, ja:*).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia
+m-1,ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
work (local)
REAL for psorglq
DOUBLE PRECISION for pdorglq
Workspace array of size of lwork.
NOTE
mod(x,y) is the integer remainder of x/y.
indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
1960
ScaLAPACK Routines 4
Output Parameters
tau (local)
REAL for psorglq
DOUBLE PRECISION for pdorglq
Array of size LOCr(ia+k-1).
Contains the scalar factors tau(j) of elementary reflectors H(j). tau is tied
to the distributed matrix A.
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?unglq
Generates the unitary matrix Q of the LQ factorization
formed by p?gelqf.
Syntax
call pcunglq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzunglq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
This routine generates the whole or part of m-by-n complex distributed matrix Q denoting A(ia:ia
+m-1,ja:ja+n-1) with orthonormal rows, which is defined as the first m rows of a product of k elementary
reflectors of order n
Input Parameters
a (local)
1961
4 Intel Math Kernel Library Developer Reference
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia
+m-1,ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
COMPLEX for pcunglq
DOUBLE COMPLEX for pzunglq
Array of size LOCr(ia+k-1).
Contains the scalar factors tau(j) of elementary reflectors H(j). tau is tied
to the distributed matrix A.
work (local)
COMPLEX for pcunglq
DOUBLE COMPLEX for pzunglq
Workspace array of size of lwork.
NOTE
mod(x,y) is the integer remainder of x/y.
1962
ScaLAPACK Routines 4
Output Parameters
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ormlq
Multiplies a general matrix by the orthogonal matrix Q
of the LQ factorization formed by p?gelqf.
Syntax
call psormlq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, work, lwork,
info)
call pdormlq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, work, lwork,
info)
Include Files
Description
The p?ormlq routine overwrites the general real m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j
+n-1) with
where Q is a real orthogonal distributed matrix defined as the product of k elementary reflectors
Q = H(k)...H(2) H(1)
Input Parameters
1963
4 Intel Math Kernel Library Developer Reference
a (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)),
if side = 'L' and (lld_a,LOCc(ja+n-1)), if side = 'R'. The i-th row
must contain the vector that defines the elementary reflector H(i), iaiia
+k-1, as returned by p?gelqf in the k rows of its distributed matrix
argument A(ia:ia+k-1, ja:*).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq
Array of size LOCc(ja+k-1).
c (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the submatrix C, respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
1964
ScaLAPACK Routines 4
work (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq.
Workspace array of size of lwork.
lwork (local or global) INTEGER, size of the array work; must be at least:
If side = 'L',
NOTE
mod(x,y) is the integer remainder of x/y.
ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
1965
4 Intel Math Kernel Library Developer Reference
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?unmlq
Multiplies a general matrix by the unitary matrix Q of
the LQ factorization formed by p?gelqf.
Syntax
call pcunmlq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmlq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general complex m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j+n-1)
with
where Q is a complex unitary distributed matrix defined as the product of k elementary reflectors
Q = H(k)' ... H(2)' H(1)'
Input Parameters
1966
ScaLAPACK Routines 4
m (global) INTEGER. The number of rows in the distributed matrix sub(C)
(m0).
a (local)
COMPLEX for pcunmlq
DOUBLE COMPLEX for pzunmlq.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)),
if side = 'L' and (lld_a,LOCc(ja+n-1)), if side = 'R', where lld_a
max(1, LOCr (ia+k-1)). The i-th column must contain the vector that
defines the elementary reflector H(i), iaiia+k-1, as returned by p?gelqf
in the k rows of its distributed matrix argument A( ia:ia+k-1, ja:*).
A( ia:ia+k-1, ja:*) is modified by the routine but restored on exit.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
COMPLEX for pcunmlq
DOUBLE COMPLEX for pzunmlq
Array of size LOCc(ia+k-1).
c (local)
COMPLEX for pcunmlq
DOUBLE COMPLEX for pzunmlq.
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the submatrix C, respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
1967
4 Intel Math Kernel Library Developer Reference
lwork (local or global) INTEGER, size of the array work; must be at least:
If side = 'L',
NOTE
mod(x,y) is the integer remainder of x/y.
ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
1968
ScaLAPACK Routines 4
info (global) INTEGER.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?geqlf
Computes the QL factorization of a general matrix.
Syntax
call psgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?geqlf routine forms the QL factorization of a real/complex distributed m-by-n matrix sub(A)= A(ia:ia
+m-1, ja:ja+n-1) = Q*L.
Input Parameters
a (local)
REAL for psgeqlf
DOUBLE PRECISION for pdgeqlf
COMPLEX for pcgeqlf
DOUBLE COMPLEX for pzgeqlf
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)). Contains the local pieces of the distributed matrix sub(A) to be
factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia+m-1,
ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
work (local)
1969
4 Intel Math Kernel Library Developer Reference
NOTE
mod(x,y) is the integer remainder of x/y.
numroc and indxg2p are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
Output Parameters
tau (local)
REAL for psgeqlf
DOUBLE PRECISION for pdgeqlf
COMPLEX for pcgeqlf
DOUBLE COMPLEX for pzgeqlf
Array of size LOCc(ja+n-1).
1970
ScaLAPACK Routines 4
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(ja+k-1)*...*H(ja+1)*H(ja)
where k = min(m,n)
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?orgql
Generates the orthogonal matrix Q of the QL
factorization formed by p?geqlf.
Syntax
call psorgql(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgql(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?orgql routine generates the whole or part of m-by-n real distributed matrix Q denoting A(ia:ia
+m-1,ja:ja+n-1) with orthonormal rows, which is defined as the first m rows of a product of k elementary
reflectors of order n
Q = H(k)*...*H(2)*H(1)
as returned by p?geqlf.
Input Parameters
1971
4 Intel Math Kernel Library Developer Reference
a (local)
REAL for psorgql
DOUBLE PRECISION for pdorgql
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)). On entry, the j-th column must contain the vector that defines the
elementary reflector H(j),ja+n-kjja+n-1, as returned by p?geqlf in the
k columns of its distributed matrix argument A(ia:*,ja+n-k:ja+n-1).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia
+m-1,ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
REAL for psorgql
DOUBLE PRECISION for pdorgql
Array of size LOCc(ja+n-1).
Contains the scalar factors tau(j) of elementary reflectors H(j). tau is tied
to the distributed matrix A.
work (local)
REAL for psorgql
DOUBLE PRECISION for pdorgql
Workspace array of size of lwork.
NOTE
mod(x,y) is the integer remainder of x/y.
indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
1972
ScaLAPACK Routines 4
Output Parameters
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ungql
Generates the unitary matrix Q of the QL factorization
formed by p?geqlf.
Syntax
call pcungql(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungql(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
This routine generates the whole or part of m-by-n complex distributed matrix Q denoting A(ia:ia
+m-1,ja:ja+n-1) with orthonormal rows, which is defined as the first n columns of a product of k
elementary reflectors of order m
Input Parameters
a (local)
COMPLEX for pcungql
DOUBLE COMPLEX for pzungql
Pointer into the local memory to an array of local size
(lld_a,LOCc(ja+n-1)). On entry, the j-th column must contain the
vector that defines the elementary reflector H(j), ja+n-kjja+n-1,
as returned by p?geqlf in the k columns of its distributed matrix
argument A(ia:*, ja+n-k: ja+n-1).
1973
4 Intel Math Kernel Library Developer Reference
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia
+m-1,ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
COMPLEX for pcungql
DOUBLE COMPLEX for pzungql
Array of size LOCr(ia+n-1).
Contains the scalar factors tau(j) of elementary reflectors H(j). tau is tied
to the distributed matrix A.
work (local)
COMPLEX for pcungql
DOUBLE COMPLEX for pzungql
Workspace array of size of lwork.
indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
Output Parameters
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
1974
ScaLAPACK Routines 4
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ormql
Multiplies a general matrix by the orthogonal matrix Q
of the QL factorization formed by p?geqlf.
Syntax
call psormql(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pdormql(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
The p?ormqlroutine overwrites the general real m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j
+n-1) with
where Q is a real orthogonal distributed matrix defined as the product of k elementary reflectors
Input Parameters
a (local)
1975
4 Intel Math Kernel Library Developer Reference
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
REAL for psormql
DOUBLE PRECISION for pdormql.
Array of size LOCc(ja+n-1).
c (local)
REAL for psormql
DOUBLE PRECISION for pdormql.
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the submatrix C, respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
REAL for psormql
DOUBLE PRECISION for pdormql.
Workspace array of size of lwork.
If side = 'L',
1976
ScaLAPACK Routines 4
lworkmax((nb_a*(nb_a-1))/2, (nqc0+max(npa0 +
numroc(numroc(n+icoffc, nb_a, 0, 0, NPCOL), nb_a, 0, 0,
lcmq), mpc0))*nb_a) + nb_a*nb_a
end if
where
lcmq = lcm/NPCOL with lcm = ilcm (NPROW, NPCOL),
iroffa = mod(ia-1, mb_a),
icoffa = mod(ja-1, nb_a),
iarow = indxg2p(ia, mb_a, MYROW, rsrc_a, NPROW),
npa0= numroc(n + iroffa, mb_a, MYROW, iarow, NPROW),
iroffc = mod(ic-1, mb_c),
icoffc = mod(jc-1, nb_c),
icrow = indxg2p(ic, mb_c, MYROW, rsrc_c, NPROW),
iccol = indxg2p(jc, nb_c, MYCOL, csrc_c, NPCOL),
mpc0 = numroc(m+iroffc, mb_c, MYROW, icrow, NPROW),
nqc0 = numroc(n+icoffc, nb_c, MYCOL, iccol, NPCOL),
NOTE
mod(x,y) is the integer remainder of x/y.
ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
1977
4 Intel Math Kernel Library Developer Reference
p?unmql
Multiplies a general matrix by the unitary matrix Q of
the QL factorization formed by p?geqlf.
Syntax
call pcunmql(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmql(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general complex m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j+n-1)
with
where Q is a complex unitary distributed matrix defined as the product of k elementary reflectors
Q = H(k)' ... H(2)' H(1)'
Input Parameters
a (local)
COMPLEX for pcunmql
DOUBLE COMPLEX for pzunmql.
1978
ScaLAPACK Routines 4
Pointer into the local memory to an array of size (lld_a,LOCc(ja+k-1)).
The j-th column must contain the vector that defines the elementary
reflector H(j), jajja+k-1, as returned by p?geqlf in the k columns of its
distributed matrix argument A(ia:*, ja:ja+k-1). A(ia:*, ja:ja+k-1) is
modified by the routine but restored on exit.
If side = 'L',lld_amax(1, LOCr(ia+m-1)),
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
COMPLEX for pcunmql
DOUBLE COMPLEX for pzunmql
Array of size LOCc(ia+n-1).
c (local)
COMPLEX for pcunmql
DOUBLE COMPLEX for pzunmql.
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the submatrix C, respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
COMPLEX for pcunmql
DOUBLE COMPLEX for pzunmql.
Workspace array of size of lwork.
If side = 'L',
1979
4 Intel Math Kernel Library Developer Reference
end if
where
lcmp = lcm/NPCOL with lcm = ilcm (NPROW, NPCOL),
iroffa = mod(ia-1, mb_a),
icoffa = mod(ja-1, nb_a),
iarow = indxg2p(ia, mb_a, MYROW, rsrc_a, NPROW),
npa0 = numroc (n + iroffa, mb_a, MYROW, iarow, NPROW),
iroffc = mod(ic-1, mb_c),
icoffc = mod(jc-1, nb_c),
icrow = indxg2p(ic, mb_c, MYROW, rsrc_c, NPROW),
iccol = indxg2p(jc, nb_c, MYCOL, csrc_c, NPCOL),
mpc0 = numroc(m+iroffc, mb_c, MYROW, icrow, NPROW),
nqc0 = numroc(n+icoffc, nb_c, MYCOL, iccol, NPCOL),
NOTE
mod(x,y) is the integer remainder of x/y.
ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
NOTE
mod(x,y) is the integer remainder of x/y.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
1980
ScaLAPACK Routines 4
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?gerqf
Computes the RQ factorization of a general
rectangular matrix.
Syntax
call psgerqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgerqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgerqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgerqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?gerqf routine forms the QR factorization of a general m-by-n distributed matrix sub(A)= A(ia:ia
+m-1, ja:ja+n-1) as
A= R*Q
Input Parameters
a (local)
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf
COMPLEX for pcgeqrf
DOUBLE COMPLEX for pzgeqrf.
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
Contains the local pieces of the distributed matrix sub(A) to be factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A(ia:ia+m-1,
ja:ja+n-1), respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A
work (local).
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf.
1981
4 Intel Math Kernel Library Developer Reference
NOTE
mod(x,y) is the integer remainder of x/y.
Output Parameters
tau (local)
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf
COMPLEX for pcgeqrf
DOUBLE COMPLEX for pzgeqrf.
Array of size LOCr(ia+m-1).
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
1982
ScaLAPACK Routines 4
< 0, if the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j); if the i-th argument is a scalar and had an illegal
value, then info = -i.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(ia)*H(ia+1)*...*H(ia+k-1),
where k = min(m,n).
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?orgrq
Generates the orthogonal matrix Q of the RQ
factorization formed by p?gerqf.
Syntax
call psorgrq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgrq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?orgrqroutine generates the whole or part of m-by-n real distributed matrix Q denoting A(ia:ia
+m-1,ja:ja+n-1) with orthonormal rows that is defined as the last m rows of a product of k elementary
reflectors of order n
Q= H(1)*H(2)*...*H(k)
as returned by p?gerqf.
Input Parameters
a (local)
REAL for psorgrq
DOUBLE PRECISION for pdorgrq
1983
4 Intel Math Kernel Library Developer Reference
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
REAL for psorgrq
DOUBLE PRECISION for pdorgrq
Array of size LOCc(ja+k-1).
work (local)
REAL for psorgrq
DOUBLE PRECISION for pdorgrq
Workspace array of size of lwork.
NOTE
mod(x,y) is the integer remainder of x/y.
Output Parameters
1984
ScaLAPACK Routines 4
work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ungrq
Generates the unitary matrix Q of the RQ factorization
formed by p?gerqf.
Syntax
call pcungrq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungrq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
This routine generates the m-by-n complex distributed matrix Q denoting A(ia:ia+m-1,ja:ja+n-1) with
orthonormal rows, which is defined as the last m rows of a product of k elementary reflectors of order n
Input Parameters
a (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrqc
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
The i-th row must contain the vector that defines the elementary reflector
H(i), ia+m-kiia+m-1, as returned by p?gerqf in the k rows of its
distributed matrix argument A(ia+m-k:ia+m-1, ja:*).
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
1985
4 Intel Math Kernel Library Developer Reference
tau (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrq
Array of size LOCr(ia+m-1).
work (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrq
Workspace array of size of lwork.
NOTE
mod(x,y) is the integer remainder of x/y.
indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
1986
ScaLAPACK Routines 4
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ormr3
Applies an orthogonal distributed matrix to a general
m-by-n distributed matrix.
Syntax
call psormr3 (side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info )
call pdormr3 (side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info )
Description
p?ormr3 overwrites the general real m-by-n distributed matrix sub( C ) = C(ic:ic+m-1,jc:jc+n-1) with
where Q is a real orthogonal distributed matrix defined as the product of k elementary reflectors
Input Parameters
side (global)
CHARACTER.
= 'L': apply Q or QT from the Left;
= 'R': apply Q or QT from the Right.
trans (global)
CHARACTER.
= 'N': No transpose, apply Q;
= 'T': Transpose, apply QT.
m (global)
INTEGER.
The number of rows to be operated on i.e the number of rows of the
distributed submatrix sub( C ). m >= 0.
n (global)
INTEGER.
1987
4 Intel Math Kernel Library Developer Reference
k (global)
INTEGER.
The number of elementary reflectors whose product defines the matrix Q.
If side = 'L', m >= k >= 0,
l (global)
INTEGER.
The columns of the distributed submatrix sub( A ) containing the
meaningful part of the Householder reflectors.
If side = 'L', m >= l >= 0,
a (local)
REAL for psormr3
DOUBLE PRECISION for pdormr3
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)) if
side='L', and (lld_a,LOCc(ja+n-1)) if side='R', where lld_a >=
MAX(1,LOCr(ia+k-1));
On entry, the i-th row must contain the vector which defines the elementary
reflector H(i), ia <= i <= ia+k-1, as returned by p?tzrzf in the k rows of
its distributed matrix argument A(ia:ia+k-1,ja:*).
ia (global)
INTEGER.
The row index in the global array a indicating the first row of sub( A ).
ja (global)
INTEGER.
The column index in the global array a indicating the first column of
sub( A ).
tau (local)
REAL for psormr3
DOUBLE PRECISION for pdormr3
Array, size LOCc(ia+k-1).
1988
ScaLAPACK Routines 4
This array contains the scalar factors tau(i) of the elementary reflectors
H(i) as returned by p?tzrzf. tau is tied to the distributed matrix A.
c (local)
REAL for psormr3
DOUBLE PRECISION for pdormr3
Pointer into the local memory to an array of size (lld_c,LOCc(jc+n-1)) .
ic (global)
INTEGER.
The row index in the global array c indicating the first row of sub( C ).
jc (global)
INTEGER.
The column index in the global array c indicating the first column of
sub( C ).
work (local)
REAL for psormr3
DOUBLE PRECISION for pdormr3
Array, size (lwork)
lwork (local)
INTEGER.
The size of the array work.
1989
4 Intel Math Kernel Library Developer Reference
Output Parameters
info (local)
INTEGER.
= 0: successful exit
< 0: If the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j), if the i-th argument is a scalar and had an illegal
value, then info = -i.
Application Notes
Alignment requirements
The distributed submatrices A(ia:*, ja:*) and C(ic:ic+m-1,jc:jc+n-1) must verify some alignment
properties, namely the following expressions should be true:
If side = 'L',
p?unmr3
Applies an orthogonal distributed matrix to a general
m-by-n distributed matrix.
Syntax
call pcunmr3 (side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info )
call pzunmr3 (side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info )
Description
p?unmr3 overwrites the general complex m-by-n distributed matrix sub( C ) = C(ic:ic+m-1,jc:jc+n-1) with
side = 'L' side = 'R'
trans = 'N': Q * sub( C ) sub( C ) * Q
1990
ScaLAPACK Routines 4
trans = 'C': QH * sub( C ) sub( C ) * QH
where Q is a complex unitary distributed matrix defined as the product of k elementary reflectors
Input Parameters
side (global)
CHARACTER.
= 'L': apply Q or QH from the Left;
= 'R': apply Q or QH from the Right.
trans (global)
CHARACTER.
= 'N': No transpose, apply Q;
= 'C': Conjugate transpose, apply QH.
m (global)
INTEGER.
The number of rows to be operated on i.e the number of rows of the
distributed submatrix sub( C ). m >= 0.
n (global)
INTEGER.
The number of columns to be operated on i.e the number of columns of the
distributed submatrix sub( C ). n >= 0.
k (global)
INTEGER.
The number of elementary reflectors whose product defines the matrix Q.
If side = 'L', m >= k >= 0, if side = 'R', n >= k >= 0.
l (global)
INTEGER.
The columns of the distributed submatrix sub( A ) containing the
meaningful part of the Householder reflectors.
If side = 'L', m >= l >= 0, if side = 'R', n >= l >= 0.
a (local)
COMPLEX for pcunmr3
DOUBLE COMPLEX for pzunmr3
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)) if
side='L', and (lld_a,LOCc(ja+n-1)) if side='R', where lld_a >=
MAX(1,LOCr(ia+k-1));
1991
4 Intel Math Kernel Library Developer Reference
On entry, the i-th row must contain the vector which defines the elementary
reflector H(i), ia <= i <= ia+k-1, as returned by p?tzrzf in the k rows of
its distributed matrix argument A(ia:ia+k-1,ja:*).
ia (global)
INTEGER.
The row index in the global array a indicating the first row of sub( A ).
ja (global)
INTEGER.
The column index in the global array a indicating the first column of
sub( A ).
tau (local)
COMPLEX for pcunmr3
DOUBLE COMPLEX for pzunmr3
Array, size LOCc(ia+k-1).
This array contains the scalar factors tau(i) of the elementary reflectors
H(i) as returned by p?tzrzf. tau is tied to the distributed matrix A.
c (local)
COMPLEX for pcunmr3
DOUBLE COMPLEX for pzunmr3
Pointer into the local memory to an array of size (lld_c,LOCc(jc+n-1)) .
ic (global)
INTEGER.
The row index in the global array c indicating the first row of sub( C ).
jc (global)
INTEGER.
The column index in the global array c indicating the first column of
sub( C ).
1992
ScaLAPACK Routines 4
work (local)
COMPLEX for pcunmr3
DOUBLE COMPLEX for pzunmr3
Array, size (lwork)
Output Parameters
work (local)
COMPLEX for pcunmr3
DOUBLE COMPLEX for pzunmr3
Array, size (lwork)
info (local)
INTEGER.
= 0: successful exit
1993
4 Intel Math Kernel Library Developer Reference
< 0: If the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j), if the i-th argument is a scalar and had an illegal
value, then info = -i.
Application Notes
Alignment requirements
The distributed submatrices A(ia:*, ja:*) and C(ic:ic+m-1,jc:jc+n-1) must verify some alignment
properties, namely the following expressions should be true:
If side = 'L', ( nb_a = MB_C and ICOFFA = IROFFC )
If side = 'R', ( nb_a = nb_c and ICOFFA = ICOFFC and IACOL = ICCOL )
p?ormrq
Multiplies a general matrix by the orthogonal matrix Q
of the RQ factorization formed by p?gerqf.
Syntax
call psormrq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pdormrq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
The p?ormrqroutine overwrites the general real m-by-n distributed matrix sub (C) = C(i:i+m-1,j:j
+n-1) with
where Q is a real orthogonal distributed matrix defined as the product of k elementary reflectors
Input Parameters
1994
ScaLAPACK Routines 4
n (global) INTEGER. The number of columns in the distributed matrix sub(C)
(n0).
a (local)
REAL for psormqr
DOUBLE PRECISION for pdormqr.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)) if
side = 'L', and (lld_a,LOCc(ja+n-1)) if side = 'R'.
The i-th row must contain the vector that defines the elementary reflector
H(i), iaiia+k-1, as returned by p?gerqf in the k rows of its distributed
matrix argument A(ia:ia+k-1, ja:*). A(ia:ia+k-1, ja:*) is modified by
the routine but restored on exit.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
REAL for psormqr
DOUBLE PRECISION for pdormqr
Array of size LOCc(ja+k-1).
c (local)
REAL for psormrq
DOUBLE PRECISION for pdormrq
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the matrix sub(C),
respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
REAL for psormrq
1995
4 Intel Math Kernel Library Developer Reference
If side = 'L',
NOTE
mod(x,y) is the integer remainder of x/y.
ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
1996
ScaLAPACK Routines 4
= 0: the execution is successful.
< 0: if the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j); if the i-th argument is a scalar and had an illegal
value, then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?unmrq
Multiplies a general matrix by the unitary matrix Q of
the RQ factorization formed by p?gerqf.
Syntax
call pcunmrq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmrq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general complex m-by-n distributed matrix sub (C) = C(i:i+m-1,j:j+n-1)
with
where Q is a complex unitary distributed matrix defined as the product of k elementary reflectors
Input Parameters
1997
4 Intel Math Kernel Library Developer Reference
a (local)
COMPLEX for pcunmrq
DOUBLE COMPLEX for pzunmrq.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)) if
side = 'L', and (lld_a,LOCc(ja+n-1)) if side = 'R'. The i-th row
must contain the vector that defines the elementary reflector H(i), iaiia
+k-1, as returned by p?gerqf in the k rows of its distributed matrix
argument A(ia:ia+k-1, ja:*). A(ia:ia+k-1, ja:*) is modified by the
routine but restored on exit.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
COMPLEX for pcunmrq
DOUBLE COMPLEX for pzunmrq
Array of size LOCc(ja+k-1).
c (local)
COMPLEX for pcunmrq
DOUBLE COMPLEX for pzunmrq.
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the submatrix C, respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
COMPLEX for pcunmrq
DOUBLE COMPLEX for pzunmrq.
Workspace array of size of lwork.
1998
ScaLAPACK Routines 4
If side = 'L',
lworkmax((mb_a*(mb_a-1))/2, (mpc0 +
max(mqa0+numroc(numroc(n+iroffc, mb_a, 0, 0, NPROW), mb_a,
0, 0, lcmp), nqc0))*mb_a) + mb_a*mb_a
else if side = 'R',
NOTE
mod(x,y) is the integer remainder of x/y.
ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
1999
4 Intel Math Kernel Library Developer Reference
< 0: if the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j); if the i-th argument is a scalar and had an illegal
value, then info = -i.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?tzrzf
Reduces the upper trapezoidal matrix A to upper
triangular form.
Syntax
call pstzrzf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdtzrzf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pctzrzf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pztzrzf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?tzrzfroutine reduces the m-by-n (mn) real/complex upper trapezoidal matrix sub(A)= A(ia:ia+m-1,
ja:ja+n-1) to upper triangular form by means of orthogonal/unitary transformations. The upper trapezoidal
matrix A is factored as
A = (R 0)*Z,
where Z is an n-by-n orthogonal/unitary matrix and R is an m-by-m upper triangular matrix.
Input Parameters
a (local)
REAL for pstzrzf
DOUBLE PRECISION for pdtzrzf.
COMPLEX for pctzrzf.
DOUBLE COMPLEX for pztzrzf.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
Contains the local pieces of the m-by-n distributed matrix sub (A) to be
factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
work (local)
2000
ScaLAPACK Routines 4
REAL for pstzrzf
DOUBLE PRECISION for pdtzrzf.
COMPLEX for pctzrzf.
DOUBLE COMPLEX for pztzrzf.
Workspace array of size of lwork.
NOTE
mod(x,y) is the integer remainder of x/y.
indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
Output Parameters
a On exit, the leading m-by-m upper triangular part of sub(A) contains the
upper triangular matrix R, and elements m+1 to n of the first m rows of sub
(A), with the array tau, represent the orthogonal/unitary matrix Z as a
product of m elementary reflectors.
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
tau (local)
REAL for pstzrzf
DOUBLE PRECISION for pdtzrzf.
COMPLEX for pctzrzf.
DOUBLE COMPLEX for pztzrzf.
Array of size LOCr(ia+m-1).
2001
4 Intel Math Kernel Library Developer Reference
Application Notes
The factorization is obtained by the Householder's method. The k-th transformation matrix, Z(k), which is or
whose conjugate transpose is used to introduce zeros into the (m - k +1)-th row of sub(A), is given in the
form
where
T(k) = i - tau*u(k)*u(k)',
tau is a scalar and Z(k) is an (n - m) element vector. tau and Z(k) are chosen to annihilate the elements of
the k-th row of sub(A). The scalar tau is returned in the k-th element of tau and the vector u(k) in the k-th
row of sub(A), such that the elements of Z(k) are in a(k, m + 1),..., a(k, n). The elements of R are
returned in the upper triangular part of sub(A). Z is given by
Z = Z(1) * Z(2) *... * Z(m).
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ormrz
Multiplies a general matrix by the orthogonal matrix
from a reduction to upper triangular form formed by
p?tzrzf.
Syntax
call psormrz(side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pdormrz(side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general real m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j+n-1) with
2002
ScaLAPACK Routines 4
side ='L' side ='R'
trans = 'N': Q*sub(C) sub(C)*Q
trans = 'T': QT*sub(C) sub(C)*QT
where Q is a real orthogonal distributed matrix defined as the product of k elementary reflectors
Input Parameters
l (global)
The columns of the distributed matrix sub(A) containing the meaningful
part of the Householder reflectors.
If side = 'L', ml0
a (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)) if
side = 'L', and (lld_a,LOCc(ja+n-1)) if side = 'R', where
lld_amax(1,LOCr(ia+k-1)).
The i-th row must contain the vector that defines the elementary reflector
H(i), iaiia+k-1, as returned by p?tzrzf in the k rows of its distributed
matrix argument A(ia:ia+k-1, ja:*). A(ia:ia+k-1, ja:*) is modified by
the routine but restored on exit.
2003
4 Intel Math Kernel Library Developer Reference
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz
Array of size LOCc(ia+k-1).
c (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the submatrix C, respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz.
Workspace array of size of lwork.
If side = 'L',
2004
ScaLAPACK Routines 4
iroffc = mod(ic-1, mb_c),
icoffc = mod(jc-1, nb_c),
icrow = indxg2p(ic, mb_c, MYROW, rsrc_c, NPROW),
iccol = indxg2p(jc, nb_c, MYCOL, csrc_c, NPCOL),
mpc0 = numroc(m+iroffc, mb_c, MYROW, icrow, NPROW),
nqc0 = numroc(n+icoffc, nb_c, MYCOL, iccol, NPCOL),
NOTE
mod(x,y) is the integer remainder of x/y.
ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?unmrz
Multiplies a general matrix by the unitary
transformation matrix from a reduction to upper
triangular form determined by p?tzrzf.
Syntax
call pcunmrz(side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmrz(side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
2005
4 Intel Math Kernel Library Developer Reference
Description
This routine overwrites the general complex m-by-n distributed matrix sub (C) = C(i:i+m-1,j:j+n-1)
with
where Q is a complex unitary distributed matrix defined as the product of k elementary reflectors
Input Parameters
a (local)
COMPLEX for pcunmrz
DOUBLE COMPLEX for pzunmrz.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)) if
side = 'L', and (lld_a,LOCc(ja+n-1)) if side = 'R', where
lld_amax(1, LOCr(ja+k-1)). The i-th row must contain the vector that
defines the elementary reflector H(i), iaiia+k-1, as returned by p?gerqf
in the k rows of its distributed matrix argument A(ia:ia+k-1, ja:*).
A(ia:ia+k-1, ja:*) is modified by the routine but restored on exit.
2006
ScaLAPACK Routines 4
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
tau (local)
COMPLEX for pcunmrz
DOUBLE COMPLEX for pzunmrz
Array of size LOCc(ia+k-1).
c (local)
COMPLEX for pcunmrz
DOUBLE COMPLEX for pzunmrz.
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the submatrix C, respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
COMPLEX for pcunmrz
DOUBLE COMPLEX for pzunmrz.
Workspace array of size lwork.
If side = 'L',
lworkmax((mb_a*(mb_a-1))/2, (mpc0+max(mqa0+numroc(numroc(n
+iroffc, mb_a, 0, 0, NPROW), mb_a, 0, 0, lcmp), nqc0))*mb_a)
+ mb_a*mb_a
else if side ='R',
2007
4 Intel Math Kernel Library Developer Reference
NOTE
mod(x,y) is the integer remainder of x/y.
ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace query is
assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ggqrf
Computes the generalized QR factorization.
Syntax
call psggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
call pdggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
call pcggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
2008
ScaLAPACK Routines 4
call pzggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
Include Files
Description
The p?ggqrfroutine forms the generalized QR factorization of an n-by-m matrix
as
sub(A) = Q*R, sub(B) = Q*T*Z,
where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix, and R and T
assume one of the forms:
If nm
or if n < m
2009
4 Intel Math Kernel Library Developer Reference
In particular, if sub(B) is square and nonsingular, the GQR factorization of sub(A) and sub(B) implicitly gives
the QR factorization of inv (sub(B))* sub (A):
inv(sub(B))*sub(A) = ZH*(inv(T)*R)
Input Parameters
n (global) INTEGER. The number of rows in the distributed matrices sub (A)
and sub(B) (n0).
a (local)
REAL for psggqrf
DOUBLE PRECISION for pdggqrf
COMPLEX for pcggqrf
DOUBLE COMPLEX for pzggqrf.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)).
Contains the local pieces of the n-by-m matrix sub(A) to be factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
b (local)
REAL for psggqrf
DOUBLE PRECISION for pdggqrf
COMPLEX for pcggqrf
DOUBLE COMPLEX for pzggqrf.
Pointer into the local memory to an array of size (lld_b,LOCc(jb+p-1)).
Contains the local pieces of the n-by-p matrix sub(B) to be factored.
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the submatrix B,
respectively.
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
work (local)
REAL for psggqrf
DOUBLE PRECISION for pdggqrf
COMPLEX for pcggqrf
DOUBLE COMPLEX for pzggqrf.
2010
ScaLAPACK Routines 4
Workspace array of size of lwork.
lworkmax(nb_a*(npa0+mqa0+nb_a), max((nb_a*(nb_a-1))/2,
(pqb0+npb0)*nb_a)+nb_a*nb_a, mb_b*(npb0+pqb0+mb_b)),
where
iroffa = mod(ia-1, mb_A),
icoffa = mod(ja-1, nb_a),
iarow = indxg2p(ia, mb_a, MYROW, rsrc_a, NPROW),
iacol = indxg2p(ja, nb_a, MYCOL, csrc_a, NPCOL),
npa0 = numroc (n+iroffa, mb_a, MYROW, iarow, NPROW),
mqa0 = numroc (m+icoffa, nb_a, MYCOL, iacol, NPCOL)
iroffb = mod(ib-1, mb_b),
icoffb = mod(jb-1, nb_b),
ibrow = indxg2p(ib, mb_b, MYROW, rsrc_b, NPROW),
ibcol = indxg2p(jb, nb_b, MYCOL, csrc_b, NPCOL),
npb0 = numroc (n+iroffa, mb_b, MYROW, Ibrow, NPROW),
pqb0 = numroc(m+icoffb, nb_b, MYCOL, ibcol, NPCOL)
NOTE
mod(x,y) is the integer remainder of x/y.
and numroc, indxg2p are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
Output Parameters
a On exit, the elements on and above the diagonal of sub (A) contain the
min(n, m)-by-m upper trapezoidal matrix R (R is upper triangular if nm); the
elements below the diagonal, with the array taua, represent the
orthogonal/unitary matrix Q as a product of min(n, m) elementary
reflectors. (See Application Notes below).
2011
4 Intel Math Kernel Library Developer Reference
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(ja)*H(ja+1)*...*H(ja+k-1),
where k= min(n,m).
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?ggrqf
Computes the generalized RQ factorization.
Syntax
call psggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
2012
ScaLAPACK Routines 4
call pdggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
call pcggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
call pzggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
Include Files
Description
The p?ggrqfroutine forms the generalized RQ factorization of an m-by-n matrix sub(A) = A(ia:ia+m-1,
ja:ja+n-1) and a p-by-n matrix sub(B) = B(ib:ib+p-1, jb:jb+n-1):
sub(A) = R*Q, sub(B) = Z*T*Q,
where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix, and R and T
assume one of the forms:
or
or
2013
4 Intel Math Kernel Library Developer Reference
In particular, if sub(B) is square and nonsingular, the GRQ factorization of sub(A) and sub(B) implicitly gives
the RQ factorization of sub (A)*inv(sub(B)):
sub(A)*inv(sub(B))= (R*inv(T))*Z'
where inv(sub(B)) denotes the inverse of the matrix sub(B), and Z' denotes the transpose (conjugate
transpose) of matrix Z.
Input Parameters
m (global) INTEGER. The number of rows in the distributed matrices sub (A)
(m0).
a (local)
REAL for psggrqf
DOUBLE PRECISION for pdggrqf
COMPLEX for pcggrqf
DOUBLE COMPLEX for pzggrqf.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
Contains the local pieces of the m-by-n distributed matrix sub(A) to be
factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
b (local)
REAL for psggrqf
DOUBLE PRECISION for pdggrqf
COMPLEX for pcggrqf
DOUBLE COMPLEX for pzggrqf.
Pointer into the local memory to an array of size (lld_b,LOCc(jb+n-1)).
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the submatrix B, respectively.
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
work (local)
REAL for psggrqf
DOUBLE PRECISION for pdggrqf
2014
ScaLAPACK Routines 4
COMPLEX for pcggrqf
DOUBLE COMPLEX for pzggrqf.
Workspace array of size of lwork.
NOTE
mod(x,y) is the integer remainder of x/y.
and numroc, indxg2p are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.
Output Parameters
2015
4 Intel Math Kernel Library Developer Reference
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(ia)*H(ia+1)*...*H(ia+k-1),
where k= min(m,n).
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
2016
ScaLAPACK Routines 4
Symmetric Eigenvalue Problems: ScaLAPACK Computational Routines
To solve a symmetric eigenproblem with ScaLAPACK, you usually need to reduce the matrix to real
tridiagonal form T and then find the eigenvalues and eigenvectors of the tridiagonal matrix T. ScaLAPACK
includes routines for reducing the matrix to a tridiagonal form by an orthogonal (or unitary) similarity
transformation A = QTQH as well as for solving tridiagonal symmetric eigenvalue problems. These routines
are listed in Table "Computational Routines for Solving Symmetric Eigenproblems".
There are different routines for symmetric eigenproblems, depending on whether you need eigenvalues only
or eigenvectors as well, and on the algorithm used (either the QTQ algorithm, or bisection followed by
inverse iteration).
Computational Routines for Solving Symmetric Eigenproblems
Operation Dense symmetric/ Orthogonal/unitary Symmetric
Hermitian matrix matrix tridiagonal
matrix
Reduce to tridiagonal form A = QTQH p?sytrd/p?hetrd
Multiply matrix after reduction p?ormtr/p?unmtr
Find all eigenvalues and eigenvectors steqr2*
of a tridiagonal matrix T by a QTQ
method
Find selected eigenvalues of a p?stebz
tridiagonal matrix T via bisection
Find selected eigenvectors of a p?stein
tridiagonal matrix T by inverse
iteration
* This routine is described as part of auxiliary ScaLAPACK routines.
p?syngst
Reduces a complex Hermitian-definite generalized
eigenproblem to standard form.
Syntax
call pssyngst (ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, scale, work,
lwork, info )
call pdsyngst (ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, scale, work,
lwork, info )
Description
p?syngst reduces a complex Hermitian-definite generalized eigenproblem to standard form.
p?syngst performs the same function as p?hegst, but is based on rank 2K updates, which are faster and
more scalable than triangular solves (the basis of p?syngst).
p?syngst calls p?hegst when uplo='U', hence p?hengst provides improved performance only when
uplo='L', ibtype=1.
p?syngst also calls p?hegst when insufficient workspace is provided, hence p?syngst provides improved
performance only when lwork >= 2 * NP0 * NB + NQ0 * NB + NB * NB
In the following sub( A ) denotes A( ia:ia+n-1, ja:ja+n-1 ) and sub( B ) denotes B( ib:ib+n-1, jb:jb
+n-1 ).
If ibtype = 1, the problem is sub( A )*x = lambda*sub( B )*x, and sub( A ) is overwritten by
inv(UH)*sub( A )*inv(U) or inv(L)*sub( A )*inv(LH)
If ibtype = 2 or 3, the problem is sub( A )*sub( B )*x = lambda*x or sub( B )*sub( A )*x = lambda*x, and
sub( A ) is overwritten by U*sub( A )*UH or LH*sub( A )*L.
2017
4 Intel Math Kernel Library Developer Reference
Input Parameters
ibtype (global)
INTEGER.
= 1: compute inv(UH)*sub( A )*inv(U) or inv(L)*sub( A )*inv(LH);
= 2 or 3: compute U*sub( A )*UH or LH*sub( A )*L.
uplo (global)
CHARACTER.
= 'U': Upper triangle of sub( A ) is stored and sub( B ) is factored as UH*U;
= 'L': Lower triangle of sub( A ) is stored and sub( B ) is factored as L*LH.
n (global)
INTEGER.
The order of the matrices sub( A ) and sub( B ). n >= 0.
a (local)
REAL for pssyngst
DOUBLE PRECISION for pdsyngst
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
On entry, this array contains the local pieces of the n-by-n Hermitian
distributed matrix sub( A ). If uplo = 'U', the leading n-by-n upper
triangular part of sub( A ) contains the upper triangular part of the matrix,
and its strictly lower triangular part is not referenced. If uplo = 'L', the
leading n-by-n lower triangular part of sub( A ) contains the lower
triangular part of the matrix, and its strictly upper triangular part is not
referenced.
ia (global)
INTEGER.
A's global row index, which points to the beginning of the submatrix which
is to be operated on.
ja (global)
INTEGER.
A's global column index, which points to the beginning of the submatrix
which is to be operated on.
b (local)
REAL for pssyngst
2018
ScaLAPACK Routines 4
DOUBLE PRECISION for pdsyngst
Pointer into the local memory to an array of size (lld_b,LOCc(jb+n-1)).
On entry, this array contains the local pieces of the triangular factor from
the Cholesky factorization of sub( B ), as returned by p?potrf.
ib (global)
INTEGER.
B's global row index, which points to the beginning of the submatrix which
is to be operated on.
jb (global)
INTEGER.
B's global column index, which points to the beginning of the submatrix
which is to be operated on.
work (local)
REAL for pssyngst
DOUBLE PRECISION for pdsyngst
Array, size (lwork)
lwork is local input and must be at least lwork >= MAX( NB * ( NP0 +1 ),
3 * NB )
When ibtype = 1 and uplo = 'L', p?syngst provides improved
performance when lwork >= 2 * NP0 * NB + NQ0 * NB + NB * NB,
2019
4 Intel Math Kernel Library Developer Reference
Output Parameters
scale (global)
REAL for pssyngst
DOUBLE PRECISION for pdsyngst
Amount by which the eigenvalues should be scaled to compensate for the
scaling performed in this routine. At present, scale is always returned as
1.0, it is returned here to allow for future enhancement.
work (local)
REAL for pssyngst
DOUBLE PRECISION for pdsyngst
Array, size (lwork)
info (global)
INTEGER.
= 0: successful exit
< 0: If the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j), if the i-th argument is a scalar and had an illegal
value, then info = -i.
p?syntrd
Reduces a real symmetric matrix to symmetric
tridiagonal form.
Syntax
call pssyntrd (uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info )
call pdsyntrd (uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info )
Description
p?syntrd is a prototype version of p?sytrd which uses tailored codes (either the serial, ?sytrd, or the
parallel code, p?syttrd) when the workspace provided by the user is adequate.
p?syntrd reduces a real symmetric matrix sub( A ) to symmetric tridiagonal form T by an orthogonal
similarity transformation:
Q' * sub( A ) * Q = T, where sub( A ) = A(ia:ia+n-1,ja:ja+n-1).
Features
p?syntrd is faster than p?sytrd on almost all matrices, particularly small ones (i.e. n < 500 * sqrt(P) ),
provided that enough workspace is available to use the tailored codes.
The tailored codes provide performance that is essentially independent of the input data layout.
The tailored codes place no restrictions on ia, ja, MB or NB. At present, ia, ja, MB and NB are restricted to
those values allowed by p?hetrd to keep the interface simple (see the Application Notes section for more
information about the restrictions).
2020
ScaLAPACK Routines 4
Input Parameters
uplo (global)
CHARACTER.
Specifies whether the upper or lower triangular part of the symmetric
matrix sub( A ) is stored:
= 'U': Upper triangular
= 'L': Lower triangular
n (global)
INTEGER.
The number of rows and columns to be operated on, i.e. the order of the
distributed submatrix sub( A ). n >= 0.
a (local)
REAL for pssyntrd
DOUBLE PRECISION for pdsyntrd
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
On entry, this array contains the local pieces of the symmetric distributed
matrix sub( A ). If uplo = 'U', the leading n-by-n upper triangular part of
sub( A ) contains the upper triangular part of the matrix, and its strictly
lower triangular part is not referenced. If uplo = 'L', the leading n-by-n
lower triangular part of sub( A ) contains the lower triangular part of the
matrix, and its strictly upper triangular part is not referenced.
ia (global)
INTEGER.
The row index in the global array a indicating the first row of sub( A ).
ja (global)
INTEGER.
The column index in the global array a indicating the first column of
sub( A ).
work (local)
REAL for pssyntrd
DOUBLE PRECISION for pdsyntrd
Array, size (lwork)
2021
4 Intel Math Kernel Library Developer Reference
Output Parameters
a On exit, if uplo = 'U', the diagonal and first superdiagonal of sub( A ) are
overwritten by the corresponding elements of the tridiagonal matrix T, and
the elements above the first superdiagonal, with the array tau, represent
the orthogonal matrix Q as a product of elementary reflectors; if uplo = 'L',
the diagonal and first subdiagonal of sub( A ) are overwritten by the
corresponding elements of the tridiagonal matrix T, and the elements below
the first subdiagonal, with the array tau, represent the orthogonal matrix Q
as a product of elementary reflectors. See Further Details.
d (local)
REAL for pssyntrd
DOUBLE PRECISION for pdsyntrd
Array, size LOCc(ja+n-1)
e (local)
REAL for pssyntrd
DOUBLE PRECISION for pdsyntrd
Array, size LOCc(ja+n-1) if uplo = 'U', LOCc(ja+n-2) otherwise.
tau (local)
REAL for pssyntrd
DOUBLE PRECISION for pdsyntrd
Array, size LOCc(ja+n-1).
This array contains the scalar factors tau of the elementary reflectors. tau
is tied to the distributed matrix A.
2022
ScaLAPACK Routines 4
work (local)
REAL for pssyntrd
DOUBLE PRECISION for pdsyntrd
Array, size (lwork)
info (global)
INTEGER.
= 0: successful exit
< 0: If the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j), if the i-th argument is a scalar and had an illegal
value, then info = -i.
Application Notes
If uplo = 'U', the matrix Q is represented as a product of elementary reflectors
The contents of sub( A ) on exit are illustrated by the following examples with n = 5:
if uplo = 'U':
d e v2 v3 v4
d e v3 v4
d e v3
d e
d
if uplo = 'L':
d
e d
v1 e d
v1 v2 e d
v1 v2 v3 e d
where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector
defining H(i).
Alignment requirements
The distributed submatrix sub( A ) must verify some alignment properties, namely the following expression
should be true:
2023
4 Intel Math Kernel Library Developer Reference
( mb_a = nb_a and IROFFA = ICOFFA and IROFFA = 0 ) with IROFFA = mod( ia-1, mb_a), and ICOFFA =
mod( ja-1, nb_a ).
p?sytrd
Reduces a symmetric matrix to real symmetric
tridiagonal form by an orthogonal similarity
transformation.
Syntax
call pssytrd(uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
call pdsytrd(uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
Include Files
Description
The p?sytrd routine reduces a real symmetric matrix sub(A) to symmetric tridiagonal form T by an
orthogonal similarity transformation:
Q'*sub(A)*Q = T,
where sub(A) = A(ia:ia+n-1,ja:ja+n-1).
Input Parameters
a (local)
REAL for pssytrd
DOUBLE PRECISION for pdsytrd.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
On entry, this array contains the local pieces of the symmetric distributed
matrix sub(A).
If uplo = 'U', the leading n-by-n upper triangular part of sub(A) contains
the upper triangular part of the matrix, and its strictly lower triangular part
is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of sub(A) contains
the lower triangular part of the matrix, and its strictly upper triangular part
is not referenced. See Application Notes below.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the submatrix A, respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
2024
ScaLAPACK Routines 4
work (local)
REAL for pssytrd
DOUBLE PRECISION for pdsytrd.
Workspace array of size lwork.
Output Parameters
a On exit, if uplo = 'U', the diagonal and first superdiagonal of sub(A) are
overwritten by the corresponding elements of the tridiagonal matrix T, and
the elements above the first superdiagonal, with the array tau, represent
the orthogonal matrix Q as a product of elementary reflectors; if uplo =
'L', the diagonal and first subdiagonal of sub(A) are overwritten by the
corresponding elements of the tridiagonal matrix T, and the elements below
the first subdiagonal, with the array tau, represent the orthogonal matrix Q
as a product of elementary reflectors. See Application Notes below.
d (local)
REAL for pssytrd
DOUBLE PRECISION for pdsytrd.
Arrays of size LOCc(ja+n-1) .The diagonal elements of the tridiagonal
matrix T:
d(i)= A(i,i).
d is tied to the distributed matrix A.
e (local)
REAL for pssytrd
DOUBLE PRECISION for pdsytrd.
Arrays of size LOCc(ja+n-1) if uplo = 'U', LOCc(ja+n-2) otherwise.
2025
4 Intel Math Kernel Library Developer Reference
tau (local)
REAL for pssytrd
DOUBLE PRECISION for pdsytrd.
Arrays of size LOCc(ja+n-1). This array contains the scalar factors of the
elementary reflectors. tau is tied to the distributed matrix A.
work(1) On exit work(1) contains the minimum value of lwork required for
optimum performance.
Application Notes
If uplo = 'U', the matrix Q is represented as a product of elementary reflectors
The contents of sub(A) on exit are illustrated by the following examples with n = 5:
If uplo = 'U':
If uplo = 'L':
2026
ScaLAPACK Routines 4
where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector
defining H(i).
See Also
Overview of ScaLAPACK Ro