Correlation Analysis
Correlation Analysis
nag correl contains procedures that calculate the correlation coefficients for a set of
data values.
Contents
Procedures
nag prod mom correl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.2.3
Calculates the variance-covariance matrix and the Pearson product-moment
correlation coefficients for a set of data
nag part correl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.2.7
Calculates the partial variance-covariance matrix and the partial correlation matrix
from a correlation or variance covariance matrix
Examples
Example 1: Calculation of correlation and partial correlation coefficients . . . . . . . . . . . . . . . . . . 25.2.11
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.2.16
1 Description
Given a set of n observations of l variables this procedure computes the (optionally weighted) means and
sums of squares and cross-products of the deviations about the means. The variance-covariance matrix,
the standard deviations and the Pearson product-moment correlation matrix are then computed from
these basic results.
Note: all the output arguments of this procedure are optional. However, at least one output argument
must be present in every call statement.
2 Usage
USE nag correl
CALL nag prod mom correl(data [, optional arguments])
3 Arguments
Note. All array arguments are assumed-shape arrays. The extent in each dimension must be exactly that required by
the problem. Notation such as ‘x(n)’ is used in the argument descriptions to specify that the array x must have exactly n
elements.
This procedure derives the values of the following problem parameters from the shape of the supplied
arrays.
n > 1 — the number of observations in the data matrix
m ≥ 1 — the number of variables in the data matrix
l ≥ 1 — the number of variables actually included in the calculations. If the optional argument
var in correl is not present then l = m, otherwise l = COUNT(var in correl)
4 Error Codes
Fatal errors (error%level = 3):
error%code Description
301 An input argument has an invalid value.
302 An array argument has an invalid shape.
5 Examples of Usage
A complete example of the use of this procedure appears in Example 1 of this module document.
6 Further Comments
6.1 Mathematical Background
Let xij denote the ith observation for the jth variable and wi denote the weight for the ith observation.
The output statistics are then defined as follows.
1 Description
nag part correl calculates the partial correlation matrix and the partial variance-covariance matrix
from a correlation or variance-covariance matrix.
In general, let a set of variables be partitioned into two groups Y and X with ny variables in Y and nx
variables in X and let the variance-covariance matrix of all ny + nx variables be partitioned into
Σxx Σyx
.
Σxy Σyy
The partial variance-covariance of Y conditional on fixed values of the X variables is given by:
2 Usage
USE nag correl
CALL nag part correl(cov, var in x, part correl [, optional arguments])
3 Arguments
Note. All array arguments are assumed-shape arrays. The extent in each dimension must be exactly that required by
the problem. Notation such as ‘x(n)’ is used in the argument descriptions to specify that the array x must have exactly n
elements.
This procedure derives the values of the following problem parameters from the shape of the supplied
arrays.
m ≥ 3 — the number of variables in the variance-covariance matrix or correlation matrix
ny ≥ 2 — the number of Y variables. If the optional argument var in model is present then ny =
COUNT((.NOT. var in x) .AND. var in model), otherwise ny = COUNT(.NOT. var in x)
4 Error Codes
Fatal errors (error%level = 3):
error%code Description
301 An input argument has an invalid value.
302 An array argument has an invalid shape.
303 Array arguments have inconsistent shapes.
320 The procedure was unable to allocate enough memory.
5 Examples of Usage
A complete example of the use of this procedure appears in Example 1 of this module document.
6 Further Comments
6.1 Mathematical Background
Partial correlation can be used to explore the association between pairs of random variables in the
presence of other variables. For three variables, y1 , y2 and x3 the partial correlation coefficient between
y1 and y2 given x3 is computed as
r12 − r13 r23
,
(1 − r13
2 )(1 − r2 )
23
where rij is the product-moment correlation coefficient between variables with subscripts i and j. The
partial correlation coefficient is a measure of the linear association between y1 and y2 having eliminated
the effect due to both y1 and y2 being linearly associated with x3 . That is, it is a measure of association
between y1 and y2 conditional upon fixed values of x3 . Like the full correlation coefficients the partial
correlation coefficient takes a value in the range (−1, 1) with the value 0 indicating no association.
To test the hypothesis that a partial correlation is zero under the assumption that the data has an
approximately Normal distribution a test similar to the test for the full correlation coefficient can be
used. If r is the computed partial correlation coefficient then the appropriate t statistic is
n − nx − 2
r ,
1 − r2
which has approximately a Student’s t-distribution with n − nx − 2 degrees of freedom, where n is the
number of observations from which the full correlation coefficients were computed, and nx is the number
of variables included as X variables (see argument var in x).
1 Program Text
Note. The listing of the example program presented below is double precision. Single precision users are referred to
Section 5.2 of the Essential Introduction for further information.
PROGRAM nag_correl_ex01
! .. Use Statements ..
USE nag_examples_io, ONLY : nag_std_out, nag_std_in
USE nag_correl, ONLY : nag_prod_mom_correl, nag_part_correl
USE nag_write_mat, ONLY : nag_write_tri_mat
! .. Implicit None Statement ..
IMPLICIT NONE
! .. Intrinsic Functions ..
INTRINSIC KIND
! .. Parameters ..
INTEGER, PARAMETER :: wp = KIND(1.0D0)
! .. Local Scalars ..
INTEGER :: i, k, m, n, nx
! .. Local Arrays ..
INTEGER, ALLOCATABLE :: index_x(:)
REAL (wp), ALLOCATABLE :: correl(:,:), data(:,:), part_correl(:,:)
LOGICAL, ALLOCATABLE :: var_in_x(:)
CHARACTER (5), ALLOCATABLE :: label(:)
! .. Executable Statements ..
! Read data
READ (nag_std_in,*) (data(i,:),i=1,n)
WRITE (nag_std_out,*)
CALL nag_prod_mom_correl(data,correl=correl)
CALL nag_part_correl(correl,var_in_x,part_correl=part_correl)
k = 0
DO i = 1, m
IF ( .NOT. var_in_x(i)) THEN
k = k + 1
WRITE (label(k),’(a2,i2,a)’) ’x(’, i, ’)’
END IF
END DO
WRITE (nag_std_out,*)
2 Program Data
Example Program Data for nag_correl_ex01
5 20 : m, n
11.25 48.9 7.43 2.270 15.48
10.87 47.7 7.45 1.971 14.97
11.18 48.2 7.44 1.979 14.20
10.62 49.0 7.38 2.026 15.02
11.02 47.4 7.43 1.974 12.92
10.83 48.3 7.72 2.124 13.58
11.18 49.3 7.05 2.064 14.12
11.05 48.2 6.95 2.001 15.34
11.15 49.1 7.12 2.035 14.52
11.23 48.6 7.28 1.970 15.25
10.94 49.9 7.45 1.974 15.34
11.18 49.0 7.34 1.942 14.48
11.02 48.2 7.29 2.063 12.92
10.99 47.8 7.37 1.973 13.61
11.03 48.9 7.45 1.974 14.20
11.09 48.8 7.08 2.039 14.51
11.46 51.2 6.75 2.008 16.07
11.57 49.8 7.00 1.944 16.60
11.07 47.9 7.04 1.947 13.41
10.89 49.6 7.07 1.798 15.84 : data
1 : nx number of x variables
5 : indexes of x variables
3 Program Results
Example Program Results for nag_correl_ex01
Additional Examples
Not all example programs supplied with NAG fl 90 appear in full in this module document. The following
additional examples, associated with this module, are available.
References
[1] Morrison D F (1967) Multivariate Statistical Methods McGraw-Hill
[2] West D H D (1979) Updating mean and variance estimates: An improved method Comm. ACM 22
532–535