Standard Error of R
Standard Error of R
Timo Gnambs
Author Note
I have no conflicts of interest to disclose. The study was not preregistered. The simulation
Version: 2023-07-14
STANDARD ERROR OF CORRELATION 2
Abstract
sample correlations that are biased indicators of the respective population correlations.
Moreover, there seems to be some uncertainty on how to properly calculate the standard error
have been previously introduced. This note aims to briefly summarize 10 different ways to
calculate the standard error of the Pearson correlation. Moreover, a simulation study on the
accuracy of these estimators compared their relative percentage biases for different population
correlations and sample sizes. The results showed that all estimators were largely unbiased for
sample sizes of at least 40. For smaller samples, a simple approximation by Bonett (2008) led
to the least biased results. Based on these results, it is recommended to use the expression
(1 − 𝑟 2 )⁄√𝑁 − 3 for the calculation of the standard error of the Pearson correlation.
descriptive and exploratory research but is also used as a measure of effect size to facilitate
interpretations and comparisons of results between studies. Therefore, it also plays a pivotal
role in research syntheses including quantitative meta-analyses (e.g., Cheung, 2015; Hafdahl,
2008). Although it is well known among statisticians that the sample correlation r represents a
biased estimator of the population correlation ρ (e.g., De Winter et al., 2016; Hedges, 1989;
Olkin & Pratt, 1958; Shieh, 2010; Zimmerman et al., 2003), applied researchers seldom adopt
unbiased estimators of ρ because the bias in r is often considered increasingly negligible with
larger sample sizes (Shieh, 2010). Moreover, there seems to be some uncertainty regarding
the calculation of the standard error of r that is often used in meta-analytic investigations as
precision weights and, thus, determines the contribution of each correlation coefficient to the
pooled estimate (e.g., Bonett, 2008; Hafdahl, 2008). Because ρ follows a rather complex
sampling distribution, the standard error is rarely calculated from the exact distribution.
Rather, various large-sample approximations have been suggested in the literature (e.g.,
Bonett, 2008; Hedges, 1989; Hotelling, 1953). However, if these approximations represent
biased estimators of standard errors, their use in meta-analyses might involuntarily distort
effect size estimations and interpretations. Therefore, this brief note reviews different
approaches on how to calculate the standard error of r and demonstrates their biases for
Let X = {x1, x2, …, xN} and Y = {y1, y2, …, yN} represent two bivariate normally distributed
variables with population means {μX, μY}, variances {σ𝑋2 , σ2𝑌 }, and correlation ρ that are
by
STANDARD ERROR OF CORRELATION 4
∑𝑁
𝑖=1(𝑥𝑖 − μ
̂ 𝑋 )(𝑦𝑖 − μ̂𝑌 )
𝑟𝑋𝑌 =
(1)
√∑𝑁 ̂ 𝑋 )2 ∑𝑁
𝑖=1(𝑥𝑖 − μ ̂ 𝑌 )2
𝑖=1(𝑦𝑖 − μ
with μ̂𝑋 and μ̂𝑌 as the sample means of X and Y. The density probability distribution of r has
been derived by Hotelling (1951, 1953) based on prior work in Fisher (1915, 1921) and
distributions for two correlations, ρ = .20 and ρ = .80. Whereas the former represents a small
to medium effect that is typically observed in different areas of psychology (see Bosco et al.,
2015; Gignac & Szodorai, 2016; Lovakov & Agadullina, 2021), the latter reflects a rather
large effect that is often limited to specific domains such as competence research (e.g.,
Gnambs & Lockl, 2023). The distributions in Figure 1 highlight that the sample correlations r
follow a rather asymmetrical shape. In the present case, the modes of these distributions are
larger than the respective ρ, thus, resulting in a negative skew. Generally, the skew is more
pronounced for larger |ρ| because correlations are bounded at -1 and +1. In contrast, smaller
the sampling distribution is strongly affected by the sample size. It is stronger for small
sample sizes, while larger samples result in more symmetric distributions. In the presented
Because of its skewed sampling distribution, the sample correlation r is a biased estimator
of the population correlation ρ (e.g., Hedges, 1989; Olkin & Pratt, 1958). As pointed out by
Zimmerman and colleagues (2013), this bias can reach up to .03 or .04 in many applied
STANDARD ERROR OF CORRELATION 5
situations. Therefore, Olkin and Pratt (1958) derived an estimator of ρ from (2) that corrects
1 1 2∙𝑁−2 𝑟 ∙ (1 − 𝑟 2 )
ρ̂ = 𝑟 ∙ 2𝐹1 ( , ; ; 1 − 𝑟 2) ≈ 𝑟 + (3)
2 2 2 2 ∙ (𝑁 − 4)
with ≈ indicating an approximation that is accurate within ± .001 for N ≥ 18. Monte Carlo
simulations confirmed that the approximate estimator in (3) is largely unbiased for different
sample sizes and population correlations, whereas r tends to underestimate ρ, particularly for
medium- to large-sized correlations and small sample sizes (Shieh, 2010). However, r tends to
exhibit a higher precision for |ρ| < .60 as reflected by the mean squared error. Therefore, in
these cases, the sample correlation r might, despite its bias, serve as a meaningful estimator of
The sampling variance of r was derived by Hotelling (1951, 1953) using the moments of r
Consequently, the standard error of r is σ𝑟 = √σ2𝑟 . Because (4) requires integrating over the
approximations have been suggested in the literature that often take the general form
(1 − 𝑟 2 )2
σ2𝑟 ≈ ∙ 𝑂(𝑛) (5)
𝑑𝑓
with df representing the degrees of freedom and O(n) a series of terms of the order n that
approximates the integral over the hypergeometric function in (4). For n → ∞, the estimator in
noticed that this referred to a rather special case where the variances of X and Y are known
(see Pearson & Filon, 1898) and therefore revised the expression to df = N as in (7). But, the
estimated means in (1) because (7) implies that the means of the two variables in (1) are
known, that is, μ̂𝑋 = μ𝑋 and μ̂𝑌 = μ𝑌 (Olkin & Pratt, 1985). Moreover, simulation studies led
to Bonett’s (2008) suggestion of df = N – 3 in (9). The formulas in (7) to (9) are frequently
used in applied research because they are easy to calculate and give good approximations of
(4). However, they are biased to some degree because they ignore O(n) for the approximation
Several authors tried to give analytic solutions for O(n) and different n. Soper (1913)
derived O(1) for df = N and df = N – 1 resulting in (10) and (11). The latter was later extended
approximation of O(6) resulting in (13). The formulas in (10) to (13) are expected to provide
better estimators of (4) because of their closer approximation of the sampling distribution of r.
However, they are hardly used anymore today because researchers trying to improve the
accuracy of the estimated standard errors as compared to (7) to (9) can easily do so by directly
evaluating the integral in (4) using modern optimization routines implemented in standard
statistical software.
based on the assumption σ2 (ρ𝑖 ) = σ2 (ρ) + σ2 (𝑒𝑖 ) derived from classical test theory. Here,
the variance σ2 (ρ𝑖 ) in study i is decomposed into the variance σ2 (ρ) of the distribution from
which the study-specific population correlations ρ𝑖 are sampled and sampling error σ2 (𝑒𝑖 ).
̂2𝑖 can be
Consequently, an estimator of the standard error σ(𝑒𝑖 ) = σ(ρ𝑖 ) − σ(ρ) = ρ̂2𝑖 − ρ
obtained by using unbiased estimates of ρ as given in (4) and the multiple correlation ρ2 as
derived by Olkin and Pratt (1958). This results in the approximate and exact expressions of
Finally, it might be helpful to illuminate a potential misconception arising from the fact
that the bivariate correlation can also be expressed in terms of the standardized linear
STANDARD ERROR OF CORRELATION 7
(𝑏−β)
regression coefficient. The studentized regression coefficient 𝑡 = with b and β as the
σ𝑏
estimated and true regression weights, respectively, and σb as the standard error of b follows a
𝑏 𝑟
t distribution with df = N - 2. Under the null hypothesis of β = 0, this reduces to 𝑡 = σ = σ =
𝑏 𝑏
−1
1−𝑅2
𝑟 ∙ (√ 𝑁−2 ) with R2 as the coefficient of determination (see Pugh & Winslow, 1966, for the
detailed derivation). Because for a single predictor R2 = r2, it might be tempting to mistake the
1−𝑟 2
term √ 𝑁−2 for the standard error of r. However, this is only true for the special case when ρ =
0, whereas this expression leads to increasingly biased estimators of σr for larger |ρ|.
As highlighted in Table 1 different approximations have been suggested for the estimation
of the standard error of r. However, little is known about which of these estimators might
yield noteworthy benefits in substantive research. Therefore, the accuracy and efficiency of
these estimators were compared for different population correlations and sample sizes.
Methods
The comparison varied the population correlations between ρ = .00 and .90 (in increments
of .10) and sample sizes N from 10 to 50 (in increments of 10) and 100. For each condition,
(1898), (8) in Soper (1913), (9) in Bonett (2008), (4) using the integral by Hotelling (1953)
with ρ̂ = 𝑟, or (15) in Hedges (1989). Moreover, also the standard error of a regression
coefficient was derived to demonstrate its inadequacy for the correlation coefficient. The
approximations in (10) to (13) were not considered because they are rarely used in practice
and, more importantly, are superseded by the direct evaluation of the integral in (4). The
performance of these estimators was compared using the relative percentage bias (Hoogland
̂𝑟 , σ𝑟 ) = 𝐸(σ
& Boomsma, 1998) which is given as %𝐵𝑖𝑎𝑠(σ ̂𝑟 − σ𝑟 )⁄σ𝑟 ∙ 100. Values of
%Bias less than 5% were considered negligible. The true standard error σr was calculated
STANDARD ERROR OF CORRELATION 8
following (4) using the population correlation ρ from the data-generating process. Moreover,
the efficiency of the estimators was studied using the root mean squared error,
estimator might be more biased but at the same time also more efficient if it has a smaller
variance.1 The Bias and RMSE were computed by numerical integration using adaptive
1
Simpson’s quadrature with respect to r, that is, 𝐵𝑖𝑎𝑠(σ
̂𝑟 , σ𝑟 ) = ∫−1(σ
̂𝑟 − σ𝑟 ) ∙ 𝑓(𝑟|ρ, 𝑁) 𝑑𝑟
1
and 𝑅𝑀𝑆𝐸 2 (σ ̂𝑟 − σ𝑟 )2 ∙ 𝑓(𝑟|ρ, 𝑁) 𝑑𝑟.
̂𝑟 , σ𝑟 ) = ∫−1(σ
The relative bias of the different estimators of σr for different sample sizes is summarized
in Figure 2. These results show little differences between the compared estimators for sample
sizes of N = 40 or larger. Except for the standard error of the regression coefficient, the
approximations of the standard error of the correlation yielded largely unbiased estimates. In
observed. Estimators using N (Pearson & Filon, 1898) or N - 1 (Soper, 1913) as degrees of
freedom resulted in negative relative biases that increased for larger population correlations.
In contrast, the estimator by Bonett (2008) with N – 3 degrees of freedom resulted in unbiased
estimates of the standard error across a wide range of correlations; only for very large
correlations a slight negative bias was observed. Similarly, an evaluation of the integral in (4)
led to comparably unbiased estimates for sample sizes of at least 20. However, in extremely
small samples it was less precise than the approximation by Bonett (2008), presumably
because Equation (4) uses the estimate of the population correlation multiple times which is
1
The simulation only considered correlations ρ ≥ 0 because the bias for negative correlations is simply the
opposite of the bias for positive correlations, that is, 𝐵𝑖𝑎𝑠(σ ̂𝑟 , −σ𝑟 ) = −𝐵𝑖𝑎𝑠(σ ̂𝑟 , σ𝑟 ), while the root mean
squared error is identical in both cases, that is, 𝑅𝑀𝑆𝐸(σ ̂𝑟 , −σ𝑟 ) = 𝑅𝑀𝑆𝐸(σ ̂𝑟 , σ𝑟 ).
STANDARD ERROR OF CORRELATION 9
(1989) was only unbiased for correlations up to about .60; larger correlations resulted in a
slightly negative bias. These results also emphasize that - independent of the sample size - the
standard error of the regression coefficient yielded unbiased results only for population
correlations close to 0. For larger correlations, the relative percentage bias increased up to
125%. Thus, mistakenly using this standard error as an indicator of precision might result in
The root mean squared error resulted in only marginal differences between the compared
estimators (see Figure 2). For samples of N = 50 or N = 100, the RMSE fell below .001 in all
conditions. Although it was slightly larger for smaller sample sizes, the RMSE did not indicate
pronouncedly different efficiencies of the studied estimators. Except for N = 10, the different
estimators resulted in comparable RMSEs. But again, the standard error of the regression
coefficient was less efficient at larger population correlations. Together, these results indicate
that the simple approximation by Bonett (2008) results in the least biased estimates of the
standard error for different values of ρ and sample sizes. More complex estimators that either
require the evaluation of the integral in (4) or rely on the hypergeometric function such as
(15), do not seem to provide noteworthy benefits for the accuracy of the standard error.
Conclusion
Despite its popularity in applied research, the distributional properties of the sample
correlation are often not well understood or simply neglected. Particularly, the calculation of
the standard error of the Pearson correlation remains challenging because of the complex
sampling distribution of r which does not give a simple analytical solution. Therefore, various
approximations are currently used in substantive research. Although some of the more
complex expressions in Table 1 have only historical value today because the integration of
complex functions became substantially easier with modern computers, it was unclear
whether the choice between the simpler approaches might matter. Therefore, the simulation
evaluated the accuracy of these estimators for different population correlations and sample
STANDARD ERROR OF CORRELATION 10
sizes. The respective findings suggest that the least biased estimator used the expression
samples, this estimator should result in more precise standard errors. However, it needs to be
emphasized that differences between estimators become negligible as soon as sample sizes
increase. At typical sample sizes in psychology that often exceed 50 or 100 respondents the
choice of the estimator is unlikely to substantially matter. Thus, the estimator in (8) that is
currently often used in practice should be usually also acceptable in many applied situations.
However, further research is needed to identify specific conditions under which biased
small samples.
STANDARD ERROR OF CORRELATION 11
Declarations
Author contributions
TG conceived the study, conducted the analyses, and wrote the manuscript.
Funding
Conflicts of Interest
The author has no competing interests to declare that are relevant to the content of this
article.
Not applicable.
Code availability
References
Bosco, F. A., Aguinis, H., Singh, K., Field, J. G., & Pierce, C. A. (2015). Correlational effect
https://doi.org/10.1037/a0038047
De Winter, J. C., Gosling, S. D., & Potter, J. (2016). Comparing the Pearson and Spearman
correlation coefficients across distributions and sample sizes: A tutorial using simulations
https://doi.org/10.1037/met0000079
https://doi.org/10.2307/2331838
Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences
https://doi.org/10.1016/j.paid.2016.06.069
Gnambs, T., & Lockl, K. (2023). Bidirectional effects between reading and mathematics
https://doi.org/10.1007/s11618-022-01108-w
STANDARD ERROR OF CORRELATION 13
https://doi.org/10.3102/1076998607309472
Hedges, L.V. (1989). An unbiased correction for sampling error in validity generalization
9010.74.3.469
Hoogland, J. J., & Boomsma, A. (1998). Robustness studies in covariance structure modeling:
https://doi.org/10.1177/0049124198026003003
Hotelling, H. (1953). New light on the correlation coefficient and its transforms. Journal of
https://doi.org/10.1111/j.2517-6161.1953.tb00135.x
Lovakov, A., & Agadullina, E. R. (2021). Empirically derived guidelines for effect size
504. https://doi.org/10.1002/ejsp.2752
Olkin, I., & Pratt, J. W. (1958). Unbiased estimator of certain correlation coefficients. Annals
https://doi.org/10.1098/rsta.1896.0007
Pearson, K., & Filon, L. N. G. (1898). Mathematical contributions to the theory of evolution.
IV. On the probable errors of frequency constants and on the influence of random selection
on variation and correlation. Proceedings of the Royal Society of London, 62, 173-176.
https://doi.org/10.1098/rspl.1897.0091
STANDARD ERROR OF CORRELATION 14
Pugh, E. M., & Winslow, G. H. (1966). The Analysis of Physical Measurement. Addison-
Wesley.
Zimmerman, D. W., Zumbo, B. D., & Williams, R. H. (2003). Bias in estimation and
2
Ghosh (1966) presented an approximation to the order of 6. Because this resulted in rather complex terms,
for ease of presentation (13) reports only the first ones.
3
Hedges’ (1989) approximation of q (Equation 14) seems to mistakenly use a value of 3 in the denominator
instead of 4, thus, adopting N as the degrees of freedom, whereas the degrees of freedom was N - 1 in the
remainder of the paper. Moreover, the approximation of Q given in Equation 19 seems to be incorrect as well.
STANDARD ERROR OF CORRELATION 16
Figure 1
Sampling Distributions for Small and Large Correlations at Different Sample Sizes
Figure 2
Relative Percentage Bias of Different Estimators of the Standard Error of the Pearson Correlation
Note. Dashed lines represent relative biases of 5%. Relative biases exceeding 20% are not presented.
STANDARD ERROR OF CORRELATION 18
Figure 3
Root Mean Squared Error of Different Estimators of the Standard Error of the Pearson Correlation
Supplemental Material for
A Brief Note on the Standard Error of the Pearson Correlation
In addition to the relative bias discussed in the main text, Table S1 also summarizes the
̂𝑟 , σ𝑟 ) = 𝐸(σ
absolute bias, 𝐵𝑖𝑎𝑠(σ ̂𝑟 − σ𝑟 ). These results corroborate the basic pattern already
discussed for the relative bias, with more precise standard errors for the estimators in (4) and
Table S1
Population correlation ρ
Method Eq.
.00 .10 .20 .30 .40 .50 .60 .70 .80 .90
N = 10
True standard error (4) .333 .331 .323 .310 .292 .267 .235 .196 .146 .083
Pearson & Filon (1898) (7) -.052 -.052 -.051 -.050 -.048 -.046 -.042 -.037 -.030 -.019
Soper (1913) (8) -.037 -.037 -.037 -.036 -.035 -.034 -.032 -.029 -.024 -.015
Bonett (2008) (9) .003 .002 .002 .001 -.001 -.003 -.005 -.006 -.007 -.006
Hotelling (1953) (4) -.030 -.029 -.028 -.025 -.022 -.018 -.013 -.007 -.004 .003
Hedges (1989) (15) .013 .012 .009 .003 -.003 -.011 -.018 -.024 -.026 -.020
Regression -.001 .000 .003 .008 .015 .024 .035 .048 .061 .069
N = 20
True standard error (4) .229 .227 .221 .211 .197 .178 .154 .125 .091 .049
Pearson & Filon (1898) (7) -.017 -.017 -.017 -.017 -.016 -.015 -.014 -.012 -.009 -.005
Soper (1913) (8) -.012 -.012 -.012 -.012 -.011 -.011 -.010 -.009 -.007 -.004
Bonett (2008) (9) .001 .000 .000 .000 -.001 -.001 -.002 -.002 -.002 -.002
Hotelling (1953) (4) -.011 -.010 -.010 -.009 -.007 -.005 -.003 -.001 .000 .001
Hedges (1989) (15) .006 .006 .004 .002 -.001 -.003 -.006 -.008 -.008 -.006
Regression .000 .001 .003 .008 .014 .022 .031 .040 .049 .053
N = 30
True standard error (4) .186 .184 .179 .170 .158 .142 .123 .099 .071 .038
Pearson & Filon (1898) (7) -.009 -.009 -.009 -.009 -.0009 -.008 -.007 -.006 -.005 -.003
Soper (1913) (8) -.006 -.006 -.006 -.006 -.006 -.006 -.005 -.005 -.004 -.002
Bonett (2008) (9) .000 .000 .000 .000 -.001 -.001 -.001 -.001 -.001 -.001
Hotelling (1953) (4) -.006 -.006 -.006 -.005 -.005 -.004 -.003 -.002 .000 .001
Hedges (1989) (15) .003 .003 .002 .001 .000 -.002 -.003 -.004 -.004 -.003
STANDARD ERROR OF CORRELATION 20
Population correlation ρ
Method Eq.
.00 .10 .20 .30 .40 .50 .60 .70 .80 .90
Regression .000 .001 .003 .007 .012 .019 .027 .035 .042 .044
N = 40
True standard error (4) 160 .159 .154 .147 .136 .122 .105 .085 .060 .032
Pearson & Filon (1898) (7) -.006 -.006 -.006 -.006 -.006 -.005 -.005 -.004 -.003 -.002
Soper (1913) (8) -.004 -.004 -.004 -.004 -.004 -.004 -.003 -.003 -.002 -.001
Bonett (2008) (9) .000 .000 .000 .000 .000 -.001 -.001 -.001 -.001 -.001
Hotelling (1953) (4) -.004 -.004 -.003 -.003 -.002 -.002 .001 .000 .000 .000
Hedges (1989) (15) .002 .002 .001 .001 .000 -.001 -.002 -.003 -.003 -.002
Regression .000 .001 .003 .006 .011 .017 .024 .031 .037 .038
N = 50
True standard error (4) .143 .142 .137 .131 .121 .109 .093 .075 .053 .028
Pearson & Filon (1898) (7) -.004 -.004 -.004 -.004 -.004 -.004 -.003 -.003 -.002 -.001
Soper (1913) (8) -.003 -.003 -.003 -.003 -.003 -.003 -.002 -.002 -.001 -.001
Bonett (2008) (9) .000 .000 .000 .000 .000 .000 .000 -.001 .000 .000
Hotelling (1953) (4) -.003 -.003 -.002 -.002 -.002 -.001 -.001 .000 .000 .000
Hedges (1989) (15) .001 .001 .001 .000 .000 -.001 -.001 -.002 -.002 -.001
Regression .000 .001 .003 .006 .010 .015 .021 .028 .033 .034
N = 100
True standard error (4) .101 .100 .097 .092 .085 .076 .065 .052 .037 .020
Pearson & Filon (1898) (7) -.002 -.002 -.001 -.001 -.001 -.001 -.001 -.001 -.001 .000
Soper (1913) (8) -.001 -.001 -.001 -.001 -.001 -.001 -.001 -.001 -.001 .000
Bonett (2008) (9) .000 .000 .000 .000 .000 .000 .000 000 000 .000
Hotelling (1953) (4) -.001 -.001 -.001 -.001 .000 .000 .000 .000 .000 .000
Hedges (1989) (15) .000 .000 .000 .000 .000 .000 -.001 -.001 -.001 .000
Regression .000 .000 .002 .004 .007 .011 .016 .020 .024 .024