0% found this document useful (0 votes)

41 views

Goodness

For any generalized linear model, The Pearson goodness of fit statistic is the score test statistic for testing the current model against the saturated model. The Pearson statistic is a quadratic form alternative to the residual deviance. If the fitted model is correct and the observations yi are approximately normal, then X 2 is approximately distributed as kh2 on the residual degrees of freedom for the model.

Uploaded by

carmo-neto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

Goodness

Uploaded by

carmo-neto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Pearsons Goodness of Fit Statistic as a Score Test Statistic

Gordon K. Smyth

Abstract For any generalized linear model, the Pearson goodness of t statistic is the score test statistic for testing the current model against the saturated model. The relationship between the Pearson statistic and the residual deviance is therefore the relationship between the score test and the likelihood ratio test statistics, and this claries the role of the Pearson statistic in generalized linear models. The result is extended to cases in which there are multiple reponse observations for the same combination of explanatory variables. Keywords. Pearson statistic; score test; chisquare statistic; generalized linear model; exponential family nonlinear model; saturated model.

Introduction

Goodness of t tests go back at least to Pearsons (1900) article establishing the asymptotic chisquare distribution for a goodness of t statistic for the multinomial distribution. Pearsons chisquare statistic includes the test for independence in twoway contingency tables. It has been extended in generalized linear model theory to a test for the adequacy of the current tted model. Given a generalized linear model with responses yi , weights wi , tted means i , variance function v () and dispersion = 1, the Pearson goodness of t statistic is X2 = wi (yi i )2 v ( i )

[14]. If the tted model is correct and the observations yi are approximately normal, then X 2 is approximately distributed as 2 on the residual degrees of freedom for the model.
To appear in Science and Statistics: A Festschrift for Terry Speed, D. R. Goldstein (ed.), IMS Lecture Notes Monograph Series, Volume 40, Institute of Mathematical Statistics, Hayward, California, March 2003.

G. K. Smyth
The Pearson goodness of t statistic X 2 is one of two goodness of t tests in routine use in generalized linear models, the other being the residual deviance. The residual deviance is the log-likelihood ratio statistic for testing the tted model against the saturated model in which there is a regression coecient for every observation. The Pearson statistic is a quadratic form alternative to the residual deviance, and is often preferred over the residual deviance because of its moment estimator character. The expected value of the Pearson statistic depends only on the rst two moments of the distribution of the yi and in this sense the Pearson statistic is robust against mis-specication of the response distribution. The score test, like the likelihood ratio test, is a general asymptotic parametric test associated with the likelihood function [22]. Score tests are often simpler than likelihood ratio tests because the statistic requires parameter estimators to be obtained only under the null hypothesis. For this reason score tests have been proposed frequently in generalized linear model contexts to test for various sorts of model complications such as overdispersion [5] [3] [7] [24] [13] [19], zero ination [8], adequacy of the link function [20] [9], or extra terms in the tted model [21] [4] [1] [2] [26] [19]. While the residual deviance arises from a general inferential principle, namely the likelihood ratio test, the origin of the Pearson statistic has seemed more ad hoc. Several authors have noted that score tests for extra terms in the linear predictor give rise to chisquare statistics, but there has been no result for the residual Pearson statistic itself. Pregibon [21] shows, by using one-step estimators, that the score statistic for extra terms in the linear predictor can be expressed as a dierence between two chisquare statistics, just as the likelihood ratio test can be obtained as the dierence between two residual deviances. Cox and Hinkley [6, Examples 9.17 and 9.21] show that the simplest Pearson statistic, the goodness of t statistic for the multinomial distribution, can be derived as a score statistic. This article shows that Cox and Hinkleys result for the multinomial extends to all generalized linear models. The Pearson goodness of t statistic is itself a score test statistic, testing the current model against the saturated model. The relationship between the Pearson statistic and the residual deviance is therefore the relationship between the score test and the likelihood ratio test statistics, and this claries the role of the Pearson statistic in generalized linear models. The result of this article extends to several more general situations. The result extends to data sets with multiple counts in categories and to generalizations of exponential families models, such as overdispersion models, for which there are extra parameters in the variance function. It includes for example as special cases the results on tests for independence in two-way contingency tables of Thall [26] and Paul and Banerjee [19]. The general proofs given here are simpler and more transparent than the special case proofs for contingency tables. Finally, the results given here do not require link-linearity as in generalized linear models, but apply to any exponential family non-linear regression model. The theory of score tests is revised briey in Section 2 and the background material

Pearsons Goodness of Fit Statistic

required for generalized linear and non-linear models is stated briey in Section 3. The main results of the article are given in Section 4 showing the relationship between score tests and goodness of t. Section 5 goes on to consider models with extradispersion and Section 6 considers estimation of the dispersion parameter.

Score tests

This section summarizes briey the theory of likelihood score tests. Further background on score tests and likelihood ratio tests can be found in Rao [23, pages 417418] and Cox and Hinkley [6, Section 9.3]. Let (y; 1 , 2 ) be a log-likelihood function depending on a response vector y and parameter vectors 1 and 2 . We wish to test the composite hypothesis H0 : 2 = 0 against the alternative that 2 is unrestricted. The components of 1 are so-called nuisance parameters because they are not of interest in the test but values must be estimated for them for a test statistic to be computed. The likelihood score vectors for 1 and 2 are the partial derivatives 1 = 1 and 2 = 2

respectively. The observed information matrix for the parameters is with = 2 = 1 T 2 11 21 12 22 .

The expected or Fisher information matrix is I = E ( ), which is partitioned conformally with as I11 I12 I= . I21 I22 The score test statistic is based on the fact that the score vector has mean zero and covariance matrix I . If the nuisance vector 1 is known, then the score test statistic of H0 is 1/2 Z = I22 2 , where I22 stands for any factor such that I22 I22 = I22 , or equivalently
1 S = Z T Z = T 2 I22 2 1/2 1 / 2 T /2

with 2 and I22 evaluated at 2 = 0. The score vector is a sum of terms corresponding to individual observations and so is asymptotically normal under standard regularity conditions. It follows that Z is asymptotically a standard normal p2 -vector

G. K. Smyth
under the null hypothesis H0 and that S is asymptotically chisquare distributed on p2 degrees of freedom, where p2 is the dimension of 2 . If the nuisance parameters are not known, then the score test substitutes for them 1 under the null hypothesis. Setting 1 = 1 is their maximum likelihood estimators equivalent to setting 1 = 0, so we need the asymptotic distribution of 2 conditional on 1 = 0, which is normal with mean zero and covariance matrix
1 I2.1 = I22 I21 I11 I12 .

The score test statistic becomes

1 S = T 2 I2.1 2

1 and 2 = 0. If I12 = 0 then 1 and 2 are said with 2 and I2.1 evaluated at 1 = to be orthogonal. In that case, 1 and 2 are independent and I2.1 = I22 , meaning that the information matrix I22 does not need to be adjusted for estimation of 1 , Neyman [15] and Neyman and Scott [16] show that the asymptotic distribution and eciency of the score statistic S is unchanged if an estimator other than the maximum likelihood estimator is used for the nuisance parameters, provided that the estimator is consistent with convergence rate at least O(n1/2 ), where n is the 1 number of observations. They show that we can substitute into S any estimator of 1 for which n| 1 1 | is bounded in probability as n . In that case they rename the score statistic the C () test statistic.

Generalized Linear Models

Generalized linear models assume that observations are distributed according to a linear exponential family with an additional dispersion parameter. The density or probability mass function for each response is assumed to be of the form f (y ; , ) = a(y, ) exp[{y ()}/] (1)

where a and are suitable known functions. The mean is = () and the variance is (). The mean and the canonical parameter are one-to-one functions of one another. We call the dispersion parameter and v () = () the variance function. Following Jrgensen [10, 12], we call the distribution described by (1) an exponential dispersion model and denote it ED(, ). If y1 , . . . , yn are independently distributed as ED(, ) then the sample mean y is sucient for and y ED(, /n). More generally, if yi ED(, /wi ) where the wi are known weights, then the weighted sum y w is sucient for and y w =
n i=1 wi yi n i=1 wi

ED ,

n i=1 wi

Pearsons Goodness of Fit Statistic

A generalized linear model assumes independent observations y1 , . . . , yn with yi ED(i , /wi ). The means i are assumed to follow a link-linear model g (i ) = xT i (2)

where g is a known monotonic link function, xi is a vector of covariates and is an unknown vector of regression coecients. Without loss of generality we will assume that the n p matrix X with rows xi is of full column rank and that p < n, where p is the dimension of . More generally we will consider generalized nonlinear models in which the mean vector = (1 , . . . , n )T is a general n-dimensional function of the p-vector . To ensure that the parametrization is not degenerate, we will assume that the gradient matrix / is of full column rank, at least in a neighborhood containing the true . value of and the maximum likelihood estimate In this article we mainly consider models in which the dispersion is known, = 1 say. Most models with discrete responses have known dispersion.

Goodness of Fit Tests

Let be the locus of possible values for , = {( ) : IRp }. Let H0 be the null hypothesis that belongs to and let Ha be the alternative hypothesis that is unrestricted. The goodness of t test for the current model tests H0 against Ha . For a generalized linear model, H0 is the hypothesis that the i are described by the link-linear model (2). Theorem 1 The score statistic for the goodness of t test of a generalized nonlinear model with unit dispersion is the Pearson chisquare statistic
n

S=
i=1

wi (yi i )2 /v ( i )

. where i is the expected value i evaluated at the maximum likelihood estimator Proof. There exists an parameter vector 2 of dimension n p such that ( , 2 ) is a one-to-one transformation of in the neighborhood of interest and such that 2 = 0 if and only if . The goodness of t test consists of testing H0 : 2 = 0 against the alternative that 2 is unrestricted. The components of the original parameter vector are the nuisance parameters for this test. In the generalized linear model case, the implicit parameter vector 2 can be constructed by nding an n (n p) matrix X2 such that (X, X2 ) is of full rank. Then Ha is the saturated model that g (i ) = X + X2 2 for some and some 2 .

G. K. Smyth
Let 1 and 2 be the score vectors for and 2 respectively, and let I be the Fisher information matrix, partitioned into I11 , I12 and I22 as in Section 2. The Fisher information for 2 adjusted for estimation of is I2.1 and the score statistic for testing H0 is 1 S = T 2 I2.1 2 and = 0. where 2 and I2.1 are evaluated at =
2

Let V = diag{v (i )/wi } and write e = V 1/2 (y ) for the vector of Pearson residuals. Also write U1 = V 1/2 and U2 = V 1/2

. 2 It is straightforward to show that the score vectors are given by j = U T e j for j = 1, 2 and the information matrices are given by
T Ijk = Uj Uk

for j, k = 1, 2 [25] [27]. T U )1 U T of the orthogonal projection onto Write P1 for the matrix P1 = U1 (U1 1 1 the column space of U1 . Also write U2.1 = (I P1 )U2 and P2.1 for the matrix
T 1 T P2.1 = U2.1 (U2 .1 U2.1 ) U2.1

of the orthogonal projection onto the column space of U2.1 . Then P1 and P2.1 project onto orthogonal subspaces and P1 + P2.1 = I since the dimensions of the subspaces add to n. We can rewrite
T T T T T T I2.1 = U2 U2 U2 U1 (U1 U1 )1 U1 U2 = U2 (I P1 )U2 = U2 .1 U2.1 .

We can also rewrite 2 = (U T U T P1 )e = U T e 2 2 2 .1 ensures that U T e = 0 and hence P1 e = 0. This shows because evaluating at = 1 that the score statistic is
T 1 T T T T S = eT U2.1 (U2 .1 U2.1 ) U2.1 e = e P2.1 e = e (P1 + P2.1 )e = e e

which the Pearson statistic.

Pearsons Goodness of Fit Statistic

Example . Theorem 1 shows that the chisquare test for independence in a twoway contingency table is a score statistic, based on the assumption that the counts are independent and Poisson distributed. For multiway contingency tables, Theorem 1 shows that the score test of the hypothesis that any chosen subset of the pairs of faces are independent yields a Pearson statistic. We now consider the case where there are multiple observations for some or all of the covariate combinations. In such cases it is usually more natural to associate the saturated alternative with unique combinations of the explanatory variables rather than to allow every i to be dierent. The following corollary to Theorem 1 shows that the score test statistic in such cases is naturally expressed in terms of the mean response for each covariate-combination group. The score statistic in the corollary is the Pearson goodness of t statistic when the data has been reduced to sucient statistics for each covariate-combination group. Corollary 1 Suppose that yij ED(i , 1/wij ), i = 1, . . . , n, j = 1, . . . , ni , are independent. The score test statistic of H0 , that the i are functions of , against the alternative Ha that they are unrestricted, is given by
n

S=
i=1

wi ( ywi i )2 /v ( i )

where i is the maximum likelihood estimator of i , wi is the sum of weights

wi =
j =1

wij

and y wi is the weighted mean y wi 1 = wi

wij yij .
j =1

Proof. The weighted means y wi are sucient for the i , and y i ED(i , 1/wi ). The y wi are distributed as for the yi but with weights wi , so the result follows immediately from Theorem 1. 2

Example . Suppose that the yij are binary responses and that wij = 1 for all i and j . Then
n

S=
i=1

ni (ri p i )2 /v ( pi )

G. K. Smyth
where ri is the empirical proportion for the ith covariate-combination group, p i is the ni estimated probability that yij = 1, and v ( pi ) = p i (1 p i ). If yi = j =1 yij is the number of successes for the ith group then the yi are binomial(ni , pi ) and
n

S=
i=1

(yi. i )2 /vi ( i )

with i = npi and vi (i ) = i (ni i )/ni . This is the Pearson goodness of t statistic for the data summarized in the usual generalized linear model way as binomial counts for each covariate-combination group. Example . Paul and Banerjee [19] derive the score test for interaction in a twoway contingency table with multiple counts in each cell. Corollary 1 includes Paul and Banerjees Theorem 1 as a special case.

Extra Parameters in the Variance

Suppose now that there are extra parameters which aect the variance of the yi , but not its mean, and which are outside the exponential dispersion model framework. Let be the vector of extra parameters and let G be the parameter space for . Suppose that for each xed value of , the yi follow a generalized nonlinear model with variance function v (; ). The values of eectively index a class of generalized nonlinear models. This setup arises frequently when extra parameters are introduced to accommodate overdispersion in generalized linear models [1] [2] [7] [19]. It is straightforward to show that and are orthogonal parameters. This follows because 1 = V (y ) and does not depend on . Therefore the cross derivative 2 / will be linear in y and will have expectation zero. Orthogonality of and implies that estimation of does not aect the form of the score statistics for goodness of t. According to the theory of C () tests, may be replaced in the score test statistics by any estimator which is O(n1/2 ) consistent without changing the distributional properties of S to rst order. This gives the following theorem. Theorem 2 Suppose that for each G, y1 , . . . , yn ED(i , 1/wi ) are independent with variance function v (; ). The C () goodness of t statistic is
n

S=
i=1

) wi (yi i )2 /v ( i ;

Pearsons Goodness of Fit Statistic

is any n-consistent estimator of and where i is the maximum likelihood esti. mator of i given = Corollary 2 Suppose that for each G, yij ED(i , 1/wij ), i = 1, . . . , n, j = 1, . . . , ni , are independent with variance function v (; ). The C () goodness of t statistic is
n

S=
i=1

) wi ( ywi i )2 /v ( i ;

is any n-consistent estimator of , i is the maximum likelihood estimator where , the wi are sums of weights and the y of i given = wi are weighted means. The proofs of Theorem 2 and the corollary are similar to the proofs in Section 2. Example . Suppose that yij follows a negative binomial distribution with mean i and variance function V (; c) = + c2 , i = 1, . . . , n, j = 1, . . . , ni for each c 0. Suppose that the i are a function of a vector of regression parameters. For xed values of c, the means y i are sucient for the i and are negative binomial with the same variance function and weights ni . The C () goodness of t statistic therefore is n ni ( yi i )2 S= i + c 2 i i=1 where c is a n-consistent estimator of c and i is the maximum likelihood estimator of i with c = c . This includes Theorem 3 of Paul and Banerjee (1998). One possible estimator for is the maximum likelihood estimator. An alternative estimation method is to solve S = n p with respect to . This estimator is often preferred in overdispersion contexts because it is evidently a consistent estimator based only on the rst and second moments of the yi and therefore has a quasilikelihood avor (Breslow, 1990). Obviously the score statistic S is no longer useful as a goodness of t statistic if is estimated by either of the above methods. If there are repeat observations for covariate combinations, then an estimate of may be obtained from the pure error or within-covariate combination variability. In this approach, can be estimated by solving wij (yij y wi )2 = v ( ywi ; ) i=1 j =1
n ni n

(ni 1).
i=1

With such a estimator for , S still has meaning as a goodness of t statistic.

G. K. Smyth

Unknown Dispersion Parameter

All the above results have assumed that = 1. If is unknown, then both and I are divided by and the score statistic for goodness of t for a generalized nonlinear model becomes n wi (yi i )2 S= . v ( i ) i=1 The appearance of the unknown scale parameter in S means that the statistic is no longer useful for judging goodness of t. The statistic leads instead, by equating S to its expectation, to the so-called Pearson estimator of , = 1 np
n i=1

wi (yi i )2 v ( i )

which is the default estimator of in generalized linear model functions in the statistical programs S-Plus and R. Other estimators of are discussed by Jrgensen [11]. When there are repeat observations, the dierence between the full version of the score statistic in Theorem 1 and the reduced form in Corollary 1 can be used to dene a pure error estimate of the dispersion parameter , pure = 1 (ni 1) wij (yij y wi )2 . v ( ywi ) i=1 j =1
n ni

In the case of normal linear regression, this is the well known pure error estimator of the variance. With the use of this this estimator, the score statistic recovers its use as a goodness of t statistic, but now as a generalized F -statistic rather than chisquare. Substituting the pure error estimator into the score test for the reduced data gives ywi i )2 (ni 1) n wi ( F = . pure v ( n p i=1 i ) If the yij are approximately normal, then F follows approximately an F -distribution on n p and (ni 1) degrees of freedom under the null hypothesis. This is asymptotically true for example as the weights wij or the dispersion 0, because any exponential dispersion model ED(, ) tends to normality as 0 [11, 12]. The F statistic above is a generalization of the normal theory equivalents, described for example by Weisberg [28, Section 4.3].

Dedication
This article is in honor of Terry Speed, from whom I learned generalized linear models while an undergraduate student in Perth, Western Australia. Terrys enthusiasm for

Pearsons Goodness of Fit Statistic

statistics and science was and remains infectious. The topic of this article partly arises from a more recent conversation with Terry. Gordon K. Smyth, Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia, [email protected].

References
[1] Breslow, N. E. Score tests in overdispersed generalized linear models. In A. Decarli, B. J. Francis, R. Gilchrist, and G. U. H. Seeber, editors, Proceedings of GLIM 89 and the 4th International Workshop on Statistical Modelling, pages 6474. Springer-Verlag: New York, 1989. [2] Breslow, N. E. Tests of hypotheses in overdispersed Poisson regression and other quasi-likelihood models. Journal of the American Statistical Association, 85:565 571, 1990. [3] Cameron, A., and Trivedi, P. Regression-based tests for overdispersion in the Poisson model. Journal of Econometrics, 46:347364, 1990. [4] Chen, C.-F. Score tests for regression models. Journal of the American Statistical Association, 78:158161, 1983. [5] Cox, D. R. Some remarks on overdispersion. Biometrika, 70:269274, 1983. [6] Cox, D. R., and Hinkley, D. V. Theoretical Statistics. Chapman and Hall: London, 1974. [7] Dean, C. B. Testing for overdispersion in Poisson and binomial regression models. Journal of the American Statistical Association, 87:451457, 1992. [8] Deng, D., and Paul, S. R. Score tests for zero ination in generalized linear models. Canadian Journal of Statistics, 28:563570, 2000. [9] Genter, F. C. and Farewell, V. T. Goodness-of-link testing in ordinal regression models. Canadian Journal of Statistics, 13:3744, 1985. [10] Jrgensen, B. Exponential dispersion models (with discussion). Journal of the Royal Statistical Society Series B, 49:127162, 1987. [11] Jrgensen, B. The theory of exponential dispersion models and analysis of deviance. Monograas de Matem atika No. 51, Instituto de Mathem atika pura e Aplicada, Rio de Janeiro, 1992. [12] Jrgensen, B. Theory of Dispersion Models. Chapman and Hall: London, 1997. [13] Lu, W.-S. Score tests for overdispersion in Poisson regression models. Journal of Statistical Computation and Simulation, 56:213228, 1997. [14] McCullagh, P., and Nelder, J. A. Generalized Linear Models. Chapman and Hall: London, 1989.

G. K. Smyth
[15] Neyman, J. Optimal asymptotic tests of composite hypotheses. In V. Grenander, editor, Probability and Statistics: The Harold Cram er Volume, pages 213234. Wiley: New York, 1959. [16] Neyman, J., and Scott, E. On the use of C () optimal tests of composite hypotheses. Bulletin of the International Statistical Institute, Proceedings of the 35th Session, 41:477497, 1966. [17] Pearson, K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine Series 5, 50:157175, 1900. [18] Paul, S. R., and Deng, D. Goodness of t of generalized linear models to sparse data. Journal of the Royal Statistical Society Series B, 62:323333, 2000. [19] Paul, S. R., and Banerjee, T. Analysis of two-way layout of count data involving multiple counts in each cell. Journal of the American Statistical Association, 93:14191429, 1998. [20] Pregibon, D. Goodness of link tests for generalized linear models. Applied Statistics, 29:1524, 1980. [21] Pregibon, D. Score tests in GLIM with applications. In R. Gilchrist, editor, GLIM 82: Proceedings of the International Conference on Generalised Linear Models, pages 8797. Springer-Verlag: New York, 1982. [22] Rao, C. R. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44:5057, 1947. [23] Rao, C. R. Linear Statistical Inference and its Applications, Second Edition. Wiley: New York, 1973. [24] Smith, P. J., and Heitjan, D. F. Testing and adjusting for departures from nominal dispersion in generalized linear models. Applied Statistics, 42:31-41, 1993. [25] Smyth, G.K. Exponential dispersion models and the Gauss-Newton algorithm. Australian Journal of Statistics, 33:5764, 1991. [26] Thall, P. F. Score tests in two-way layouts of counts. Communications in Statistics Part ATheory and Methods, 21:30173036, 1992. [27] Wei, B.-C. Exponential Family Nonlinear Models. Springer-Verlag: Singapore, 1998. [28] Weisberg, S. Applied linear regression. Wiley: New York, 1985.

Linear Models 2nd Edition Shayle R. Searle - Download the full ebook now to never miss any detail
100% (2)
Linear Models 2nd Edition Shayle R. Searle - Download the full ebook now to never miss any detail
57 pages
PATTERN RECOGNITION Final Notes
89% (9)
PATTERN RECOGNITION Final Notes
40 pages
Solutions To Steven Kay's Statistical Estimation Book
67% (3)
Solutions To Steven Kay's Statistical Estimation Book
16 pages
How To Interpret Literature Critical Robert Dale 15620467
0% (26)
How To Interpret Literature Critical Robert Dale 15620467
2 pages
DL RS 299a
100% (3)
DL RS 299a
9 pages
Making Use of Incomplete Observations in The Analysis of Structural Equation Models The CALIS Procedure's Full Information Maximum Likelihood Method in SAS STAT 9.3
No ratings yet
Making Use of Incomplete Observations in The Analysis of Structural Equation Models The CALIS Procedure's Full Information Maximum Likelihood Method in SAS STAT 9.3
20 pages
IV_AI-DS_AD3491_FDSA_Unit5
No ratings yet
IV_AI-DS_AD3491_FDSA_Unit5
39 pages
Malnutrition in The World
No ratings yet
Malnutrition in The World
11 pages
Interpretation of Canonical Discriminant Functions - Rencher
No ratings yet
Interpretation of Canonical Discriminant Functions - Rencher
10 pages
Exploring Statistics
No ratings yet
Exploring Statistics
33 pages
Jarque and Bera 1987 - A Test For Normality of Observations and Regression Residuals PDF
No ratings yet
Jarque and Bera 1987 - A Test For Normality of Observations and Regression Residuals PDF
11 pages
International Statistical Institute (ISI)
No ratings yet
International Statistical Institute (ISI)
26 pages
Model Validity
No ratings yet
Model Validity
511 pages
Ics 2328 Computer Oriented Statistical Modeling Assignment March 2024 Ms
No ratings yet
Ics 2328 Computer Oriented Statistical Modeling Assignment March 2024 Ms
6 pages
Quantitative Anaysise Solomon
No ratings yet
Quantitative Anaysise Solomon
51 pages
Manjon Martinez 2014 The Chi Squared Goodness of Fit Test For Count Data Models
No ratings yet
Manjon Martinez 2014 The Chi Squared Goodness of Fit Test For Count Data Models
19 pages
JB Test
No ratings yet
JB Test
11 pages
Fisher 1922
No ratings yet
Fisher 1922
17 pages
Ana Assignment 5019
No ratings yet
Ana Assignment 5019
7 pages
Measurement 6th Sem (H) DSE4 Lec 4 05 05 2020
No ratings yet
Measurement 6th Sem (H) DSE4 Lec 4 05 05 2020
19 pages
bartlett1941
No ratings yet
bartlett1941
10 pages
PSAI Unit 5
No ratings yet
PSAI Unit 5
25 pages
IV_AI-DS_AD3491_FDSA_Unit5
No ratings yet
IV_AI-DS_AD3491_FDSA_Unit5
35 pages
Lesson 12 - Introduction To Regression and Correlation Analysis Regression Analysis
No ratings yet
Lesson 12 - Introduction To Regression and Correlation Analysis Regression Analysis
39 pages
Model Paper STAT-211
No ratings yet
Model Paper STAT-211
7 pages
123 T F Z Chi Test 2
No ratings yet
123 T F Z Chi Test 2
5 pages
1Lesson-13.-Statistical-Treatment
No ratings yet
1Lesson-13.-Statistical-Treatment
14 pages
Related To Covid
No ratings yet
Related To Covid
10 pages
1-s2.0-016794739592844N-main
No ratings yet
1-s2.0-016794739592844N-main
11 pages
Abisola
No ratings yet
Abisola
12 pages
Formulas
No ratings yet
Formulas
12 pages
Tutorial 9 So LN
No ratings yet
Tutorial 9 So LN
7 pages
Chapter 11
No ratings yet
Chapter 11
6 pages
Multivariate Statistical Analysis Anderson
100% (1)
Multivariate Statistical Analysis Anderson
8 pages
Rm Unit 4 - Overview
No ratings yet
Rm Unit 4 - Overview
62 pages
Applied Statistics in Business and Economics 4th Edition Doane Test Bank - Download Today With Full Content
100% (7)
Applied Statistics in Business and Economics 4th Edition Doane Test Bank - Download Today With Full Content
61 pages
TEST-OF-RELATIONSHIP-PARAMETRIC
No ratings yet
TEST-OF-RELATIONSHIP-PARAMETRIC
9 pages
Module 8 ANOVA or F Test
No ratings yet
Module 8 ANOVA or F Test
11 pages
Cox-Stuart 1955 Some Quick Sign Tests For Trend in Location and Dispersion
No ratings yet
Cox-Stuart 1955 Some Quick Sign Tests For Trend in Location and Dispersion
17 pages
Statistics For College Students-Part 2
100% (1)
Statistics For College Students-Part 2
43 pages
Univariate_statistical_methods
No ratings yet
Univariate_statistical_methods
37 pages
Unit 5 Mba 1ST
No ratings yet
Unit 5 Mba 1ST
197 pages
Document (6)
No ratings yet
Document (6)
6 pages
Chapter 6
No ratings yet
Chapter 6
13 pages
Multinomial Goodness-of-Fit Based On U - Statistics: High-Dimensional Asymptotic and Minimax Optimality
No ratings yet
Multinomial Goodness-of-Fit Based On U - Statistics: High-Dimensional Asymptotic and Minimax Optimality
29 pages
Chapter 1 To Chapter 2 Stat 222
No ratings yet
Chapter 1 To Chapter 2 Stat 222
21 pages
Chow - Tests of Equality Between Sets of Coefficients in Two Linear Regressions
No ratings yet
Chow - Tests of Equality Between Sets of Coefficients in Two Linear Regressions
16 pages
04 Testing
No ratings yet
04 Testing
35 pages
PrefinalTask1 Zardilla
No ratings yet
PrefinalTask1 Zardilla
3 pages
Lecture3 - Contingency Analysis
No ratings yet
Lecture3 - Contingency Analysis
16 pages
function of non parametric tests
No ratings yet
function of non parametric tests
3 pages
Statistical Tests
No ratings yet
Statistical Tests
9 pages
Stats Formulas &tables
No ratings yet
Stats Formulas &tables
21 pages
Rangkuman Rumus & Tabel Statistika
No ratings yet
Rangkuman Rumus & Tabel Statistika
12 pages
The World of Statistics
No ratings yet
The World of Statistics
1 page
Satistics PDF
No ratings yet
Satistics PDF
42 pages
The World of Statistics
No ratings yet
The World of Statistics
1 page
QUESTION 1 (3 + 12 + 5 = 20 marks) :, … ,Y Y μ and V Y σ
No ratings yet
QUESTION 1 (3 + 12 + 5 = 20 marks) :, … ,Y Y μ and V Y σ
4 pages
Analysis of Variance
No ratings yet
Analysis of Variance
62 pages
Test Statistic_Note
No ratings yet
Test Statistic_Note
4 pages
Z Test Formula
No ratings yet
Z Test Formula
6 pages
UNIT 10
No ratings yet
UNIT 10
30 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
BookReview - Classical Mechaniscs - Rao 818
No ratings yet
BookReview - Classical Mechaniscs - Rao 818
1 page
How To Interpret Literature
0% (1)
How To Interpret Literature
2 pages
Interrater Agreement - Kappa Statistic
No ratings yet
Interrater Agreement - Kappa Statistic
4 pages
How To Write A Clear Research Report
No ratings yet
How To Write A Clear Research Report
2 pages
Theories of Imperialism (Paper)
No ratings yet
Theories of Imperialism (Paper)
3 pages
WS - Centering Prayer
100% (2)
WS - Centering Prayer
2 pages
Root Fishbone Modtemplate
No ratings yet
Root Fishbone Modtemplate
1 page
Syllabus Epidemiology Intro Books
No ratings yet
Syllabus Epidemiology Intro Books
1 page
NCPS Patient Safety Intervention Hierarchy
No ratings yet
NCPS Patient Safety Intervention Hierarchy
1 page
Cause and Effect Diagram: Human Factors/ Communication Human Factors/ Training Human Factors Fatigue/ Scheduling
No ratings yet
Cause and Effect Diagram: Human Factors/ Communication Human Factors/ Training Human Factors Fatigue/ Scheduling
1 page
Syllabus Epidemiology Intro Books
No ratings yet
Syllabus Epidemiology Intro Books
1 page
Year: Fourth Theoretical: 2 Hrs/Week Tutorial: - Hrs/week
No ratings yet
Year: Fourth Theoretical: 2 Hrs/Week Tutorial: - Hrs/week
1 page
Module Title: Legal Research and Methods Credit Points: 05 ECTS / 10 UK Credits Overview
No ratings yet
Module Title: Legal Research and Methods Credit Points: 05 ECTS / 10 UK Credits Overview
2 pages
Advanced Electronics Ee 407 Year: Fourth Theoretical: 2 Hrs/Week Tutorial: - Hrs/week
No ratings yet
Advanced Electronics Ee 407 Year: Fourth Theoretical: 2 Hrs/Week Tutorial: - Hrs/week
1 page
Wheelers Rules of Writing
No ratings yet
Wheelers Rules of Writing
1 page
Python Programming On Win32
No ratings yet
Python Programming On Win32
2 pages
Estimating Animal Density With Camera Traps A Practitioners Guide of The REST Model
No ratings yet
Estimating Animal Density With Camera Traps A Practitioners Guide of The REST Model
40 pages
Exam C - 1106 PDF
No ratings yet
Exam C - 1106 PDF
36 pages
Threshold Methods For Sample Extremes: at 1984 by 0. Reidel Publishing C0mpany
No ratings yet
Threshold Methods For Sample Extremes: at 1984 by 0. Reidel Publishing C0mpany
18 pages
Maximum Likelihood Decoding
No ratings yet
Maximum Likelihood Decoding
5 pages
Full Download Original PDF Econometric Analysis 8th Edition by William H Greene PDF
100% (31)
Full Download Original PDF Econometric Analysis 8th Edition by William H Greene PDF
41 pages
Basics of Monte Carlo Simulation
No ratings yet
Basics of Monte Carlo Simulation
10 pages
Vision Dummy PDF
100% (1)
Vision Dummy PDF
51 pages
Statistical Models and Methods for Lifetime Data Second Edition Wiley Series in Probability and Statistics Jerald F. Lawless pdf download
No ratings yet
Statistical Models and Methods for Lifetime Data Second Edition Wiley Series in Probability and Statistics Jerald F. Lawless pdf download
74 pages
Paradoja Osteo
No ratings yet
Paradoja Osteo
29 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
The_Overlapping_Data_Problem
No ratings yet
The_Overlapping_Data_Problem
39 pages
Characterization of The Resistivity Aluto Langano Geothermal Field in Ethiopia Data
No ratings yet
Characterization of The Resistivity Aluto Langano Geothermal Field in Ethiopia Data
12 pages
On Families of Generalized Pareto Distributions: Properties and Applications
No ratings yet
On Families of Generalized Pareto Distributions: Properties and Applications
20 pages
Midterm Solutions PDF
No ratings yet
Midterm Solutions PDF
17 pages
CH 7
No ratings yet
CH 7
47 pages
Fisher Information - Wikipedia
No ratings yet
Fisher Information - Wikipedia
13 pages
Mathematical Statistics and Stochastic Processes 1st Edition Denis Bosq download
No ratings yet
Mathematical Statistics and Stochastic Processes 1st Edition Denis Bosq download
55 pages
Regression Modelling and Least-Squares: GSA Short Course: Session 1 Regression
No ratings yet
Regression Modelling and Least-Squares: GSA Short Course: Session 1 Regression
6 pages
Time Delay Estimation Based On The Higher-Order Statistics of Signals
No ratings yet
Time Delay Estimation Based On The Higher-Order Statistics of Signals
151 pages
Maher
No ratings yet
Maher
10 pages
Graduate Econometrics Lecture Notes - Michael Creel (414 Pages)
100% (1)
Graduate Econometrics Lecture Notes - Michael Creel (414 Pages)
414 pages
Distribution Models-Theory PDF
No ratings yet
Distribution Models-Theory PDF
307 pages
Handwritten Signature Verification FULLTEXT01
No ratings yet
Handwritten Signature Verification FULLTEXT01
50 pages
Baker (2011) Fragility Fitting
No ratings yet
Baker (2011) Fragility Fitting
10 pages
Expectation Maximization: Dekang Lin Department of Computing Science University of Alberta
No ratings yet
Expectation Maximization: Dekang Lin Department of Computing Science University of Alberta
22 pages
Midterm Solutions
No ratings yet
Midterm Solutions
8 pages

Goodness

Uploaded by

Goodness

Uploaded by

Pearsons Goodness of Fit Statistic as a Score Test Statistic

Pearsons Goodness of Fit Statistic

respectively. The observed information matrix for the parameters is with = 2 = 1 T 2 11 21 12 22 .

The score test statistic becomes

Generalized Linear Models

Pearsons Goodness of Fit Statistic

Goodness of Fit Tests

which the Pearson statistic.

Pearsons Goodness of Fit Statistic

where i is the maximum likelihood estimator of i , wi is the sum of weights

and y wi is the weighted mean y wi 1 = wi

Extra Parameters in the Variance

Pearsons Goodness of Fit Statistic

With such a estimator for , S still has meaning as a goodness of t statistic.

Unknown Dispersion Parameter

Pearsons Goodness of Fit Statistic

You might also like