0% found this document useful (0 votes)
61 views

Statistics Important Points: Properties of Normal Distribution

There are three key differences between populations and samples discussed in the document: 1) Summary measures for a population are called parameters while summary measures for a sample are called statistics. 2) The mean refers to the population mean while the average refers to the sample average. 3) Notation also differs between population and sample measures such as variance.

Uploaded by

esjai
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

Statistics Important Points: Properties of Normal Distribution

There are three key differences between populations and samples discussed in the document: 1) Summary measures for a population are called parameters while summary measures for a sample are called statistics. 2) The mean refers to the population mean while the average refers to the sample average. 3) Notation also differs between population and sample measures such as variance.

Uploaded by

esjai
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

STATISTICS IMPORTANT POINTS

Another difference between populations and samples is the way summary measures are described. Summary measures for a population are called parameters. Summary measures for a sample are called statistics. As an example, suppose you calculate the average for a set of data. If the data set is a population of values, the average is a parameter, which is called the population mean. If the data set is a sample of values, the average is a statistic, which is called the sample average (or the average, for short). The rest of this book uses the word mean to indicate the population mean, and the word average to indicate the sample average. The mean is a measure that describes the center of a distribution of values. The variance is a measure that describes the dispersion around the mean. The standard deviation is the square root of the variance. The sample average is denoted with X and is called x bar. The population mean is denoted with and is called Mu. The sample variance is denoted with s2 and is called s-squared. The population variance is denoted with 2 and is called sigma-squared. Because the standard deviation is the square root of the variance, it is denoted with s for a sample and for a population.

Properties of Normal Distribution


The normal distribution is completely defined by its mean and standard deviation. For a given mean and standard deviation, there is only one normal distribution whose graph we can draw. The normal distribution has mean=mode=median. The normal distribution is symmetric. The normal distribution is smooth. The normal distribution has a kurtosis of 0

The Empirical Rule


If data is from a normal distribution, the Empirical Rule gives a quick and easy way to summarize the data. The Empirical Rule says the following: About 68% of the values are within one standard deviation of the mean. About 95% of the values are within two standard deviations of the mean. More than 99% of the values are within three standard deviations of the mean.

Parametric and Nonparametric Statistical Methods


Many statistical methods rely on the assumption that the data values are a sample from a normal distribution. Other statistical methods rely on an assumption of some other distribution of the data. Statistical methods that rely on assumptions about distributions are called parametric methods. There are statistical methods that do not assume a particular distribution for the data. Statistical methods that dont rely on assumptions about distributions are called nonparametric methods.

*Note: Use a parametric method if the data meets the assumptions, and use a nonparametric method if it doesnt.

Testing for Normality

Recall that a normal distribution is the theoretical distribution of values for a population. Many statistical methods assume that the data values are a sample from a normal distribution. For a given sample, you need to decide whether this assumption is reasonable. Because you have only a sample, you can never be absolutely sure that the assumption is correct. What you can do is test the assumption, and, based on the results of the test, decide whether the assumption is reasonable. This testing and decision process is called testing for normality.

Statistical Test for Normality When testing for normality, you start with the idea that the sample is from a normal distribution. Then, you verify whether the data agrees or disagrees with this idea. Using the sample, you calculate a statistic and use this statistic to try to verify the idea. Because this statistic tests the idea, it is called a test statistic. The test statistic compares the shape of the sample distribution with the shape of a normal distribution.
The result of this comparison is a number called a p-value, which describes how doubtful the idea is in terms of probability. A p-value can range from 0 to 1. A p-value close to 0 means that the idea is very doubtful, and provides evidence against the idea. If you find enough evidence to reject the idea, you decide that the data is not a sample from a normal distribution. If you cannot find enough evidence to reject the idea, you proceed with the analysis based on the assumption that the data is a sample from a normal distribution. SAS provides the formal test for normality in PROC UNIVARIATE
with the NORMAL option.

A Type I error means concluding that the alternative hypothesis is true when it is not. A Type II error occurs when you fail to reject the null hypothesis when it is false.

Independent Groups Independent groups of data contain measurements for two unrelated samples of items.
Example, suppose a researcher selects a random sample of children, some who use fluoride toothpaste, and some who do not. There is no relationship between the children who use fluoride toothpaste and the children who do not. A dentist counts the number of cavities for each child. The goal of analysis is to compare the average number of cavities for children who use fluoride toothpaste and for children who do not.

Paired Groups Paired groups of data contain measurements for one sample of items, but there are two measurements for each item. A common example of paired groups is before-and-after measurements, where the goal of analysis is to decide whether the average change from before to after is greater than what could happen by chance. For example, a doctor weighs 30 people before they begin a program to quit smoking, and weighs them again six months after they have completed the program. The goal of analysis is to decide whether the average weight change is greater than what could happen by chance.

You might also like