0% found this document useful (0 votes)
42 views

Continuity Correction For Iqs: Pnorm Pnorm

This document simulates dice rolls and CEO salaries to investigate the central limit theorem. For dice rolls, averaging more dice (2, 5, 20, 100) results in distributions closer to a normal curve centered around the expected value of 3.5. Averaging CEO salaries pulls the skewed distribution closer to normal as the sample size increases (10, 30, 50, 10000). The document supports that increasing sample size causes the average to approach a normal distribution, as predicted by the central limit theorem.

Uploaded by

cincinmindy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Continuity Correction For Iqs: Pnorm Pnorm

This document simulates dice rolls and CEO salaries to investigate the central limit theorem. For dice rolls, averaging more dice (2, 5, 20, 100) results in distributions closer to a normal curve centered around the expected value of 3.5. Averaging CEO salaries pulls the skewed distribution closer to normal as the sample size increases (10, 30, 50, 10000). The document supports that increasing sample size causes the average to approach a normal distribution, as predicted by the central limit theorem.

Uploaded by

cincinmindy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

HW 13

2. Continuity Correction for IQs


pnorm(100.5,100,15)-pnorm(99.5,100,15)
## [1] 0.02659
1-pnorm(100.5,100,15)
## [1] 0.4867

3. Simulating Dice Throws


dice=matrix(sample(1:6,1000000,rep=TRUE),nrow=10000)
barplot(table(dice[,1]))

It does look uniform.


averagesof2=apply(dice[,1:2],1,mean)
hist(averagesof2)

table(averagesof2)
## averagesof2
## 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
## 256 532 862 1093 1454 1676 1419 1089 809 549 261

It has a almost symmetrical shape. It's not uniform because it is more likely to have an
average closer to the expected value (3.5) when you have only 2 dice. There are more
combinations to have an average of 3.5 than to have an average of 6. P(average=3.5)=
(6/36)*10000=1666.667 This is close to the number I got but they do not match.
averagesof5=apply(dice[,1:5],1,mean)
hist(averagesof5)

averagesof20=apply(dice[,1:20],1,mean)
hist(averagesof20)

averagesof100=apply(dice[,1:100],1,mean)
hist(averagesof100)

summary(averagesof5)
##
##

Min. 1st Qu. Median Mean 3rd Qu.


1.0 3.0 3.6 3.5 4.0 6.0

Max.

summary(averagesof20)
##
##

Min. 1st Qu. Median Mean 3rd Qu. Max.


2.25 3.25 3.50 3.50 3.75 4.85

summary(averagesof100)
##
##

Min. 1st Qu. Median Mean 3rd Qu. Max.


2.83 3.38 3.50 3.50 3.61 4.18

The shape is becoming more normal as more tosses are considered. The center stays
roughly around the expected value 3.5. The spread is getting smaller. Normal seems to
become a good approximation when there are more tosses.
sd(averagesof5)
## [1] 0.762
sd(averagesof20)
## [1] 0.3821
sd(averagesof100)
## [1] 0.1699

*See attached for calculations of average and standard deviation. The means and sd's are
almost the same as the simulation. No, the outcome of a die when lots of dice are thrown
will be uniform instead of normal. In c) we are saying that as the number of tosses
increases, the averages will be closer to normal, not the outcome.
4. Normal?

4. Normal?
bodyfat=read.delim("http://sites.williams.edu/rdeveaux/files/2014/09/bodyfat.txt")
with(bodyfat,mean(Weight))
## [1] 178.1
with(bodyfat,sd(Weight))
## [1] 27.04
with(bodyfat,hist(Weight))

The histogram is unimodal and somewhat symmetrical.


weight=do(1000)*sample(bodyfat$Weight,size=10,replace=TRUE)
## Loading required package: parallel
averagesof10=apply(weight[,1:10],1,mean)
hist(averagesof10)

mean(averagesof10)
## [1] 177.6
sd(averagesof10)
## [1] 8.839

The mean should be same as a) and sd shoule be a)/sqrt(10). The mean and sd are 177.8404
and 8.826601 respectively. The shape is almost normal. This is not surprising according to
the Central Limit Theorem. 5. Still Normal?
ceo=read.delim("http://sites.williams.edu/rdeveaux/files/2014/09/CEO_Salary_2012.txt")
sal=ceo[,4]
hist(sal)

The histogram does not look normal at all.


salary=do(1000)*sample(ceo$X1.Year.Pay...mil.,size=10,replace=TRUE)
salariesof10=apply(salary[,1:10],1,mean)
hist(salariesof10)

The shape is not normal and skewed to the right. That is not surprising because the sample
size is too small.
salary30=do(1000)*sample(ceo$X1.Year.Pay...mil.,size=30,replace=TRUE)
salaries30=apply(salary30[,1:30],1,mean)
hist(salaries30)

This is closer to normal but still skewed to the right.


salary50=do(1000)*sample(ceo$X1.Year.Pay...mil,size=50,replace=TRUE)
salaries50=apply(salary50[,1:50],1,mean)
hist(salaries50)

This is even more normal but still skewed to the right.


The bigger the sample size, the more normal it will look.
salary10000=do(1000)*sample(ceo$X1.Year.Pay...mil,size=10000,replace=TRUE)
salaries10000=apply(salary10000[,1:10000],1,mean)
hist(salaries10000)

A sample size of 10000 is much closer to normal than a sample of size of 50.
I agree with this rule of thumb.

You might also like