Assignment 1_Ex 4.
39
Olamide Gab-Opadokun
2/10/2020
4.39a). Examine variables for marginal normality
mydata<-read.csv("C:/Users/Olamide/Desktop/STAT 5509 (Multivariate Analysis)/assigndata.csv")
#Independence
myplot.i<-qqnorm(mydata$Independence,plot.it=FALSE)
rQ.I<-cor(myplot.i$x,myplot.i$y)
#Support
myplot.s<-qqnorm(mydata$Support,plot.it=FALSE)
rQ.S<-cor(myplot.s$x,myplot.s$y)
#Benevolence
myplot.b<-qqnorm(mydata$Benevolence,plot.it=FALSE)
rQ.B<-cor(myplot.b$x,myplot.b$y)
#Conformity
myplot.c<-qqnorm(mydata$Conformity,plot.it=FALSE)
rQ.C<-cor(myplot.c$x,myplot.c$y)
#Leadership
myplot.l<-qqnorm(mydata$Leadership,plot.it=FALSE)
rQ.L<-cor(myplot.l$x,myplot.l$y)
rQ.values<-cbind(rQ.I,rQ.S,rQ.B,rQ.C,rQ.L)
colnames(rQ.values)<-c('Independence','Support','Benevolence','Conformity','Leadership')
#These gives the correlation coefficient for the different variables being tested.
rQ.values
## Independence Support Benevolence Conformity Leadership
## [1,] 0.9881301 0.989288 0.9925086 0.99338 0.9812888
With n=130 and testing at a significance level of alpha=0.05, the critical point is 0.9873.
We can see that ‘Leadership’ is the only variable whose correlation coefficient (0.9813) is lower than the
critical point. For Independence, Support, Benevolence and Conformity, we fail to reject the hypothesis of
normality since all their correlation coefficients are higher than the critical value while for Leadership,we
reject the hypothesis of normality.
1
4.39b) Using all five variables, check for multivariate normality
#We will construct a chi-square plot to check for multivariate normality
# to calculate the sample mean vector and v-cov matrix
xbar <- colMeans(mydata)
S<- cov(mydata)
# to calaculate the squares distances
d2 <- apply(mydata, MARGIN = 1, function(mydata) +
t(mydata-xbar) %*% solve(S) %*% (mydata-xbar))
# to construct the chi-square plots
plot(qchisq((1:nrow(mydata)-0.5)/nrow(mydata),df=5), sort(d2),
xlab = 'Quantiles (q((j-0.5)/130))',
ylab = "Squared Distances (d(j)^2)",
main = "Chi-Squared Plot for all Variables in Psychological Data",pch=20)
text(x=17.7, y=15,labels= c("X"), pos=2, cex=1.3)
Chi−Squared Plot for all Variables in Psychological Data
X
15
Squared Distances (d(j)^2)
10
5
0
0 5 10 15
Quantiles (q((j−0.5)/130))
Looking at the plot above, we can see that it is reasonably straight except for the one point at the topmost
right (marked with an X). That point is an outlier. The straight plot signifies multivariate normality.
2
4.39c)
#The nonnormal variable is the 'Leadership' variable.
#We will take the square root of the Leadership scores then we reexamine for marginal
#normality with the new scores.
mydata$Leadership<-sqrt(mydata$Leadership)
myplot.l<-qqnorm(mydata$Leadership,plot.it=FALSE)
rQ.L.new<-cor(myplot.l$x,myplot.l$y)
#This gives the new correlation coefficient for leadership variable after the transformation.
rQ.L.new
## [1] 0.9959115
The new correlation coefficient after transforming the Leadership scores is 0.9959. With n=130 and testing
at a significance level of alpha=0.05, the critical point is 0.9873. We can see that the new ‘Leadership’
correlation coefficient (0.9959) is higher than the critical point. Therefore, we fail to reject the hypothesis
of normality.