Introduction To R. Graphical Representation of Multivariate Observations
Introduction To R. Graphical Representation of Multivariate Observations
1. Start by opening R and change the working directory, using the menu: File -
> Change dir...
choose the appropriate path to save your work, e.g.
C:\ROLIVEIRA\MODCLIM\Project1
1+1
10*3
c(1,2,3)
c(1,2,3)*10
x <- 5
x*x
exp(1)
Get the logarithm in base e, 10 and the square root of the vector lx.
log(lx)
log(lx,10)
sqrt(lx)
m%*%lx[1:2]
m%*%t(m)
4. The Iris data is a famous multivariate dataset and will be used to illustrate the
use of various commands. Start by typing
iris
dim(iris)
colnames(iris)
1
allowing you to see your dataset, the size of the data matrix and the names of
each column. Your data can be stored in different structures. Verify that the
object iris is not a matrix, but a data.frame using the commands:
is.matrix(iris)
is.data.frame(iris)
5. Calculate summary statistics for each of the first 4 columns of your dataset. Do
it in two different ways:
summary(iris[,1])
summary(iris[,2])
summary(iris[,3])
summary(iris[,4])
sd(iris[,1])
sd(iris[,2])
sd(iris[,3])
sd(iris[,4])
apply(iris[,1:4],2,summary)
apply(iris[,1:4],2,sd)
Use also the package psych to obtain a more complete summary statistics:
library(psych)
describe(iris[,1:4])
describe.by(iris[,1:4],group=iris[,5])
6. The last column represents the type of lily that is under study. Check how many
types of lilies we have and many flowers belong to each category.
table(iris[,5])
7. Calculate the covariance matrix and correlations among the 4 variables that cha-
racterize the flowers.
var(iris[,1:4])
cor(iris[,1:4])
cov(iris[,1:2],iris[,4])
8. Plot the observations using histograms, box plots, dot plots, etc. When possible,
assign the same color and symbol to the same species (category).
par(mfrow=c(2,2))
hist(iris[,1],prob=TRUE,xlab="Sepal.Length")
hist(iris[,2],prob=TRUE,xlab="Sepal.Width")
hist(iris[,3],prob=TRUE,xlab="Petal.Length")
hist(iris[,4],prob=TRUE,xlab="Petal.Width")
2
par(mfrow=c(1,1))
boxplot(iris[,1:4],prob=TRUE,xlab="")
error.bars(iris[,1:4])
boxplot(iris[,1:4])
error.bars(iris[,1:4],add=TRUE,col="red",lwd=2)
par(mfrow=c(3,2))
plot(iris[,1:2],col=iris$Species,xlab="Sepal.Length",ylab="Sepal.Width")
plot(iris[,1],iris[,3],col=iris$Species,xlab="Sepal.Length",ylab="Petal.Length")
plot(iris[,1],iris[,4],col=iris$Species,xlab="Sepal.Length",ylab="Petal.Width")
plot(iris[,2],iris[,3],col=iris$Species,xlab="Sepal.Width",ylab="Petal.Length")
plot(iris[,2],iris[,4],col=iris$Species,xlab="Sepal.Width",ylab="Petal.Width")
plot(iris[,3],iris[,4],col=iris$Species,xlab="Petal.Length",ylab="Petal.Width")
par(mfrow=c(1,1))
pairs(iris[,1:4],col=iris[,5])
par(mfrow=c(2,2))
plot(iris$Petal.Length ~ iris$Species, col="cyan")
plot(iris[,2] ~ iris$Species, col="cyan")
plot(iris[,3] ~ iris$Species, col="cyan")
plot(iris[,4] ~ iris$Species, col="cyan")
9. Plot the density function of a univariate normal with expected value zero and
unit variance. Start using the help to see how to use the command curve.
?curve
par(mfrow=c(1,1))
curve(dnorm(x),from=-4,to=4)
10. Make the Chernoff faces and star charts for the dataset that characterize the
sparrows. Start by reading the file sparrows.txt.
sparrows<-read.table("sparrows.txt")
library(graphics)
stars(sparrows)
library(aplpack)
faces(sparrows)
11. Check the objects you have in memory by typing ls(). Use rm(xpto) if you
want to delete the object xpto from memory. Record the object, e.g. in the file
aula1.RDATA, you are working with using the command
save.image("class1.RDATA")
3
To read these data type
load("class1.RDATA").
12. Make the following demos that allow you to understand some of the graphics
capability of R.
demo(image)
demo(graphics)
demo(persp)
demo(plotmath)
library(lattice)
demo(lattice)
library(vcd )
demo(mosaic)
13. Build a function to draw Andrews curves using the sparrows dataset. Remember
that the individual xi = (x1 , . . . , xp ) is represented by the following function:
x1
fxi (t) = + x2 sin(t) + x3 cos(t) + x5 sin(2t) + x6 cos(2t) + . . . , < t
2
"Andrews"<-function(data)
{
n<-nrow(data)
data<-scale(data)
i<-1
curve(data[i,1]/sqrt(2)+data[i,2]*sin(x)+data[i,3]*cos(x)+data[i,4]*sin(2*x)+
data[i,5]*cos(2*x),from=-pi,to=pi,ylab="Curvas de Andrews",add=FALSE,
ylim=c(-8,8))
for(i in 2:n)
{
if(i<=21) cor<-1
else cor<-2
curve(data[i,1]/sqrt(2)+data[i,2]*sin(x)+data[i,3]*cos(x)+data[i,4]*sin(2*x)+
data[i,5]*cos(2*x),from=-pi,to=pi,ylab="Curvas de Andrews",add=TRUE,
col=cor)
}
}
sparrows<-read.table("sparrows.txt",header=TRUE)
names(sparrows)
Andrews(sparrows)
4
https://cran.r-project.org/web/views/Graphics.html
http://www.visualcomplexity.com/vc/