Training in R For Data Statistics
Training in R For Data Statistics
Package: cluster
Version: 1.14.4
Date: 2013-03-26
Priority: recommended
Author: Martin Maechler, based on S
Original by Peter … …… … …
Followed by a list of all the functions and data sets.
How to quit in R
Task
Then
x = c(2,3,4,5)
● To find sum of squares of deviation from mean
● > x = c(2,3,4,5)
Day 3-
Work on Arithmetic, Logical and
Matrix operations.
Day 4-
Writing simple programs, saving,
and running programs.
Set the working directory
Create an R file
RStudio with script file open
Writing scripts
Saving R script file
Executing an R file
Add comments –single line
Add comments –Multiple lines
Clear the console
Clear the environment –rm()
plot(v,type,col,xlab,ylab)
S. Param Description
No eter
1. V It is a vector which contains the numeric
values.
2. Type This parameter takes the value : I: to draw
only the lines or p: to draw only the points
and "o" to draw both lines and points.
3. Xlab It is the label for the x-axis.
4. Ylab It is the label for the y-axis.
5. Main It is the title of the chart.
6. Col It is used to give the color for both the points
and lines
Exercise
Write a R Program to draw multiple lines.
# Create the data for the chart.
v <- c(7,12,28,3,41)
t <- c(14,7,6,19,3)
# Give the chart file a name.
png(file = "line_chart_2_lines.jpg")
# Plot the bar chart.
plot(v,type = "o",col = "red", xlab = "Month", ylab =
"Rain fall", main = "Rain fall chart")
lines(t, type = "o", col = "blue")
# Save the file. dev.off()
Histogram
● A histogram is a type of bar chart which shows the frequency of
the number of values which are compared with a set of values
ranges. The histogram is used for the distribution, whereas a bar
chart is used for comparing different entities. In the histogram,
each bar represents the height of the number of values present in
the given range.
hist(v,main,xlab,ylab,xlim,ylim,breaks,col,border)
S.N Parameter Description
o
1. V It is a vector that contains numeric values.
2. Main It indicates the title of the chart.
3. Col It is used to set the color of the bars.
4. Border It is used to set the border color of each bar.
5. Xlab It is used to describe the x-axis.
6. Ylab It is used to describe the y-axis.
7. Xlim It is used to specify the range of values on the
x-axis.
8. Ylim It is used to specify the range of values on the
y-axis.
9. Breaks It is used to mention the width of each bar.
Exercise
Write a R Program to draw histogram.
Suppose you own a school uniform shop, and you want to stock up the supply based on the age of the
students residing in your locality. You can go through your bill book and write the ages of all the
customers as follow:
Age Data: 5, 5, 7, 6, 8, 11, 13, 11, 14, 12, 15, 23, 24, 15, 14, 14, 15, 10, 24, 16, 16, 17, 11, 19, 23, 14,
18, 16, 15, 19, 14, 9, 11, 10, 12, 10, 10, 16, 13, 14, 12, 15, 23, 24, 15, 14, 14, 15, 12, 24, 16, 16, 17, 18,
19, 23, 18, 9, 23, 14, 11, 16, 6, 13, 11, 14, 12, 15, 22, 22, 15, 14, 14, 15, 10, 5, 7, 6, 8, 6, 13, 11, 14, 12,
15, 23, 21, 15, 14, 14, 15, 5, 7, 6, 8, 6, 13, 11, 14, 12, 15, 9, 24, 15, 14, 14, 15, 12.
The above data can be categorized into groups as follow:
Age Groups (Class-
Number of students (frequency)
Intervals)
5-8 16
9-12 24
13-16 46
17-20 8
21-24 14
Bar Charts
A bar chart is a pictorial representation in which numerical
values of variables are represented by length or height of lines or
rectangles of equal width. A bar chart is used for summarizing a
set of categorical data. In bar chart, the data is shown through
rectangular bars having the length of the bar proportional to the
value of the variable.
barplot(h,x,y,main,names.arg,col)
S.No Parameter Description
1. H A vector or matrix which contains numeric
values used in the bar chart.