Big Data File in R
Big Data File in R
Laboratory File
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical
tests, time-series analysis, classification, clustering,) and graphical techniques, and is highly
extensible."
One of R's strengths is the ease with which well-designed publication-quality plots can be
produced, including mathematical symbols and formulae where needed.
R supports procedural programming with functions and, for some functions, object-
oriented programming with generic functions. A generic function acts differently depending
on the classes of arguments passed to it. In other words, the generic
function dispatches the function (method) specific to that class of object.
Vectors
When you want to create vector with more than one element, you should use c() function
which means to combine the elements into a vector.
Lists
A list is an R-object which can contain many different types of elements inside it like
vectors, functions and even another list inside it.
Matrices
A matrix is a two-dimensional rectangular data set. It can be created using a vector input
to the matrix function.
Arrays
While matrices are confined to two dimensions, arrays can be of any number of
dimensions. The array function takes a dim attribute which creates the required number of
dimension.
Factors
Factors are the r-objects which are created using a vector. It stores the vector along with
the distinct values of the elements in the vector as labels. The labels are always character
irrespective of whether it is numeric or character or Boolean etc. in the input vector. They
are useful in statistical modelling.
Factors are created using the factor() function. The nlevels functions gives the count of
levels.
Experiment-2
AIM: Write program to print elementary operation in R.
x <- 10
x
class(x)
is.integer(x)
x<- as.integer(3.8)
x
class(x)
is.integer(x)
x=1
y=4
z=x>y
class(z)
is.integer(z)
x= "vishesh"
class(c)
Outputs
Experiment-3
AIM: Write a program to determine various control statements in R.
If else statement:
x=10;
if(x>1){
print("x is greater than 1")
}else{
print("x is less than 1")
}
For loop:
x = c(1,2,3,4,5)
for(i in 0:5){
print(x[i])
}
While Loop:
x = 2.987
while(x <= 4.987) {
x = x + 0.987
print(c(x,x-2,x-1))
}
Repeat loop:
a=1
repeat { print(a)
a = a+1
if(a > 4)
break }
Experiment-4
AIM: Write a program to create list & vector in R.
List in R
list_data <- list("Red", "Green", c(21,32,11), TRUE, 51.23, 119.1)
print(list_data)
Vector in R:
print("abc");
# Atomic vector of type double.
print(12.5)
# Atomic vector of type integer.
print(63L)
# Atomic vector of type logical.
print(TRUE)
# Atomic vector of type complex.
print(2+3i)
# Atomic vector of type raw.
print(charToRaw('hello'))
Experiment-5
AIM: Create a vector from 1 to 5 in increments of 0.2 by using sequence.
x = 1:30
x
x = seq(2, 8, 0.5)
x
x= seq(2,10, 2)
x
x = 5/0
x
Experiment-6
AIM: Generate a vector of 5000 random numbers from uniform distribution, with
mean=3, and standard deviation=2, use the function mean, sd, to compute the
sample mean and standard deviation of the values in the vector. Visualize this
distribution using hist(to generate the histogram).
n = rnorm (5000, 3, 2)
mean(n)
sd(n)
hist(n)
Experiment-7
AIM: To print Matrix, Array, Strings in R.
A = matrix(
c(2, 4, 3, 1, 5, 7), # the data elements
nrow=2, # number of rows
ncol=3, # number of columns
byrow = TRUE) # fill matrix by rows
A
String
a <- 'Start and end with single quote'
print(a)
b <- "Start and end with double quotes"
print(b)
c <- "single quote ' in between double quotes"
print(c)
d <- 'Double quotes " in between single quote'
print(d)
Arrays
v = c(20, 25, 34, 56, 99, 1006, 2009, 41113) // creating a vector v
v
X = matrix(c(v),nrow=2, ncol=4) // creating a matrix X and assigning the vector value in it.
X
Experiment-9
AIM: Use runif to construct a 5*5 matrix b of random numbers with a uniform.
distribution between 0 and 1.
(a) Extract from it, the second row, second column and the 3*3 matrix of the
values that are not at the margins.
(b)Use sequence to replace the values of the first row of b by 2,5,8,11,14.
df <- data.frame( c( 183, 85, 40), c( 175, 76, 35), c( 178, 79, 38 ))
names(df) <- c("Height", "Weight", "Age")
HISTOGRAM
BMI<-rnorm(n=1000, m=24.2, sd=2.2)
hist(BMI)
BARPLOT
BMI<-rnorm(n=1000, m=24.2, sd=2.2)
barplot(BMI)
Experiment-12
AIM: Write a program to implement function in R.
MEAN
# Create a vector.
x <- c(12,7,5,4.2,18,2)
# Find Mean.
result.mean <- mean(x)
print(result.mean)
MEDIAN
# Create the vector.
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)
# Find the median.
median.result <- median(x)
print(median.result)
MODE
# Create the function.
getmode <- function(v) {
uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]
}
# Create the vector with numbers.
v <- c(2,1,2,3,1,2,3,4,1,5,5,3,2,3)
# Calculate the mode using the user function.
result <- getmode(v)
print(result)
# Create the vector with characters.
charv <- c("o","it","the","it","it")
# Calculate the mode using the user function.
result <- getmode(charv)
print(result)
Experiment-14
AIM: Write a program in R to work in excel.
id,name,salary,start_date,dept
1,Rick,623.3,2012-01-01,IT
2,Dan,515.2,2013-09-23,Operations
3,Michelle,611,2014-11-15,IT
4,Ryan,729,2014-05-11,HR
,Gary,843.25,2015-03-27,Finance
6,Nina,578,2013-05-21,IT
7,Simon,632.8,2013-07-30,Operations
8,Guru,722.5,2014-06-17,Finance
Now Run:
data <- read.csv("input.csv")
print(data)
in R studio.
Experiment-15
AIM: Write a program in R to fit Lanier regression model.