0% found this document useful (0 votes)

57 views40 pages

Aditya Garg DMDW

A crop researcher wants to test the effect of three fertilizer mixtures on crop yield using a one-way ANOVA. The researcher will perform a one-way ANOVA on the data to see if there are statistically significant differences in crop yields between the three groups. The one-way ANOVA will calculate test statistics and compare them to an alpha level of 0.05 to determine if the null hypothesis that the group means are identical can be rejected or not. A conclusion will be made and the results will be plotted in a graph.

Uploaded by

Raj Nish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views40 pages

Aditya Garg DMDW

Uploaded by

Raj Nish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Dr. B. R.

Ambedkar National Institute of Technology

Jalandhar, Punjab

Session : June-Dec 2020

CXS – 425
Data Mining and Data Warehousing Lab

Submitted to : Submitted By:

Dr. Kunwar Pal Kunal Khandelwal
CSE Department 17103045
G2
17103045 1

Assignment 1
1.
a. Find matrix – matrix multiplication
b. Find (AB)T and (AB)-1
c. Find the mean, standard deviation for each column and row for the
matrices A, B, AB, (AB)-1.
A <- rbind(c(3,-2,1),c(-1,4,-2))
B <- rbind(c(-7,4),c(9,5),c(2,-1))

print("Matrix A : ")
print(A)
print("Matrix B :")
print(B)

#AB
C <-A%*%B
print("Multiplication AB :")
print(C)

#T(AB)
T <-t(C)
print("Transpose of Matrix AB :")
print(T)

#I(AB)
I <- solve(C)
print("Inverse of Matrix AB :")
print(I)

#Mean
print("Mean of Matrix A :")
#Row
mean(A[1,])
mean(A[2,])
#column
mean(A[,1])
mean(A[,2])
mean(A[,3])

print("Mean of Matrix B :")

#Row
mean(B[1,])
mean(B[2,])
mean(B[3,])
#column
mean(B[,1])
mean(B[,2])

print("Mean of Matrix AB :")

#Row
mean(C[1,])
mean(C[2,])
#column
mean(C[,1])
mean(C[,2])
17103045 2

print("Mean of Matrix Inverse of AB :")

#Row
mean(I[1,])
mean(I[2,])
#column
mean(I[,1])
mean(I[,2])

#Standard Deviations
print("Standard deviation of matrix A :")
sd(A,na.rm=TRUE)
print("Standard deviation of matrix B :")
sd(B,na.rm=TRUE)
print("Standard deviation of matrix AB :")
sd(C,na.rm=TRUE)
print("Standard deviation of matrix inverse of AB :")
sd(I,na.rm=TRUE)

OUTPUT
17103045 3
17103045 4

2. Write a “Function” program in R to find n! . Hence Find 13! , 32! ,.Do not name the
function by “Factorial”. You can initialize that 0!=1 and 1!=1.

findfactorial <- function(n){

factorial <- 1
if((n==0||n==1))
factorial <- 1
else{
for(i in 1:n)
factorial <- factorial*i
}
return (factorial)
}

print(findfactorial(13))
print(findfactorial(32))

OUTPUT
17103045 5

3. Write a “Function” program in R to find maximum and minimum from a set of

numbers. Do not name the function by “max” or “min”. As an input you take
(4,44.7,2,40,54,1,3,4).

vector1 <- c(4,44.7,2,40,54,1,3,4)

l <- length(vector1)

min1 = 10000
max1 = -10000

for(i in 1:l){
if(min1>vector1[i]){
min1 = vector1[i];
}
if(max1<vector1[i]){
max1 = vector1[i];
}
}

print(paste("Minimum is", min1))

print(paste("Maximum is", max1))

OUTPUT
17103045 6

ASSIGNMENT 2

1. How to read/write data from the dataset in R.

In R, we can write data frames easily to a file, using the write.table() command.
write.table(cars1, file="cars1.txt", quote=F)
The first argument refers to the data frame to be written to the output file, the second is
the name of the output file. By default, R will surround each entry in the output file by
quotes, so we use quote=F.
The function read.table(“/location”) can then be used to read the data frame directly

Code:

data <- read_excel("BEPSxls.xlsx")

View(data)

OUTUPT:
17103045 7

2. Use different function in R.

a. Read
b. Head
c. Tail
d. Names

CODE:

data <- read_excel("BEPSxls.xlsx")

#data-read
print("*************************");
print(data)
head(data,6)
tail(data,6)

print("Data Head ***")

#data-head
print(head(1:50,10))

print("********Data Tail***********")
#data-tail
print(tail(1:5,1))
print("******Names Data ***********")
print(names(data))
17103045 8

OUTPUT:
17103045 9

3. Download the given dataset and perform the following.

a. Mean
b. Median
c. Summary
d. Histogram
e. Plot

Code:
dataset<- read_excel("BEPSxls.xlsx")
mean(dataset$age)
median(dataset$age)
summary(dataset)
hist(dataset$age,main = 'AGE HISTOGRAM')
plot(dataset$Blair)

OUTPUT:
17103045 10

4. Attach and detach the dataset in R.

data <- data.frame(x1 = c(9, 8, 3, 4, 8),

x2 = c(5, 4, 7, 1, 1),
x3 = c(1, 2, 3, 4, 5))
data
x1 #give error
attach(data)
x1 #run
detach(data)
x1 # give error

library(readxl)
dataset=read_excel(file.choose())
#For dataset
attach(dataset)
cat(gender)
detach(dataset)
cat(gender)
17103045 11

ASSIGNMENT 3

1. Demonstration of pre-processing on dataset mtcars(R-studio)

Code:
mtcars
mtcars$mpg = ifelse(is.na(mtcars$mpg),ave(mtcars$mpg, FUN =
function(x) mean(x,na.rm='TRUE')),mtcars$mpg)

OUTPUT:

2. Demonstrate the filter function on dataset mtcars using (deplyr package)

a. Show ehre gear attribute = 4,
b. Show where disp = 160,
c. Show different operations (and,or,not)

CODE:
library(dplyr)

#1 Show where gear attribute = 4,

gear_4 <- filter(mtcars, gear == 4)
head(gear_4)

#2 Show where disp = 160.

disp_160 <- filter(mtcars, disp == 160.0)
head(disp_160)

#3 Show different operations (and, or, not)

#AND
gear4_and_carb4 <- filter(mtcars, gear == 4 & carb == 4)
head(gear4_and_carb4)
#OR
17103045 12

gear4_or_hp110 <- filter(mtcars, gear == 4 | hp == 110)

head(gear4_or_hp110)
#Not
gearNot4 <- filter(mtcars, gear != 4)
head(gearNot4)

OUTPUT:
17103045 13

3. Demonstrate the different function on dataset mtcars/Titanic

a. arrange
b. group_by
c. summarise
d. select
e. intersect
f. setdiff

CODE:
print("Arrange : ")
arrange(mtcars, desc(disp))

print("Group By : ")
group_by(mtcars,drat)

print("Summarise : ")
summarise(mtcars,mean(disp))

print("Select : ")
select(mtcars,qsec)

print("Intersect :")
A<- subset(mtcars,disp==160)
B<- subset(mtcars,cyl=100)
intersect(A,B)

print("SetDiff :")
setdiff(B,A)
17103045 14
17103045 15
17103045 16
17103045 17

4. Remove the not required columns from mtcars dataset

Code:
DATA <- subset(mtcars,select=c(1:9))
print(DATA)

5. Show the attribute containing NA values in a column in dataset

Code:
myData <- data.frame(col1 = c(1:3, NA),
col2 = c("this", NA,"is", "text"),
col3 = c(TRUE, FALSE, TRUE, TRUE),
col4 = c(2.5, NA, 3.2, NA))
is.na(myData)
17103045 18

6. Repeat all the above question on downloaded dataset

Attributes containing NA VALUES

CODE:
is.na(mtcars)
17103045 19

ASSIGNMENT 4

1. As a crop researcher, you want to test effect of three different fertilizers

mixtures on crop yield. You can use a one-way ANOVA to find out if there
is a difference in crop yields between the three groups. Using the data,
perform a one-way analysis of variance using alpha = .05.
a. Perform a one-way analysis of variance
b. Calculate test statistics
c. Interpreting the results
d. State conclusion
e. Plot the graph for the same

The one-way analysis of variance (ANOVA), also known as one-factor ANOVA, is an

extension of independent two-samples t-test for comparing means in a situation
where there are more than two groups. In one-way ANOVA, the data is organized
into several groups base on one single grouping variable (also called factor variable).
The one-way analysis of variance (ANOVA) is used to determine whether there are
any statistically significant differences between the means of three or more
independent (unrelated) groups.To clarify if the data comes from the same
population, you can perform a one-way analysis of variance (one-way ANOVA
hereafter). This test, like any other statistical tests, gives evidence whether the H0
hypothesis can be accepted or rejected.

Hypothesis in one-way ANOVA test:

• H0: The means between groups are identical
• H3: At least, the mean of one group is different

In other words, the H0 hypothesis implies that there is not enough evidence to prove
the mean of the group (factor) are different from another.
17103045 20

Code:

my_data <- read_excel('DMDW_LAB4.xlsx')

View(my_data)

#check and display ordered levels

my_data$group <- ordered(my_data$group, levels = c("Group1", "Group2",
"Group3"))

#compute summary statistics by group

library(dplyr)
group_by(my_data, group) %>%
summarise(
count = n(),mean = mean(values, na.rm = TRUE),
sd = sd(values, na.rm = TRUE)
)

#compute one way ANOVA

#compute analysis of variance
res.aov <- aov(values ~ group, data = my_data)
#summary of analysis
summary(res.aov)

#interpret result of ANOVA

#multiple pairwise comparison
TukeyHSD(res.aov)
#homogeneity
plot(res.aov,1)
#normality
plot(res.aov,2)

OUTPUT:
17103045 21

2. Repeat the quesion1 and perform one-way analysis of variance using inbuilt
dataset in Rstudio.

Code:
#build data
my_data<- PlantGrowth

#check data and display ordered levels

sample_n(my_data,10)

#show levels
levels(my_data$group)

#compute summary statistics

library(dplyr)
group_by(my_data, group) %>%
summarise(
count = n(),
mean = mean(weight, na.rm = TRUE),
sd = sd(weight, na.rm = TRUE)
)
17103045 22

#compute anova test

# Compute the analysis of variance
res.aov <- aov(weight ~ group, data = my_data)
# Summary of the analysis
summary(res.aov)

#Interpret the result of one-way ANOVA tests

#multiple pairwise comparison
TukeyHSD(res.aov)
#Homogeneity of variances
plot(res.aov, 1)
#Normality
plot(res.aov, 2)

OUTPUT:
17103045 23
17103045 24

ASSIGNMENT 5

1. Consider dataset “Groceries” and apply apriori algorithm on it. What are the
first 5 rules generated when the min support is 0.001 and min confidence is 0.9

Code:

library(arules)
groceries <- read_excel("LAB5.csv")
rules=apriori(data= groceries, parameter =
list(support=0.001,confidence=0.9))
inspect(rules[1:5])

OUTPUT:

2. The database has four transaction. What association rule can be found in this set,
if the minimum support is 60% and minimum confidence is 80%.

Code:

library(arules)
library(readr)
groceries2 <- read_excel("LAB5-2.csv")

rules=apriori(data= groceries2,parameter= list(support=0.6,confidence=0.8))

rules

Output:
17103045 25

3. Demonstration of association rule process on dataset titanic using apriori

algorithm in rstudio.

Code:
library(arules)
library(readr)
titanic <- read_csv("titanic.csv")
data(titanic)
rules=apriori(data= titanic, parameter =
list(support=0.6,confidence=0.8))
rules
inspect(rules[1:5])

OUTPUT:
17103045 26

ASSIGNMENT 6

1. Demonstrate performing linear regression on given data using R/Python.

a. Plot the scattered graph
b. Calculate test statistics
c. Find coefficient and different performance matrix

Code:
dataset <- read_excel("LAB6.xlsx")
summary(dataset)
hist(dataset$X)
plot(Y~X, data=dataset)
dataset.lm <- lm(Y~X, dataset)
summary(dataset.lm)

Output:
17103045 27

2. Demonstrate performing linear regression on Lung capacity dataset using

R/Python.

Code:

dataset <- read_excel("Lung Capacity.xls")

summary(dataset)
cor(dataset$Height, dataset$LungCapacity)
cor(dataset$Age, dataset$LungCapacity)
plot(dataset$Exercise, dataset$LungCapacity,data = dataset)

dataset.lm <- lm(dataset$LungCapacity ~ dataset$Gender + dataset$Height +

dataset$Smoker +dataset$Exercise, data= dataset)
summary(dataset.lm)

Output:
17103045 28

ASSIGNMENT 7

1. To construct Decision tree for weather data and classify it.

Code:

library(rpart.plot)
library(rpart)
dataset <- read_csv("austin_weather.csv")
head(dataset)
shuffle_index<-sample(1:nrow(dataset))
dataset <- dataset[shuffle_index,]
ls(dataset)
sum(is.na(dataset$Events))
dim(dataset)

sum(is.na(dataset$DewPointAvgF))
summary(dataset$TempHighF)

dataset = subset(dataset, select = -c(Date,Events,TempAvgF, DewPointAvgF,

HumidityAvgPercent,SeaLevelPressureAvgInches, VisibilityAvgMiles, WindAvgMPH ))

str(dataset)
dataset[] <- lapply(dataset, as.numeric)

dataset <- dataset %>%

mutate(TempHighF = case_when(
TempHighF < 40 ~ "<40",
TempHighF >= 40 & TempHighF < 60 ~ "40-60",
TempHighF >= 60 & TempHighF < 80 ~ "60-80",
TempHighF >= 80 & TempHighF < 100 ~ "80-100",
TempHighF >= 100 ~ ">100",
TRUE ~ "NA"
))

fit <- rpart(TempHighF~., data = dataset, method = 'class')

rpart.plot(fit, extra = 106)
17103045 29

Output:

2. To construct Decision tree for customer data and classify it.

Code:
dataset <- read_csv("WA_Fn-UseC_-Telco-Customer-Churn.csv")

dim(dataset)

ls(dataset)

dataset = subset(dataset, select = -c(customerID ))

fit <- rpart(Churn~., data = dataset, method = 'class')

rpart.plot(fit, extra = 106)

Output:
17103045 30
17103045 31

ASSIGNMENT 8

1. Write a procedure for clustering customer data using Simple KMeans Algorithm

 Step 1: Choose groups in the feature plan randomly

 Step 2: Minimize the distance between the cluster center and the different observations
(centroid). It results in groups with observations
 Step 3: Shift the initial centroid to the mean of the coordinates within a group.
 Step 4: Minimize the distance according to the new centroids. New boundaries are created.
Thus, observations will move from one group to another
 Repeat until no observation changes groups
17103045 32

2. Demonstration of clustering rule process on dataset using simple k-means.

Code:

library(readr)
dataset = read_csv("Mall_Customers.csv")
dataset = dataset[4:5]
set.seed(6)
wcss = vector()
for (i in 1:10) wcss[i] = sum(kmeans(dataset, i)$withinss)
plot(1:10,
wcss,
type = 'b',
main = paste('The Elbow Method'),
xlab = 'Number of clusters',
ylab = 'WCSS')
kmeans = kmeans(x = dataset, centers = 5)
y_kmeans = kmeans$cluster

# Visualising the clusters

library(cluster)
clusplot(dataset,
y_kmeans,
lines = 0,
shade = TRUE,
color = TRUE,
labels = 2,
plotchar = FALSE,
span = TRUE,
main = paste('Clusters of customers'),
xlab = 'Annual Income',
ylab = 'Spending Score')

Output:
17103045 33
17103045 34

ASSIGNMENT 9

1. Demonstration of classification rule process on dataset using naïve bayes

algorithm

Code:

# Installing Packages
install.packages("e1071")
install.packages("caTools")
install.packages("caret")

# Loading package
library(e1071)
library(caTools)
library(caret)
library(dplyr)

dataset = read_csv("Mall_Customers.csv")

dataset$Gender <- factor(dataset$Gender, levels = c("Male", "Female"),

labels = c(0,1))

dataset <- dataset %>%

mutate(Age = case_when(
Age < 30 ~ "<30",
Age >= 30 & Age < 45 ~ "30-45",
Age >= 45 & Age < 60 ~ "45-60",
Age >= 60 ~ ">60",
TRUE ~ "NA"
))

dataset <- dataset %>%

mutate(Income = case_when(
Income <40 ~ "<40",
Income >= 40 & Income < 60 ~ "40-60",
Income >= 60 ~ ">60",
TRUE ~ "NA"
))

dataset <- dataset %>%

mutate(Score = case_when(
Score < 30 ~ "<20",
Score >= 20 & Score < 40 ~ "20-40",
Score >= 40 & Score < 60 ~ "40-60",
Score >= 60 & Score < 80 ~ "60-80",
Score >= 80 ~ ">80",
TRUE ~ "NA"
))

trainIndex <- createDataPartition(dataset$Score, p = .7,

list = FALSE,
times = 1)

Train <- dataset[ trainIndex,]

Valid <- dataset[-trainIndex,]
17103045 35

# Fitting Naive Bayes Model

# to training dataset

classifier_cl <- naiveBayes(Score ~ ., data = Train)

classifier_cl

Output:

2. Demonstration of clustering rule process on dataset using EM algorithm.

Code:

install.packages("mixtools")
dataset = read_csv("Mall_Customers.csv")
summary(dataset$Score)
x <- dataset$Score
plot(density(x))

mem <- kmeans(x,2)$cluster

mu1 <- mean(x[mem==1])
mu2 <- mean(x[mem==2])
sigma1 <- sd(x[mem==1])
sigma2 <- sd(x[mem==2])
pi1 <- sum(mem==1)/length(mem)
pi2 <- sum(mem==2)/length(mem)
# modified sum only considers finite values
sum.finite <- function(x) {
sum(x[is.finite(x)])
}
17103045 36

Q <- 0
# starting value of expected value of the log likelihood
Q[2] <- sum.finite(log(pi1)+log(dnorm(x, mu1, sigma1))) +
sum.finite(log(pi2)+log(dnorm(x, mu2, sigma2)))

k <- 2

while (abs(Q[k]-Q[k-1])>=1e-6) {
# E step
comp1 <- pi1 * dnorm(x, mu1, sigma1)
comp2 <- pi2 * dnorm(x, mu2, sigma2)
comp.sum <- comp1 + comp2

p1 <- comp1/comp.sum
p2 <- comp2/comp.sum

# M step
pi1 <- sum.finite(p1) / length(x)
pi2 <- sum.finite(p2) / length(x)

mu1 <- sum.finite(p1 * x) / sum.finite(p1)

mu2 <- sum.finite(p2 * x) / sum.finite(p2)

sigma1 <- sqrt(sum.finite(p1 * (x-mu1)^2) / sum.finite(p1))

sigma2 <- sqrt(sum.finite(p2 * (x-mu2)^2) / sum.finite(p2))

p1 <- pi1
p2 <- pi2

k <- k + 1
Q[k] <- sum(log(comp.sum))
}

library(mixtools)
gm<-normalmixEM(x,k=2,lambda=c(0.9,0.1),mu=c(0.4,0.3),sigma=c(0.05,0.02))
gm$mu
gm$sigma
gm$lambda
hist(x, prob=T, breaks=32, xlim=c(range(x)[1], range(x)[2]), main='')
lines(density(x), col="green", lwd=2)
x1 <- seq(from=range(x)[1], to=range(x)[2], length.out=1000)
y <- pi1 * dnorm(x1, mean=mu1, sd=sigma1) + pi2 * dnorm(x1, mean=mu2,
sd=sigma2)
lines(x1, y, col="red", lwd=2)
legend('topright', col=c("green", 'red'), lwd=2, legend=c("kernal", "fitted"))

Output:
17103045 37
17103045 38

ASSIGNMENT 10

1. Build Data Warehouse, install and Explore WEKA.

Took the dataset Mall_customers.csv

Here is how columns are visualised with a single click

Clustering the data with WEKA

17103045 39

2. Perform data pre-processing tasks and Demonstrate performing association rule

mining on datasets using WEKA.

Dataset:

Min_Support : 50%

Using WEKA,
We got the rules as

RADIOSS Theory Manual V13 PDF
No ratings yet
RADIOSS Theory Manual V13 PDF
74 pages
Math
100% (1)
Math
184 pages
Solutions Manual Using R Introductory ST
No ratings yet
Solutions Manual Using R Introductory ST
33 pages
Maths Record Output .
No ratings yet
Maths Record Output .
24 pages
DEV Lab Manual
No ratings yet
DEV Lab Manual
27 pages
Da Lab It
No ratings yet
Da Lab It
20 pages
R Console
No ratings yet
R Console
6 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Case 4 - Tutorial 2
No ratings yet
Case 4 - Tutorial 2
20 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
Solutions For QB3
No ratings yet
Solutions For QB3
14 pages
BAN5
No ratings yet
BAN5
2 pages
Module2 BDA
No ratings yet
Module2 BDA
44 pages
Statistics, Statistical Modelling and Data Analytics - Practicalfile - SJ
No ratings yet
Statistics, Statistical Modelling and Data Analytics - Practicalfile - SJ
23 pages
R Studio Notes
No ratings yet
R Studio Notes
10 pages
Lab File AD PDF
No ratings yet
Lab File AD PDF
25 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
9 pages
BES - R Lab
No ratings yet
BES - R Lab
5 pages
DS Assignment COMPLETED
No ratings yet
DS Assignment COMPLETED
11 pages
Problem Set 1 Solution Numerical Methods
No ratings yet
Problem Set 1 Solution Numerical Methods
32 pages
DA Lab Manual
No ratings yet
DA Lab Manual
42 pages
R Commands
No ratings yet
R Commands
18 pages
Example Report
No ratings yet
Example Report
22 pages
First Course On R
No ratings yet
First Course On R
26 pages
Shahun Term Workr1
No ratings yet
Shahun Term Workr1
34 pages
Assignment 1
No ratings yet
Assignment 1
8 pages
Exercises For R
No ratings yet
Exercises For R
40 pages
Industrial Statistics - A Computer Based Approach With Python
No ratings yet
Industrial Statistics - A Computer Based Approach With Python
140 pages
Solutions Modernstatistics
No ratings yet
Solutions Modernstatistics
144 pages
R Program Record Book Iba
No ratings yet
R Program Record Book Iba
24 pages
A Short List of The Most Useful R Commands
No ratings yet
A Short List of The Most Useful R Commands
11 pages
R Practicals
No ratings yet
R Practicals
32 pages
Applied Statistics MAT1011
No ratings yet
Applied Statistics MAT1011
22 pages
COST - JournalPracticals (1-7)
No ratings yet
COST - JournalPracticals (1-7)
22 pages
R Intro 2011
No ratings yet
R Intro 2011
115 pages
R File Code
No ratings yet
R File Code
16 pages
Assignment 1
No ratings yet
Assignment 1
16 pages
R Assignment
No ratings yet
R Assignment
9 pages
Stastistics and Probability With R Programming Language: Lab Report
50% (2)
Stastistics and Probability With R Programming Language: Lab Report
44 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
32 pages
Da Lab File 2
No ratings yet
Da Lab File 2
13 pages
R Code
No ratings yet
R Code
9 pages
R Programing Bhagu
No ratings yet
R Programing Bhagu
40 pages
Advanced Statistical Methods Using R
No ratings yet
Advanced Statistical Methods Using R
32 pages
Cost Practical
No ratings yet
Cost Practical
13 pages
Ds
No ratings yet
Ds
2 pages
Data Science Practicals
No ratings yet
Data Science Practicals
47 pages
A Short List of The Most Useful R Commands
No ratings yet
A Short List of The Most Useful R Commands
8 pages
ANOVA in R
No ratings yet
ANOVA in R
7 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
Commands For Data Analysis Using R
No ratings yet
Commands For Data Analysis Using R
11 pages
R Questions With Solution
No ratings yet
R Questions With Solution
11 pages
R Code For Linear Regression Analysis 1 Way ANOVA
No ratings yet
R Code For Linear Regression Analysis 1 Way ANOVA
8 pages
Report File
No ratings yet
Report File
28 pages
WT EX 1-6 Raju PDF
No ratings yet
WT EX 1-6 Raju PDF
24 pages
Lab File of Network Programming: Dr. B R Ambedkar National Institute of Technology Jalandhar
No ratings yet
Lab File of Network Programming: Dr. B R Ambedkar National Institute of Technology Jalandhar
35 pages
DR B.R. Ambedkar National Institute of Technology: Jalandhar, Punjab
No ratings yet
DR B.R. Ambedkar National Institute of Technology: Jalandhar, Punjab
29 pages
DBMS LAB (Tushar)
No ratings yet
DBMS LAB (Tushar)
70 pages
Java File
No ratings yet
Java File
104 pages
CN Lab 2
No ratings yet
CN Lab 2
35 pages
DBMS Interview Questions
No ratings yet
DBMS Interview Questions
16 pages
Use of Blockchain Technology: By-Kunal Khandelwal Vikas Saini Arshdeep Rajnish Kumar
No ratings yet
Use of Blockchain Technology: By-Kunal Khandelwal Vikas Saini Arshdeep Rajnish Kumar
8 pages
Web Technology LabFile
100% (1)
Web Technology LabFile
28 pages
Software Engg
No ratings yet
Software Engg
59 pages
Year 4 Maths Practice Paper Term 3
No ratings yet
Year 4 Maths Practice Paper Term 3
6 pages
Lines and Planes in Space: MATH23 Multivariable Calculus
No ratings yet
Lines and Planes in Space: MATH23 Multivariable Calculus
28 pages
Thermodynamics Exergy Lectures
No ratings yet
Thermodynamics Exergy Lectures
11 pages
Algebra Questions
No ratings yet
Algebra Questions
30 pages
Summer Training Report: Bachelor of Technology
No ratings yet
Summer Training Report: Bachelor of Technology
37 pages
The Pseudoinverse of A Rectangular Matrix and Its Statistical Applications
No ratings yet
The Pseudoinverse of A Rectangular Matrix and Its Statistical Applications
6 pages
Cellular Learning Automata: Theory and Applications Reza Vafashoar - The Ebook in PDF/DOCX Format Is Available For Instant Download
100% (1)
Cellular Learning Automata: Theory and Applications Reza Vafashoar - The Ebook in PDF/DOCX Format Is Available For Instant Download
60 pages
What Is Coupling and What Do You Understand by Tight and Loose Coupling?
No ratings yet
What Is Coupling and What Do You Understand by Tight and Loose Coupling?
9 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
7.1 Writing and Solving One-Step Inequalities PPS D and AB
No ratings yet
7.1 Writing and Solving One-Step Inequalities PPS D and AB
2 pages
Big-O Algorithm Complexity Cheat Sheet (Know Thy Complexities!) @ericdrowell
No ratings yet
Big-O Algorithm Complexity Cheat Sheet (Know Thy Complexities!) @ericdrowell
9 pages
IGCSE Mathematics 0580 - 21 Paper 2 May-June 2023
No ratings yet
IGCSE Mathematics 0580 - 21 Paper 2 May-June 2023
6 pages
Quantum Mechanics 4
No ratings yet
Quantum Mechanics 4
44 pages
Ac Distribution
No ratings yet
Ac Distribution
16 pages
1st & 2nd Pu Cet Maths Muliple Choice Question Bank
No ratings yet
1st & 2nd Pu Cet Maths Muliple Choice Question Bank
12 pages
Experiment 3
No ratings yet
Experiment 3
11 pages
Modfalcon: Compact Signatures Based On Module Ntru Lattices: Chitchanok - Chuengsatiansup@Adelaide - Edu.Au
No ratings yet
Modfalcon: Compact Signatures Based On Module Ntru Lattices: Chitchanok - Chuengsatiansup@Adelaide - Edu.Au
28 pages
The Wizard's Apprentice
No ratings yet
The Wizard's Apprentice
23 pages
Class 11 Syllabus and Blueprint Term 1 (2024-25)
No ratings yet
Class 11 Syllabus and Blueprint Term 1 (2024-25)
16 pages
Algorithms, Design and Analysis: Types of Formulas For Basic Operation Count
No ratings yet
Algorithms, Design and Analysis: Types of Formulas For Basic Operation Count
6 pages
Surface Kinetics Modeling of Silicon and Silicon Oxide Plasma Etching. II. Plasma Etching Surface Kinetics Modeling Using Translating Mixed-Layer Representation
No ratings yet
Surface Kinetics Modeling of Silicon and Silicon Oxide Plasma Etching. II. Plasma Etching Surface Kinetics Modeling Using Translating Mixed-Layer Representation
7 pages
Lecture 1 Complex Analysis by M.Mustafa Saif
No ratings yet
Lecture 1 Complex Analysis by M.Mustafa Saif
4 pages
Proficiency Task 4.2: Combinatorics and Probability
No ratings yet
Proficiency Task 4.2: Combinatorics and Probability
6 pages
The Exit Chart - Introduction To Extrinsic Information Transfer in Iterative Processing
No ratings yet
The Exit Chart - Introduction To Extrinsic Information Transfer in Iterative Processing
8 pages
AP Sem 1 Final Review
No ratings yet
AP Sem 1 Final Review
42 pages
Statistik Ch12
No ratings yet
Statistik Ch12
36 pages
David A. Santos - Elementary Number Theory Notes (2004)
No ratings yet
David A. Santos - Elementary Number Theory Notes (2004)
183 pages
Natural System of Units in General Relativity
No ratings yet
Natural System of Units in General Relativity
5 pages

Aditya Garg DMDW

Uploaded by

Aditya Garg DMDW

Uploaded by

Dr. B. R.

Ambedkar National Institute of Technology

Session : June-Dec 2020

Submitted to : Submitted By:

print("Mean of Matrix B :")

print("Mean of Matrix AB :")

print("Mean of Matrix Inverse of AB :")

findfactorial <- function(n){

3. Write a “Function” program in R to find maximum and minimum from a set of

vector1 <- c(4,44.7,2,40,54,1,3,4)

print(paste("Minimum is", min1))

1. How to read/write data from the dataset in R.

data <- read_excel("BEPSxls.xlsx")

2. Use different function in R.

data <- read_excel("BEPSxls.xlsx")

print("********Data Head ***********")

3. Download the given dataset and perform the following.

4. Attach and detach the dataset in R.

data <- data.frame(x1 = c(9, 8, 3, 4, 8),

1. Demonstration of pre-processing on dataset mtcars(R-studio)

2. Demonstrate the filter function on dataset mtcars using (deplyr package)

#1 Show where gear attribute = 4,

#2 Show where disp = 160.

#3 Show different operations (and, or, not)

gear4_or_hp110 <- filter(mtcars, gear == 4 | hp == 110)

3. Demonstrate the different function on dataset mtcars/Titanic

4. Remove the not required columns from mtcars dataset

5. Show the attribute containing NA values in a column in dataset

6. Repeat all the above question on downloaded dataset

Attributes containing NA VALUES

1. As a crop researcher, you want to test effect of three different fertilizers

The one-way analysis of variance (ANOVA), also known as one-factor ANOVA, is an

Hypothesis in one-way ANOVA test:

my_data <- read_excel('DMDW_LAB4.xlsx')

#check and display ordered levels

#compute summary statistics by group

#compute one way ANOVA

#interpret result of ANOVA

#check data and display ordered levels

#compute summary statistics

#compute anova test

#Interpret the result of one-way ANOVA tests

rules=apriori(data= groceries2,parameter= list(support=0.6,confidence=0.8))

3. Demonstration of association rule process on dataset titanic using apriori

1. Demonstrate performing linear regression on given data using R/Python.

2. Demonstrate performing linear regression on Lung capacity dataset using

dataset <- read_excel("Lung Capacity.xls")

dataset.lm <- lm(dataset$LungCapacity ~ dataset$Gender + dataset$Height +

1. To construct Decision tree for weather data and classify it.

dataset = subset(dataset, select = -c(Date,Events,TempAvgF, DewPointAvgF,

dataset <- dataset %>%

fit <- rpart(TempHighF~., data = dataset, method = 'class')

2. To construct Decision tree for customer data and classify it.

dataset = subset(dataset, select = -c(customerID ))

fit <- rpart(Churn~., data = dataset, method = 'class')

rpart.plot(fit, extra = 106)

 Step 1: Choose groups in the feature plan randomly

2. Demonstration of clustering rule process on dataset using simple k-means.

# Visualising the clusters

1. Demonstration of classification rule process on dataset using naïve bayes

dataset$Gender <- factor(dataset$Gender, levels = c("Male", "Female"),

dataset <- dataset %>%

dataset <- dataset %>%

dataset <- dataset %>%

trainIndex <- createDataPartition(dataset$Score, p = .7,

Train <- dataset[ trainIndex,]

# Fitting Naive Bayes Model

classifier_cl <- naiveBayes(Score ~ ., data = Train)

2. Demonstration of clustering rule process on dataset using EM algorithm.

mem <- kmeans(x,2)$cluster

mu1 <- sum.finite(p1 * x) / sum.finite(p1)

sigma1 <- sqrt(sum.finite(p1 * (x-mu1)^2) / sum.finite(p1))

1. Build Data Warehouse, install and Explore WEKA.

Took the dataset Mall_customers.csv

Clustering the data with WEKA

2. Perform data pre-processing tasks and Demonstrate performing association rule

print("Data Head ***")