DEPARTMENT OF COMPUTER SCIENCE
RECORD NOTE
Record work submitted to the Bharathiar University in partial fulfillment of the requirement for
the Degree of
Master of Science in Computer Science
SNMV
(Shri Nehru MahaVidyalaya)
College of Art and Science (Affiliated to
Bharathiar University)
ShriGambhirmalBafna Nagar,
Malumachampatti, Coimbatore-641050.
April-2023
DEPARTMENT OF COMPUTER SCIENCE
This is to certify that it is a bonafide record work done by____________________
studying I Year M.Sc. Computer Science Reg.No. ___________________
Staff In-Charge Head of the Department
Submitted for: PRACTICAL III : DATA MINING USING R
Bharathiar University Practical Examination held on
Even semester (2022-2023)
Internal Examiner External Examiner
CONTENTS
S.NO DATE PROGRAM TITLE PAGE STAF
F
NO SIG
N
1 Implement Apriori algorithm to extract association rule
of datamining.
2 Implement k-means clustering technique.
3 Implement any one Hierarchal Clustering.
4 Implement Classification algorithm.
5 Implement Decision Tree.
6 Linear Regression.
7 Data Visualization
Ex No.01 APRIORI ALGORITHM
DATE:
Aim:
To implement the apriori algorithm to extract the association rule using the R tool.
Algorithm:
STEP 1: Start the process.
STEP 2: Open the R tool.
STEP 3: Install the association rule package (rule).
STEP 4: Create a notepad with n number of transactions & save the fill with txt format.
STEP 5: Set the path of the notepad document.
STEP6: Then inspect the transactions to display a number of items.
STEP 7: Display the image for the data set.
STEP 8: Display rules parameter specifications.
STEP 9: Stop the process.
PROGRAM:
>library("arules");
>library("arulesViz");
>patterns = random.patterns(nItems = 1000);
>summary(patterns);
>trans = random.transactions(nItems = 1000, nTrans = 1000, method = "agrawal", patterns = patterns);
>image(trans);
>data("AdultUCI");
>Adult = as(AdultUCI, "transactions");
>rules = apriori(Adult, parameter=list(support=0.01, confidence=0.5));
>rules;
>inspect(head(sort(rules, by="lift"),3));
>plot(rules);
>head(quality(rules));
>plot(rules, measure=c("support","lift"), shading="confidence");
>plot(rules, shading="order", control=list(main ="Two-key plot"));
OUTPUT
>sel = plot(rules, measure=c("support","lift"), shading="confidence", interactive=TRUE);
>subrules = rules[quality(rules)$confidence > 0.8];
>subrules
OUTPUT
>plot(subrules, method="matrix", measure="lift");
>plot(subrules, method="matrix", measure="lift", control=list(reorder=TRUE));
>plot(subrules, method="matrix3D", measure="lift");
>plot(subrules, method="matrix3D", measure="lift", control = list(reorder=TRUE));
>plot(subrules, method="matrix", measure=c("lift", "confidence"));
>plot(subrules, method="matrix", measure=c("lift","confidence"), control = list(reorder=TRUE));
>plot(rules, method="grouped");
>plot(rules, method="grouped", control=list(k=50));
>sel = plot(rules, method="grouped", interactive=TRUE);
>subrules2 = head(sort(rules, by="lift"), 30);
>plot(subrules2, method="graph");
>plot(subrules2, method="graph", control=list(type="items"));
>plot(subrules2, method="paracoord");
>plot(subrules2, method="paracoord", control=list(reorder=TRUE));
>oneRule = sample(rules, 1);
>inspect(oneRule);
OUTPUT
RESULT:
Thus the program is executed successfully and the output is verified
Ex No.02
K-MEANS CLUSTERING
DATE:
Aim:
To implement the linear Regression by using R tool.
Algorithm:
STEP 1: Start the process.
STEP 2: Open the R tool.
STEP 3: Open notepad and mark details and save the file as text document.
STEP 4: Prin the summary of the mark.
STEP 5: Plot the graph for the mark.
STEP 6: Plot the regression chart for x&y.
STEP 7: Stop the process.
PROGRAM:
> newiris <- iris
> newiris$Species <- NULL
> (kc <- kmeans(newiris, 3))
OUTPUT
K-means clustering with 3 clusters of sizes 33, 96, 21
Cluster means:
Sepal.Length Sepal.Width Petal.Length Petal.Width
1 5.175758 3.624242 1.472727 0.2727273
2 6.314583 2.895833 4.973958 1.7031250
3 4.738095 2.904762 1.790476 0.3523810
Clustering vector:
[1] 1 3 3 3 1 1 1 1 3 3 1 1 3 3 1 1 1 1 1 1 1 1 1 1 3 3 1 1 1 3 3 1 1 1 3 1 1
[38] 1 3 1 1 3 3 1 1 3 1 3 1 1 2 2 2 2 2 2 2 3 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2
[75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2
[112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[149] 2 2
Within cluster sum of squares by cluster:
[1] 6.432121 118.651875 17.669524
(between_SS / total_SS = 79.0 %)
Available components:
[1] "cluster" "centers" "totss" "withinss" "tot.withinss"
[6] "betweenss" "size" "iter" "ifault"
> table(iris$Species, kc$cluster)
OUTPUT
1 2 3
setosa 33 0 17
versicolor 0 46 4
virginica 0 50 0
> plot(newiris[c("Sepal.Length", "Sepal.Width")], col=kc$cluster)
OUTPUT
> points(kc$centers[,c("Sepal.Length", "Sepal.Width")], col=1:3, pch=8, cex=2)
RESULT:
Thus the program is executed successfully and output is verified
Ex No.03
HIERARCHICAL CLUSTERING
DATE:
Aim:
To implement hierarchical clustering using R tool.
Algorithm:
STEP 1 Start the process.
STEP 2: Open the R tool.
STEP 3: Select is sample and cluster.
STEP 4: Plot the graph for hierarchical clustering.
STEP 5: Stop the process.
PROGRAM:
>hc=hclust(dist(mtcars),method=”ave”)
>plot(hc)
OUTPUT
RESULT:
Thus the program is executed successfully and the output is verified.
Ex No.04 CLASSIFICATION ALGORITHM
DATE:
Aim:
To implement the classification Algorithm KNN using the R tool.
Algorithm:
STEP 1: Start the process.
STEP 2: Open the R tool.
STEP 3: Call the library function mass.
STEP 4: Select the iris dataset.
STEP 5: Display the summary of fit.
STEP 6: Stop the process.
PROGRAM:
> library(e1071)
> local({pkg <- select.list(sort(.packages(all.available = TRUE)),graphics=TRUE)
+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
> library(MASS)
> data(cats)
> model <- svm(Sex~., data = cats)
> print(model)
OUTPUT
Call:
svm(formula = Sex ~ ., data = cats)
Parameters:
SVM-Type: C-classification
SVM-Kernel: radial
cost: 1
gamma: 0.5
Number of Support Vectors: 84
> summary(model)
OUTPUT
Call:
svm(formula = Sex ~ ., data = cats)
Parameters:
SVM-Type: C-classification
SVM-Kernel: radial
cost: 1
gamma: 0.5
Number of Support Vectors: 84
( 39 45 )
Number of Classes: 2
Levels:
FM
> plot(model, cats)
OUTPUT
RESULT:
Thus the program is executed successfully and the output is verified.
Ex No.05
DECISION TREE
DATE:
Aim:
To implement the decision tree using R Tool.
Algorithm:
STEP 1: Start the process.
STEP 2: Open the R tool.
STEP 3: Call the library function part.
STEP 4: Print cp.
STEP 5: Plot the decision tree.
STEP 6: Print the summary.
STEP 7: Stop the process.
PROGRAM:
> library(rpart)
> fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis)
> fit2 <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis,
+ parms = list(prior = c(.65,.35), split = "information"))
>
> fit3 <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis,
+ control = rpart.control(cp = 0.05))
> par(mfrow = c(1,2), xpd = NA)
> plot(fit)
OUTPUT
>text(fit, use.n = TRUE)
OUTPUT
> plot(fit2)
> text(fit2, use.n = TRUE)
RESULT:
Thus the program is executed successfully and the output is verified.
Ex No.06
LINEAR REGRESSION
DATE:
Aim:
To implement the linear Regression by using R tool
Algorithm:
STEP 1: Start the process.
STEP 2: Open the R tool.
STEP 3: Open the notepad and mark details and save the file as a text document.
STEP 4: Print the summary of the mark.
STEP 5: Plot the graph for the mark.
STEP 6: Plot the regression chart for x & y.
STEP 7: Stop the process.
PROGRAM:
> x <- c(1,2,3,4,5,6)
> y <- x^2
> print(y)
OUTPUT
[1] 1 4 9 16 25 36
> mean(y)
OUTPUT
[1] 15.16667
> var(y)
OUTPUT
[1] 178.9667
> lm_1 <- lm(y ~ x)
> "y = B0 + (B1 * x)"
OUTPUT
[1] "y = B0 + (B1 * x)"
> print(lm_1)
OUTPUT
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
-9.333 7.000
> summary(lm_1)
OUTPUT
Call:
lm(formula = y ~ x)
Residuals:
1 2 3 4 5 6
3.3333 -0.6667 -2.6667 -2.6667 -0.6667 3.3333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9.3333 2.8441 -3.282 0.030453 *
x 7.0000 0.7303 9.585 0.000662 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.055 on 4 degrees of freedom
Multiple R-squared: 0.9583, Adjusted R-squared: 0.9478
F-statistic: 91.87 on 1 and 4 DF, p-value: 0.000662
> par(mfrow=c(2, 2))
> plot(lm_1)
OUTPUT
RESULT:
Thus the program is executed successfully and the output is verified.
Ex No.07
DATA VISUALIZATIONS
DATE:
Aim:
To implement the data visualization using R tool.
Algorithm:
STEP 1: Start the process.
STEP 2: Open the R tool.
STEP 3: Give the value for slices.
STEP 4: Enter the labels.
STEP 5: Specify the slices.
STEP 6: Use a different color for different labels.
STEP 7: Display the pie chart.
STEP 8: Stop the process.
PROGRAM:
Line Chart
>plot(AirPassengers,type="l")
OUTPUT
Bar Chart
>barplot(iris$Petal.Length) #Creating simple Bar Graph
OUTPUT
Box Plot
>boxplot(iris$Petal.Length~iris$Species)
OUTPUT
data(iris)
par(mfrow=c(2,2))
boxplot(iris$Sepal.Length,col="red")
boxplot(iris$Sepal.Length~iris$Species,col="red")
boxplot(iris$Sepal.Length~iris$Species,col=heat.colors(3))
boxplot(iris$Sepal.Length~iris$Species,col=topo.colors(3))
OUTPUT
Scatter Plot
plot(x=iris$Petal.Length) #Simple Scatter Plot
plot(x=iris$Petal.Length,y=iris$Species) #Multivariate Scatter Plot
Pie chart
>pie(table(iris$Species))
OUTPUT
1.0 1.5 2.0 2.5 3.0
6 7
iris$Petal.Length
iris$Species
5
4
3
1 2
0 50 100 150 1 2 3 4 5 6 7
Index iris$Petal.Length
setosa
versicolor
virginica
RESULT:
Thus the program is executed successfully and the output is verified.