0% found this document useful (0 votes)

220 views

Business Analytics MGN801-CA2 KAJAL (11917586) Section - Q1959

The document discusses various machine learning algorithms applied to an agriculture crops production dataset from India. It first introduces supervised and unsupervised learning in machine learning. It then applies linear regression, decision tree, and random forest regression models to predict crop yield based on cost variables. Next, it uses KNN classification and K-means clustering for classification and clustering. Finally, it performs hierarchical clustering using complete, ward.D, and average methods to cluster the data. Tables and plots are produced to evaluate and visualize the results of the various models.

Uploaded by

KAJAL KUMARI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

220 views

Business Analytics MGN801-CA2 KAJAL (11917586) Section - Q1959

Uploaded by

KAJAL KUMARI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

BUSINESS ANALYTICS

MGN801-CA2
KAJAL (11917586)
Section – Q1959

Introduction to Machine Learning

Machine Learning is a tool which is used to build logic based on the given data. Its helps
in learning from the examples and past experiences where instead of writing codes we
feed data to the generic algorithms which further gives an informative output and this
output is used for forecasting or making predictions. The main goal of machine learning
is to understand the nature of data and convert that data into different models that can be
further understood and utilized by people.

 Types of Machine Learning

 Supervised Learning- Supervised learning is a machine learning approach where

we use input variables and an output variables from the data and then we run an
algorithm to learn the mapping function from the input to the output. The goal is
to determine the mapping function so well that again if we have a new input data
then we can predict the output variables for that data. Classification and
regression model are the two different models used for prediction in supervised
learning. In supervised learning all the data is labeled and algorithms are used to
predict output from the input data.

 Unsupervised Learning- Unsupervised learning is the other machine learning

approach where we only take input data and no corresponding output variables.
The goal for unsupervised learning is to frame a model of underlying structure or
distribution available in the data in order to learn more about the data.
Unsupervised learning is further grouped into two different models i.e. Clustering
and Association. In case of unsupervised learning the data is unlabeled and
algorithms are used to learn inherent structure from the input data.

Data set information

https://www.kaggle.com/srinivas1/agricuture-crops-production-in-india#datafile%20(1).csv

The data which I have used for my assignment is taken from the above link. This data is
basically shows the yield on agriculture products based on the cost of cultivation and production.
REGRESSION:

It is a set of statistical processes which we used to estimating the relationships among the
different variables. Variables include on dependent variables and one or more than one input
variables. In this assignment I have taken the three input variables and one output variable.

Input variables – Cost of cultivation and production

Output – yield
1) Linear Regression Model

A linear regression is simply a regression model which is made up of linear variables. This can
be single variable linear model or multivariable linear model. Single variable is simply where we
have one input and one output however in multivariable we have more than one inputs and one
output. In this case we are referring to multivariable linear regression model.
Output:
Interpretation:

2) Decision Tree Model

Decision tree builds regression or classification models in the form of a tree structure. It breaks
down a dataset into smaller and smaller subsets while at the same time an associated decision
tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes. A
decision node has two or more branches, each representing values for the attribute tested. Leaf
node represents a decision on the numerical target. The topmost decision node in a tree which
corresponds to the best predictor called root node. Decision trees can handle both categorical and
numerical data.
Output:

Interpretation:
3) Random Forest Model

The Random Forest is one of the most effective machine learning models for predictive
analytics, making it an industrial workhorse for machine learning. The random forest model is a
type of additive model that makes predictions by combining decisions from a sequence of base
models.
Output

Interpretation:
Accuracy
Classification
Classification is a process related to categorization, where we classify and make different classes
based on different characteristics. It is a useful tool for statistical surveys.
KNN Classification
KNN algorithm is one of the simplest classification algorithm and it is one of the most used
learning algorithms. KNN is a non-parametric, lazy learning algorithm. Its purpose is to use a
database in which the data points are separated into several classes to predict the classification of
a new sample point. KNN is a simple algorithm that stores all available cases and classifies new
cases based on similarity measures e.g distance functions. Those who comes under the smallest
distance that falls under the same category.
Commands:
library(dplyr)
library(class)
T1=read.csv(file.choose(),header=TRUE)
View(T1)
str(T1)
set.seed(99)
T2=T1[,c(1,3,4,5)]
head(T2)
normalize=function(x){return((x-min(x))/(max(x)-min(x)))}
T2.new=as.data.frame(lapply(T2[,c(2,3,4)],normalize))
head(T2.new)

T2.train=T2.new[1:35,]
T2.train.target=T2[1:35,1]
T2.test=T2.new[36:49,]
T2.test.target=T2[36:49,1]
summary(T2.new)
T_model=knn(train = T2.train,test = T2.test,
cl=T2.train.target,k=7)
T_model
plot(T_model)
table(T2.test.target,T_model)
Output

Table

Interpretation
Clustering
Clustering analysis or clustering is a technique of grouping the objects having similar
characteristics in the same group so that they can look more similar to each other than those
which are in other groups.

K Means clustering:

This clustering aims the partition on n observations into k clusters and each observation belongs
to the cluster with the nearest mean. It is a part of unsupervised learning and we use this
clustering when we have unlabeled data.
Commands
T3=T2
T3
T3$Crop=NULL
head(T3)
results.T=kmeans(T3,3)
results.T
results.T$size
results.T$cluster
table(results.T$cluster)
plot(T3$Cost.of.Cultivation_1~T3$Cost.of.Production,col=results.T$cluster)
library(ggplot2)
ggplot(T3,aes(x=T3$Cost.of.Cultivation_1,y=T3$Cost.of.Production))+
geom_point(aes(col=results.T$cluster))
Output

Interpretation
Hierarchical Clustering:

Hierarchical clustering, also known as hierarchical cluster analysis, is an algorithm that groups
similar objects into groups called clusters. The endpoint is a set of clusters, where each cluster is
distinct from each other cluster, and the objects within each cluster are broadly similar to each
other.
Commands
#### Method complete#####
dist(T2[,2:4])
clust1=hclust(dist(T2[,2:4]),method = "complete")
plot(clust1,cex=0.3)
cutting1=cutree(clust1,3)
plot(cutting1)
table(T2$Crop,cutting1)
rect.hclust(clust1,k=8,border = c("red","blue","green"))

####Method Ward.d####
dist(T2[,2:4])
clust2=hclust(dist(T2[,2:4]),method = "ward.D")
plot(clust2,cex=0.3)
cutting2=cutree(clust2,3)
plot(cutting2)
table(T2$Crop,cutting2)
rect.hclust(clust2,k=8,border = c("red","blue","green"))

#####Method Average#####
dist(T2[,2:4])
clust3=hclust(dist(T2[,2:4]),method = "average")
plot(clust3,cex=0.3)
cutting3=cutree(clust3,3)
plot(cutting3)
table(T2$Crop,cutting3)
rect.hclust(clust3,k=8,border = c("red","blue","green"))
Output
Method complete
Method Ward.d
Method Average

Alienation Ethnicity and Postmodernism
100% (3)
Alienation Ethnicity and Postmodernism
271 pages
BUSINESS ANALYTICS Assignment
No ratings yet
BUSINESS ANALYTICS Assignment
14 pages
Evolutional Study On KNN and K-Means Algorithms (SP)
No ratings yet
Evolutional Study On KNN and K-Means Algorithms (SP)
9 pages
Module 3 (1)
No ratings yet
Module 3 (1)
63 pages
R Data Analysis
No ratings yet
R Data Analysis
10 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Unit 1
No ratings yet
Unit 1
15 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
5 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
Classification & Prediction
No ratings yet
Classification & Prediction
24 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Statistical Prediction and Machine Learning
100% (2)
Statistical Prediction and Machine Learning
314 pages
Machine Learning Notes ?
No ratings yet
Machine Learning Notes ?
14 pages
For More Visit WWW - Ktunotes.in
No ratings yet
For More Visit WWW - Ktunotes.in
21 pages
Machine Learning
No ratings yet
Machine Learning
56 pages
AIYA SESSION 4
No ratings yet
AIYA SESSION 4
42 pages
Machine Learning File
No ratings yet
Machine Learning File
7 pages
ICT202B AI ML and Emerging technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging technologies UNIT 3 (Classification and Regression) 2
23 pages
COMP1801 - Copy 1
No ratings yet
COMP1801 - Copy 1
18 pages
Unit-4 Data Mining
No ratings yet
Unit-4 Data Mining
19 pages
Machine_Learning
No ratings yet
Machine_Learning
35 pages
MCC Mba ML and Ai May30 2024
No ratings yet
MCC Mba ML and Ai May30 2024
201 pages
Reference Papers
No ratings yet
Reference Papers
7 pages
unit 1
No ratings yet
unit 1
8 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Basic Fundamentals of A.I.
No ratings yet
Basic Fundamentals of A.I.
5 pages
ML-1-PPT-UNIT-1
No ratings yet
ML-1-PPT-UNIT-1
93 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
Classification (Part II)
No ratings yet
Classification (Part II)
162 pages
DOC-20241106-WA0007
No ratings yet
DOC-20241106-WA0007
48 pages
Classification and Prediction Chapter6 Detailed Notes
No ratings yet
Classification and Prediction Chapter6 Detailed Notes
4 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
AI and DS QB1
No ratings yet
AI and DS QB1
31 pages
Machine Learning Clustering AlgorithmsI
No ratings yet
Machine Learning Clustering AlgorithmsI
129 pages
Machine Learning For Beginners PDF
No ratings yet
Machine Learning For Beginners PDF
29 pages
Data Analytics - Unit-IV
No ratings yet
Data Analytics - Unit-IV
21 pages
UNIT-2 Material
No ratings yet
UNIT-2 Material
71 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
ML_Introduction
No ratings yet
ML_Introduction
76 pages
UNIT 3 - Final
No ratings yet
UNIT 3 - Final
37 pages
Supervised Learning
No ratings yet
Supervised Learning
46 pages
Final ML
No ratings yet
Final ML
2 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
(KtabPDF Com) xrwA7TEBGp
No ratings yet
(KtabPDF Com) xrwA7TEBGp
32 pages
Lecture 3 - MachineLearning-CrashCourse2023
No ratings yet
Lecture 3 - MachineLearning-CrashCourse2023
99 pages
Machine learning algorithms laiki
No ratings yet
Machine learning algorithms laiki
123 pages
331mt 3.1 (1)
No ratings yet
331mt 3.1 (1)
36 pages
Machine Learning Concepts
No ratings yet
Machine Learning Concepts
68 pages
overview_basics
No ratings yet
overview_basics
16 pages
DataMining_Unit-3
No ratings yet
DataMining_Unit-3
8 pages
Classification
No ratings yet
Classification
7 pages
Unit 2 ML
No ratings yet
Unit 2 ML
141 pages
Classification
No ratings yet
Classification
50 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Task 03: Data Analysis: House Pricing Vs Incinerator Installation
No ratings yet
Task 03: Data Analysis: House Pricing Vs Incinerator Installation
10 pages
Volkswagen India: Das Auto Digitally: Case Study Analysis
No ratings yet
Volkswagen India: Das Auto Digitally: Case Study Analysis
12 pages
Template Sharing Is Cairing
No ratings yet
Template Sharing Is Cairing
5 pages
TASK-02: 100 Companies List
No ratings yet
TASK-02: 100 Companies List
106 pages
TASK-02: 100 Companies List
No ratings yet
TASK-02: 100 Companies List
106 pages
Comsumer Buying Behaviour As Regards Smartphones: 01. TITLE
No ratings yet
Comsumer Buying Behaviour As Regards Smartphones: 01. TITLE
13 pages
Johnson & Johnson: Contact No Email-Id Company Size HR Most Selling Product Vision
No ratings yet
Johnson & Johnson: Contact No Email-Id Company Size HR Most Selling Product Vision
76 pages
Amcat
No ratings yet
Amcat
21 pages
Performance Management System
No ratings yet
Performance Management System
11 pages
A1777297963 - 17437 - 19 - 2020 - HRM515 at Ii
No ratings yet
A1777297963 - 17437 - 19 - 2020 - HRM515 at Ii
3 pages
Case Study Solutions
No ratings yet
Case Study Solutions
2 pages
Volkswagen India: Das Auto, Digitally
No ratings yet
Volkswagen India: Das Auto, Digitally
22 pages
Predicting Link Budget and Coverage Area of B5G Using Machine Learning
No ratings yet
Predicting Link Budget and Coverage Area of B5G Using Machine Learning
4 pages
Lecture 05 - Nearest Neighbour
No ratings yet
Lecture 05 - Nearest Neighbour
17 pages
55 Machine Learning Engineer Questions To Find The Perfect Candidate
100% (1)
55 Machine Learning Engineer Questions To Find The Perfect Candidate
14 pages
Wang 等。 - 2017 - Evolving boxes for fast vehicle detection
No ratings yet
Wang 等。 - 2017 - Evolving boxes for fast vehicle detection
6 pages
Whats Is A Database
No ratings yet
Whats Is A Database
10 pages
Deep QNetwork Based Rotary Inverted Pendulum System and Its Monitoring On The EdgeX Platform
No ratings yet
Deep QNetwork Based Rotary Inverted Pendulum System and Its Monitoring On The EdgeX Platform
6 pages
Similarity Filters Jaro Winkler
No ratings yet
Similarity Filters Jaro Winkler
7 pages
Forecasting Wavelet Transformed Time Series With Attentive Neural Networks
No ratings yet
Forecasting Wavelet Transformed Time Series With Attentive Neural Networks
6 pages
Article On Artificial Intelligence
No ratings yet
Article On Artificial Intelligence
2 pages
GP-DSA-Linked List Reversal Notes
No ratings yet
GP-DSA-Linked List Reversal Notes
4 pages
Different Artificial Neural Networks Architectures
No ratings yet
Different Artificial Neural Networks Architectures
27 pages
Communications
No ratings yet
Communications
4 pages
Summary - Data Analytics& Machine Learning
No ratings yet
Summary - Data Analytics& Machine Learning
18 pages
272 539 1 PB
No ratings yet
272 539 1 PB
5 pages
01 Introduction To Ict
No ratings yet
01 Introduction To Ict
4 pages
E1 277 January-April 3:1 Reinforcement Learning: Instructor
No ratings yet
E1 277 January-April 3:1 Reinforcement Learning: Instructor
2 pages
Vector Quantized Diffusion Model For Text-to-Image Synthesis
No ratings yet
Vector Quantized Diffusion Model For Text-to-Image Synthesis
14 pages
Where can buy (Ebook) Gregory Bateson on Relational Communication: From Octopuses to Nations by Phillip Guddemi ISBN 9783030521004, 9783030521011, 3030521001, 303052101X ebook with cheap price
100% (6)
Where can buy (Ebook) Gregory Bateson on Relational Communication: From Octopuses to Nations by Phillip Guddemi ISBN 9783030521004, 9783030521011, 3030521001, 303052101X ebook with cheap price
67 pages
KECReport
No ratings yet
KECReport
23 pages
Advanced Control Seborg Chapter 15 16
0% (1)
Advanced Control Seborg Chapter 15 16
43 pages
CH 16
0% (1)
CH 16
69 pages
Unit 4
No ratings yet
Unit 4
8 pages
Core - Data - Business Intelligence (Snowflake)
No ratings yet
Core - Data - Business Intelligence (Snowflake)
2 pages
Hebbian Learning and Gradient Descent Learning: Neural Computation: Lecture 5
No ratings yet
Hebbian Learning and Gradient Descent Learning: Neural Computation: Lecture 5
20 pages
Ontological Semantics - Sergei Nirenburg, Victor Raskin
No ratings yet
Ontological Semantics - Sergei Nirenburg, Victor Raskin
328 pages
DL - Assignment 10 Solution
100% (2)
DL - Assignment 10 Solution
6 pages
Ai Powered Search
No ratings yet
Ai Powered Search
9 pages
A Review of 40 Years of Cognitive Architecture Res
No ratings yet
A Review of 40 Years of Cognitive Architecture Res
38 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
2 pages

Business Analytics MGN801-CA2 KAJAL (11917586) Section - Q1959

Uploaded by

Business Analytics MGN801-CA2 KAJAL (11917586) Section - Q1959

Uploaded by

BUSINESS ANALYTICS

Introduction to Machine Learning

 Types of Machine Learning

 Supervised Learning- Supervised learning is a machine learning approach where

 Unsupervised Learning- Unsupervised learning is the other machine learning

Data set information

Input variables – Cost of cultivation and production

2) Decision Tree Model

You might also like