0% found this document useful (0 votes)

29 views29 pages

Exploratory Data Analysis and Case

Uploaded by

shadowalker2276

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views29 pages

Exploratory Data Analysis and Case

Uploaded by

shadowalker2276

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

EXPLORATORY DATA ANALYSIS AND CASE

PREDICTION ON
WINE QUALITY DATASET

ENGINEERING CLINIC PROJECT REPORT

Submitted by

1.S.MIRUTHUVIKASINI 18BCS021
2.N.ABINAYASRI 18BCS033
3.R.PRAGATHI 18BCS038
4.N.JENFERO 18BCS043
5. S.RAGHAVI 18BCS056

In partial fulfillment for the award of the degree

BACHELOR OF ENGINEERING
in
COMPUTER SCIENCE AND ENGINEERING

KUMARAGURU COLLEGE OF TECHNOLOGY

COIMBATORE-641 049
(An Autonomous Institution Affiliated to Anna University, Chennai)

December 2020

Verified by

(V. Sudha)

1
TABLE OF CONTENTS

CHAPTER TITLE PAGE

NO NO
ABSTRACT 3
1. INTRODUCTION 3
1.1 CONCEPTUAL STUDY OF THE PROJECT 3
1.2 OBJECTIVE OF THE PROJECT 5
1.3 SCOPE OF THE PROJECT 5

2. LITERATURE REVIEW
2.1 LITERATURE REVIEW OF JOURNALS 6
3. PROBLEM DEFINITION
11
4. LOADING THE DATASET
4.1 BASIC DATA EXPLORATION 12
4.2 DATA CLEANING 17
17
4.2.1. CHECKING FOR NULL VALUE
18
4.2.2.CHECKING FOR OUTLIERS
4.2.3.TREATING THE OUTLIER 20
4.3 DATA VISUALIZATION 21
4.3.1.HISTOGRAM 22
4.3.2.STRIPPLOT 23
4.3.3.COUNTPLOT 24
25
4.4 NORMALIZATION
27
4.5 PREDICTION OF TARGET VARIABLE

5. CONCLUSION 29
6. REFERENCE LINK 29

2
ABSTRACT

Machine learning has made dramatic improvements in

the past few years, Here we applied various machine learning techniques to
predict the Quality of wine based on various physicochemical data. In our
Project Wine Quality-red dataset has been used to analyze and infer various
information about the data.We use various Python libraries such as pandas,
numpy, matplotlib, seaborn and scikit-learn to validate our dataset.Data
Visualization had played a vital role to make data more natural for the human
mind to comprehend and therefore makes it easier to identify trends,
patterns, and outliers within large data sets.We also use that technique to get
clear inference and we use it to remove all our outliers. We had trained our
data by using a machine learning algorithm called DecisonTreeClassifier so
that it would predict the quality of wine based on the available
physicochemical data.In this quality prediction testing is done on the 20
percent of the data and the training is done on the 80 percent of the data.The
results we have obtained is about the accuracy

1.INTRODUCTION
1.1 INTRODUCTION

The red wine industry shows a recent exponential growth as social drinking is
on the rise. Nowadays, industry players are using product quality
certifications to promote their products. This is a time-consuming process and
requires the assessment given by human experts, which makes this process
very expensive. Another vital factor in red wine certification and quality
assessment is physicochemical tests, which are laboratory-based and
consider factors like acidity, pH level, sugar, and other chemical properties.

Our analysis will use Red Wine Quality Data Set, available on Kaggle
https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2009

The wine Samples was obtained from the north of Portugal to model red wine
quality based on physicochemical tests. The dataset contains a total of 12
variables, which were recorded for 1,599 observations.
3
Attribute Details:

Input variables (based on physicochemical tests):

1 - fixed acidity

2 - volatile acidity

3 - citric acid

4 - residual sugar

5 - chlorides

6 - free sulfur dioxide

7 - total sulfur dioxide

8 - density

9 - pH

10 - sulphates

11 - alcohol

Output variable (based on sensory data):

12 - quality (score between 0 and 10)

These days with the advent of machine learning techniques it is possible to

classify the wines as well as it is possible to figure out the importance of each
chemical analyses parameters in the wine and which one to ignore for
reduction of cost. The performance comparison with different feature sets
will also help to classify it in a more distinctive way. In this paper machine
learning approach is proposed to train the dataset and make a test to predict
the Quality of wine given the physicochemical data.

4
1.2 OBJECTIVES OF THE PROJECT

The main objective of this project is to study the wine quality prediction
dataset which is available in kaggle and to explore more on python libraries
which helps to do exploratory data analysis.

To analyse the data by using various Data Visualization Techniques.

To Preprocess the data by removing the NULL values and treating the missing
values.

To identify and remove the Outliers by using various Python techniques.

Finally Prediction is made on the quality of wine by incorporating various

machine learning models.

1.3 SCOPE OF THE PROJECT

Learning the attributes of a dataset and understanding the relationship

between them.

Cleaning the data by removing the NULL values and treating Outliers.

Visualizing data to get it more precise, by exploring various python libraries

such as Numpy, Pandas and seaborn.

To understand the various Machine Learning algorithms and use them

accordingly.

Split the data into Train and Test data and make prediction on the quality of
wine test by including alternative models on machine learning.

5
2. LITERATURE REVIEW

2.1 LITERATURE REVIEW OF JOURNALS

Title of the paper: Wine Quality

Research Focus: Exploratory Data Analysis (EDA) in Python for

the analysis of Wine Quality dataset.

Student level: Undergraduate

Abstract:

Wine classification is a difficult task since taste is the least understood of

the human senses. A good wine quality prediction can be very useful in the
certification phase, since currently the sensory analysis is performed by
human tasters, being clearly a subjective approach. An automatic predictive
system can be integrated into a decision support system, helping the speed
and quality of the performance. Furthermore, a feature selection process can
help to analyze the impact of the analytical tests. If it is concluded that several
input variables are highly relevant to predict the wine quality, since in the
production process some variables can be controlled, this information can be
used to improve the wine quality.

From the following research journal papers we have included the details
related to our dataset.

Research paper 1:
6
Selection of important features and predicting wine
quality using machine learning techniques

Y Gupta - Procedia Computer Science, 2018 – Elsevier

Nowadays, industries are using product quality certifications to promote

their products. This is a time taking process and requires the assessment given
by human experts which makes this process very expensive. This paper
explores the usage of machine learning techniques such as linear regression,
neural network and support vector machine for product quality in two ways.
Firstly, determine the dependency of target variable on independent
variables and secondly, predicting the value of target variable. In this paper,
linear regression is used to determine the dependency of target variable on
independent variables. On the basis of computed dependency, important
variables are selected those make significant impact on dependent variable.
Further, neural network and support vector machine are used to predict the
values of dependent variable. All the experiments are performed on Red Wine
and White Wine datasets. This paper proves that the better prediction can be
made if selected features (variables) are being considered rather than
considering all the features.

Proposed methodology:

o Linear regression
o Neural network
o Support vector machine

Experimental results and analysis :

7
o Determining important features for prediction
o Predicting value of dependent variable (Quality)

Conclusion and future directions:

The interest has been increased in wine industry in recent years which
demands growth in this industry. Therefore, companies are investing in new
technologies to improve wine production and selling. In this direction, wine
quality certification plays a very important role for both processes and it
requires wine testing by human experts. This paper explores the usage of
machine learning techniques in two ways. Firstly, how linear regression
determines important features for prediction. Secondly, the usage of neural
network and support vector machine in predicting the values. The benchmark
Wine dataset is used for all experiments. This dataset has two parts: Red Wine
and White Wine data. Red wine contains 1599 samples and white wine
contains 4898 samples. Both red and white wine dataset consists of 12
physicochemical characteristics. One (quality) is dependent variable and
other 11 are predictors. The experiments shows that the value of dependent
variable can be predicted more accurately if only important features are
considered in prediction rather than considering all features. In future, large
dataset can be taken for experiments and other machine learning techniques
may be explored for wine quality prediction.

Reference link:
https://www.sciencedirect.com/science/article/pii/S1877050917328053

Research paper 2:
8
Assessing wine quality using a decision tree

S Lee, J Park, K Kang - 2015 IEEE International Symposium on …,

2015 - ieeexplore.ieee.org

Even though wine-drinkers generally agree that wines may be ranked by

quality, wine-tasting is famously subjective. There have been many attempts
to construct a more methodical approach to the assessment of wines. We
propose a method of assessing wine quality using a decision tree, and test it
against the wine-quality dataset from the UC Irvine Machine Learning
Repository. Results are 60% in agreement with traditional assessment
techniques.

Reference link: https://ieeexplore.ieee.org/abstract/document/7302752

Research paper 3:

Prediction of Quality for Different Type of Wine based on

Different Feature Sets Using Supervised Machine Learning
Techniques

S Aich, AA Al-Absi, KL Hui… - 2019 21st International …, 2019 -

ieeexplore.ieee.org

In recent years, most of the industries promoting their products based on

the quality certification they received on the products. The traditional way of
assessing the product quality is time consuming, however with the invent of
machine learning techniques the processes has become more efficient and
9
consumed less time than before. In this paper we have explored, some of the
machine learning techniques to assess the quality of wine based on the
attributes of wine that depends on quality. We have used white wine and red
wine quality dataset for this research work. We have used different feature
selection technique such as genetic algorithm (GA) based feature selection
and simulated annealing (SA) based feature selection to check the prediction
performance. We have used different performance measure such as
accuracy, sensitivity, specificity, positive predictive value, negative predictive
value for comparison using different feature sets and different supervised
machine learning techniques. We have used nonlinear, linear and
probabilistic classifiers. We have found that feature selection-based feature
sets able to provide better prediction than considering all the features for
performance prediction.We have found accuracy ranging from 95.23% to
98.81% with different feature sets. This analysis will help the industries to
access the quality of the products at less time and more efficient way.

Reference link: https://ieeexplore.ieee.org/abstract/document/8702017

Research paper 4:

The classification of wine according to their physicochemical

qualities
Y Er, A Atasoy - International Journal of Intelligent Systems and …, 2016 -
ijisae.org

The main purpose of this study is to predict wine quality based on

physicochemical data. In this study, two large separate data sets which were
10
taken from UC Irvine Machine Learning Repository were used. These data sets
contain 1599 instances for red wine and 4898 instances for white wine with
11 features of physicochemical data such as alcohol, chlorides, density, total
sulfur dioxide, free sulfur dioxide, residual sugar, and pH. First, the instances
were successfully classified as red wine and white wine with the accuracy of
99.5229% by using Random Forests Algorithm. Then, the following three
different data mining algorithms were used to classify the quality of both red
wine and white wine: k-nearest-neighbourhood, random forests and support
vector machines. There are 6 quality classes of red wine and 7 quality classes
of white wine. The most successful classification was obtained by using
Random Forests Algorithm. In this study, it is also observed that the use of
principal component analysis in the feature selection increases the success
rate of classification in Random Forests Algorithm.

Reference link: https://www.ijisae.org/IJISAE/article/view/914

3. Problem Definition:

This data will allow us to create different regression models to determine

how different independent variables help predict our dependent variable,
quality. Knowing how each variable will impact the wine quality will help
producers, distributors, and businesses in the wine industry better assess
their production, distribution, and pricing strategy.

4.LOADING THE DATASET

We use pandas to analyse the dataset.

import pandas as pd
We use numpy for scientific computing.
11
import numpy as np
We use matplotlib for visualization of data generally consists of bars, pies,
lines, scatter plots and so on.
import matplotlib.pyplot as plt
We use seaborn for data visualization library based on matplotlib. It
provides a high-level interface for drawing attractive and informative
statistical graphics.
import seaborn as sns

4.1 BASIC DATA EXPLORATION

• Head of the dataset- The head function will display the top records in
the data set. By default, python shows you only the top 5 records.

12
• Tail of the dataset- The tail function will display the last records in the
data set. By default, python shows you only the last 5 records.

• Shape of the dataset- To check the dimension of data.

13
• Info of the dataset- info() is used to check the Information about the
data and the data types of each respective attribute.

• Summary of the dataset- The describe method will help to see how
data has been spread for numerical values.

14
• Columns of the dataset- The column method will help to see the
names of the columns the dataset contains.

• Unique values of the dataset- The unique function will help to see the
unique values in the specific column of the dataset.

15
• The nunique function will help to see the no of unique values does the
dataset contains.

4.2. DATA CLEANING –

16
4.2.1CHECKING FOR NULL VALUES

TREATING THE NULL VALUES:

17
Here I treated the null values with spaces . (i.e)I replaced the
null values with empty space.

4.2.2. CHECKING FOR OUTLIERS

USING BOXPLOT:

18
19
4.2.3 TREATING THE OUTLIERS

20
4.3.DATA VISUALIZATION

Data visualization is the graphical representation of information and

data. With visual elements like charts, graphs, and maps, data
visualization tools provide an accessible way to see and understand
trends, outliers, and patterns in data.

4.3.1 HISTOGRAM:

21
4.3.2 STRIPPLOT:

22
4.3.3 COUNTPLOT:

23
4.4. NORMALIZATION

Normalization is a scaling technique in which values are shifted and

rescaled so that they end up ranging between 0 and 1. It is also known as
Min-Max scaling.
24
25
4.5 PREDICTION OF TARGET VARIABLE 1

26
27
CONCLUSION:

By looking into the details, we can see that good quality wines have higher
levels of alcohol on average, have a lower volatile acidity on average, higher
levels of sulphates on average, and higher levels of residual sugar on average.

Reference link:

● https://towardsdatascience.com/predicting-wine-quality-with-several-
classification-techniques-179038ea6434
● https://www.kaggle.com/scsaurabh/red-wine-quality-analysis-python
● https://www.kaggle.com/sgus1318/wine-quality-exploration-and-
analysis

28
● https://medium.com/analytics-vidhya/wine-quality-prediction-with-
python-695939d34d87
● https://medium.com/datadriveninvestor/regression-from-scratch-
wine-quality-prediction-d61195cb91c8
● https://github.com/vikrantkakad/Red-Wine-Quality-Analysis
● https://dzone.com/articles/predicting-wine-quality-with-several-
classificatio
● https://datauab.github.io/red_wine_quality/

Machine Learning (16CIC73) Project Report Template
33% (3)
Machine Learning (16CIC73) Project Report Template
12 pages
Wine Quality prediction Project Report
No ratings yet
Wine Quality prediction Project Report
4 pages
Mlp Slides Merged
No ratings yet
Mlp Slides Merged
480 pages
grkfinal123
No ratings yet
grkfinal123
22 pages
Wine Quality Dataset
No ratings yet
Wine Quality Dataset
2 pages
Current Trends in Software
No ratings yet
Current Trends in Software
40 pages
Red Wine Mine
100% (1)
Red Wine Mine
32 pages
Bài-tập-nhóm-AI-1
No ratings yet
Bài-tập-nhóm-AI-1
47 pages
EDA Mini Project Report
No ratings yet
EDA Mini Project Report
23 pages
Report Revathy
No ratings yet
Report Revathy
13 pages
Projet-IMI5
No ratings yet
Projet-IMI5
4 pages
Lab_1__Data_Visualization_and_Statistics_from_Data
No ratings yet
Lab_1__Data_Visualization_and_Statistics_from_Data
4 pages
wine 9
No ratings yet
wine 9
20 pages
Copy of 5th Sem Mini Project Synopsis 2
No ratings yet
Copy of 5th Sem Mini Project Synopsis 2
2 pages
HW04
No ratings yet
HW04
3 pages
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
No ratings yet
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
13 pages
College Project by Muhannad-3
No ratings yet
College Project by Muhannad-3
20 pages
Wine Quality Predictions
No ratings yet
Wine Quality Predictions
13 pages
7
No ratings yet
7
8 pages
DWDM GLOB
No ratings yet
DWDM GLOB
20 pages
20250210-35078-w15z3q
No ratings yet
20250210-35078-w15z3q
10 pages
Honours LY Project
No ratings yet
Honours LY Project
31 pages
WINE QUALITY PREDICTOR ppt
0% (1)
WINE QUALITY PREDICTOR ppt
9 pages
Finaldocmp
No ratings yet
Finaldocmp
40 pages
Humair+Arshad+Wine+Quality+Revised
No ratings yet
Humair+Arshad+Wine+Quality+Revised
16 pages
10.1007@978-981-13-7403-623
No ratings yet
10.1007@978-981-13-7403-623
9 pages
Wine Quality Analysis
No ratings yet
Wine Quality Analysis
27 pages
Wine_Quality_Prediction_PoC_Report
No ratings yet
Wine_Quality_Prediction_PoC_Report
2 pages
Wine Quality Prediction GHAR
No ratings yet
Wine Quality Prediction GHAR
19 pages
ML PR
No ratings yet
ML PR
32 pages
Wine Quality Prediction Using Machine Learning
No ratings yet
Wine Quality Prediction Using Machine Learning
10 pages
Mini Project Report
No ratings yet
Mini Project Report
12 pages
Prediction of Wine Quality Using Machine Learning
100% (1)
Prediction of Wine Quality Using Machine Learning
12 pages
Wine Quality Prediction Using Data Mining
No ratings yet
Wine Quality Prediction Using Data Mining
13 pages
DT-1 Project Report
No ratings yet
DT-1 Project Report
12 pages
mahima2020
No ratings yet
mahima2020
8 pages
Wine_Quality_Prediction_Report
No ratings yet
Wine_Quality_Prediction_Report
2 pages
ML Miniproject
No ratings yet
ML Miniproject
19 pages
Wine Quality Prediction Using Machine Learning Algorithms
100% (1)
Wine Quality Prediction Using Machine Learning Algorithms
4 pages
8
No ratings yet
8
5 pages
Aiml Question-bank Solutions-full Combined
No ratings yet
Aiml Question-bank Solutions-full Combined
109 pages
Big Data Projecct
No ratings yet
Big Data Projecct
12 pages
Wine Quality Classification
No ratings yet
Wine Quality Classification
36 pages
Machine Learning Miniproject
No ratings yet
Machine Learning Miniproject
10 pages
Report
No ratings yet
Report
6 pages
Red Wine Quality Prediction Using Machine Learning
No ratings yet
Red Wine Quality Prediction Using Machine Learning
4 pages
Wine Quality Prediction: Implementation
No ratings yet
Wine Quality Prediction: Implementation
3 pages
VinQCheck: An Intelligent Wine Quality Assessment
No ratings yet
VinQCheck: An Intelligent Wine Quality Assessment
9 pages
An Investigation of Wine Quality Testing Using Machine Learning Techniques
No ratings yet
An Investigation of Wine Quality Testing Using Machine Learning Techniques
8 pages
ML Project Report
No ratings yet
ML Project Report
12 pages
Wine Case Report
100% (2)
Wine Case Report
16 pages
Vit Assignment 4
No ratings yet
Vit Assignment 4
1 page
Wine Quality Research Paper
No ratings yet
Wine Quality Research Paper
3 pages
ML Mini Report
No ratings yet
ML Mini Report
6 pages
beddelle
No ratings yet
beddelle
57 pages
Analyze Phase Workbook - Final
100% (3)
Analyze Phase Workbook - Final
151 pages
Wine Quality
100% (1)
Wine Quality
2 pages
CSE (With SPL)
No ratings yet
CSE (With SPL)
65 pages
Guia Del Usuario - TEST
No ratings yet
Guia Del Usuario - TEST
63 pages
Data Analysis With Excel Handbook p1
No ratings yet
Data Analysis With Excel Handbook p1
17 pages
The Relationship Between Academic Motivation and Academic Achievement of Students
No ratings yet
The Relationship Between Academic Motivation and Academic Achievement of Students
8 pages
Ec571-Panel Data
No ratings yet
Ec571-Panel Data
33 pages
Learn Module Outline
No ratings yet
Learn Module Outline
30 pages
Variance Reduction Technique
No ratings yet
Variance Reduction Technique
51 pages
ML UNIT -2 Part 2
No ratings yet
ML UNIT -2 Part 2
20 pages
Capello - HB of Regional Growth and Development Theories - 2009 - 367
No ratings yet
Capello - HB of Regional Growth and Development Theories - 2009 - 367
20 pages
Final Exam Sta104 Feb 2021
No ratings yet
Final Exam Sta104 Feb 2021
9 pages
Impact_of_Digital_Marketing_Strategies_on_Performa
No ratings yet
Impact_of_Digital_Marketing_Strategies_on_Performa
10 pages
Structuring For Team Success: Christian Tröster, Ajay Mehra, Daan Van Knippenberg
No ratings yet
Structuring For Team Success: Christian Tröster, Ajay Mehra, Daan Van Knippenberg
14 pages
Red Wine Quality Detection
No ratings yet
Red Wine Quality Detection
17 pages
Trends in Production and Export of Lentils in Ethiopia
No ratings yet
Trends in Production and Export of Lentils in Ethiopia
6 pages
Introduction Econometrics Notes
No ratings yet
Introduction Econometrics Notes
7 pages
Multiple Linear Regression 1
No ratings yet
Multiple Linear Regression 1
8 pages
Is This Time Different - RP - 2019 - Pub
No ratings yet
Is This Time Different - RP - 2019 - Pub
10 pages
Wine Quality Synopsis
No ratings yet
Wine Quality Synopsis
3 pages
Courses For B. Tech. in Electronics & Communication Engineering
No ratings yet
Courses For B. Tech. in Electronics & Communication Engineering
38 pages
Red Wine Quality Prediction Using Machine Learning Techniques
No ratings yet
Red Wine Quality Prediction Using Machine Learning Techniques
7 pages
Financial Econometrics Assignment Final Version
No ratings yet
Financial Econometrics Assignment Final Version
11 pages
ICA 2 Assignment
No ratings yet
ICA 2 Assignment
5 pages
Olaf Abel Ethylene Apc
No ratings yet
Olaf Abel Ethylene Apc
32 pages
MS 02 More Exercises
No ratings yet
MS 02 More Exercises
5 pages
Bivariate Logistic Regression
100% (1)
Bivariate Logistic Regression
12 pages
Sol Linear Regression by Hand
No ratings yet
Sol Linear Regression by Hand
3 pages
Forecasting Methods
No ratings yet
Forecasting Methods
38 pages
Visvesvaraya Technological University: City Engineering College
No ratings yet
Visvesvaraya Technological University: City Engineering College
31 pages
Summer Project
No ratings yet
Summer Project
21 pages
A Strategy For The Risk-Based Inspection of Pressure Safety Valves (2009) PDF
No ratings yet
A Strategy For The Risk-Based Inspection of Pressure Safety Valves (2009) PDF
9 pages
Software and Programming Tools in Pharmaceutical Research
From Everand
Software and Programming Tools in Pharmaceutical Research
Editors: Dilpreet Singh
No ratings yet
Structured Software Testing: The Discipline of Discovering
From Everand
Structured Software Testing: The Discipline of Discovering
Arunkumar Khannur
No ratings yet

Exploratory Data Analysis and Case

Uploaded by

Exploratory Data Analysis and Case

Uploaded by

EXPLORATORY DATA ANALYSIS AND CASE

ENGINEERING CLINIC PROJECT REPORT

In partial fulfillment for the award of the degree

KUMARAGURU COLLEGE OF TECHNOLOGY

CHAPTER TITLE PAGE

Machine learning has made dramatic improvements in

Input variables (based on physicochemical tests):

6 - free sulfur dioxide

7 - total sulfur dioxide

Output variable (based on sensory data):

12 - quality (score between 0 and 10)

These days with the advent of machine learning techniques it is possible to

To analyse the data by using various Data Visualization Techniques.

To identify and remove the Outliers by using various Python techniques.

Finally Prediction is made on the quality of wine by incorporating various

1.3 SCOPE OF THE PROJECT

Learning the attributes of a dataset and understanding the relationship

Visualizing data to get it more precise, by exploring various python libraries

To understand the various Machine Learning algorithms and use them

2.1 LITERATURE REVIEW OF JOURNALS

Title of the paper: Wine Quality

Research Focus: Exploratory Data Analysis (EDA) in Python for

Student level: Undergraduate

Wine classification is a difficult task since taste is the least understood of

Y Gupta - Procedia Computer Science, 2018 – Elsevier

Nowadays, industries are using product quality certifications to promote

Experimental results and analysis :

Conclusion and future directions:

S Lee, J Park, K Kang - 2015 IEEE International Symposium on …,

Even though wine-drinkers generally agree that wines may be ranked by

Reference link: https://ieeexplore.ieee.org/abstract/document/7302752

Prediction of Quality for Different Type of Wine based on

S Aich, AA Al-Absi, KL Hui… - 2019 21st International …, 2019 -

In recent years, most of the industries promoting their products based on

Reference link: https://ieeexplore.ieee.org/abstract/document/8702017

The classification of wine according to their physicochemical

The main purpose of this study is to predict wine quality based on

Reference link: https://www.ijisae.org/IJISAE/article/view/914

This data will allow us to create different regression models to determine

4.LOADING THE DATASET

We use pandas to analyse the dataset.

4.1 BASIC DATA EXPLORATION

• Shape of the dataset- To check the dimension of data.

4.2. DATA CLEANING –

TREATING THE NULL VALUES:

4.2.2. CHECKING FOR OUTLIERS

Data visualization is the graphical representation of information and

Normalization is a scaling technique in which values are shifted and

You might also like