0% found this document useful (0 votes)
34 views5 pages

Ids Unit 4 Case Study 1 Checking Patterns in Data

Uploaded by

Harsimar Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views5 pages

Ids Unit 4 Case Study 1 Checking Patterns in Data

Uploaded by

Harsimar Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

IDS-Unit-4-BCA SEM-IV

Case Study-1: Checking different Patterns in data


 Patterns are everywhere. It belongs to every aspect of our daily lives. Starting from the
design and color of our clothes to using intelligent voice assistants, everything involves
some kind of pattern.
 When we say that everything consists of a pattern or everything has a pattern, the
common question that comes up to our minds is,
 What is a pattern?
 How can we say that it constitutes almost everything and anything surrounding
us?
 How can it be implemented in the technologies that we use every day?

 Well, the answer to all these questions is one of the simplest things that all of us have
probably been doing since childhood. When we were in school, we were often given the
task of identifying the missing alphabets to predict which number would come in a
sequence next or to join the dots for completing the figure. The prediction of the
missing number or alphabet involved analyzing the trend followed by the given
numbers or alphabets. This is what pattern recognition in Machine Learning means.

 What is Pattern Recognition?


Pattern Recognition is defined as the process of identifying the trends (global or local) in
the given pattern. A pattern can be defined as anything that follows a trend and exhibits
some kind of regularity. The recognition of patterns can be done physically,
mathematically, or by the use of algorithms. When we talk about pattern recognition in
machine learning, it indicates the use of powerful algorithms for identifying the
regularities in the given data. Pattern recognition is widely used in the new age technical
domains like computer vision, speech recognition, face recognition, etc.

 Scope of Pattern Recognition:


 Data Mining- It refers to the extraction of useful information from large amounts of
data from heterogeneous sources. The meaningful data obtained from data mining
techniques are used for prediction making and data analysis.
 Recommender Systems– Most of the websites dedicated to online shopping make
use of recommender systems. These systems collect data related to each customer
purchase and make suggestions using machine learning algorithms by identifying the
trends in the pattern of customer purchase.
 Image processing– Image process is basically of two types – Digital Image
processing and Analog image processing. Digital image processing uses intelligent
machine learning algorithms for enhancing the quality of the image obtained from
distant sources such as satellites.
IDS-Unit-4-BCA SEM-IV

 Bioinformatics – It is a field of science that uses computation tools and software to


make predictions relating to biological data. For example, suppose someone
discovered a new protein in the lab but the sequence of the protein is not known.
Using bioinformatics tools, the unknown protein is compared with a huge number of
proteins stored in the database to predict a sequence based on similar patterns.
 Analysis– Pattern recognition is used for identifying important data trends. These
trends can be used for future predictions. An analysis is required in almost every
domain be it technical or non-technical. For example, the tweets made by a person on
twitter help in the sentiment analysis by identifying the patterns in the posts using
natural language processing.

 Advantages of Pattern Recognition


Using pattern recognition techniques provides a large number of benefits to an individual.
It not only helps in the analysis of trends but also helps in making predictions.
 It helps in the identification of objects at varying distances and angles.
 Easy and highly automated.
 It is not rocket science and does not require an out of the box thinking ability.
 Highly useful in the finance industry to make valuable predictions regarding sales.
 Efficient solutions to real-time problems.
 Useful in the medical fields for forensic analysis and DNA (Deoxyribonucleic acid)
sequencing.
 Applications of Pattern Recognition
 Trend Analysis– Pattern recognition helps in identifying the trend in the given data
on which appropriate analysis can be done. For example, looking at the recent trends
in the sales made by a particular company or organization, future sales can be
predicted.
 Assistance – Pattern is an integral part of our daily lives. It provides immense help
in our day to day activities. A large number of software and applications are there in
the market today that use machine learning algorithms to make predictions regarding
the presence of obstacles and alerts the user to void miss happenings.
 E-Commerce – Visual search engines recognize the desired item based on its
specifications and provide appropriate results. Most of the websites dedicated to
online shopping make use of recommender systems. These systems collect data
related to each customer purchase and make suggestions. All these tasks are
accomplished by analyzing previous trends to make successful predictions.
 Computer vision– The user interacts with the system by giving an image or video as
the input. The machine compares it with thousands or maybe millions of images
stored in its database, to find similar patterns. The drawl of the essential features is
done by using an algorithm that is mainly directed for grouping similar looking
objects and patterns. This is termed as computer vision. Example, cancer detection.
IDS-Unit-4-BCA SEM-IV

 Biometric devices– These devices secure authentication and security by making


using of face recognition and fingerprint detection technologies. On the hidden side,
the base that enables the use of technologies like face and fingerprint recognition is
machine learning algorithms.

Case Study -1: Milk Quality Prediction using Linear


Regression model
 Target-Grade:
Low (Bad),
Medium (Moderate),
High (Good)
 If Taste, Odor, Fat, and Turbidity are satisfied with optimal conditions then they will
assign 1 otherwise 0.

Importing Necessary libraries


# Data processing, Linear algebra
import pandas as pd
import numpy as np

# Data Visualization
import seaborn as sns
import matplotlib.pyplot as plt
import missingno as msno

import warnings
warnings.filterwarnings("ignore")

Read Dataset: Attach CSV file to Colab notebook.


Basic Information about Dataset-
df.shape
Out[4]:

(1059, 8)
In [5]:

linkcode

df.head()
Checking for NaN values
pd.DataFrame(df.isnull().sum(), columns=["Null Values"]).rename_axis("Column Name")

df.info()
IDS-Unit-4-BCA SEM-IV

Statistical information
df.describe(include = "all")

EDA
for i in df.columns:
print(i)
print(df[i].unique())
print('\n')

for i in df.columns:
print(i)
print(df[i].value_counts())
print('\n')

for i in df.columns:
plt.figure(figsize=(15,6))
sns.histplot(df[i], kde = True, bins = 20, palette = 'hls')
plt.xticks(rotation = 90)
plt.show()

Visualization the relation between features


plt.figure(figsize=(15,6))
sns.barplot(x='Odor',y='Taste',hue='Grade',data=df)
plt.show()

Correlation-
Correlation is a statistical measure that expresses the extent to which two variables are linearly
related (meaning they change together at a constant rate). It's a common tool for describing simple
relationships without making a statement about cause and effect.

df_corr = df.corr()
df_corr
plt.figure(figsize=(10, 8))
matrix = np.triu(df_corr)
sns.heatmap(df_corr, annot=True, linewidth=.8, mask=matrix, cmap="rocket");
plt.show()
sns.pairplot(df,hue="Grade",height=3)
plt.show()
IDS-Unit-4-BCA SEM-IV

Encoding the target variable-

from sklearn.preprocessing import LabelEncoder


le = LabelEncoder()
df['Grade'] = le.fit_transform(df['Grade'])
df['Grade'].unique()

# Train-Test Data
X= df.drop("Temprature", axis = 1)
y= df["Temprature"]

#Splitting the dataset into Training and Test-

from sklearn.model_selection import train_test_split


X_train,X_test,y_train,y_test=train_test_split(X,y,train_size = 0.80,
random_state = 41)

Model Building
#LinearRegression-
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X_train,y_train)
y_pred = lr.predict(X_test)

from sklearn.metrics import mean_squared_error, r2_score


mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("Mean Squared Error:", mse)
print('R2-Score:',r2)
Mean Squared Error: 95.6316242946285
R2-Score: 0.14763496777448915

Github Link: https://github.com/tapanjha/Data-Science/blob/main/Case_Study_1.ipynb

Colab link: https://colab.research.google.com/drive/1rgQlEirhk0dq5hsCvY5xrOwiub3rJlnc

You might also like