Machine Learning With Real Life Project: by - Rishabh Gaur
Machine Learning With Real Life Project: by - Rishabh Gaur
By – Rishabh Gaur
Table Of Content
• Certificates of Course
• Company Overview
• Introduction To Machine Learning
• Need For ML
• Pre-Requisite For ML
• Methods Of ML
• ML Algorithms
• Step-Wise Procedure For ML
• Project Overview
• Future Work
• Applications
• References
Company Overview
• Ciperschools is training Institute in Chandigarh.
• It is an online learning platform where a student can get guidance on various technologies such as –
• Machine Learning
• Data Science
• Mern Stack Development
• Web Development
• About Mentor – Kanav Bansal
• Kanav Bansal has more than four year of experience in teaching. Also has a great knowledge in
various technologies that are popular in today’s time
• Technologies – Python, Data Science, R Programming, Machine Learning, Deep Learning.
Introduction To Machine Learning
• Machine Learning is a type of artificial intelligence that extract patterns out of
raw data by using an algorithm or method.
• The main focus of ML is to make the machines able learn from experience
without being explicitly programmed or human intervention.
Need For Machine Learning
• Let us understand this with a very basic example.
• Given a dataset of Height, Weight and Gender.
Height Weight Gender
160 55 Female
175 74 Male
180 82 Male
168 63 Female
168 72 Male
• Observations –
1. Here the average hours-per-week is 40.42.
2. There is huge difference between mean and median of capital-loss and capital gain columns which that there are a lot
of outliers present in these columns.
3. The IQR of hours-per-week column is very small which means that there is not much variation in data of this column.
4. Since the values of fnlwgt does not make any sense therefore we can remove this column from the data set
• Univariate Analysis – It gives the information about a particular column in the dataset with the help
of some graphs or plots. We can use Matplotlib and seaborn libraries of python to perform Univariate
Analysis. The plot used are-
1. Boxplot 2. Histogram 3. Countplot
•Bivariate Analysis – Here we can do the comparison between two columns of the data set with the help of
plots. Bivariate Analysis done for two numerical columns as well as between numerical and categorical columns.
The plots that are used are –
•For Numerical and
categorical columns
1.Strip Plot
2.Box Plot
3.Bar Plot
4.Line Plot
• Step 3 – Data Preparation
• Since the machine can only understand bits language, therefore before giving data to the model we have to
convert the data into binary form.
• For this purpose we have special function known as get_dummies() function for the categorical data and
StandardScaller() function for numerical data.
• After transformation data looks like this
• Now the only step left is to split the data into train and test.
• Step 4 – Modeling
• Since the target variable is discrete, therefore this problem can be solved by –
1. Logistic Regression
2. S V M
3. K N N
• After applying the above three algorithm the final classification reports are
• Logistic Regression -
• SVM -
• K-Means -
• Conclusion –
1. We can conclude that `SVM` have highest `Accuracy Score` i.e. 85.2%.
2. On the other hand `Logistic Regression` accuracy score is very much near to `SVM` i.e. 84.8 %.
3. For `KNeighborsClassifier` the accuracy score is 82.3 %.
Applications
• Image Recognition
• Speech Recognition
• Traffic prediction
• Product recommendations
• Self-driving cars
• Email Spam and Malware Filtering
• Virtual Personal Assistant
• Online Fraud Detection
• Stock Market trading
• Medical
• Automatic Language Translation
Future Work
• Working on different data sets.
• Starting with Deep Learning.
-- Deep learning in comparison to ml gives more accurate predictions as it
follows the concept of neural networks. Also deep learning require less
computational power as compared to machine learning.
• Natural Processing Language – It concerned with the interaction between the
computers and human language such as speech recognition.
References
• For data sets: https://www.kaggle.com/datasets/
• For Concepts and courses :
https://www.coursera.org/learn/machine-learning
https://www.geeksforgeeks.org/machine-learning/
https://
www.simplilearn.com/big-data-and-analytics/machine-learning-certification-
training-course
For Projects :
https://
www.kdnuggets.com/2020/03/20-machine-learning-datasets-project-ideas.ht
ml