ML Unit-1
ML Unit-1
Introduction to Machine Learning with Python: Introduction to Machine Learning, basic terminology,
Types of Machine Learning and Applications, Using Python for Machine Learning: Installing Python and
packages from the Python Package Index, Introduction to NumPy, SciPy, matplotlib and scikit-learn, Tiny
application of Machine Learning.
UNIT II:
Supervised Learning: Types of Supervised Learning, Supervised Machine Learning Algorithms: kNearest
Neighbors, Regression Models, Naive Bayes Classifiers, Decision Trees, Ensembles of Decision Trees,
Kernelized Support Vector Machines, Uncertainty Estimates from Classifiers.
UNIT III:
Building good training datasets: Dealing with missing data, Handling categorical data, partitioning a data
set into separate training and test datasets, bringing features onto the same scale, selecting meaningful
features, assessing feature importance with random forests. Compressing data via dimensionality
reduction: Unsupervised dimensionality reduction via PCA, Supervised data compression via linear
discriminant analysis (Text Book 2)
UNIT IV:
Learning best Practices for Model Evaluation and Hyperparameter tuning: streamlining workflows with
pipelines, using k-fold cross validation to assess model performance, debugging algorithms with learning
and validation curves, fine tuning machine learning models via grid search, looking at different
performance evaluation metrics. Combining different model sfor Ensemble learning: learning with
ensembles, combining classifiers via majority vote, bagging-building an ensemble of classifiers from
bootstrap samples, leveraging weak learners via adaptive boosting (Text Book 2)
UNIT V:
Working with Text Data (Data Visualization): Types of Data Represented as Strings, Example Application:
Sentiment Analysis of Movie Reviews, Representing Text Data as a Bag of Words, Stop Words, Rescaling
the Data with tf-idf, Investigating Model Coefficients, Approaching a Machine Learning Problem, Testing
Production Systems, Ranking, Recommender Systems and Other kinds of Learning.
UNIT-1
Machine learning algorithms create a mathematical model that, without being explicitly
programmed, aids in making predictions or decisions with the assistance of sample historical
data, or training data. For the purpose of developing predictive models, machine learning
brings together statistics and computer science. Algorithms that learn from historical data
are either constructed or utilized in machine learning. The performance will rise in
proportion to the quantity of information we provide.
A machine can learn if it can gain more data to improve its performance.
How does Machine Learning work
A machine learning system builds prediction models, learns from previous data, and predicts
the output of new data whenever it receives it. The amount of data helps to build a better
model that accurately predicts the output, which in turn affects the accuracy of the
predicted output.
The Machine Learning algorithm's operation is depicted in the following block diagram:
By providing them with a large amount of data and allowing them to automatically explore
the data, build models, and predict the required output, we can train machine learning
algorithms. The cost function can be used to determine the amount of data and the machine
learning algorithm's performance. We can save both time and money by using machine
learning.
Following are some key points which show the importance of Machine Learning:
o Rapid increment in the production of data
o Solving complex problems, which are difficult for a human
o Decision making in various sector including finance
o Finding hidden patterns and extracting useful information from data.
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1) Supervised Learning
In supervised learning, sample labeled data are provided to the machine learning system for
training, and the system then predicts the output based on the training data.
The system uses labeled data to build a model that understands the datasets and learns
about each one. After the training and processing are done, we test the model with sample
data to see if it can accurately predict the output.
The mapping of the input data to the output data is the objective of supervised learning. The
managed learning depends on oversight, and it is equivalent to when an understudy learns
things in the management of the educator. Spam filtering is an example of supervised
learning.
Supervised learning can be grouped further in two categories of algorithms:
o Classification
o Regression
2) Unsupervised Learning
The training is provided to the machine with the set of data that has not been labeled,
classified, or categorized, and the algorithm needs to act on that data without any
supervision. The goal of unsupervised learning is to restructure the input data into new
features or a group of objects with similar patterns.
In unsupervised learning, we don't have a predetermined result. The machine tries to find
useful insights from the huge amount of data. It can be further classifieds into two
categories of algorithms:
o Clustering
o Association
3) Reinforcement Learning
The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.
Image recognition is one of the most common applications of machine learning. It is used to
identify objects, persons, places, digital images, etc. The popular use case of image
recognition and face detection is, Automatic friend tagging suggestion:
It is based on the Facebook project named "Deep Face," which is responsible for face
recognition and person identification in the picture.
2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.
Speech recognition is a process of converting voice instructions into text, and it is also
known as "Speech to text", or "Computer speech recognition." At present, machine
learning algorithms are widely used by various applications of speech recognition. Google
assistant, Siri, Cortana, and Alexa are using speech recognition technology to follow the
voice instructions.
3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the correct
path with the shortest route and predicts the traffic conditions.
It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily
congested with the help of two ways:
o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes information
from the user and sends back to its database to improve the performance.
4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation to the user. Whenever we search for
some product on Amazon, then we started getting an advertisement for the same product
while internet surfing on the same browser and this is because of machine learning.
Google understands the user interest using various machine learning algorithms and
suggests the product as per customer interest.
As similar, when we use Netflix, we find some recommendations for entertainment series,
movies, etc., and this is also done with the help of machine learning.
5. Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars. Machine
learning plays a significant role in self-driving cars. Tesla, the most popular car
manufacturing company is working on self-driving car. It is using unsupervised learning
method to train the car models to detect people and objects while driving.
o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
These assistant record our voice instructions, send it over the server on a cloud, and decode
it using ML algorithms and act accordingly.
Machine learning is making our online transaction safe and secure by detecting fraud
transaction. Whenever we perform some online transaction, there may be various ways that
a fraudulent transaction can take place such as fake accounts, fake ids, and steal
money in the middle of a transaction. So to detect this, Feed Forward Neural
network helps us by checking whether it is a genuine transaction or a fraud transaction.
For each genuine transaction, the output is converted into some hash values, and these
values become the input for the next round. For each genuine transaction, there is a specific
pattern which gets change for the fraud transaction hence, it detects it and makes our
online transactions more secure.
Machine learning is widely used in stock market trading. In the stock market, there is always
a risk of up and downs in shares, so for this machine learning's long short term memory
neural network is used for the prediction of stock market trends.
In medical science, machine learning is used for diseases diagnoses. With this, medical
technology is growing very fast and able to build 3D models that can predict the exact
position of lesions in the brain.
Nowadays, if we visit a new place and we are not aware of the language then it is not a
problem at all, as for this also machine learning helps us by converting the text into our
known languages. Google's GNMT (Google Neural Machine Translation) provide this feature,
which is a Neural Machine Learning that translates the text into our familiar language, and it
called as automatic translation.
NumPy
NumPy basically provides n-dimensional array object. NumPy also provides
mathematical functions which can be used in many calculations.
import numpy as np
arr = np.array([[1,2,3],[4,5,6]])
print("Numpy array
{}".format(arr))
Output
Output
Numpy array
[[1 2 3]
[4 5 6]]
SciPy
SciPy is collection of scientific computing functions. It provides advanced linear algebra
routines, mathematical function optimization, signal processing, special mathematical
functions, and statistical distributions.
Output
NumPy array:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
(0, 0) 1.0
(1, 1) 1.0
(2, 2) 1.0
matplotlib
matplotlib is scientific plotting library usually required to visualize data. Importantly
visualization is required to analyze the data. You can plot histograms, scatter graphs, lines
etc.
Output
scikit-learn
scikit-learn is built on NumPy, SciPy and matplotlib provides tools for data analysis and
data mining. It provides classification and clustering algorithms built in and some datasets
for practice like iris dataset, Boston house prices dataset, diabetes dataset etc.
Output
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
pandas
pandas is used for data analysis it can take multi-dimensional arrays as input and
produce charts/graphs. pandas may take a table with columns of different datatypes. It may
ingest data from various data files and database like SQL, Excel, CSV etc.
import pandas as pd
age = {'age': [4, 6, 8, 34, 5, 30, 41] }
dataframe = pd.DataFrame(age)
print("all age:
{}".format(dataframe))
filtered = dataframe[dataframe.age > 20]
print("age above 20:
{}".format(filtered))
Output
all age:
age
0 4
1 6
2 8
3 34
4 5
5 30
6 41
age
3 34
5 30
6 41