Machine Learning From Scratch PDF
Machine Learning From Scratch PDF
S CR A TCH
AI Sciences Publishing
How to contact us
Please address comments and questions concerning this book
to our customer service by email at:
[email protected]
Preface ............................................................................14
Introduction ....................................................................17
What is Machine Learning & Why Big Companies are
investing ...................................................................................... 18
Real-World Examples You Might Not Be Aware Of................... 19
Why it’s Only Becoming Popular Recently ................................ 20
Do You Really Need to Know Statistics & Python? .................... 21
Clustering ...................................................................... 54
How to reveal the clusters .......................................................... 56
What can we do with those Clusters? ......................................... 58
ISBN-13: 978-1643160733
You cannot amend, distribute, sell, use, quote or paraphrase any part
or the content within this book without the consent of the author.
Disclaimer Notice:
Our books have had phenomenal success and they are today
among the best sellers on Amazon. Our books have helped
many people to progress and especially to understand these
techniques, which are sometimes considered to be complicated
rightly or wrongly.
The books we produce are short, very pleasant to read. These
books focus on the essentials so that beginners can quickly
understand and practice effectively. You will never regret
having chosen one of our books.
We also offer you completely free books on our website: Visit
our site and subscribe in our Email-List: www.aisciences.net
By subscribing to our mailing list, we also offer you all our new
books for free and continuously.
To Contact Us:
Website: www.aisciences.net
Email: [email protected]
Follow us on social media and share our publications
Facebook: @aisciencesllc
LinkedIn: AI Sciences
From AI Sciences Publishing
WWW.AISCIENCES.NET
EBooks, free offers of eBooks and online learning courses.
Did you know that AI Sciences offers free eBooks versions of
every books published? Please subscribe to our email list to be
aware about our free eBook promotion. Get in touch with us
at [email protected] for more details.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.read_csv(‘your_data_file.csv’)
dataset.describe()
We’ll take it slow and explain the intuition behind the code and
the algorithms.
After reading this book, you’ll be in a better position to create
informed opinions about the opportunities (and threats)
machine learning and artificial intelligence will bring. You’ll
also be able to apply basic machine learning algorithms you can
apply to your professional or personal projects.
What is Machine Learning & Why Big Companies
are investing
#include <stdio.h>
int main(void)
{
printf("Hello world!\n");
return 0;
}
print(“Hello world!”)
Install Anaconda
import pandas as pd
dataset = pd.read_csv(‘your_file.csv’)
dataset.hist(‘sample_feature’)
Main reasons for using libraries are for speed and convenience.
It’s like using “off-the-shelf” techniques and tools so you can
better focus on the machine learning itself (instead of
bothering yourself with hundreds of lines of code). These also
allow professionals to get started quickly without diving too
deep into mathematics and programming.
With all that speed and convenience though, it’s still up to you
how to use them effectively (or whether to use them or not in
the first place). New tools and technologies will also appear
that would claim to make machine learning 2x faster and easier.
That’s why in the next chapter we’ll briefly discuss how to
initially approach machine learning problems no matter which
tools you use. We’ll use the libraries mentioned in this chapter
to better illustrate the thought process.
Steps to Solving Machine Learning
Problems
Many beginners make the mistake of immersing early on scikit-
learn, pandas, numpy, and TensorFlow without first learning
how to approach common machine learning problems.
That’s why in this short chapter we’ll discuss how to effectively
approach such problems and perhaps save you a lot of time in
the long run (especially if you plan to be a machine learning
engineer).
This is the first step. Before we waste any resources and use
sophisticated tools, what’s our goal and what’s the problem
we’re trying to solve?
In most cases not having a clear goal will waste a lot of time
and resources. That’s why at the start set a goal and determine
the problem. This will give you a direction and lets you know
when to stop. This also sets clear expectations among the team.
import pandas as pd
dataset = pd.read_csv(‘your_file.csv’)
dataset.describe()
At the very top you’ll see the count, which shows the number
of instances for each feature or attribute. Any inconsistency in
the row will reveal that there’s missing data (or data in the
wrong format).
The dataset.describe() will also show you if the data really makes
sense. You can look at the mean and see if it falls within
reasonable values. You can also see the standard deviation and
get a quick idea of variability within the data (higher standard
deviation means data points are more dispersed).
Source: https://storage.googleapis.com/mledu-
datasets/california_housing_train.csv)
import pandas as pd
dataset = pd.read_csv(‘california_housing_train.csv’)
X = dataset[[‘total_rooms’]].values
y = dataset[[‘median_house_value’]].values
plt.scatter(X, y)
This gives us a quick idea about the relationship of the variable
(total_rooms) and the output (median_house_value).
Perhaps you expected something a lot more sophisticated than
this. However, this seemingly “simple” Regression example
forms the foundation of many advanced Predictive Analytics
procedures.
First we have to determine the relationship between a variable
(or several features) and an output. Then we visualize the data
using a scatter plot to better get an idea. Whether it’s a small
dataset or a file with billions of data points (e.g. social media),
the general steps are the same.
In the next chapter, we’ll take it a step further. This will require
more lines of code (that seem more abstract and reserved for
experienced programmers). As mentioned early in this book,
the goal is to introduce you to the most common things that
machine learning developers and data scientists do. This is to
give you a solid foundation if you’ll strive to be an expert.
Here’s Where Real Machine Learning
Starts
Once you got the fundamental things really right, it’s time to
really apply machine learning according to its true meaning.
The input feature (X) is divided into train and test set. Same
case with the output (y). The test_size is ⅓ which means ⅓ is for
the test set and the rest (⅔) is for the training set. You can set
the random_state to any value you like (purpose of a set
random_state is to be consistent with producing random values.)
Once the train and test sets are ready, it’s now time to find the
best “linear fit.” In the next section we’ll discuss what this
exactly means.
y = mx + b
Where y is the output, m is the slope, x is the value of an input
feature, and b is the bias (the y-intercept).
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.000000
1)
my_optimizer =
tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
On the other hand, applying simple linear regression in scikit-
learn may require the following code:
regressor = linear_model.LinearRegression()
regressor.fit(X_train, y_train)
# This results to creating a linear fit. We can then make a prediction using
the Test Set by:
y_pred = regressor.predict(X_test)
import numpy as np
import pandas as pd
dataset = pd.read_csv('california_housing_train.csv')
dataset.describe()
dataset.hist('total_rooms')
X = dataset[['total_rooms']].values
y = dataset[['median_house_value']].values
plt.scatter(X, y)
from sklearn.model_selection import train_test_split
regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
plt.xlabel('total_rooms')
plt.ylabel('median_house_value')
plt.show()
# Apply that linear fit to the test set
plt.xlabel('total_rooms')
plt.ylabel('median_house_value')
plt.show()
Notice that we’re applying the same line we got from the test
set into the training set. After all, we’re building the best fitting
line to the training set and seeing if it’s also working well with
the test set.
Also notice that this seems far from being the best example
(this is only for illustration purposes). No matter the specifics,
the goal in linear regression is to find the best fitting line to
approximate the relationship between an input and output &
also be able to use that line in predicting values for “unseen”
data (e.g. test set and new incoming data).
What If Regression Doesn’t Apply?
● Sepal length in cm
● Sepal width in cm
● Petal length in cm
● Petal width in cm
import pandas as pd
dataset.hist()
y = dataset[['class']].values
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
Output:
print(classification_report(y_test, y_pred))
Logistic Regression
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
import numpy as np
import pandas as pd
Next is we generate random data (this is only for illustration
purposes). This is very useful for testing and practicing Python
& data science.
X_vis = dataset[0].values
y_vis = dataset[1].values
plt.xlabel('Income')
plt.ylabel('Purchase Score')
plt.scatter(X_vis, y_vis)
plt.show()
At first, it can be hard to see the Clusters because it all seems
random. However, given enough data points, some points tend
to aggregate and naturally form groups. One way to better see
those groups is by doing Clustering.
In marketing (and in life in general), we can’t please everybody
no matter how hard we try. That’s why we have to somehow
create different campaigns for different customer groups. This
way, the marketing message will better appeal to a certain
group.
First, we have to create groups for those customers. How many
groups should we create? Are the formed groups meaningful?
This is where our objectives and judgment come in. Too many
groups and our resulting marketing strategy won’t be targeted
enough. Too few groups could make our strategy generalized.
y_kmeans = kmeans.fit_predict(X)
Finally, we can again plot the points. This time though the
Clusters are more visible because of the color differences.
plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 100, c = 'green',
label = 'Cluster 1')
plt.xlabel('Income')
plt.ylabel('Purchase Score')
plt.legend()
plt.show()
The Yellow points are the Centroids and the Green, Red, and
Blue Points are in their respective clusters. Notice that the
Centroids seem to be at the “center” of each cluster
(minimizing the sum of squares of the distances).
The number of clusters may be arbitrary. We can set it to 2, 4,
5, or any other number. The choice actually depends on our
goal and preferences. There are mathematical and
recommended ways to determine the supposed number of
clusters. Nevertheless, it’s always up to us if the revealed
clusters suit our goals.
More data points may allow our model to “learn” more. It’s
similar to how we personally learn. If we’ve seen many
examples, our view and learning may become more accurate
and realistic. But if we’re only exposed to very few examples,
the tendency is to become biased or we’re getting incomplete
information.
That’s why companies are busy gathering more data from users
and other sources. More data could mean better prediction
accuracy (e.g. better product recommendations, customized
offers, fewer diagnostic errors).
However, gathering more data may not be worth it in some
cases. Often there are financial and time constraints in many
machine learning projects. Further data collection could be
expensive or by the time it’s been accomplished, a competitor
might have already come up with a good enough solution.
import pandas as pd
To simplify the task, we can divide the quality scores into just
2 categories. Wines with quality scores of equal or higher than
7 will be considered Good (1). While the rest will be considered
Bad (0).
def isGood(quality):
if quality >= 7:
return 1
else:
return 0
X = dataset.iloc[:, 0:11].values
y = dataset['quality'].apply(isGood)
y.hist()
X_test = sc.transform(X_test)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)
Result is 0.90625
The accuracy score is around 90.6% when we use the Decision
Tree Classifier. Let’s then use Random Forest Classifier and
compare the results. Most of the lines of code are just the copy
of the code for Decision Tree.
# Random Forest Classifier
import numpy as np
import pandas as pd
dataset.describe()
def isGood(quality):
if quality >= 7:
return 1
else:
return 0
X = dataset.iloc[:, 0:11].values
y = dataset['quality'].apply(isGood)
y.hist()
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)
Result is 0.928125
Which are the most important factors that affect success in any
field? Hard work is a sure one. What about the special morning
routines? Do those really contribute to a person’s success?
There are many other factors that people say are crucial to
success such as diet, where the people is born, if she had a
summer job, if the person learned to be independent while
young, and so on.
But let’s face it, only a few of those factors actually contribute
to the positive results. It’s similar to the 80/20 principle
popular in business and management. Only 20% of the input
is responsible for 80% of the output (not exactly but you get
the point).
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset.head(5)
dataset.describe()
def isGood(quality):
if quality >= 7:
return 1
else:
return 0
X = dataset.iloc[:, 0:11].values
y = dataset['quality'].apply(isGood)
y.hist()
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Now here’s where it becomes different. We apply Principal
Component Analysis (PCA) and “reduce” the 11 features into
just 2 components (n_components = 2).
pca = PCA(n_components = 2)
X_train = pca.fit_transform(X_train)
X_test = pca.transform(X_test)
explained_variance = pca.explained_variance_ratio_
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.legend()
plt.show()
Notice that in the visualization above, there’s only the PC1 and
PC2 (only 2 components). Our Random Forest Classifier only
used 2 “features” to classify the wines into 0 (bad, ratings
below 7) and 1 (good, ratings equal or higher than 7).
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.legend()
plt.show()
As with earlier examples, we applied the same classifier from
Training Set into the Test Set and see if it performs well
enough.
Using Principal Component Analysis and other
Dimensionality Reduction techniques will just add a few lines
of code to our analysis. The main use of PCA and similar
techniques is that we’re trying to include all features in our
model, visualization, and analysis.
Instead of choosing just 1 or 2 input features based on instinct
or domain knowledge, with PCA we create 2 “new” features
that try to include all of the original features. This may or may
not affect the accuracy of our model. Nevertheless, the good
thing is there’s lower loss of data (in contrast to ignoring all the
other features).
Beware of overfitting
Overfitting is that the data points are overly fitting with our
model. Here’s a good example:
Instead of a straight line that misses most of the data points,
the model literally followed the points to include them all in
our model. The result is the model appears complicated and
with a lot of twists and turns.
The accuracy score for the training set (or the current available
data) might be excellent (~100%). However, when new data
comes, the model will fail spectacularly. In addition, the model
seems complicated. The equation for that model is likely to be
a high degree polynomial equation (e.g. y = x2 - 2x3 + 5x5)
instead of just the simple y = mx + b.
How do we prevent overfitting and still maintain a good
accuracy score? One way is by wise feature selection. Using our
domain knowledge (or consulting with an expert in the subject
area), we can select 1 or 2 features that may have the biggest
impact to the output. For example, the weather in a particular
day might not have anything to do with the result of a
basketball game (but other data such as home court advantage
might have).
As hinted earlier though, feature selection may also mean loss
of data and therefore some tradeoff on avoiding overfitting.
That’s why experts may use Dimensionality Reduction
techniques instead to avoid ignoring features that might have
an actual impact to the output.
Another way to prevent or solve overfitting is by getting more
data (more examples for our model to train with). It’s
somehow similar to the Law of Large Numbers wherein as the
number of experiments increases, the actual and expected ratio
of outcomes converge (nearing the values of each other). For
instance, out of 10 coin tosses you might get 8 tails (0.8 ratio).
But if you toss the coin maybe a million times, the result you
get might be near the expected value which is 0.5.
With more training data you might get a model that generalizes
well and may perform well enough with unseen data. However,
getting more data may be more expensive and time-
consuming.
There’s a way though to train on more data (while still using
the same dataset). It’s by applying Cross-Validation
(specifically k-fold cross validation). Here’s a simple example
of how it works.
You divide the dataset into 10 subsets (Subset 1, Subset 2,
Subset 3, and so on). You’ll use Subsets 2 to 10 as your
Training Data (while Subset 1 is your Test Set). You do this
procedure again but this time you use Subset 2 as your Test Set
while the rest is your Training Data. You do this until the
“rotation” is complete.
The result is you’ve used different combinations of Training
Data and Test Set. This is a good way to prevent overfitting
because the model has a lot “more chances of learning.” It’s
similar to what we’ve discussed about Ensemble Learning (use
of different or combination of algorithms then averaging the
results).
Let us revisit our Classification example earlier (classifying
flowers into Iris setosa, Iris virginica, and Iris versicolor
according to their sepal length and sepal width). This time
though, let us apply Cross Validation (CV) and compare the
accuracies from with and without CV.
import numpy as np
import pandas as pd
dataset.head(5)
dataset.sample(10)
dataset.describe()
dataset.hist()
X = dataset[['sepal-length', 'sepal-width']].values
y = dataset[['class']].values
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train.ravel())
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)
import pandas as pd
X = dataset[['sepal-length', 'sepal-width']].values
y = dataset[['class']].values
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train.ravel())
y_pred = classifier.predict(X_test)
accuracies.mean()
Result is 0.69 (with Cross-Validation applied)
Beware of underfitting
It’s the same with our basic model how the brain neurons
work. The difference is that instead of Processing, we have the
so-called Hidden Layer. It sounds mysterious but what it really
means is just it’s neither an Input nor Output.
In practice though, here’s how ANNs actually look like:
Notice there is 1 column of Input, 2 columns of Hidden
Layers, and finally just 1 Output. Also, notice the connections
and networks between the layers. It’s also loosely based on
how the brain works (we also have neural networks but they’re
far more complicated than what computers have now).
In other words, the ANNs are our attempt to recreate the
brain. Scientists and researchers are still far from it, but the
ANNs now could be enough for some purposes.
This particular use of ANNs is called Deep Learning. It’s a
subfield of machine learning that specializes in applying
algorithms inspired by brain neurons.
Creating an artificial neural network in Python
if(deriv==True):
return x*(1-x)
return 1/(1+np.exp(-x))
X = np.array([ [0,0,1],
[0,1,1],
[1,0,1],
[1,1,1] ])
y = np.array([[0,0,1,1]]).T
np.random.seed(1)
syn0 = 2*np.random.random((3,1)) – 1
l0 = X
l1 = nonlin(np.dot(l0,syn0))
l1_error = y - l1
syn0 += np.dot(l0.T,l1_delta)
print(l1)
● Caffe
● Theano
● Keras
● MXNet
Why TensorFlow
It’s from Google where you can expect it will be here for a long
time. This means learning how to use TensorFlow could be
very worth your time and effort. Also, in the future you might
easily find it easier to implement especially in large projects
(there’s Google Cloud Platform which fully supports
TensorFlow).
For this chapter let’s focus on TensorFlow and how to use it.
Let’s start with a simple illustration so you know the context
before diving deep into the code.
import math
import numpy as np
import pandas as pd
import tensorflow as tf
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format
dataset = pd.read_csv("https://storage.googleapis.com/mledu-
datasets/california_housing_train.csv", sep=",")
dataset =
dataset.reindex(np.random.permutation(california_housing_dataframe.in
dex))
dataset[“median_house_value”] /= 1000.0
dataset.describe()
my_feature = dataset[[“total_rooms]]
feature_columns = [tf.feature_column.numeric_column("total_rooms")]
targets = dataset["median_house_value"]
my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.000000
1)
my_optimizer =
tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
ds = Dataset.from_tensor_slices((features,targets))
ds = ds.batch(batch_size).repeat(num_epochs)
if shuffle:
ds = ds.shuffle(buffer_size=10000)
_ = linear_regressor.train(
steps=100
predictions = linear_regressor.predict(input_fn=prediction_input_fn)
root_mean_squared_error = math.sqrt(mean_squared_error)
periods = 10
my_feature = input_feature
my_feature_data = dataset[[my_feature]]
my_label = "median_house_value"
targets = dataset[my_label]
feature_columns = [tf.feature_column.numeric_column(my_feature)]
my_optimizer =
tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
my_optimizer =
tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
plt.figure(figsize=(15, 6))
plt.subplot(1, 2, 1)
plt.ylabel(my_label)
plt.xlabel(my_feature)
sample = dataset.sample(n=300)
plt.scatter(sample[my_feature], sample[my_label])
print("Training model...")
root_mean_squared_errors = []
linear_regressor.train(
input_fn=training_input_fn,
steps=steps_per_period
predictions = linear_regressor.predict(input_fn=prediction_input_fn)
root_mean_squared_error = math.sqrt(
metrics.mean_squared_error(predictions, targets))
root_mean_squared_errors.append(root_mean_squared_error)
weight =
linear_regressor.get_variable_value('linear/linear_model/%s/weights' %
input_feature)[0]
bias =
linear_regressor.get_variable_value('linear/linear_model/bias_weights')
x_extents = np.maximum(np.minimum(x_extents,
sample[my_feature].max()),
sample[my_feature].min())
plt.subplot(1, 2, 2)
plt.ylabel('RMSE')
plt.xlabel('Periods')
plt.plot(root_mean_squared_errors)
calibration_data = pd.DataFrame()
calibration_data["predictions"] = pd.Series(predictions)
calibration_data["targets"] = pd.Series(targets)
display.display(calibration_data.describe())
train_model(
learning_rate=0.00001,
steps=100,
batch_size=1
)
Our model performs poorly. The lines are far off from the data
points. Let’s try again by changing the parameters (learning
rate, steps, batch size):
train_model(
learning_rate=0.00002,
steps=500,
batch_size=5
)
We got a lot better result just by just changing the parameters.
This is called Hyperparameter Tuning which is about
adjusting the parameters until we get a good model or accuracy.
In just one page you can perform a full data analysis with all
the code, figures, and notes included. Many online instructors
actually use Jupyter Notebook for convenience and
accessibility to learners.
To use Jupyter Notebook, one way is to open the Anaconda
Prompt (type and search through your installed apps). Then,
type jupyter notebook. Here’s an example how to do it:
Press enter and a new browser tab will appear. It will look
something like this:
You can click the folders and files just like how you normally
use your computer. You can then choose a folder (where you
want to put the file) and then click Python 3 to create a new
notebook.
For more tutorials, you can always refer to their documentation
at https://jupyter-notebook.readthedocs.io/en/stable/
Another tool that works like Jupyter Notebook is the Google
Colab (https://colab.research.google.com/). It’s getting
popular because of the minimal setup and online collaboration.
It’s like having a Google Docs for coding.
Python Crash Course
dataset = pd.read_csv(“your_data.csv”)
dataset.describe() to
show summary of the data such as mean and
standard deviation
dataset.head(5) to show the first 5 entries of the dataset
dataset.hist(“feature_1”) to create a histogram of a particular
column/feature
X = dataset[[“feature_1”]].values to assign X into the first feature
y = dataset[[“output]].values to assign y as the output or target
● Python (https://www.python.org/)
● Anaconda (https://anaconda.org/)
● Numpy (http://www.numpy.org/)
● Pandas (https://pandas.pydata.org/)
● Matplotlib (https://matplotlib.org/)
● Scikit-learn (http://scikit-learn.org/)
● TensorFlow (https://www.tensorflow.org/)
Datasets
● Kaggle (https://www.kaggle.com/datasets)
● California Housing Dataset
(https://www.kaggle.com/camnugent/california-
housing-prices/data)
● UCI Machine Learning Repository
(https://archive.ics.uci.edu/ml/datasets.html)
● Wine Quality Dataset
(https://archive.ics.uci.edu/ml/datasets/Wine+Quali
ty)
● Iris Dataset
(https://archive.ics.uci.edu/ml/datasets/Iris)
https://www.amazon.com/dp/B07FTPKJMM